lachiewyoung
/

mistral-7b-instruct-1.58bit

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Adding `safetensors` variant of this model

#2

by SFconvertbot - opened Mar 22

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

Mar 22

This is an automated PR created with https://huggingface.co/spaces/safetensors/convert

This new file is equivalent to pytorch_model.bin but safe in the sense that
no arbitrary code can be put into it.

These files also happen to load much faster than their pytorch counterpart:
https://colab.research.google.com/github/huggingface/notebooks/blob/main/safetensors_doc/en/speed.ipynb

The widgets on your model page will run using this model even if this is not merged
making sure the file actually works.

If you find any issues: please report here: https://huggingface.co/spaces/safetensors/convert/discussions

Feel free to ignore this PR.

Adding `safetensors` variant of this modela14efebe

Mar 22

token generation seems broken with all tokens being <unk>

Ran with text-generation-inference with --revision "refs/pr/2" --quantize bitsandbytes

However it does look like token generation time is faster compared to standard mistral 7b on my machine

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment