File size: 1,370 Bytes
4209f20 2b57bed ac43173 2b57bed 7f93ea6 2b57bed |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
---
license: apache-2.0
---
# Model Card for Soniox-7B-v1.0
Soniox 7B is a powerful large language model. Supports English and code with 8K context.
Matches GPT-4 performance on some benchmarks.
Built on top of Mistral 7B, enhanced with additional pre-training and fine-tuning for strong problem-solving capabilities.
Apache 2.0 License.
For more details, please read our [blog post](https://soniox.com/news/soniox-7B).
## Usage in Transformers
The model is available in transformers and can be used as follows:
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_path = "soniox/Soniox-7B-v1.0"
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16)
tokenizer = AutoTokenizer.from_pretrained(model_path)
device = "cuda"
model.to(device)
messages = [
{"role": "user", "content": "12 plus 21?"},
{"role": "assistant", "content": "33."},
{"role": "user", "content": "Five minus one?"},
]
tok_prompt = tokenizer.apply_chat_template(messages, return_tensors="pt")
model_inputs = tok_prompt.to(device)
generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])
```
## Inference deployment
Refer to our [documentation](https://docs.soniox.com) for inference with vLLM and other
deployment options.
|