Soniox-7B-v1.0 / README.md
ambroz-soniox's picture
Fix link.
7f93ea6
---
license: apache-2.0
---
# Model Card for Soniox-7B-v1.0
Soniox 7B is a powerful large language model. Supports English and code with 8K context.
Matches GPT-4 performance on some benchmarks.
Built on top of Mistral 7B, enhanced with additional pre-training and fine-tuning for strong problem-solving capabilities.
Apache 2.0 License.
For more details, please read our [blog post](https://soniox.com/news/soniox-7B).
## Usage in Transformers
The model is available in transformers and can be used as follows:
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_path = "soniox/Soniox-7B-v1.0"
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16)
tokenizer = AutoTokenizer.from_pretrained(model_path)
device = "cuda"
model.to(device)
messages = [
{"role": "user", "content": "12 plus 21?"},
{"role": "assistant", "content": "33."},
{"role": "user", "content": "Five minus one?"},
]
tok_prompt = tokenizer.apply_chat_template(messages, return_tensors="pt")
model_inputs = tok_prompt.to(device)
generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])
```
## Inference deployment
Refer to our [documentation](https://docs.soniox.com) for inference with vLLM and other
deployment options.