metadata

license: apache-2.0

Model Card for Soniox-7B-v1.0

Soniox 7B is a powerful large language model. Supports English and code with 8K context. Matches GPT-4 performance on some benchmarks. Built on top of Mistral 7B, enhanced with additional pre-training and fine-tuning for strong problem-solving capabilities. Apache 2.0 License. For more details, please read our blog post.

Usage in Transformers

The model is available in transformers and can be used as follows:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "soniox/Soniox-7B-v1.0"
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16)
tokenizer = AutoTokenizer.from_pretrained(model_path)

device = "cuda"
model.to(device)

messages = [
    {"role": "user", "content": "12 plus 21?"},
    {"role": "assistant", "content": "33."},
    {"role": "user", "content": "Five minus one?"},
]
tok_prompt = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = tok_prompt.to(device)
generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])

Inference deployment

Refer to our documentation for inference with vLLM and other deployment options.