Edit model card

Model Card for Soniox-7B-v1.0

Soniox 7B is a powerful large language model. Supports English and code with 8K context. Matches GPT-4 performance on some benchmarks. Built on top of Mistral 7B, enhanced with additional pre-training and fine-tuning for strong problem-solving capabilities. Apache 2.0 License. For more details, please read our blog post.

Usage in Transformers

The model is available in transformers and can be used as follows:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "soniox/Soniox-7B-v1.0"
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16)
tokenizer = AutoTokenizer.from_pretrained(model_path)

device = "cuda"
model.to(device)

messages = [
    {"role": "user", "content": "12 plus 21?"},
    {"role": "assistant", "content": "33."},
    {"role": "user", "content": "Five minus one?"},
]
tok_prompt = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = tok_prompt.to(device)
generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])

Inference deployment

Refer to our documentation for inference with vLLM and other deployment options.

Downloads last month
650
Safetensors
Model size
7.24B params
Tensor type
BF16
Β·
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Spaces using soniox/Soniox-7B-v1.0 4