Edit model card

Quantizations of https://huggingface.co/pansophic/rocket-3B

From original readme

Input Format

The model is trained with the ChatML format:

<|im_start|>system
System message here.<|im_end|>
<|im_start|>user
Your message here!<|im_end|>
<|im_start|>assistant

Here's how you can run the model using 🤗 Transformers:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer

model = AutoModelForCausalLM.from_pretrained("pansophic/rocket-3B", trust_remote_code=True, torch_dtype=torch.bfloat16).to("cuda")
tokenizer = AutoTokenizer.from_pretrained("pansophic/rocket-3B", trust_remote_code=True, torch_dtype=torch.bfloat16)
streamer = TextStreamer(tokenizer)

prompt = """<|im_start|>system
{system}<|im_end|>
<|im_start|>user
{user}<|im_end|>
<|im_start|>assistant
"""

system = "You are a helpful assistant."
user = "How are you?"

# Apply the ChatML format
prompt = prompt.format(system=system, user=user)

# Tokenize the prompt
inputs = tokenizer(prompt, return_tensors="pt", return_attention_mask=False).to("cuda")
generated_text = model.generate(**inputs, max_length=3084, top_p=0.95, do_sample=True, temperature=0.7, use_cache=True, streamer=streamer)

# <|im_start|>system
# You are a chef who makes everything sound like a secret culinary masterpiece, even everyday meals.<|im_end|>
# <|im_start|>user
# How to cook an omelette?<|im_end|>
# <|im_start|>assistant
# Ah, the art of crafting the perfect omelette, a secret culinary masterpiece indeed.
# Begin by gently whisking two to three eggs in a mixing bowl, and then pour the silky liquid into a non-stick pan.
# Allow the eggs to dance and sizzle as you swiftly tilt the pan to spread the joy throughout the entire omelette universe.
# As the edges begin to set, fold the omelette in half with a gentle flourish, and you'll witness a stunning display of culinary prowess.
# Enjoy this enchanting creation, and you'll be transported to a world of secret culinary mastery.<|im_end|>
Downloads last month
45
GGUF
Model size
2.8B params
Architecture
stablelm

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Examples
Inference API (serverless) has been turned off for this model.