Edit model card

Orpo-Llama-3.2-1B-15k

AdamLucek/Orpo-Llama-3.2-1B-15k is an ORPO fine tuned version of meta-llama/Llama-3.2-1B on a subset of 15,000 shuffled entries of mlabonne/orpo-dpo-mix-40k.

Trained for 7 hours on an L4 GPU with this training script, modified from Maxime Labonne's original guide

For full model details, refer to the base model page meta-llama/Llama-3.2-1B

Evaluations

Benchmark Accuracy Notes
AGIEval 20.99% Average across multiple reasoning tasks
GPT4ALL 51.12% Average across all categories
TruthfulQA 42.80% MC2 accuracy
BigBench 31.75% Average across 18 tasks
MMLU 31.23% Average across all categories
Winogrande 61.33% 5-shot evaluation
ARC Challenge 35.92% 25-shot evaluation
HellaSwag 48.65% 10-shot evaluation

Detailed Eval Metrics Available Here

Using this Model

from transformers import AutoTokenizer
import transformers
import torch

# Load Model and Pipeline
model = "AdamLucek/Orpo-Llama-3.2-1B-15k"

pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

# Load Tokenizer
tokenizer = AutoTokenizer.from_pretrained(model)

# Generate Message
messages = [{"role": "user", "content": "What is a language model?"}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

Training Statistics

Panel 1
Panel 2
Panel 3
Panel 4
Downloads last month
319
Safetensors
Model size
1.24B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for AdamLucek/Orpo-Llama-3.2-1B-15k

Finetuned
(126)
this model

Dataset used to train AdamLucek/Orpo-Llama-3.2-1B-15k