Edit model card

Model Card for Model ID

This is a fine-tuned version of the GPT-Neo 1.3B model from EleutherAI, trained using ORPO (Odds Ratio Preference Optimization) on the 'mlabonne/orpo-dpo-mix-40k' dataset. It was fine-tuned with LoRA (Low-Rank Adaptation) to allow efficient training.

Evaluation results

Tasks Version Filter n-shot Metric Value Stderr
hellaswag 1 none 0 acc ↑ 0.3859 ± 0.0049
none 0 acc_norm ↑ 0.4891 ± 0.0050
Downloads last month
18
Safetensors
Model size
1.32B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.