Model Card for Model ID
This is a fine-tuned version of the GPT-Neo 1.3B model from EleutherAI, trained using ORPO (Odds Ratio Preference Optimization) on the 'mlabonne/orpo-dpo-mix-40k' dataset. It was fine-tuned with LoRA (Low-Rank Adaptation) to allow efficient training.
Evaluation Results
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
eq_bench | 2.1 | none | 0 | eqbench | ↑ | 3.9776 | ± | 1.7012 |
none | 0 | percent_parseable | ↑ | 54.9708 | ± | 3.8158 |
- Downloads last month
- 7
Model tree for trainhubai/gpt-neo-1-3B-orpo
Base model
EleutherAI/gpt-neo-1.3B