burtenshaw's picture
burtenshaw HF staff
Upload folder using huggingface_hub
4ad32d0 verified

ORPO

Updates (24.03.25)

 

This is the official repository for ORPO: Monolithic Preference Optimization without Reference Model. The detailed results in the paper can be found in:

Model Checkpoints

Our models trained with ORPO can be found in:

And the corresponding logs for the average log probabilities of chosen/rejected responses during training are reported in:

 

AlpacaEval

Description of the image
Figure 1. AlpacaEval 2.0 score for the models trained with different alignment methods.

 

MT-Bench

Description of the image
Figure 2. MT-Bench result by category.

 

IFEval

IFEval scores are measured with EleutherAI/lm-evaluation-harness by applying the chat template. The scores for Llama-2-Chat (70B), Zephyr-β (7B), and Mixtral-8X7B-Instruct-v0.1 are originally reported in this tweet.

Model Type Prompt-Strict Prompt-Loose Inst-Strict Inst-Loose
Llama-2-Chat (70B) 0.4436 0.5342 0.5468 0.6319
Zephyr-β (7B) 0.4233 0.4547 0.5492 0.5767
Mixtral-8X7B-Instruct-v0.1 0.5213 0.5712 0.6343 0.6823
Mistral-ORPO-⍺ (7B) 0.5009 0.5083 0.5995 0.6163
Mistral-ORPO-β (7B) 0.5287 0.5564 0.6355 0.6619