lewtun HF staff commited on
Commit
313cbf5
1 Parent(s): 728c8ea

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -15,9 +15,10 @@ model-index:
15
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
16
  should probably proofread and complete it, then remove this comment. -->
17
 
18
- # mistral-7b-dpo-v21.0cai.0.2
 
 
19
 
20
- This model is a fine-tuned version of [HuggingFaceH4/mistral-7b-cai](https://huggingface.co/HuggingFaceH4/mistral-7b-cai) on the HuggingFaceH4/ultrafeedback_binarized_fixed and the HuggingFaceH4/cai-conversation-harmless datasets.
21
  It achieves the following results on the evaluation set:
22
  - Loss: 0.6327
23
  - Rewards/chosen: -9.8716
 
15
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
16
  should probably proofread and complete it, then remove this comment. -->
17
 
18
+ # Mistral 7B Constitutional AI
19
+
20
+ This model is a DPO-aligned version of [HuggingFaceH4/mistral-7b-cai](https://huggingface.co/HuggingFaceH4/mistral-7b-cai) on the HuggingFaceH4/ultrafeedback_binarized_fixed and the HuggingFaceH4/cai-conversation-harmless datasets.
21
 
 
22
  It achieves the following results on the evaluation set:
23
  - Loss: 0.6327
24
  - Rewards/chosen: -9.8716