Update README.md
Browse files
README.md
CHANGED
@@ -14,6 +14,9 @@ license: artistic-2.0
|
|
14 |
# juanako-7b-v1 (UNA: Uniform Neural Alignment)
|
15 |
|
16 |
This model uses uniform neural alignment (UNA) for the DPO training phases and is a fine-tuned version of [fblgit/zephyr-lora-dpo-b1](https://huggingface.co/fblgit/zephyr-lora-dpo-b1) on the HuggingFaceH4/ultrafeedback_binarized dataset.
|
|
|
|
|
|
|
17 |
It achieves the following results on the evaluation set:
|
18 |
- Loss: 0.4594
|
19 |
- Rewards/chosen: -1.1095
|
@@ -27,7 +30,7 @@ It achieves the following results on the evaluation set:
|
|
27 |
|
28 |
Followed [alignment-handbook](https://github.com/huggingface/alignment-handbook) to perform DPO (Phase 2) over Zephyr-SFT model.
|
29 |
|
30 |
-
**Please feel free to run more tests and commit the results. Also if you are interested to participate in [UNA's paper research or GPU sponsorship](mailto:
|
31 |
|
32 |
Special thanks to [TheBloke](https://huggingface.co/TheBloke) for converting the model into multiple formats and overall his enormous contribution to the community.
|
33 |
Here are the models:
|
|
|
14 |
# juanako-7b-v1 (UNA: Uniform Neural Alignment)
|
15 |
|
16 |
This model uses uniform neural alignment (UNA) for the DPO training phases and is a fine-tuned version of [fblgit/zephyr-lora-dpo-b1](https://huggingface.co/fblgit/zephyr-lora-dpo-b1) on the HuggingFaceH4/ultrafeedback_binarized dataset.
|
17 |
+
|
18 |
+
**It is recommended to use the latest [Juanako Version](https://huggingface.co/fblgit/juanako-7b-UNA) which highly outperforms the v1**
|
19 |
+
|
20 |
It achieves the following results on the evaluation set:
|
21 |
- Loss: 0.4594
|
22 |
- Rewards/chosen: -1.1095
|
|
|
30 |
|
31 |
Followed [alignment-handbook](https://github.com/huggingface/alignment-handbook) to perform DPO (Phase 2) over Zephyr-SFT model.
|
32 |
|
33 |
+
**Please feel free to run more tests and commit the results. Also if you are interested to participate in [UNA's paper research or GPU sponsorship](mailto:xavi@juanako.ai) to support UNA research, feel free to contact.**
|
34 |
|
35 |
Special thanks to [TheBloke](https://huggingface.co/TheBloke) for converting the model into multiple formats and overall his enormous contribution to the community.
|
36 |
Here are the models:
|