Update README.md
Browse files
README.md
CHANGED
@@ -13,6 +13,8 @@ pipeline_tag: text-generation
|
|
13 |
|
14 |
# DPOpenHermes 7B
|
15 |
|
|
|
|
|
16 |
## OpenHermes x Notus x Neural
|
17 |
|
18 |
This is an RL fine tuned [OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) using the [Intel/orca_dpo_pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs) and [argilla/ultrafeedback-binarized-preferences](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences) preference datasets for reinforcement learning using Direct Preference Optimization (DPO)
|
|
|
13 |
|
14 |
# DPOpenHermes 7B
|
15 |
|
16 |
+
![image/png](https://huggingface.co/openaccess-ai-collective/DPOpenHermes-7B/resolve/main/assets/dpopenhermes.png)
|
17 |
+
|
18 |
## OpenHermes x Notus x Neural
|
19 |
|
20 |
This is an RL fine tuned [OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) using the [Intel/orca_dpo_pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs) and [argilla/ultrafeedback-binarized-preferences](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences) preference datasets for reinforcement learning using Direct Preference Optimization (DPO)
|