winglian commited on
Commit
ae87025
1 Parent(s): 1fb444b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -15
README.md CHANGED
@@ -8,29 +8,17 @@ model-index:
8
  results: []
9
  datasets:
10
  - winglian/no_robots_rlhf
 
11
  ---
12
 
13
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
  should probably proofread and complete it, then remove this comment. -->
15
 
16
  [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
17
- # qlora-out
18
 
19
- This model is a fine-tuned version of [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) on an unknown dataset.
20
 
21
- ## Model description
22
-
23
- More information needed
24
-
25
- ## Intended uses & limitations
26
-
27
- More information needed
28
-
29
- ## Training and evaluation data
30
-
31
- More information needed
32
-
33
- ## Training procedure
34
 
35
  ### Training hyperparameters
36
 
 
8
  results: []
9
  datasets:
10
  - winglian/no_robots_rlhf
11
+ - HuggingFaceH4/no_robots
12
  ---
13
 
14
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
  should probably proofread and complete it, then remove this comment. -->
16
 
17
  [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
18
+ # openhermes-2_5-dpo-no-robots
19
 
20
+ This model is a RL fine-tuned version of [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) on a preference dataset derived from HuggingFace's [no robots dataset](https://huggingface.co/datasets/HuggingFaceH4/no_robots) using DPO.
21
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
  ### Training hyperparameters
24