Update README.md
Browse files
README.md
CHANGED
@@ -92,7 +92,6 @@ OpenBioLLM-70B is an advanced open source language model designed specifically f
|
|
92 |
</div>
|
93 |
|
94 |
|
95 |
-
- **Reward Model**: [Nexusflow/Starling-RM-34B](https://huggingface.co/Nexusflow/Starling-RM-34B)
|
96 |
- **Policy Optimization**: [Fine-Tuning Language Models from Human Preferences (PPO)](https://arxiv.org/abs/1909.08593)
|
97 |
- **Ranking Dataset**: [berkeley-nest/Nectar](https://huggingface.co/datasets/berkeley-nest/Nectar)
|
98 |
- **Fine-tuning dataset**: Custom Medical Instruct dataset (We plan to release a sample training dataset in our upcoming paper; please stay updated)
|
@@ -106,7 +105,7 @@ This combination of cutting-edge techniques enables OpenBioLLM-70B to align with
|
|
106 |
- **Language(s) (NLP):** en
|
107 |
- **Developed By**: [Ankit Pal (Aaditya Ura)](https://aadityaura.github.io/) from Saama AI Labs
|
108 |
- **License:** Meta-Llama License
|
109 |
-
- **Fine-tuned from models:** [Meta-Llama-3-70B-Instruct](meta-llama/Meta-Llama-3-70B-Instruct)
|
110 |
- **Resources for more information:**
|
111 |
- Paper: Coming soon
|
112 |
|
|
|
92 |
</div>
|
93 |
|
94 |
|
|
|
95 |
- **Policy Optimization**: [Fine-Tuning Language Models from Human Preferences (PPO)](https://arxiv.org/abs/1909.08593)
|
96 |
- **Ranking Dataset**: [berkeley-nest/Nectar](https://huggingface.co/datasets/berkeley-nest/Nectar)
|
97 |
- **Fine-tuning dataset**: Custom Medical Instruct dataset (We plan to release a sample training dataset in our upcoming paper; please stay updated)
|
|
|
105 |
- **Language(s) (NLP):** en
|
106 |
- **Developed By**: [Ankit Pal (Aaditya Ura)](https://aadityaura.github.io/) from Saama AI Labs
|
107 |
- **License:** Meta-Llama License
|
108 |
+
- **Fine-tuned from models:** [Meta-Llama-3-70B-Instruct](meta-llama/Meta-Llama-3-70B-Instruct)
|
109 |
- **Resources for more information:**
|
110 |
- Paper: Coming soon
|
111 |
|