Update README.md
Browse files
README.md
CHANGED
@@ -10,20 +10,20 @@ datasets:
|
|
10 |
- allenai/RLVR-GSM-MATH-IF-Mixed-Constraints
|
11 |
---
|
12 |
|
13 |
-
<img src="https://
|
14 |
|
15 |
# OLMo-2-1124-13B-Instruct
|
16 |
|
17 |
-
OLMo-2 13B Instruct November 2024 is
|
18 |
Tülu 3 is designed for state-of-the-art performance on a diversity of tasks in addition to chat, such as MATH, GSM8K, and IFEval.
|
19 |
-
Check out
|
20 |
|
21 |
OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models.
|
22 |
These models are trained on the Dolma dataset. We are releasing all code, checkpoints, logs (coming soon), and associated training details.
|
23 |
The core models released in this batch include the following:
|
24 |
|
25 |
|
26 |
-
| **Stage** | **OLMo-2 7B** | **OLMo
|
27 |
|----------------------|----------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------|
|
28 |
| **Base Model** | [allenai/OLMo2-7B-1124](https://huggingface.co/allenai/OLMo2-7B-1124) | [allenai/OLMo-2-13B-1124](https://huggingface.co/allenai/OLMo-2-13B-1124) |
|
29 |
| **SFT** | [allenai/OLMo-2-1124-7B-SFT](https://huggingface.co/allenai/OLMo-2-1124-7B-SFT) | [allenai/OLMo-2-1124-13B-SFT](https://huggingface.co/allenai/OLMo-2-1124-13B-SFT) |
|
@@ -47,7 +47,7 @@ The core models released in this batch include the following:
|
|
47 |
- Core repo (training, inference, fine-tuning etc.): https://github.com/allenai/OLMo
|
48 |
- Evaluation code: https://github.com/allenai/olmes
|
49 |
- Further fine-tuning code: https://github.com/allenai/open-instruct
|
50 |
-
- **Paper:** Coming soon!
|
51 |
- **Demo:** https://playground.allenai.org/
|
52 |
|
53 |
## Using the model
|
@@ -86,7 +86,7 @@ The model has not been trained with a specific system prompt in mind.
|
|
86 |
|
87 |
### Bias, Risks, and Limitations
|
88 |
|
89 |
-
The OLMo
|
90 |
See the Falcon 180B model card for an example of this.
|
91 |
|
92 |
|
@@ -138,9 +138,10 @@ PPO settings for RLVR:
|
|
138 |
|
139 |
## License and use
|
140 |
|
141 |
-
OLMo
|
142 |
-
OLMo
|
143 |
For more information, please see our [Responsible Use Guidelines](https://allenai.org/responsible-use).
|
|
|
144 |
|
145 |
## Citation
|
146 |
|
|
|
10 |
- allenai/RLVR-GSM-MATH-IF-Mixed-Constraints
|
11 |
---
|
12 |
|
13 |
+
<img alt="OLMo Logo" src="https://huggingface.co/datasets/allenai/blog-images/resolve/main/olmo2/olmo.png" width="242px">
|
14 |
|
15 |
# OLMo-2-1124-13B-Instruct
|
16 |
|
17 |
+
OLMo-2 13B Instruct November 2024 is post-trained variant of the [OLMo-2 13B November 2024](https://huggingface.co/allenai/OLMo2-13B-1124) model, which has undergone supervised finetuning on an OLMo-specific variant of the [Tülu 3 dataset](allenai/tulu-3-sft-olmo-2-mixture) and further DPO training on [this dataset](https://huggingface.co/datasets/allenai/olmo-2-1124-13b-preference-mix), and finally RLVR training using [this data](https://huggingface.co/datasets/allenai/RLVR-GSM-MATH-IF-Mixed-Constraints).
|
18 |
Tülu 3 is designed for state-of-the-art performance on a diversity of tasks in addition to chat, such as MATH, GSM8K, and IFEval.
|
19 |
+
Check out the OLMo 2 paper (forthcoming) or [Tülu 3 paper](https://arxiv.org/abs/2411.15124) for more details!
|
20 |
|
21 |
OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models.
|
22 |
These models are trained on the Dolma dataset. We are releasing all code, checkpoints, logs (coming soon), and associated training details.
|
23 |
The core models released in this batch include the following:
|
24 |
|
25 |
|
26 |
+
| **Stage** | **OLMo-2 7B** | **OLMo 2 13B** |
|
27 |
|----------------------|----------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------|
|
28 |
| **Base Model** | [allenai/OLMo2-7B-1124](https://huggingface.co/allenai/OLMo2-7B-1124) | [allenai/OLMo-2-13B-1124](https://huggingface.co/allenai/OLMo-2-13B-1124) |
|
29 |
| **SFT** | [allenai/OLMo-2-1124-7B-SFT](https://huggingface.co/allenai/OLMo-2-1124-7B-SFT) | [allenai/OLMo-2-1124-13B-SFT](https://huggingface.co/allenai/OLMo-2-1124-13B-SFT) |
|
|
|
47 |
- Core repo (training, inference, fine-tuning etc.): https://github.com/allenai/OLMo
|
48 |
- Evaluation code: https://github.com/allenai/olmes
|
49 |
- Further fine-tuning code: https://github.com/allenai/open-instruct
|
50 |
+
- **Paper:** Coming soon!
|
51 |
- **Demo:** https://playground.allenai.org/
|
52 |
|
53 |
## Using the model
|
|
|
86 |
|
87 |
### Bias, Risks, and Limitations
|
88 |
|
89 |
+
The OLMo 2 models have limited safety training, but are not deployed automatically with in-the-loop filtering of responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so).
|
90 |
See the Falcon 180B model card for an example of this.
|
91 |
|
92 |
|
|
|
138 |
|
139 |
## License and use
|
140 |
|
141 |
+
OLMo 2 is licensed under the Apache 2.0 license.
|
142 |
+
OLMo 2 is intended for research and educational use.
|
143 |
For more information, please see our [Responsible Use Guidelines](https://allenai.org/responsible-use).
|
144 |
+
This model has been fine-tuned using a dataset mix with outputs generated from third party models and are subject to additional terms: [Gemma Terms of Use](https://ai.google.dev/gemma/terms).
|
145 |
|
146 |
## Citation
|
147 |
|