natolambert
commited on
Commit
•
97afec2
1
Parent(s):
c45397a
Update README.md
Browse files
README.md
CHANGED
@@ -10,13 +10,12 @@ language:
|
|
10 |
<img src="https://allenai.org/olmo/olmo-7b-animation.gif" alt="OLMo Logo" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
|
11 |
|
12 |
# TODO
|
13 |
-
* Change using model section if is in transformers
|
14 |
* Update summary of Dolma 1.7
|
15 |
* Remove installation requirements?
|
16 |
* Evals pre and post annealing
|
17 |
* details on annealing / accessing checkpoint (remove previous checkpoint instructions)
|
18 |
|
19 |
-
# Model Card for OLMo
|
20 |
|
21 |
<!-- Provide a quick summary of what the model is/does. -->
|
22 |
|
@@ -32,25 +31,25 @@ The core models released in this batch are the following:
|
|
32 |
| [OLMo 1B](https://huggingface.co/allenai/OLMo-1B) | 3 Trillion |16 | 2048 | 16 | 2048 |
|
33 |
| [OLMo 7B](https://huggingface.co/allenai/OLMo-7B) | 2.5 Trillion | 32 | 4096 | 32 | 2048 |
|
34 |
| [OLMo 7B Twin 2T](https://huggingface.co/allenai/OLMo-7B-Twin-2T) | 2 Trillion | 32 | 4096 | 32 | 2048 |
|
35 |
-
| [OLMo
|
36 |
|
37 |
-
*Note: OLMo
|
38 |
|
39 |
|
40 |
-
We are releasing many checkpoints for these models, for every 1000 traing steps.
|
41 |
The naming convention is `step1000-tokens4B`.
|
42 |
|
43 |
To load a specific model revision with HuggingFace, simply add the argument `revision`:
|
44 |
```bash
|
45 |
import hf_olmo # pip install ai2-olmo
|
46 |
-
olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-
|
47 |
```
|
48 |
|
49 |
All revisions/branches are listed in the file `revisions.txt`.
|
50 |
Or, you can access all the revisions for the models via the following code snippet:
|
51 |
```python
|
52 |
from huggingface_hub import list_repo_refs
|
53 |
-
out = list_repo_refs("allenai/OLMo-
|
54 |
branches = [b.name for b in out.branches]
|
55 |
```
|
56 |
A few revisions were lost due to an error, but the vast majority are present.
|
@@ -87,6 +86,11 @@ A few revisions were lost due to an error, but the vast majority are present.
|
|
87 |
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
88 |
|
89 |
### Inference
|
|
|
|
|
|
|
|
|
|
|
90 |
Quickly get inference running with the following required installation:
|
91 |
```bash
|
92 |
pip install ai2-olmo
|
@@ -96,8 +100,8 @@ Now, proceed as usual with HuggingFace:
|
|
96 |
import hf_olmo
|
97 |
|
98 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
99 |
-
olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-
|
100 |
-
tokenizer = AutoTokenizer.from_pretrained("allenai/OLMo-
|
101 |
message = ["Language modeling is "]
|
102 |
inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
|
103 |
# optional verifying cuda
|
@@ -112,12 +116,12 @@ Alternatively, with the pipeline abstraction:
|
|
112 |
import hf_olmo
|
113 |
|
114 |
from transformers import pipeline
|
115 |
-
olmo_pipe = pipeline("text-generation", model="allenai/OLMo-
|
116 |
print(olmo_pipe("Language modeling is "))
|
117 |
>> 'Language modeling is a branch of natural language processing that aims to...'
|
118 |
```
|
119 |
|
120 |
-
Or, you can make this slightly faster by quantizing the model, e.g. `AutoModelForCausalLM.from_pretrained("allenai/OLMo-
|
121 |
The quantized model is more sensitive to typing / cuda, so it is recommended to pass the inputs as `inputs.input_ids.to('cuda')` to avoid potential issues.
|
122 |
|
123 |
Note, you may see the following error if `ai2-olmo` is not installed correctly, which is caused by internal Python check naming. We'll update the code soon to make this error clearer.
|
@@ -144,24 +148,23 @@ For more documentation, see the [GitHub readme](https://github.com/allenai/OLMo?
|
|
144 |
|
145 |
<!-- This section describes the evaluation protocols and provides the results. -->
|
146 |
|
147 |
-
Core model results for the 7B model are found below.
|
148 |
-
|
149 |
-
|
|
150 |
-
|
151 |
-
|
|
152 |
-
|
|
153 |
-
| boolq
|
154 |
-
| copa
|
155 |
-
| hellaswag
|
156 |
-
| openbookqa
|
157 |
-
| piqa
|
158 |
-
| sciq
|
159 |
-
| winogrande
|
160 |
-
|
|
161 |
-
|
|
162 |
-
|
|
163 |
-
|
|
164 |
-
| **Full average** | 57.8 | 59.3 | 59.2 | 59.3 | 59.8 |
|
165 |
|
166 |
And for the 1B model:
|
167 |
|
|
|
10 |
<img src="https://allenai.org/olmo/olmo-7b-animation.gif" alt="OLMo Logo" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
|
11 |
|
12 |
# TODO
|
|
|
13 |
* Update summary of Dolma 1.7
|
14 |
* Remove installation requirements?
|
15 |
* Evals pre and post annealing
|
16 |
* details on annealing / accessing checkpoint (remove previous checkpoint instructions)
|
17 |
|
18 |
+
# Model Card for OLMo 1.7-7B
|
19 |
|
20 |
<!-- Provide a quick summary of what the model is/does. -->
|
21 |
|
|
|
31 |
| [OLMo 1B](https://huggingface.co/allenai/OLMo-1B) | 3 Trillion |16 | 2048 | 16 | 2048 |
|
32 |
| [OLMo 7B](https://huggingface.co/allenai/OLMo-7B) | 2.5 Trillion | 32 | 4096 | 32 | 2048 |
|
33 |
| [OLMo 7B Twin 2T](https://huggingface.co/allenai/OLMo-7B-Twin-2T) | 2 Trillion | 32 | 4096 | 32 | 2048 |
|
34 |
+
| [OLMo 1.7-7B](https://huggingface.co/allenai/OLMo-1.7-7B) | 2.05 Trillion | 32 | 4096 | 32 | 4096 |
|
35 |
|
36 |
+
*Note: OLMo 1.7-7B also includes QKV clipping.*
|
37 |
|
38 |
|
39 |
+
[Coming soon] We are releasing many checkpoints for these models, for every 1000 traing steps.
|
40 |
The naming convention is `step1000-tokens4B`.
|
41 |
|
42 |
To load a specific model revision with HuggingFace, simply add the argument `revision`:
|
43 |
```bash
|
44 |
import hf_olmo # pip install ai2-olmo
|
45 |
+
olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-1.7-7B", revision="step1000-tokens4B")
|
46 |
```
|
47 |
|
48 |
All revisions/branches are listed in the file `revisions.txt`.
|
49 |
Or, you can access all the revisions for the models via the following code snippet:
|
50 |
```python
|
51 |
from huggingface_hub import list_repo_refs
|
52 |
+
out = list_repo_refs("allenai/OLMo-1.7-7B")
|
53 |
branches = [b.name for b in out.branches]
|
54 |
```
|
55 |
A few revisions were lost due to an error, but the vast majority are present.
|
|
|
86 |
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
87 |
|
88 |
### Inference
|
89 |
+
|
90 |
+
*Note: The OLMo models will shortly be included in Transformers.*
|
91 |
+
When the [PR](https://github.com/huggingface/transformers/pull/29890) is merged, you will no longer need to use `trust_remote_code=True` or install `ai2-olmo` to use the model.
|
92 |
+
Then, install Transformers [from source](https://huggingface.co/docs/transformers/en/installation#install-from-source).
|
93 |
+
|
94 |
Quickly get inference running with the following required installation:
|
95 |
```bash
|
96 |
pip install ai2-olmo
|
|
|
100 |
import hf_olmo
|
101 |
|
102 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
103 |
+
olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-1.7-7B")
|
104 |
+
tokenizer = AutoTokenizer.from_pretrained("allenai/OLMo-1.7-7B")
|
105 |
message = ["Language modeling is "]
|
106 |
inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
|
107 |
# optional verifying cuda
|
|
|
116 |
import hf_olmo
|
117 |
|
118 |
from transformers import pipeline
|
119 |
+
olmo_pipe = pipeline("text-generation", model="allenai/OLMo-1.7-7B")
|
120 |
print(olmo_pipe("Language modeling is "))
|
121 |
>> 'Language modeling is a branch of natural language processing that aims to...'
|
122 |
```
|
123 |
|
124 |
+
Or, you can make this slightly faster by quantizing the model, e.g. `AutoModelForCausalLM.from_pretrained("allenai/OLMo-1.7-7B", torch_dtype=torch.float16, load_in_8bit=True)` (requires `bitsandbytes`).
|
125 |
The quantized model is more sensitive to typing / cuda, so it is recommended to pass the inputs as `inputs.input_ids.to('cuda')` to avoid potential issues.
|
126 |
|
127 |
Note, you may see the following error if `ai2-olmo` is not installed correctly, which is caused by internal Python check naming. We'll update the code soon to make this error clearer.
|
|
|
148 |
|
149 |
<!-- This section describes the evaluation protocols and provides the results. -->
|
150 |
|
151 |
+
Core model results for the new and original 7B model are found below.
|
152 |
+
|
153 |
+
| Task | Llama-7b | Llama2-7b | Falcon-7b | Mpt-7b | OLMo-7B | Llama2-13b | **OLMo 1.7-7B** |
|
154 |
+
|-------------------|----------|-----------|-----------|--------|---------|------------|-------------|
|
155 |
+
| arc_c | 44.5 | 48.5 | 47.5 | 46.5 | 48.5 | 52.8 | 42.5 |
|
156 |
+
| arc_e | 67.9 | 69.5 | 70.4 | 70.5 | 65.4 | 73.7 | 67.2 |
|
157 |
+
| boolq | 75.4 | 80.2 | 74.6 | 74.2 | 73.4 | 82.2 | 83.7 |
|
158 |
+
| copa | 91.0 | 86.0 | 86.0 | 85.0 | 90.0 | 90.0 | 86.0 |
|
159 |
+
| hellaswag | 76.2 | 76.8 | 75.9 | 77.6 | 76.4 | 78.6 | 75.5 |
|
160 |
+
| openbookqa | 51.2 | 48.4 | 53.0 | 48.6 | 50.4 | 51.8 | 50.0 |
|
161 |
+
| piqa | 77.2 | 76.7 | 78.5 | 77.3 | 78.4 | 79.0 | 77.5 |
|
162 |
+
| sciq | 93.9 | 94.5 | 93.9 | 93.7 | 93.8 | 95.5 | 96.7 |
|
163 |
+
| winogrande | 70.5 | 69.4 | 68.9 | 69.9 | 67.9 | 73.5 | 69.8 |
|
164 |
+
| truthfulQA (MC2) | 33.9 | 38.5 | 34.0 | 33.0 | 36.0 | 36.8 | 35.8 |
|
165 |
+
| MMLU (5 shot MC) | 31.5 | 45.0 | 24.0 | 30.8 | 28.3 | 55.5 | 52.0 |
|
166 |
+
| GSM8k | 10.0 | 12.0 | 4.0 | 4.5 | 8.5 | 25.0 | 29.0 |
|
167 |
+
| Full average | 60.3 | 62.1 | 59.2 | 59.3 | 59.8 | 66.2 | 63.8 |
|
|
|
168 |
|
169 |
And for the 1B model:
|
170 |
|