allenai
/

OLMo-7B-0424

@@ -10,13 +10,12 @@ language:
 <img src="https://allenai.org/olmo/olmo-7b-animation.gif" alt="OLMo Logo" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
 # TODO
-* Change using model section if is in transformers
 * Update summary of Dolma 1.7
 * Remove installation requirements?
 * Evals pre and post annealing
 * details on annealing / accessing checkpoint (remove previous checkpoint instructions)
-# Model Card for OLMo 7B v1.7
 <!-- Provide a quick summary of what the model is/does. -->
@@ -32,25 +31,25 @@ The core models released in this batch are the following:
 | [OLMo 1B](https://huggingface.co/allenai/OLMo-1B)   | 3 Trillion |16     | 2048        | 16              | 2048  |
 | [OLMo 7B](https://huggingface.co/allenai/OLMo-7B) | 2.5 Trillion   | 32     | 4096        | 32              |  2048  |
 | [OLMo 7B Twin 2T](https://huggingface.co/allenai/OLMo-7B-Twin-2T) | 2 Trillion   | 32     | 4096        | 32              |  2048  |
-| [OLMo 7B v1.7](https://huggingface.co/allenai/OLMo-7B-v1.7) | 2.TODO   | 32     | 4096        | 32              |  4096  |
-*Note: OLMo 7B v1.7 also includes QKV clipping.*
-We are releasing many checkpoints for these models, for every 1000 traing steps.
 The naming convention is `step1000-tokens4B`.
 To load a specific model revision with HuggingFace, simply add the argument `revision`:
 ```bash
 import hf_olmo # pip install ai2-olmo
-olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-7B-v1.7", revision="step1000-tokens4B")
 ```
 All revisions/branches are listed in the file `revisions.txt`.
 Or, you can access all the revisions for the models via the following code snippet:
 ```python
 from huggingface_hub import list_repo_refs
-out = list_repo_refs("allenai/OLMo-7B-v1.7")
 branches = [b.name for b in out.branches]
 ```
 A few revisions were lost due to an error, but the vast majority are present.
@@ -87,6 +86,11 @@ A few revisions were lost due to an error, but the vast majority are present.
 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 ### Inference
 Quickly get inference running with the following required installation:
 ```bash
 pip install ai2-olmo
@@ -96,8 +100,8 @@ Now, proceed as usual with HuggingFace:
 import hf_olmo
 from transformers import AutoModelForCausalLM, AutoTokenizer
-olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-7B-v1.7")
-tokenizer = AutoTokenizer.from_pretrained("allenai/OLMo-7B-v1.7")
 message = ["Language modeling is "]
 inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
 # optional verifying cuda
@@ -112,12 +116,12 @@ Alternatively, with the pipeline abstraction:
 import hf_olmo
 from transformers import pipeline
-olmo_pipe = pipeline("text-generation", model="allenai/OLMo-7B-v1.7")
 print(olmo_pipe("Language modeling is "))
 >> 'Language modeling is a branch of natural language processing that aims to...'
 ```
-Or, you can make this slightly faster by quantizing the model, e.g. `AutoModelForCausalLM.from_pretrained("allenai/OLMo-7B-v1.7", torch_dtype=torch.float16, load_in_8bit=True)` (requires `bitsandbytes`).
 The quantized model is more sensitive to typing / cuda, so it is recommended to pass the inputs as `inputs.input_ids.to('cuda')` to avoid potential issues.
 Note, you may see the following error if `ai2-olmo` is not installed correctly, which is caused by internal Python check naming. We'll update the code soon to make this error clearer.
@@ -144,24 +148,23 @@ For more documentation, see the [GitHub readme](https://github.com/allenai/OLMo?
 <!-- This section describes the evaluation protocols and provides the results. -->
-Core model results for the 7B model are found below.
-|                                   | [Llama 7B](https://arxiv.org/abs/2302.13971) | [Llama 2 7B](https://huggingface.co/meta-llama/Llama-2-7b) | [Falcon 7B](https://huggingface.co/tiiuae/falcon-7b) | [MPT 7B](https://huggingface.co/mosaicml/mpt-7b) | **OLMo 7B** (ours) |
-| --------------------------------- | -------- | ---------- | --------- | ------ | ------- |
-| arc_challenge       | 44.5             | 39.8             | 47.5         | 46.5         | 48.5            |
-| arc_easy            | 57.0             | 57.7             | 70.4         | 70.5         | 65.4            |
-| boolq               | 73.1             | 73.5             | 74.6         | 74.2         | 73.4            |
-| copa                | 85.0             | 87.0             | 86.0         | 85.0         | 90              |
-| hellaswag           | 74.5             | 74.5             | 75.9         | 77.6         | 76.4            |
-| openbookqa          | 49.8             | 48.4             | 53.0         | 48.6         | 50.2            |
-| piqa                | 76.3             | 76.4             | 78.5         | 77.3         | 78.4            |
-| sciq                | 89.5             | 90.8             | 93.9         | 93.7         | 93.8            |
-| winogrande          | 68.2             | 67.3             | 68.9         | 69.9         | 67.9            |
-| **Core tasks average**  | 68.7             | 68.4             | 72.1         | 71.5         | 71.6            |
-| truthfulQA (MC2)    | 33.9             | 38.5             | 34.0         | 33           | 36.0            |
-| MMLU (5 shot MC)    | 31.5             | 45.0             | 24.0         | 30.8         | 28.3            |
-| GSM8k (mixed eval.) | 10.0 (8shot CoT) | 12.0 (8shot CoT) | 4.0 (5 shot) | 4.5 (5 shot) | 8.5 (8shot CoT) |
-| **Full average**        | 57.8             | 59.3             | 59.2         | 59.3         | 59.8            |
 And for the 1B model:

 <img src="https://allenai.org/olmo/olmo-7b-animation.gif" alt="OLMo Logo" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
 # TODO
 * Update summary of Dolma 1.7
 * Remove installation requirements?
 * Evals pre and post annealing
 * details on annealing / accessing checkpoint (remove previous checkpoint instructions)
+# Model Card for OLMo 1.7-7B
 <!-- Provide a quick summary of what the model is/does. -->
 | [OLMo 1B](https://huggingface.co/allenai/OLMo-1B)   | 3 Trillion |16     | 2048        | 16              | 2048  |
 | [OLMo 7B](https://huggingface.co/allenai/OLMo-7B) | 2.5 Trillion   | 32     | 4096        | 32              |  2048  |
 | [OLMo 7B Twin 2T](https://huggingface.co/allenai/OLMo-7B-Twin-2T) | 2 Trillion   | 32     | 4096        | 32              |  2048  |
+| [OLMo 1.7-7B](https://huggingface.co/allenai/OLMo-1.7-7B) | 2.05 Trillion   | 32     | 4096        | 32              |  4096  |
+*Note: OLMo 1.7-7B also includes QKV clipping.*
+[Coming soon] We are releasing many checkpoints for these models, for every 1000 traing steps.
 The naming convention is `step1000-tokens4B`.
 To load a specific model revision with HuggingFace, simply add the argument `revision`:
 ```bash
 import hf_olmo # pip install ai2-olmo
+olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-1.7-7B", revision="step1000-tokens4B")
 ```
 All revisions/branches are listed in the file `revisions.txt`.
 Or, you can access all the revisions for the models via the following code snippet:
 ```python
 from huggingface_hub import list_repo_refs
+out = list_repo_refs("allenai/OLMo-1.7-7B")
 branches = [b.name for b in out.branches]
 ```
 A few revisions were lost due to an error, but the vast majority are present.
 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 ### Inference
+*Note: The OLMo models will shortly be included in Transformers.*
+When the [PR](https://github.com/huggingface/transformers/pull/29890) is merged, you will no longer need to use `trust_remote_code=True` or install `ai2-olmo` to use the model.
+Then, install Transformers [from source](https://huggingface.co/docs/transformers/en/installation#install-from-source).
 Quickly get inference running with the following required installation:
 ```bash
 pip install ai2-olmo
 import hf_olmo
 from transformers import AutoModelForCausalLM, AutoTokenizer
+olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-1.7-7B")
+tokenizer = AutoTokenizer.from_pretrained("allenai/OLMo-1.7-7B")
 message = ["Language modeling is "]
 inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
 # optional verifying cuda
 import hf_olmo
 from transformers import pipeline
+olmo_pipe = pipeline("text-generation", model="allenai/OLMo-1.7-7B")
 print(olmo_pipe("Language modeling is "))
 >> 'Language modeling is a branch of natural language processing that aims to...'
 ```
+Or, you can make this slightly faster by quantizing the model, e.g. `AutoModelForCausalLM.from_pretrained("allenai/OLMo-1.7-7B", torch_dtype=torch.float16, load_in_8bit=True)` (requires `bitsandbytes`).
 The quantized model is more sensitive to typing / cuda, so it is recommended to pass the inputs as `inputs.input_ids.to('cuda')` to avoid potential issues.
 Note, you may see the following error if `ai2-olmo` is not installed correctly, which is caused by internal Python check naming. We'll update the code soon to make this error clearer.
 <!-- This section describes the evaluation protocols and provides the results. -->
+Core model results for the new and original 7B model are found below.
+| Task              | Llama-7b | Llama2-7b | Falcon-7b | Mpt-7b | OLMo-7B | Llama2-13b | **OLMo 1.7-7B** |
+|-------------------|----------|-----------|-----------|--------|---------|------------|-------------|
+| arc_c             | 44.5     | 48.5      | 47.5      | 46.5   | 48.5    | 52.8       | 42.5        |
+| arc_e             | 67.9     | 69.5      | 70.4      | 70.5   | 65.4    | 73.7       | 67.2        |
+| boolq             | 75.4     | 80.2      | 74.6      | 74.2   | 73.4    | 82.2       | 83.7        |
+| copa              | 91.0     | 86.0      | 86.0      | 85.0   | 90.0    | 90.0       | 86.0        |
+| hellaswag         | 76.2     | 76.8      | 75.9      | 77.6   | 76.4    | 78.6       | 75.5        |
+| openbookqa        | 51.2     | 48.4      | 53.0      | 48.6   | 50.4    | 51.8       | 50.0        |
+| piqa              | 77.2     | 76.7      | 78.5      | 77.3   | 78.4    | 79.0       | 77.5        |
+| sciq              | 93.9     | 94.5      | 93.9      | 93.7   | 93.8    | 95.5       | 96.7        |
+| winogrande        | 70.5     | 69.4      | 68.9      | 69.9   | 67.9    | 73.5       | 69.8        |
+| truthfulQA (MC2)  | 33.9     | 38.5      | 34.0      | 33.0   | 36.0    | 36.8       | 35.8        |
+| MMLU (5 shot MC)  | 31.5     | 45.0      | 24.0      | 30.8   | 28.3    | 55.5       | 52.0        |
+| GSM8k             | 10.0     | 12.0      | 4.0       | 4.5    | 8.5     | 25.0       | 29.0        |
+| Full average      | 60.3     | 62.1      | 59.2      | 59.3   | 59.8    | 66.2       | 63.8        |
 And for the 1B model: