facebook
/

galactica-1.3b

@@ -53,6 +53,100 @@ The models are made available under a non-commercial CC BY-NC 4.0 license. More
 The GALACTICA models are trained on 106 billion tokens of open-access scientific text and data. This includes papers, textbooks, scientific websites, encyclopedias, reference material, knowledge bases, and more. We tokenize different modalities to provide a natural langauge interface for different tasks. See the README.md for more information. See the paper for full information on the training data.
 ## Performance and Limitations
 The model outperforms several existing language models on a range of knowledge probes, reasoning, and knowledge-intensive scientific tasks. This also extends to general NLP tasks, where GALACTICA outperforms other open source general language models. That being said, we note a number of limitations in this section.

 The GALACTICA models are trained on 106 billion tokens of open-access scientific text and data. This includes papers, textbooks, scientific websites, encyclopedias, reference material, knowledge bases, and more. We tokenize different modalities to provide a natural langauge interface for different tasks. See the README.md for more information. See the paper for full information on the training data.
+## How to use
+Find below some example scripts on how to use the model in `transformers`:
+## Using the Pytorch model
+### Running the model on a CPU
+<details>
+<summary> Click to expand </summary>
+```python
+from transformers import AutoTokenizer, OPTForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("facebook/galactica-1.3b")
+model = OPTForCausalLM.from_pretrained("facebook/galactica-1.3b")
+input_text = "The Transformer architecture [START_REF]"
+input_ids = tokenizer(input_text, return_tensors="pt").input_ids
+outputs = model.generate(input_ids)
+print(tokenizer.decode(outputs[0]))
+```
+</details>
+### Running the model on a GPU
+<details>
+<summary> Click to expand </summary>
+```python
+# pip install accelerate
+from transformers import AutoTokenizer, OPTForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("facebook/galactica-1.3b")
+OPTForCausalLM.from_pretrained("facebook/galactica-1.3b", device_map="auto")
+input_text = "The Transformer architecture [START_REF]"
+input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")
+outputs = model.generate(input_ids)
+print(tokenizer.decode(outputs[0]))
+```
+</details>
+### Running the model on a GPU using different precisions
+#### FP16
+<details>
+<summary> Click to expand </summary>
+```python
+# pip install accelerate
+import torch
+from transformers import AutoTokenizer, OPTForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("facebook/galactica-1.3b")
+model = OPTForCausalLM.from_pretrained("facebook/galactica-1.3b", device_map="auto", torch_dtype=torch.float16)
+input_text = "The Transformer architecture [START_REF]"
+input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")
+outputs = model.generate(input_ids)
+print(tokenizer.decode(outputs[0]))
+```
+</details>
+#### INT8
+<details>
+<summary> Click to expand </summary>
+```python
+# pip install bitsandbytes accelerate
+from transformers import AutoTokenizer, OPTForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("facebook/galactica-1.3b")
+model = OPTForCausalLM.from_pretrained("facebook/galactica-1.3b", device_map="auto", load_in_8bit=True)
+input_text = "The Transformer architecture [START_REF]"
+input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")
+outputs = model.generate(input_ids)
+print(tokenizer.decode(outputs[0]))
+```
+</details>
 ## Performance and Limitations
 The model outperforms several existing language models on a range of knowledge probes, reasoning, and knowledge-intensive scientific tasks. This also extends to general NLP tasks, where GALACTICA outperforms other open source general language models. That being said, we note a number of limitations in this section.