Update README.md
Browse files
README.md
CHANGED
@@ -72,7 +72,7 @@ If you are looking for the finetuned model, please use [DBRX Instruct](https://h
|
|
72 |
Getting started with DBRX models is easy with the `transformers` library. The model requires ~264GB of RAM and the following packages:
|
73 |
|
74 |
```bash
|
75 |
-
pip install transformers tiktoken
|
76 |
```
|
77 |
|
78 |
If you'd like to speed up download time, you can use the `hf_transfer` package as described by Huggingface [here](https://huggingface.co/docs/huggingface_hub/en/guides/download#faster-downloads).
|
@@ -81,13 +81,16 @@ pip install hf_transfer
|
|
81 |
export HF_HUB_ENABLE_HF_TRANSFER=1
|
82 |
```
|
83 |
|
|
|
|
|
|
|
84 |
### Run the model on a CPU:
|
85 |
```python
|
86 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
87 |
import torch
|
88 |
|
89 |
-
tokenizer = AutoTokenizer.from_pretrained("Undi95/dbrx-base", trust_remote_code=True)
|
90 |
-
model = AutoModelForCausalLM.from_pretrained("Undi95/dbrx-base", device_map="cpu", torch_dtype=torch.bfloat16, trust_remote_code=True)
|
91 |
|
92 |
input_text = "Databricks was founded in "
|
93 |
input_ids = tokenizer(input_text, return_tensors="pt")
|
@@ -101,8 +104,8 @@ print(tokenizer.decode(outputs[0]))
|
|
101 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
102 |
import torch
|
103 |
|
104 |
-
tokenizer = AutoTokenizer.from_pretrained("Undi95/dbrx-base", trust_remote_code=True)
|
105 |
-
model = AutoModelForCausalLM.from_pretrained("Undi95/dbrx-base", device_map="auto", torch_dtype=torch.bfloat16, trust_remote_code=True)
|
106 |
|
107 |
input_text = "Databricks was founded in "
|
108 |
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
|
@@ -170,4 +173,4 @@ Full evaluation details can be found in our [technical blog post](https://www.da
|
|
170 |
## Acknowledgements
|
171 |
The DBRX models were made possible thanks in large part to the open-source community, especially:
|
172 |
* The [MegaBlocks](https://arxiv.org/abs/2211.15841) library, which established a foundation for our MoE implementation.
|
173 |
-
* [PyTorch FSDP](https://arxiv.org/abs/2304.11277), which we built on for distributed training.
|
|
|
72 |
Getting started with DBRX models is easy with the `transformers` library. The model requires ~264GB of RAM and the following packages:
|
73 |
|
74 |
```bash
|
75 |
+
pip install "transformers>=4.39.2" "tiktoken>=0.6.0"
|
76 |
```
|
77 |
|
78 |
If you'd like to speed up download time, you can use the `hf_transfer` package as described by Huggingface [here](https://huggingface.co/docs/huggingface_hub/en/guides/download#faster-downloads).
|
|
|
81 |
export HF_HUB_ENABLE_HF_TRANSFER=1
|
82 |
```
|
83 |
|
84 |
+
You will need to request access to this repository to download the model. Once this is granted,
|
85 |
+
[obtain an access token](https://huggingface.co/docs/hub/en/security-tokens) with `read` permission, and supply the token below.
|
86 |
+
|
87 |
### Run the model on a CPU:
|
88 |
```python
|
89 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
90 |
import torch
|
91 |
|
92 |
+
tokenizer = AutoTokenizer.from_pretrained("Undi95/dbrx-base", trust_remote_code=True, token="hf_YOUR_TOKEN")
|
93 |
+
model = AutoModelForCausalLM.from_pretrained("Undi95/dbrx-base", device_map="cpu", torch_dtype=torch.bfloat16, trust_remote_code=True, token="hf_YOUR_TOKEN")
|
94 |
|
95 |
input_text = "Databricks was founded in "
|
96 |
input_ids = tokenizer(input_text, return_tensors="pt")
|
|
|
104 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
105 |
import torch
|
106 |
|
107 |
+
tokenizer = AutoTokenizer.from_pretrained("Undi95/dbrx-base", trust_remote_code=True, token="hf_YOUR_TOKEN")
|
108 |
+
model = AutoModelForCausalLM.from_pretrained("Undi95/dbrx-base", device_map="auto", torch_dtype=torch.bfloat16, trust_remote_code=True, token="hf_YOUR_TOKEN")
|
109 |
|
110 |
input_text = "Databricks was founded in "
|
111 |
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
|
|
|
173 |
## Acknowledgements
|
174 |
The DBRX models were made possible thanks in large part to the open-source community, especially:
|
175 |
* The [MegaBlocks](https://arxiv.org/abs/2211.15841) library, which established a foundation for our MoE implementation.
|
176 |
+
* [PyTorch FSDP](https://arxiv.org/abs/2304.11277), which we built on for distributed training.
|