Update README.md
Browse files
README.md
CHANGED
@@ -14,13 +14,15 @@ Zamba requires you use `transformers` version 4.39.0 or higher:
|
|
14 |
pip install transformers>=4.39.0
|
15 |
```
|
16 |
|
17 |
-
In order to run optimized Mamba implementations, you first need to install `mamba-ssm` and `causal-conv1d`:
|
18 |
```bash
|
19 |
pip install mamba-ssm causal-conv1d>=1.2.0
|
20 |
```
|
21 |
-
You also have to have the model on a CUDA device.
|
22 |
|
23 |
-
You can run the model not using the optimized Mamba kernels, but it is **not** recommended as it will result in significantly higher latency.
|
|
|
|
|
|
|
24 |
|
25 |
## Inference
|
26 |
|
|
|
14 |
pip install transformers>=4.39.0
|
15 |
```
|
16 |
|
17 |
+
In order to run optimized Mamba implementations on a CUDA device, you first need to install `mamba-ssm` and `causal-conv1d`:
|
18 |
```bash
|
19 |
pip install mamba-ssm causal-conv1d>=1.2.0
|
20 |
```
|
|
|
21 |
|
22 |
+
You can run the model not using the optimized Mamba kernels, but it is **not** recommended as it will result in significantly higher latency.
|
23 |
+
|
24 |
+
To run on CPU, please specify `use_mamba_kernels=False` when loading the model using ``AutoModelForCausalLM.from_pretrained``.
|
25 |
+
|
26 |
|
27 |
## Inference
|
28 |
|