Zyphra
/

Zamba-7B-v1-phase1

Text Generation

Inference Endpoints

Model card Files Files and versions Community

pglo commited on May 22

Commit

a8cae05

•

1 Parent(s): 260a6ee

Update README.md

Files changed (1) hide show

README.md +8 -7

README.md CHANGED Viewed

@@ -9,22 +9,23 @@ Zamba-7B-v1-phase1 is a hybrid model between Mamba, a state-space model, and tra
 ### Presequities
-Zamba requires you use `transformers` version 4.39.0 or higher:
-```bash
-pip install transformers>=4.39.0
-```
-In order to run optimized Mamba implementations on a CUDA device, you first need to install `mamba-ssm` and `causal-conv1d`:
 ```bash
 pip install mamba-ssm causal-conv1d>=1.2.0
 ```
-You can run the model not using the optimized Mamba kernels, but it is **not** recommended as it will result in significantly higher latency.
 To run on CPU, please specify `use_mamba_kernels=False` when loading the model using ``AutoModelForCausalLM.from_pretrained``.
-## Inference
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM

 ### Presequities
+To download Zamba, clone Zyphra's fork of transformers:
+1. `git clone https://github.com/Zyphra/transformers_zamba`
+2. `cd transformers_zamba`
+3. Install the repository: `pip install -e .`.
+In order to run optimized Mamba implementations on a CUDA device, you need to install `mamba-ssm` and `causal-conv1d`:
 ```bash
 pip install mamba-ssm causal-conv1d>=1.2.0
 ```
+You can run the model without using the optimized Mamba kernels, but it is **not** recommended as it will result in significantly higher latency.
 To run on CPU, please specify `use_mamba_kernels=False` when loading the model using ``AutoModelForCausalLM.from_pretrained``.
+### Inference
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM