Zyphra
/

Zamba-7B-v1-phase1

Text Generation

Inference Endpoints

Model card Files Files and versions Community

BerenMillidge commited on May 22

Commit

19d07cb

•

1 Parent(s): 5095564

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -3,7 +3,7 @@ license: apache-2.0
 ---
 # Model Card for Zamba
-Zamba-7B-v1 is a hybrid between state-space models (Specifically Mamba) and transformer, and was trained using next-token prediction. Zamba uses a shared transformer layer after every 6 mamba blocks. It uses the Mistral v0.1 tokenizer. We came to this architecture after a series of ablations at small scales. Zamba-7B-v1 was pre-trained on 1T tokens of text and code data.
 ## Quick start
@@ -28,8 +28,8 @@ You can run the model not using the optimized Mamba kernels, but it is **not** r
 from transformers import AutoTokenizer, AutoModelForCausalLM
 import torch
-tokenizer = AutoTokenizer.from_pretrained("Zyphra/Zamba-7B-v1")
-model = AutoModelForCausalLM.from_pretrained("Zyphra/Zamba-7B-v1", device_map="auto", torch_dtype=torch.bfloat16)
 input_text = "A funny prompt would be "
 input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")

 ---
 # Model Card for Zamba
+Zamba-7B-v1-phase1 is a hybrid model between Mamba, a state-space model, and transformers. It uses a mamba backbone with a shared transformer layer every 6 blocks. Zamba was trained using next-token prediction. It uses the Mistral v0.1 tokenizer. We came to this architecture after a series of ablations at small scales. Zamba-7B-v1-phase-1 was pre-trained on 1T tokens of text and code data sourced from open web-datasets. Unlike Zamba-v1, this model represents the checkpoint after pure prertaining only on web-datasets. We envision its use primarily as a comparison tool to explore the effects of our annealing process.
 ## Quick start
 from transformers import AutoTokenizer, AutoModelForCausalLM
 import torch
+tokenizer = AutoTokenizer.from_pretrained("Zyphra/Zamba-7B-v1-phase1")
+model = AutoModelForCausalLM.from_pretrained("Zyphra/Zamba-7B-v1-phase1", device_map="auto", torch_dtype=torch.bfloat16)
 input_text = "A funny prompt would be "
 input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")