How to use this model
I guess transformer has not been updated.lol
Mamba is a brand new Deep learning architeture , parallel to RNN, LSTM, Transformers, and so on
There is nothing like modeling.py
and so on, so I guess you need to waiting for repository update or transformers
upgraded.
how to fine tune this model?
There is nothing like
modeling.py
and so on, so I guess you need to waiting for repository update ortransformers
upgraded.
Or you can taketheir simple generator and base usage on it.
Here's for example a simle working example assuming mamba_ssm is installed and model lies in ~/models
import torch
import os
from transformers import AutoTokenizer
from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
model = MambaLMHeadModel.from_pretrained(os.path.expanduser("~/models/state-spaces_mamba-2.8b/"), device="cuda", dtype=torch.bfloat16)
tokens = tokenizer("Once upon a time, a cat named", return_tensors="pt")
input_ids = tokens.input_ids.to(device="cuda")
max_length = input_ids.shape[1] + 80
fn = lambda: model.generate(
input_ids=input_ids, max_length=max_length, cg=True,
return_dict_in_generate=True, output_scores=True,
enable_timing=False, temperature=0.9, top_k=40, top_p=0.9,)
out = fn()
print(tokenizer.decode(out[0][0]))
Once upon a time, a cat named Puss-in-Boots was running around town. And when Puss-in-Boots ran, he left little pawprints. And when Puss-in-Boots climbed, he left little pawprints. And when Puss-in-Boots fell, he left little pawprints. And, when Puss-in-Boots was sleeping, the cats in town
You can also see example here: https://huggingface.co/spaces/reach-vb/mamba/tree/main