I can't load this model on L4 GPU
from transformers import AutoTokenizer, AutoModelForCausalLM
model="google/recurrentgemma-2b-it"
tokenizer = AutoTokenizer.from_pretrained(model)
model = AutoModelForCausalLM.from_pretrained(model, device_map="auto")
input_text = "Write me a poem about Machine Learning."
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
ValueError: The checkpoint you are trying to load has model type recurrent_gemma
but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.
Not in transformers yet: https://github.com/huggingface/transformers/pull/30143
It’s merged! Transformers now supports recurrent-Gemma! https://github.com/huggingface/transformers/pull/30143