running it on cpu using pretrained

#35

by himanshuyadav62 - opened Aug 30

Discussion

himanshuyadav62

Aug 30

from transformers import AutoTokenizer, AutoModelForCausalLM

can we use this to run model only on cpu

Renu11

Google org Sep 5

Yes, you can run the smaller Gemma models on CPU. Please make sure not to select the 'device_map' to GPU explicitly to run the model on CPU. You can also use the quantized version of the model to utilize the less memory. Please have a look at the gist for your reference where I run the Gemma2-2b-it model by selecting the CPU only in Google Colab.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment