CUDA usage is low

#28
by Max545 - opened

When I trained a gemma2, the GPU usage is low (0% at most of time). And when I use the same method (LoRA, peft library) to train llama, the GPU usage is constantly about 100%. What's the reason?

Google org

Hi @Max545 ,

I executed both the models in GPU type NVIDIA_TESLA_A100 x 1. When running models like google/gemma-2b and meta-llama/Llama-2-7b-hf, if the device is not specified as "auto", the models will use system RAM instead of the GPU. However, if you explicitly set device="cuda", the models will automatically run on the GPU, utilizing its computational power for faster processing. Please refer to the following gist for more details: link to gist.

The difference in GPU usage between Gemma2 and LLaMA during fine-tuning with LoRA can be attributed to several factors:

  Model architecture: LLaMA is more optimized for efficient GPU usage, while Gemma2 may not be as well-tuned for GPU-heavy tasks.
  Memory bottlenecks: Inefficient memory management or slow data transfer between CPU and GPU in Gemma2 can result in lower GPU usage.
  Framework support: LLaMA has better support in the PEFT library and related tools, which could lead to better GPU utilization compared to Gemma2.

Thank you.

Sign up or log in to comment