halyn's picture
Update README.md (#3)
22df3d2 verified
metadata
library_name: peft

README for Gemma-2-2B-IT Fine-Tuning with LoRA

This project fine-tunes the Gemma-2-2B-IT model using LoRA (Low-Rank Adaptation) for Question Answering tasks, leveraging the Wikitext-2 dataset. The fine-tuning process is optimized for efficient training on limited GPU memory by freezing most model parameters and applying LoRA to specific layers.

Project Overview

  • Model: Gemma-2-2B-IT, a causal language model.
  • Dataset: Wikitext-2 for text generation and causal language modeling.
  • Training Strategy: LoRA adaptation for low-resource fine-tuning.
  • Frameworks: Hugging Face transformers, peft, and datasets.

Key Features

  • LoRA Configuration:
    • LoRA is applied to the following projection layers: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, and down_proj.
    • LoRA hyperparameters:
      • Rank (r): 4
      • LoRA Alpha: 8
      • Dropout: 0.1
  • Training Configuration:
    • Mixed precision (fp16) enabled for faster and more memory-efficient training.
    • Gradient accumulation with 32 steps to manage large model sizes on small GPUs.
    • Batch size of 1 due to GPU memory constraints.
    • Learning rate: 5e-5 with weight decay: 0.01.

System Requirements

  • GPU: Required for efficient training. This script was tested with CUDA-enabled GPUs.
  • Python Packages: Install dependencies with:
    pip install -r requirements.txt
    

Notes

  • This fine-tuned model leverages LoRA to adapt the large Gemma-2-2B-IT model with minimal trainable parameters, allowing fine-tuning even on hardware with limited memory.
  • The fine-tuned model can be further utilized for tasks like Question Answering, and it is optimized for resource-efficient deployment.

Memory Usage

  • The training script includes CUDA memory summaries before and after the training process to monitor GPU memory consumption.