Edit model card

Gemma-2b-MultiCap

This model is a fine-tuned version of google/gemma-2-2b-it on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5983

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.03
  • training_steps: 600

Training results

Training Loss Epoch Step Validation Loss
0.8045 0.0564 50 0.8067
0.7271 0.1128 100 0.6777
0.688 0.1692 150 0.6309
0.6268 0.2256 200 0.6176
0.572 0.2820 250 0.6118
0.5864 0.3384 300 0.6065
0.5528 0.3948 350 0.6030
0.5396 0.4512 400 0.6015
0.5726 0.5076 450 0.6005
0.5655 0.5640 500 0.5997
0.5712 0.6204 550 0.5988
0.5213 0.6768 600 0.5983

Framework versions

  • PEFT 0.12.0
  • Transformers 4.44.2
  • Pytorch 2.4.0+cu124
  • Datasets 2.21.0
  • Tokenizers 0.19.1
Downloads last month
1
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for sofyc/Gemma-2b-MultiCap

Base model

google/gemma-2-2b
Adapter
(137)
this model