RoyRoyRpy's picture
palgemoriginlora88
32a901b verified
metadata
base_model: google/paligemma-3b-pt-224
library_name: peft
license: gemma
tags:
  - generated_from_trainer
model-index:
  - name: paligemma_newslakeandmadvqa_conbime
    results: []

paligemma_newslakeandmadvqa_conbime

This model is a fine-tuned version of google/paligemma-3b-pt-224 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0349

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 10
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 40
  • optimizer: Use OptimizerNames.ADAMW_HF with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 900
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
5.6592 0.1768 100 5.9832
5.1792 0.3535 200 4.5564
3.6303 0.5303 300 3.2205
2.8756 0.7070 400 2.5592
2.3211 0.8838 500 2.1059
1.9172 1.0605 600 1.8195
1.7352 1.2373 700 1.6310
1.621 1.4141 800 1.4883
1.4855 1.5908 900 1.3832
1.4496 1.7676 1000 1.2959
1.2769 1.9443 1100 1.2350
1.2199 2.1211 1200 1.1831
1.2619 2.2978 1300 1.1391
1.1412 2.4746 1400 1.0943
1.0869 2.6513 1500 1.0589
1.123 2.8281 1600 1.0349

Framework versions

  • PEFT 0.13.0
  • Transformers 4.46.0.dev0
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.1
  • Tokenizers 0.20.0