paligemma_vqav2 / README.md
seyviour's picture
seyviour/paligemma_VQAv2_enel645
740fc92 verified
metadata
library_name: peft
license: gemma
base_model: google/paligemma-3b-pt-224
tags:
  - generated_from_trainer
model-index:
  - name: paligemma_vqav2
    results: []

paligemma_vqav2

This model is a fine-tuned version of google/paligemma-3b-pt-224 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2568

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_HF with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 2
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss
6.2845 0.3137 100 1.7300
0.9773 0.6275 200 0.6286
0.5181 0.9412 300 0.4802
0.4126 1.2549 400 0.4208
0.365 1.5686 500 0.3774
0.3369 1.8824 600 0.3645
0.3037 2.1961 700 0.3299
0.2854 2.5098 800 0.3164
0.2939 2.8235 900 0.3093
0.2547 3.1373 1000 0.2961
0.2275 3.4510 1100 0.2943
0.2456 3.7647 1200 0.2824
0.2368 4.0784 1300 0.2723
0.2148 4.3922 1400 0.2733
0.2118 4.7059 1500 0.2737
0.1991 5.0196 1600 0.2715
0.1879 5.3333 1700 0.2657
0.1841 5.6471 1800 0.2746
0.1912 5.9608 1900 0.2642
0.1509 6.2745 2000 0.2964
0.1818 6.5882 2100 0.2607
0.1736 6.9020 2200 0.2644
0.1618 7.2157 2300 0.2663
0.1563 7.5294 2400 0.2637
0.159 7.8431 2500 0.2561
0.1488 8.1569 2600 0.2554
0.1417 8.4706 2700 0.2589
0.1329 8.7843 2800 0.2599
0.1455 9.0980 2900 0.2589
0.1521 9.4118 3000 0.2576
0.1334 9.7255 3100 0.2568

Framework versions

  • PEFT 0.13.2
  • Transformers 4.46.2
  • Pytorch 2.4.0.post301
  • Datasets 3.1.0
  • Tokenizers 0.20.3