--- library_name: peft license: gemma base_model: google/paligemma-3b-pt-224 tags: - generated_from_trainer model-index: - name: paligemma_vqav2 results: [] --- # paligemma_vqav2 This model is a fine-tuned version of [google/paligemma-3b-pt-224](https://huggingface.co/google/paligemma-3b-pt-224) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 0.2568 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-05 - train_batch_size: 8 - eval_batch_size: 4 - seed: 42 - gradient_accumulation_steps: 4 - total_train_batch_size: 32 - optimizer: Use OptimizerNames.ADAMW_HF with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 2 - num_epochs: 10 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:------:|:----:|:---------------:| | 6.2845 | 0.3137 | 100 | 1.7300 | | 0.9773 | 0.6275 | 200 | 0.6286 | | 0.5181 | 0.9412 | 300 | 0.4802 | | 0.4126 | 1.2549 | 400 | 0.4208 | | 0.365 | 1.5686 | 500 | 0.3774 | | 0.3369 | 1.8824 | 600 | 0.3645 | | 0.3037 | 2.1961 | 700 | 0.3299 | | 0.2854 | 2.5098 | 800 | 0.3164 | | 0.2939 | 2.8235 | 900 | 0.3093 | | 0.2547 | 3.1373 | 1000 | 0.2961 | | 0.2275 | 3.4510 | 1100 | 0.2943 | | 0.2456 | 3.7647 | 1200 | 0.2824 | | 0.2368 | 4.0784 | 1300 | 0.2723 | | 0.2148 | 4.3922 | 1400 | 0.2733 | | 0.2118 | 4.7059 | 1500 | 0.2737 | | 0.1991 | 5.0196 | 1600 | 0.2715 | | 0.1879 | 5.3333 | 1700 | 0.2657 | | 0.1841 | 5.6471 | 1800 | 0.2746 | | 0.1912 | 5.9608 | 1900 | 0.2642 | | 0.1509 | 6.2745 | 2000 | 0.2964 | | 0.1818 | 6.5882 | 2100 | 0.2607 | | 0.1736 | 6.9020 | 2200 | 0.2644 | | 0.1618 | 7.2157 | 2300 | 0.2663 | | 0.1563 | 7.5294 | 2400 | 0.2637 | | 0.159 | 7.8431 | 2500 | 0.2561 | | 0.1488 | 8.1569 | 2600 | 0.2554 | | 0.1417 | 8.4706 | 2700 | 0.2589 | | 0.1329 | 8.7843 | 2800 | 0.2599 | | 0.1455 | 9.0980 | 2900 | 0.2589 | | 0.1521 | 9.4118 | 3000 | 0.2576 | | 0.1334 | 9.7255 | 3100 | 0.2568 | ### Framework versions - PEFT 0.13.2 - Transformers 4.46.2 - Pytorch 2.4.0.post301 - Datasets 3.1.0 - Tokenizers 0.20.3