paligemma_vqav2 / README.md
Magneto's picture
Update README.md
3ce0d30 verified
metadata
license: gemma
library_name: peft
tags:
  - generated_from_trainer
base_model: google/paligemma-3b-pt-224
model-index:
  - name: paligemma_vqav2
    results: []

paligemma_vqav2

This model is a fine-tuned version of google/paligemma-3b-pt-224 on an Magneto/caption_for_mars_image_512_qa_format dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0115

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 2
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
2.6221 0.0114 50 2.4036
2.244 0.0228 100 2.0980
2.0078 0.0343 150 1.8881
1.8561 0.0457 200 1.7707
1.6108 0.0571 250 1.6833
1.5712 0.0685 300 1.6297
1.6298 0.0800 350 1.5834
1.469 0.0914 400 1.5454
1.4758 0.1028 450 1.5210
1.5303 0.1142 500 1.4936
1.3559 0.1257 550 1.4793
1.4407 0.1371 600 1.4596
1.4655 0.1485 650 1.4360
1.4213 0.1599 700 1.4223
1.3744 0.1714 750 1.4022
1.4285 0.1828 800 1.3906
1.2105 0.1942 850 1.3790
1.3653 0.2056 900 1.3687
1.337 0.2170 950 1.3602
1.1845 0.2285 1000 1.3509
1.3404 0.2399 1050 1.3384
1.2957 0.2513 1100 1.3278
1.2107 0.2627 1150 1.3176
1.4208 0.2742 1200 1.3132
1.2522 0.2856 1250 1.3032
1.2735 0.2970 1300 1.2992
1.3567 0.3084 1350 1.2854
1.0994 0.3199 1400 1.2805
1.2496 0.3313 1450 1.2710
1.1944 0.3427 1500 1.2660
1.3303 0.3541 1550 1.2610
1.2942 0.3655 1600 1.2524
1.2187 0.3770 1650 1.2458
1.2071 0.3884 1700 1.2395
1.1734 0.3998 1750 1.2356
1.182 0.4112 1800 1.2301
1.2104 0.4227 1850 1.2302
1.1961 0.4341 1900 1.2258
1.1749 0.4455 1950 1.2244
1.1283 0.4569 2000 1.2189
1.095 0.4684 2050 1.2174
1.1376 0.4798 2100 1.2172
1.0772 0.4912 2150 1.2137
1.255 0.5026 2200 1.2111
1.1682 0.5141 2250 1.2076
1.1455 0.5255 2300 1.2052
1.151 0.5369 2350 1.2034
0.9805 0.5483 2400 1.2007
1.1706 0.5597 2450 1.1985
1.1961 0.5712 2500 1.1960
1.0449 0.5826 2550 1.1937
1.1375 0.5940 2600 1.1908
1.1205 0.6054 2650 1.1896
1.2097 0.6169 2700 1.1908
1.1976 0.6283 2750 1.1856
1.1327 0.6397 2800 1.0918
1.0446 0.6511 2850 1.0929
1.0804 0.6626 2900 1.0878
0.9446 0.6740 2950 1.0871
1.0722 0.6854 3000 1.0851
1.1224 0.6968 3050 1.0865
1.2711 0.7082 3100 1.0826
1.0378 0.7197 3150 1.0835
1.0873 0.7311 3200 1.0823
1.1336 0.7425 3250 1.0815
1.1407 0.7539 3300 1.0782
1.0805 0.7654 3350 1.0786
1.2204 0.7768 3400 1.0773
1.0855 0.7882 3450 1.1838
1.1151 0.7996 3500 1.1843
1.01 0.8111 3550 1.1815
1.1389 0.8225 3600 1.1828
1.0964 0.8339 3650 1.1802
0.9706 0.8453 3700 1.1803
1.0022 0.8568 3750 1.1764
1.0751 0.8682 3800 1.1764
0.9681 0.8796 3850 1.1764
1.101 0.8910 3900 1.1740
1.0931 0.9024 3950 1.1730
1.0791 0.9139 4000 1.1721
1.1654 0.9253 4050 1.1711
1.0536 0.9367 4100 1.1669
1.1077 0.9481 4150 1.1691
1.1421 0.9596 4200 1.1674
1.1065 0.9710 4250 1.1684
1.1226 0.9824 4300 1.1670
1.1432 0.9938 4350 1.1641
1.1632 1.0053 4400 1.1614
0.9927 1.0167 4450 1.1600
0.9685 1.0281 4500 1.1559
1.1403 1.0395 4550 1.1563
1.1059 1.0509 4600 1.1546
1.071 1.0624 4650 1.1544
1.0969 1.0738 4700 1.1537
1.0136 1.0852 4750 1.1521
1.0297 1.0966 4800 1.1519
1.1304 1.1081 4850 1.1508
1.2172 1.1195 4900 1.1517
1.0156 1.1309 4950 1.1511
1.0726 1.1423 5000 1.1483
1.0272 1.1538 5050 1.0159
1.1042 1.1652 5100 1.0153
1.0118 1.1766 5150 1.0127
1.1269 1.1880 5200 1.0148
1.0389 1.1995 5250 1.0152
1.1804 1.2109 5300 1.0154
1.1138 1.2223 5350 1.0153
1.0319 1.2337 5400 1.0144
1.0 1.2451 5450 1.0153
1.1573 1.2566 5500 1.0152
1.0604 1.2680 5550 1.0126
1.081 1.2794 5600 1.0118
0.988 1.2908 5650 1.0126
1.1302 1.3023 5700 1.0119
1.0626 1.3137 5750 1.0129
1.051 1.3251 5800 1.0100
1.0849 1.3365 5850 1.0094
1.0739 1.3480 5900 1.0090
1.0457 1.3594 5950 1.0074
1.0924 1.3708 6000 1.0090
0.9545 1.3822 6050 1.0084
1.0727 1.3936 6100 1.0076
1.1274 1.4051 6150 1.0075
1.0515 1.4165 6200 1.0066
0.9465 1.4279 6250 1.0057
1.029 1.4393 6300 1.0062
1.0454 1.4508 6350 1.0058
0.9563 1.4622 6400 1.0053
1.1052 1.4736 6450 1.0049
0.9351 1.4850 6500 1.0059
1.0649 1.4965 6550 1.0048
1.0206 1.5079 6600 1.0039
1.0616 1.5193 6650 1.0032
1.1544 1.5307 6700 1.0047
1.012 1.5422 6750 1.0199
1.0374 1.5536 6800 1.0177
1.1414 1.5650 6850 1.0174
0.8807 1.5764 6900 1.0177
1.0647 1.5878 6950 1.0156
1.023 1.5993 7000 1.0173
1.0109 1.6107 7050 1.0156
1.005 1.6221 7100 1.0163
1.0047 1.6335 7150 1.0163
1.0304 1.6450 7200 1.0158
0.9394 1.6564 7250 1.0158
1.0 1.6678 7300 1.0150
1.0296 1.6792 7350 1.0148
1.0314 1.6907 7400 1.0152
0.9902 1.7021 7450 1.0148
1.0266 1.7135 7500 1.0159
1.1017 1.7249 7550 1.0152
1.0706 1.7363 7600 1.0150
0.9999 1.7478 7650 1.0149
0.9819 1.7592 7700 1.0138
1.0049 1.7706 7750 1.0137
1.0488 1.7820 7800 1.0131
1.1126 1.7935 7850 1.0140
1.0583 1.8049 7900 1.0141
1.075 1.8163 7950 1.0126
1.1158 1.8277 8000 1.0117
1.0319 1.8392 8050 1.0128
1.0514 1.8506 8100 1.0128
1.1144 1.8620 8150 1.0119
0.983 1.8734 8200 1.0119
1.1242 1.8849 8250 1.0126
1.1011 1.8963 8300 1.0123
0.9533 1.9077 8350 1.0127
1.0661 1.9191 8400 1.0118
1.0133 1.9305 8450 1.0117
1.0856 1.9420 8500 1.0118
1.1292 1.9534 8550 1.0117
0.9881 1.9648 8600 1.0118
0.9716 1.9762 8650 1.0121
1.0925 1.9877 8700 1.0117
1.0235 1.9991 8750 1.0115

Framework versions

  • PEFT 0.11.1
  • Transformers 4.42.0.dev0
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.2
  • Tokenizers 0.19.1