dwb2023
/

paligemma-cnmc-ft

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

dwb2023 commited on Jul 2

Commit

9a04935

•

1 Parent(s): 57a7ebf

dwb2023/paligemma-cnmc-ft

Files changed (2) hide show

README.md +18 -12
adapter_model.safetensors +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/paligemma-3b-pt-224](https://huggingface.co/google/paligemma-3b-pt-224) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.3006
 ## Model description
@@ -43,21 +43,27 @@ The following hyperparameters were used during training:
 - total_train_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 50
 - num_epochs: 100
 ### Training results
-| Training Loss | Epoch  | Step | Validation Loss |
-|:-------------:|:------:|:----:|:---------------:|
-| No log        | 0.9645 | 17   | 1.2278          |
-| No log        | 1.9858 | 35   | 0.4162          |
-| 0.8676        | 2.9504 | 52   | 0.3132          |
-| 0.8676        | 3.9716 | 70   | 0.2602          |
-| 0.8676        | 4.9929 | 88   | 0.2446          |
-| 0.2526        | 5.9574 | 105  | 0.2100          |
-| 0.2526        | 6.9787 | 123  | 0.1986          |
-| 0.2526        | 8.0    | 141  | 0.3006          |
 ### Framework versions

 This model is a fine-tuned version of [google/paligemma-3b-pt-224](https://huggingface.co/google/paligemma-3b-pt-224) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1871
 ## Model description
 - total_train_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 170
 - num_epochs: 100
 ### Training results
+| Training Loss | Epoch   | Step | Validation Loss |
+|:-------------:|:-------:|:----:|:---------------:|
+| No log        | 0.9645  | 17   | 1.4372          |
+| No log        | 1.9858  | 35   | 1.1505          |
+| 1.1745        | 2.9504  | 52   | 0.6419          |
+| 1.1745        | 3.9716  | 70   | 0.3542          |
+| 1.1745        | 4.9929  | 88   | 0.3141          |
+| 0.3853        | 5.9574  | 105  | 0.2847          |
+| 0.3853        | 6.9787  | 123  | 0.2544          |
+| 0.3853        | 8.0     | 141  | 0.2498          |
+| 0.2598        | 8.9645  | 158  | 0.2074          |
+| 0.2598        | 9.9858  | 176  | 0.1840          |
+| 0.2598        | 10.9504 | 193  | 0.1656          |
+| 0.1867        | 11.9716 | 211  | 0.1665          |
+| 0.1867        | 12.9929 | 229  | 0.1719          |
+| 0.1867        | 13.9574 | 246  | 0.1871          |
 ### Framework versions

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8289c9cac4199f2c4bc879c4571f40e24c1d2b88194b1f6170ef6ee9f11a4581
 size 45258384

 version https://git-lfs.github.com/spec/v1
+oid sha256:08b44f175e8ca9078cb6d4a441dea52782c15331f50d3f8e49f61fbab9ae1fde
 size 45258384