metadata
license: apache-2.0
base_model: google/flan-t5-small
tags:
- generated_from_trainer
model-index:
- name: flant5-tuned-3
results: []
flant5-tuned-3
This model is a fine-tuned version of google/flan-t5-small on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.2189
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 8
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.9667 | 0.02 | 1 | 1.6464 |
2.4545 | 0.04 | 2 | 1.6360 |
2.4307 | 0.06 | 3 | 1.6170 |
2.1502 | 0.09 | 4 | 1.5949 |
1.8514 | 0.11 | 5 | 1.5724 |
1.7189 | 0.13 | 6 | 1.5528 |
1.9036 | 0.15 | 7 | 1.5352 |
2.0255 | 0.17 | 8 | 1.5151 |
2.2073 | 0.19 | 9 | 1.4944 |
2.1577 | 0.21 | 10 | 1.4748 |
1.6581 | 0.23 | 11 | 1.4545 |
1.9323 | 0.26 | 12 | 1.4363 |
1.4871 | 0.28 | 13 | 1.4198 |
1.574 | 0.3 | 14 | 1.4032 |
1.7671 | 0.32 | 15 | 1.3898 |
1.567 | 0.34 | 16 | 1.3782 |
1.5162 | 0.36 | 17 | 1.3686 |
1.9622 | 0.38 | 18 | 1.3599 |
1.8378 | 0.4 | 19 | 1.3525 |
1.7199 | 0.43 | 20 | 1.3460 |
1.3917 | 0.45 | 21 | 1.3402 |
1.4417 | 0.47 | 22 | 1.3345 |
1.4023 | 0.49 | 23 | 1.3293 |
1.5427 | 0.51 | 24 | 1.3239 |
1.2344 | 0.53 | 25 | 1.3192 |
2.281 | 0.55 | 26 | 1.3136 |
1.9236 | 0.57 | 27 | 1.3077 |
1.4392 | 0.6 | 28 | 1.3029 |
1.9168 | 0.62 | 29 | 1.2976 |
2.1688 | 0.64 | 30 | 1.2930 |
1.2504 | 0.66 | 31 | 1.2890 |
1.5946 | 0.68 | 32 | 1.2853 |
1.6979 | 0.7 | 33 | 1.2820 |
1.6712 | 0.72 | 34 | 1.2789 |
1.7862 | 0.74 | 35 | 1.2759 |
1.534 | 0.77 | 36 | 1.2734 |
1.6904 | 0.79 | 37 | 1.2712 |
1.6023 | 0.81 | 38 | 1.2692 |
1.6756 | 0.83 | 39 | 1.2667 |
2.0195 | 0.85 | 40 | 1.2640 |
1.2913 | 0.87 | 41 | 1.2618 |
1.1534 | 0.89 | 42 | 1.2607 |
1.5612 | 0.91 | 43 | 1.2597 |
1.3159 | 0.94 | 44 | 1.2586 |
1.6303 | 0.96 | 45 | 1.2582 |
1.3721 | 0.98 | 46 | 1.2584 |
2.703 | 1.0 | 47 | 1.2583 |
1.3063 | 1.02 | 48 | 1.2585 |
1.1093 | 1.04 | 49 | 1.2594 |
1.1362 | 1.06 | 50 | 1.2609 |
1.6691 | 1.09 | 51 | 1.2619 |
1.44 | 1.11 | 52 | 1.2620 |
1.8026 | 1.13 | 53 | 1.2612 |
1.8663 | 1.15 | 54 | 1.2601 |
1.3662 | 1.17 | 55 | 1.2584 |
1.7172 | 1.19 | 56 | 1.2562 |
1.554 | 1.21 | 57 | 1.2535 |
1.0628 | 1.23 | 58 | 1.2514 |
1.389 | 1.26 | 59 | 1.2494 |
1.0307 | 1.28 | 60 | 1.2481 |
1.5557 | 1.3 | 61 | 1.2462 |
1.6536 | 1.32 | 62 | 1.2438 |
1.652 | 1.34 | 63 | 1.2415 |
1.51 | 1.36 | 64 | 1.2396 |
1.5407 | 1.38 | 65 | 1.2374 |
1.6681 | 1.4 | 66 | 1.2349 |
1.4797 | 1.43 | 67 | 1.2323 |
1.326 | 1.45 | 68 | 1.2304 |
1.8683 | 1.47 | 69 | 1.2285 |
1.3007 | 1.49 | 70 | 1.2270 |
1.5261 | 1.51 | 71 | 1.2256 |
1.6908 | 1.53 | 72 | 1.2241 |
1.4631 | 1.55 | 73 | 1.2226 |
1.5474 | 1.57 | 74 | 1.2213 |
1.0559 | 1.6 | 75 | 1.2209 |
1.5217 | 1.62 | 76 | 1.2206 |
1.7606 | 1.64 | 77 | 1.2201 |
1.5246 | 1.66 | 78 | 1.2197 |
1.8001 | 1.68 | 79 | 1.2192 |
1.4414 | 1.7 | 80 | 1.2185 |
1.4168 | 1.72 | 81 | 1.2182 |
1.2429 | 1.74 | 82 | 1.2180 |
1.7092 | 1.77 | 83 | 1.2178 |
1.4605 | 1.79 | 84 | 1.2178 |
1.2242 | 1.81 | 85 | 1.2180 |
1.6583 | 1.83 | 86 | 1.2180 |
1.7079 | 1.85 | 87 | 1.2181 |
0.9831 | 1.87 | 88 | 1.2186 |
1.6504 | 1.89 | 89 | 1.2191 |
1.7244 | 1.91 | 90 | 1.2194 |
1.2895 | 1.94 | 91 | 1.2196 |
1.03 | 1.96 | 92 | 1.2201 |
1.377 | 1.98 | 93 | 1.2205 |
1.0463 | 2.0 | 94 | 1.2210 |
1.3759 | 2.02 | 95 | 1.2214 |
1.7144 | 2.04 | 96 | 1.2218 |
1.6047 | 2.06 | 97 | 1.2220 |
1.6515 | 2.09 | 98 | 1.2222 |
1.2909 | 2.11 | 99 | 1.2221 |
1.6717 | 2.13 | 100 | 1.2219 |
1.1318 | 2.15 | 101 | 1.2220 |
1.3417 | 2.17 | 102 | 1.2219 |
1.3242 | 2.19 | 103 | 1.2219 |
1.4135 | 2.21 | 104 | 1.2221 |
1.3863 | 2.23 | 105 | 1.2222 |
1.2301 | 2.26 | 106 | 1.2222 |
1.5481 | 2.28 | 107 | 1.2220 |
1.0813 | 2.3 | 108 | 1.2219 |
1.4198 | 2.32 | 109 | 1.2218 |
1.4751 | 2.34 | 110 | 1.2214 |
1.4133 | 2.36 | 111 | 1.2212 |
0.8784 | 2.38 | 112 | 1.2212 |
1.514 | 2.4 | 113 | 1.2211 |
1.2913 | 2.43 | 114 | 1.2210 |
1.1341 | 2.45 | 115 | 1.2209 |
1.262 | 2.47 | 116 | 1.2210 |
1.3282 | 2.49 | 117 | 1.2209 |
1.5217 | 2.51 | 118 | 1.2206 |
1.6127 | 2.53 | 119 | 1.2203 |
1.5625 | 2.55 | 120 | 1.2201 |
1.4603 | 2.57 | 121 | 1.2199 |
1.8532 | 2.6 | 122 | 1.2196 |
1.3278 | 2.62 | 123 | 1.2194 |
1.0632 | 2.64 | 124 | 1.2192 |
1.5837 | 2.66 | 125 | 1.2190 |
1.4593 | 2.68 | 126 | 1.2188 |
1.2919 | 2.7 | 127 | 1.2188 |
1.1228 | 2.72 | 128 | 1.2187 |
1.3098 | 2.74 | 129 | 1.2188 |
1.6073 | 2.77 | 130 | 1.2188 |
1.1484 | 2.79 | 131 | 1.2189 |
1.6054 | 2.81 | 132 | 1.2190 |
1.5228 | 2.83 | 133 | 1.2190 |
1.5577 | 2.85 | 134 | 1.2190 |
1.4234 | 2.87 | 135 | 1.2191 |
1.7341 | 2.89 | 136 | 1.2191 |
1.6164 | 2.91 | 137 | 1.2190 |
1.6621 | 2.94 | 138 | 1.2190 |
1.5781 | 2.96 | 139 | 1.2189 |
1.0756 | 2.98 | 140 | 1.2189 |
1.8596 | 3.0 | 141 | 1.2189 |
Framework versions
- Transformers 4.38.1
- Pytorch 2.1.0+cu121
- Datasets 2.17.0
- Tokenizers 0.15.2