--- license: apache-2.0 base_model: google/flan-t5-small tags: - generated_from_trainer model-index: - name: flant5-tuned-3 results: [] --- # flant5-tuned-3 This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) on the None dataset. It achieves the following results on the evaluation set: - Loss: 1.2189 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0001 - train_batch_size: 8 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 8 - num_epochs: 3 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | 1.9667 | 0.02 | 1 | 1.6464 | | 2.4545 | 0.04 | 2 | 1.6360 | | 2.4307 | 0.06 | 3 | 1.6170 | | 2.1502 | 0.09 | 4 | 1.5949 | | 1.8514 | 0.11 | 5 | 1.5724 | | 1.7189 | 0.13 | 6 | 1.5528 | | 1.9036 | 0.15 | 7 | 1.5352 | | 2.0255 | 0.17 | 8 | 1.5151 | | 2.2073 | 0.19 | 9 | 1.4944 | | 2.1577 | 0.21 | 10 | 1.4748 | | 1.6581 | 0.23 | 11 | 1.4545 | | 1.9323 | 0.26 | 12 | 1.4363 | | 1.4871 | 0.28 | 13 | 1.4198 | | 1.574 | 0.3 | 14 | 1.4032 | | 1.7671 | 0.32 | 15 | 1.3898 | | 1.567 | 0.34 | 16 | 1.3782 | | 1.5162 | 0.36 | 17 | 1.3686 | | 1.9622 | 0.38 | 18 | 1.3599 | | 1.8378 | 0.4 | 19 | 1.3525 | | 1.7199 | 0.43 | 20 | 1.3460 | | 1.3917 | 0.45 | 21 | 1.3402 | | 1.4417 | 0.47 | 22 | 1.3345 | | 1.4023 | 0.49 | 23 | 1.3293 | | 1.5427 | 0.51 | 24 | 1.3239 | | 1.2344 | 0.53 | 25 | 1.3192 | | 2.281 | 0.55 | 26 | 1.3136 | | 1.9236 | 0.57 | 27 | 1.3077 | | 1.4392 | 0.6 | 28 | 1.3029 | | 1.9168 | 0.62 | 29 | 1.2976 | | 2.1688 | 0.64 | 30 | 1.2930 | | 1.2504 | 0.66 | 31 | 1.2890 | | 1.5946 | 0.68 | 32 | 1.2853 | | 1.6979 | 0.7 | 33 | 1.2820 | | 1.6712 | 0.72 | 34 | 1.2789 | | 1.7862 | 0.74 | 35 | 1.2759 | | 1.534 | 0.77 | 36 | 1.2734 | | 1.6904 | 0.79 | 37 | 1.2712 | | 1.6023 | 0.81 | 38 | 1.2692 | | 1.6756 | 0.83 | 39 | 1.2667 | | 2.0195 | 0.85 | 40 | 1.2640 | | 1.2913 | 0.87 | 41 | 1.2618 | | 1.1534 | 0.89 | 42 | 1.2607 | | 1.5612 | 0.91 | 43 | 1.2597 | | 1.3159 | 0.94 | 44 | 1.2586 | | 1.6303 | 0.96 | 45 | 1.2582 | | 1.3721 | 0.98 | 46 | 1.2584 | | 2.703 | 1.0 | 47 | 1.2583 | | 1.3063 | 1.02 | 48 | 1.2585 | | 1.1093 | 1.04 | 49 | 1.2594 | | 1.1362 | 1.06 | 50 | 1.2609 | | 1.6691 | 1.09 | 51 | 1.2619 | | 1.44 | 1.11 | 52 | 1.2620 | | 1.8026 | 1.13 | 53 | 1.2612 | | 1.8663 | 1.15 | 54 | 1.2601 | | 1.3662 | 1.17 | 55 | 1.2584 | | 1.7172 | 1.19 | 56 | 1.2562 | | 1.554 | 1.21 | 57 | 1.2535 | | 1.0628 | 1.23 | 58 | 1.2514 | | 1.389 | 1.26 | 59 | 1.2494 | | 1.0307 | 1.28 | 60 | 1.2481 | | 1.5557 | 1.3 | 61 | 1.2462 | | 1.6536 | 1.32 | 62 | 1.2438 | | 1.652 | 1.34 | 63 | 1.2415 | | 1.51 | 1.36 | 64 | 1.2396 | | 1.5407 | 1.38 | 65 | 1.2374 | | 1.6681 | 1.4 | 66 | 1.2349 | | 1.4797 | 1.43 | 67 | 1.2323 | | 1.326 | 1.45 | 68 | 1.2304 | | 1.8683 | 1.47 | 69 | 1.2285 | | 1.3007 | 1.49 | 70 | 1.2270 | | 1.5261 | 1.51 | 71 | 1.2256 | | 1.6908 | 1.53 | 72 | 1.2241 | | 1.4631 | 1.55 | 73 | 1.2226 | | 1.5474 | 1.57 | 74 | 1.2213 | | 1.0559 | 1.6 | 75 | 1.2209 | | 1.5217 | 1.62 | 76 | 1.2206 | | 1.7606 | 1.64 | 77 | 1.2201 | | 1.5246 | 1.66 | 78 | 1.2197 | | 1.8001 | 1.68 | 79 | 1.2192 | | 1.4414 | 1.7 | 80 | 1.2185 | | 1.4168 | 1.72 | 81 | 1.2182 | | 1.2429 | 1.74 | 82 | 1.2180 | | 1.7092 | 1.77 | 83 | 1.2178 | | 1.4605 | 1.79 | 84 | 1.2178 | | 1.2242 | 1.81 | 85 | 1.2180 | | 1.6583 | 1.83 | 86 | 1.2180 | | 1.7079 | 1.85 | 87 | 1.2181 | | 0.9831 | 1.87 | 88 | 1.2186 | | 1.6504 | 1.89 | 89 | 1.2191 | | 1.7244 | 1.91 | 90 | 1.2194 | | 1.2895 | 1.94 | 91 | 1.2196 | | 1.03 | 1.96 | 92 | 1.2201 | | 1.377 | 1.98 | 93 | 1.2205 | | 1.0463 | 2.0 | 94 | 1.2210 | | 1.3759 | 2.02 | 95 | 1.2214 | | 1.7144 | 2.04 | 96 | 1.2218 | | 1.6047 | 2.06 | 97 | 1.2220 | | 1.6515 | 2.09 | 98 | 1.2222 | | 1.2909 | 2.11 | 99 | 1.2221 | | 1.6717 | 2.13 | 100 | 1.2219 | | 1.1318 | 2.15 | 101 | 1.2220 | | 1.3417 | 2.17 | 102 | 1.2219 | | 1.3242 | 2.19 | 103 | 1.2219 | | 1.4135 | 2.21 | 104 | 1.2221 | | 1.3863 | 2.23 | 105 | 1.2222 | | 1.2301 | 2.26 | 106 | 1.2222 | | 1.5481 | 2.28 | 107 | 1.2220 | | 1.0813 | 2.3 | 108 | 1.2219 | | 1.4198 | 2.32 | 109 | 1.2218 | | 1.4751 | 2.34 | 110 | 1.2214 | | 1.4133 | 2.36 | 111 | 1.2212 | | 0.8784 | 2.38 | 112 | 1.2212 | | 1.514 | 2.4 | 113 | 1.2211 | | 1.2913 | 2.43 | 114 | 1.2210 | | 1.1341 | 2.45 | 115 | 1.2209 | | 1.262 | 2.47 | 116 | 1.2210 | | 1.3282 | 2.49 | 117 | 1.2209 | | 1.5217 | 2.51 | 118 | 1.2206 | | 1.6127 | 2.53 | 119 | 1.2203 | | 1.5625 | 2.55 | 120 | 1.2201 | | 1.4603 | 2.57 | 121 | 1.2199 | | 1.8532 | 2.6 | 122 | 1.2196 | | 1.3278 | 2.62 | 123 | 1.2194 | | 1.0632 | 2.64 | 124 | 1.2192 | | 1.5837 | 2.66 | 125 | 1.2190 | | 1.4593 | 2.68 | 126 | 1.2188 | | 1.2919 | 2.7 | 127 | 1.2188 | | 1.1228 | 2.72 | 128 | 1.2187 | | 1.3098 | 2.74 | 129 | 1.2188 | | 1.6073 | 2.77 | 130 | 1.2188 | | 1.1484 | 2.79 | 131 | 1.2189 | | 1.6054 | 2.81 | 132 | 1.2190 | | 1.5228 | 2.83 | 133 | 1.2190 | | 1.5577 | 2.85 | 134 | 1.2190 | | 1.4234 | 2.87 | 135 | 1.2191 | | 1.7341 | 2.89 | 136 | 1.2191 | | 1.6164 | 2.91 | 137 | 1.2190 | | 1.6621 | 2.94 | 138 | 1.2190 | | 1.5781 | 2.96 | 139 | 1.2189 | | 1.0756 | 2.98 | 140 | 1.2189 | | 1.8596 | 3.0 | 141 | 1.2189 | ### Framework versions - Transformers 4.38.1 - Pytorch 2.1.0+cu121 - Datasets 2.17.0 - Tokenizers 0.15.2