fine-tune-radia-v1
This model is a fine-tuned version of meta-llama/Llama-2-7b-hf on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.4998
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.3
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.1483 | 0.09 | 5 | 1.1421 |
1.1815 | 0.17 | 10 | 1.1398 |
1.1381 | 0.26 | 15 | 1.1355 |
1.1292 | 0.34 | 20 | 1.1281 |
1.0458 | 0.43 | 25 | 1.1161 |
1.0865 | 0.52 | 30 | 1.0966 |
1.0936 | 0.6 | 35 | 1.0627 |
1.0503 | 0.69 | 40 | 1.0142 |
0.9561 | 0.78 | 45 | 0.9525 |
0.9792 | 0.86 | 50 | 0.8784 |
0.8311 | 0.95 | 55 | 0.8306 |
0.8196 | 1.03 | 60 | 0.8108 |
0.7419 | 1.12 | 65 | 0.7988 |
0.8127 | 1.21 | 70 | 0.7876 |
0.7911 | 1.29 | 75 | 0.7769 |
0.7611 | 1.38 | 80 | 0.7659 |
0.7468 | 1.47 | 85 | 0.7534 |
0.7334 | 1.55 | 90 | 0.7389 |
0.7112 | 1.64 | 95 | 0.7180 |
0.7098 | 1.72 | 100 | 0.7032 |
0.6907 | 1.81 | 105 | 0.6859 |
0.6999 | 1.9 | 110 | 0.6696 |
0.6729 | 1.98 | 115 | 0.6541 |
0.6538 | 2.07 | 120 | 0.6415 |
0.5567 | 2.16 | 125 | 0.6302 |
0.5593 | 2.24 | 130 | 0.6220 |
0.6058 | 2.33 | 135 | 0.6070 |
0.5907 | 2.41 | 140 | 0.5963 |
0.6239 | 2.5 | 145 | 0.5879 |
0.6013 | 2.59 | 150 | 0.5762 |
0.4774 | 2.67 | 155 | 0.5640 |
0.4434 | 2.76 | 160 | 0.5566 |
0.5408 | 2.84 | 165 | 0.5455 |
0.5003 | 2.93 | 170 | 0.5383 |
0.5057 | 3.02 | 175 | 0.5307 |
0.444 | 3.1 | 180 | 0.5247 |
0.4443 | 3.19 | 185 | 0.5184 |
0.4213 | 3.28 | 190 | 0.5107 |
0.4798 | 3.36 | 195 | 0.5090 |
0.4392 | 3.45 | 200 | 0.5036 |
0.4078 | 3.53 | 205 | 0.5012 |
0.3802 | 3.62 | 210 | 0.4960 |
0.5128 | 3.71 | 215 | 0.4929 |
0.4806 | 3.79 | 220 | 0.4893 |
0.4421 | 3.88 | 225 | 0.4853 |
0.4077 | 3.97 | 230 | 0.4842 |
0.4018 | 4.05 | 235 | 0.4835 |
0.3358 | 4.14 | 240 | 0.4832 |
0.3444 | 4.22 | 245 | 0.4821 |
0.4141 | 4.31 | 250 | 0.4808 |
0.4197 | 4.4 | 255 | 0.4816 |
0.3877 | 4.48 | 260 | 0.4770 |
0.3193 | 4.57 | 265 | 0.4783 |
0.3672 | 4.66 | 270 | 0.4731 |
0.4183 | 4.74 | 275 | 0.4747 |
0.3834 | 4.83 | 280 | 0.4685 |
0.3415 | 4.91 | 285 | 0.4668 |
0.3261 | 5.0 | 290 | 0.4630 |
0.308 | 5.09 | 295 | 0.4800 |
0.4124 | 5.17 | 300 | 0.4702 |
0.303 | 5.26 | 305 | 0.4745 |
0.2956 | 5.34 | 310 | 0.4702 |
0.3457 | 5.43 | 315 | 0.4714 |
0.3472 | 5.52 | 320 | 0.4643 |
0.3447 | 5.6 | 325 | 0.4675 |
0.3006 | 5.69 | 330 | 0.4649 |
0.2983 | 5.78 | 335 | 0.4649 |
0.2822 | 5.86 | 340 | 0.4606 |
0.2892 | 5.95 | 345 | 0.4630 |
0.2897 | 6.03 | 350 | 0.4681 |
0.2688 | 6.12 | 355 | 0.4796 |
0.2686 | 6.21 | 360 | 0.4660 |
0.2701 | 6.29 | 365 | 0.4788 |
0.2438 | 6.38 | 370 | 0.4704 |
0.2524 | 6.47 | 375 | 0.4738 |
0.2868 | 6.55 | 380 | 0.4698 |
0.2882 | 6.64 | 385 | 0.4681 |
0.2674 | 6.72 | 390 | 0.4676 |
0.2946 | 6.81 | 395 | 0.4652 |
0.2631 | 6.9 | 400 | 0.4650 |
0.3074 | 6.98 | 405 | 0.4579 |
0.2277 | 7.07 | 410 | 0.4789 |
0.2542 | 7.16 | 415 | 0.4787 |
0.2099 | 7.24 | 420 | 0.4801 |
0.2146 | 7.33 | 425 | 0.4846 |
0.2354 | 7.41 | 430 | 0.4738 |
0.2417 | 7.5 | 435 | 0.4787 |
0.2468 | 7.59 | 440 | 0.4779 |
0.2207 | 7.67 | 445 | 0.4747 |
0.2407 | 7.76 | 450 | 0.4759 |
0.2635 | 7.84 | 455 | 0.4760 |
0.2173 | 7.93 | 460 | 0.4674 |
0.2301 | 8.02 | 465 | 0.4819 |
0.1849 | 8.1 | 470 | 0.5009 |
0.1803 | 8.19 | 475 | 0.4845 |
0.2169 | 8.28 | 480 | 0.4934 |
0.1835 | 8.36 | 485 | 0.4942 |
0.2192 | 8.45 | 490 | 0.4870 |
0.2121 | 8.53 | 495 | 0.4918 |
0.2181 | 8.62 | 500 | 0.4885 |
0.1984 | 8.71 | 505 | 0.4849 |
0.2049 | 8.79 | 510 | 0.4879 |
0.1923 | 8.88 | 515 | 0.4867 |
0.2313 | 8.97 | 520 | 0.4863 |
0.1874 | 9.05 | 525 | 0.4918 |
0.1658 | 9.14 | 530 | 0.5046 |
0.1817 | 9.22 | 535 | 0.5011 |
0.2038 | 9.31 | 540 | 0.4944 |
0.1927 | 9.4 | 545 | 0.4974 |
0.1712 | 9.48 | 550 | 0.5054 |
0.1853 | 9.57 | 555 | 0.5051 |
0.1685 | 9.66 | 560 | 0.5006 |
0.1601 | 9.74 | 565 | 0.4977 |
0.1611 | 9.83 | 570 | 0.4978 |
0.1873 | 9.91 | 575 | 0.4992 |
0.1957 | 10.0 | 580 | 0.4998 |
Framework versions
- Transformers 4.36.0.dev0
- Pytorch 2.1.0+cu118
- Datasets 2.14.7
- Tokenizers 0.15.0
Model tree for joedonino/fine-tune-radia-v1
Base model
meta-llama/Llama-2-7b-hf