fine-tune-radia-v3
This model is a fine-tuned version of meta-llama/Llama-2-7b-hf on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.4142
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.3
- num_epochs: 6
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.1521 | 0.09 | 5 | 1.1761 |
1.1482 | 0.17 | 10 | 1.1711 |
1.2242 | 0.26 | 15 | 1.1610 |
1.1821 | 0.34 | 20 | 1.1427 |
1.1905 | 0.43 | 25 | 1.1125 |
1.108 | 0.52 | 30 | 1.0602 |
1.0857 | 0.6 | 35 | 0.9833 |
0.9924 | 0.69 | 40 | 0.8858 |
0.8337 | 0.78 | 45 | 0.8109 |
0.8399 | 0.86 | 50 | 0.7848 |
0.7903 | 0.95 | 55 | 0.7641 |
0.8023 | 1.03 | 60 | 0.7455 |
0.7848 | 1.12 | 65 | 0.7305 |
0.7655 | 1.21 | 70 | 0.7155 |
0.7169 | 1.29 | 75 | 0.6981 |
0.7095 | 1.38 | 80 | 0.6774 |
0.6762 | 1.47 | 85 | 0.6524 |
0.7198 | 1.55 | 90 | 0.6331 |
0.6491 | 1.64 | 95 | 0.6082 |
0.6619 | 1.72 | 100 | 0.5881 |
0.645 | 1.81 | 105 | 0.5722 |
0.5815 | 1.9 | 110 | 0.5592 |
0.5906 | 1.98 | 115 | 0.5439 |
0.6282 | 2.07 | 120 | 0.5348 |
0.5758 | 2.16 | 125 | 0.5222 |
0.5252 | 2.24 | 130 | 0.5097 |
0.5269 | 2.33 | 135 | 0.5015 |
0.4826 | 2.41 | 140 | 0.4897 |
0.5443 | 2.5 | 145 | 0.4823 |
0.4485 | 2.59 | 150 | 0.4757 |
0.4363 | 2.67 | 155 | 0.4718 |
0.4674 | 2.76 | 160 | 0.4649 |
0.5294 | 2.84 | 165 | 0.4581 |
0.4529 | 2.93 | 170 | 0.4536 |
0.4642 | 3.02 | 175 | 0.4488 |
0.4224 | 3.1 | 180 | 0.4464 |
0.3956 | 3.19 | 185 | 0.4477 |
0.4081 | 3.28 | 190 | 0.4424 |
0.4678 | 3.36 | 195 | 0.4414 |
0.4652 | 3.45 | 200 | 0.4375 |
0.4233 | 3.53 | 205 | 0.4360 |
0.3717 | 3.62 | 210 | 0.4320 |
0.4267 | 3.71 | 215 | 0.4262 |
0.3676 | 3.79 | 220 | 0.4263 |
0.411 | 3.88 | 225 | 0.4206 |
0.4197 | 3.97 | 230 | 0.4215 |
0.373 | 4.05 | 235 | 0.4239 |
0.3347 | 4.14 | 240 | 0.4224 |
0.3834 | 4.22 | 245 | 0.4192 |
0.3055 | 4.31 | 250 | 0.4219 |
0.359 | 4.4 | 255 | 0.4182 |
0.3374 | 4.48 | 260 | 0.4219 |
0.3173 | 4.57 | 265 | 0.4207 |
0.3619 | 4.66 | 270 | 0.4153 |
0.3975 | 4.74 | 275 | 0.4141 |
0.4167 | 4.83 | 280 | 0.4135 |
0.42 | 4.91 | 285 | 0.4111 |
0.3459 | 5.0 | 290 | 0.4137 |
0.2955 | 5.09 | 295 | 0.4173 |
0.3668 | 5.17 | 300 | 0.4185 |
0.324 | 5.26 | 305 | 0.4152 |
0.3058 | 5.34 | 310 | 0.4162 |
0.3568 | 5.43 | 315 | 0.4155 |
0.3144 | 5.52 | 320 | 0.4151 |
0.2713 | 5.6 | 325 | 0.4153 |
0.3373 | 5.69 | 330 | 0.4143 |
0.3762 | 5.78 | 335 | 0.4140 |
0.3217 | 5.86 | 340 | 0.4140 |
0.3507 | 5.95 | 345 | 0.4142 |
Framework versions
- Transformers 4.36.0.dev0
- Pytorch 2.1.0+cu118
- Datasets 2.14.7
- Tokenizers 0.15.0
Model tree for joedonino/fine-tune-radia-v3
Base model
meta-llama/Llama-2-7b-hf