codellama_instruct_spider_e10
This model is a fine-tuned version of codellama/CodeLlama-7b-Instruct-hf on the tmnam20/SpiderInstruct dataset.
It achieves the following results on the evaluation set:
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 4
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.06
- num_epochs: 10.0
Training results
Training Loss |
Epoch |
Step |
Validation Loss |
0.822 |
0.37 |
100 |
0.5313 |
0.3014 |
0.74 |
200 |
0.2763 |
0.2091 |
1.11 |
300 |
0.2469 |
0.1697 |
1.48 |
400 |
0.2401 |
0.1495 |
1.85 |
500 |
0.2395 |
0.1256 |
2.22 |
600 |
0.2525 |
0.1097 |
2.59 |
700 |
0.2641 |
0.1107 |
2.96 |
800 |
0.2617 |
0.0951 |
3.33 |
900 |
0.2683 |
0.0882 |
3.7 |
1000 |
0.2892 |
0.0818 |
4.06 |
1100 |
0.3134 |
0.075 |
4.43 |
1200 |
0.2978 |
0.0745 |
4.8 |
1300 |
0.3095 |
0.0642 |
5.17 |
1400 |
0.3261 |
0.0622 |
5.54 |
1500 |
0.3201 |
0.0573 |
5.91 |
1600 |
0.3343 |
0.0552 |
6.28 |
1700 |
0.3396 |
0.0523 |
6.65 |
1800 |
0.3602 |
0.0538 |
7.02 |
1900 |
0.3464 |
0.0467 |
7.39 |
2000 |
0.3622 |
0.0465 |
7.76 |
2100 |
0.3697 |
0.044 |
8.13 |
2200 |
0.3890 |
0.043 |
8.5 |
2300 |
0.3785 |
0.0375 |
8.87 |
2400 |
0.3860 |
0.0384 |
9.24 |
2500 |
0.3952 |
0.0363 |
9.61 |
2600 |
0.3940 |
0.0352 |
9.98 |
2700 |
0.3985 |
Framework versions
- Transformers 4.34.0.dev0
- Pytorch 2.0.1+cu118
- Datasets 2.14.4
- Tokenizers 0.13.3