gemma-2b-fine-tuned for learning Python Programming easy
This model is a fine-tuned version of google/gemma-2b on a very small dataset of 205 carefully datapoints on Python programming. It achieves the following results on the evaluation set:
- Loss: 1.2177
Model description
This model experiments with fine tuning a large language model for a small task which is teaching Python in simple terms
Intended uses & limitations
The model is intended to be used experimentally, it would require more data points and training to work much better
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 1
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 8
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 15
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.756 | 0.19 | 2 | 1.6592 |
1.4272 | 0.39 | 4 | 1.6572 |
1.6918 | 0.58 | 6 | 1.6529 |
1.8009 | 0.77 | 8 | 1.6469 |
1.674 | 0.96 | 10 | 1.6384 |
1.5397 | 1.16 | 12 | 1.6273 |
1.6255 | 1.35 | 14 | 1.6131 |
1.5575 | 1.54 | 16 | 1.5947 |
1.5248 | 1.73 | 18 | 1.5750 |
1.5811 | 1.93 | 20 | 1.5545 |
1.7426 | 2.12 | 22 | 1.5339 |
1.5397 | 2.31 | 24 | 1.5140 |
1.421 | 2.51 | 26 | 1.4953 |
1.3699 | 2.7 | 28 | 1.4778 |
1.3421 | 2.89 | 30 | 1.4616 |
1.5048 | 3.08 | 32 | 1.4483 |
1.3779 | 3.28 | 34 | 1.4362 |
1.435 | 3.47 | 36 | 1.4247 |
1.2924 | 3.66 | 38 | 1.4130 |
1.375 | 3.86 | 40 | 1.4011 |
1.3808 | 4.05 | 42 | 1.3894 |
1.3854 | 4.24 | 44 | 1.3776 |
1.2755 | 4.43 | 46 | 1.3668 |
1.1832 | 4.63 | 48 | 1.3568 |
1.4068 | 4.82 | 50 | 1.3473 |
1.197 | 5.01 | 52 | 1.3383 |
1.396 | 5.2 | 54 | 1.3300 |
1.0756 | 5.4 | 56 | 1.3219 |
1.164 | 5.59 | 58 | 1.3140 |
1.2238 | 5.78 | 60 | 1.3067 |
1.2795 | 5.98 | 62 | 1.2999 |
1.2425 | 6.17 | 64 | 1.2940 |
1.1914 | 6.36 | 66 | 1.2884 |
1.2129 | 6.55 | 68 | 1.2832 |
1.0642 | 6.75 | 70 | 1.2783 |
1.1238 | 6.94 | 72 | 1.2736 |
1.0442 | 7.13 | 74 | 1.2692 |
1.1614 | 7.33 | 76 | 1.2650 |
1.2674 | 7.52 | 78 | 1.2613 |
0.973 | 7.71 | 80 | 1.2579 |
1.1108 | 7.9 | 82 | 1.2551 |
1.2114 | 8.1 | 84 | 1.2519 |
0.9327 | 8.29 | 86 | 1.2487 |
1.0495 | 8.48 | 88 | 1.2459 |
1.1297 | 8.67 | 90 | 1.2434 |
1.1777 | 8.87 | 92 | 1.2413 |
0.9277 | 9.06 | 94 | 1.2394 |
1.0063 | 9.25 | 96 | 1.2376 |
1.0652 | 9.45 | 98 | 1.2359 |
1.0928 | 9.64 | 100 | 1.2342 |
1.0611 | 9.83 | 102 | 1.2329 |
0.9749 | 10.02 | 104 | 1.2314 |
0.9305 | 10.22 | 106 | 1.2300 |
0.9944 | 10.41 | 108 | 1.2289 |
1.1229 | 10.6 | 110 | 1.2277 |
1.1502 | 10.8 | 112 | 1.2269 |
0.8728 | 10.99 | 114 | 1.2261 |
0.9504 | 11.18 | 116 | 1.2253 |
1.0989 | 11.37 | 118 | 1.2242 |
0.9485 | 11.57 | 120 | 1.2235 |
1.0335 | 11.76 | 122 | 1.2227 |
1.0332 | 11.95 | 124 | 1.2222 |
0.8178 | 12.14 | 126 | 1.2215 |
1.0058 | 12.34 | 128 | 1.2208 |
1.034 | 12.53 | 130 | 1.2202 |
0.9451 | 12.72 | 132 | 1.2197 |
0.9163 | 12.92 | 134 | 1.2193 |
1.173 | 13.11 | 136 | 1.2190 |
1.0758 | 13.3 | 138 | 1.2185 |
0.9012 | 13.49 | 140 | 1.2184 |
0.9099 | 13.69 | 142 | 1.2180 |
1.0 | 13.88 | 144 | 1.2180 |
1.0032 | 14.07 | 146 | 1.2179 |
0.991 | 14.27 | 148 | 1.2177 |
0.8836 | 14.46 | 150 | 1.2177 |
Framework versions
- PEFT 0.10.0
- Transformers 4.39.3
- Pytorch 2.2.1+cu121
- Datasets 2.18.0
- Tokenizers 0.15.2
- Downloads last month
- 2
Inference API (serverless) does not yet support peft models for this pipeline type.
Model tree for Ikeofai/gemma-2b-fine-tuned
Base model
google/gemma-2b