metadata
base_model: unsloth/gemma-2-2b-it-bnb-4bit
language:
- en
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- gemma2
- trl
- dpo
Uploaded model
- Developed by: SameedHussain
- License: apache-2.0
- Finetuned from model : unsloth/gemma-2-2b-it-bnb-4bit
This gemma2 model was trained 2x faster with Unsloth and Huggingface's TRL library.
Step | Training Loss | Rewards / Chosen | Rewards / Rejected | Rewards / Accuracies | Rewards / Margins | Logps / Rejected | Logps / Chosen | Logits / Rejected | Logits / Chosen |
---|---|---|---|---|---|---|---|---|---|
100 | 0.454700 | 6.241566 | 3.175092 | 0.750000 | 3.066474 | -102.758446 | -53.181263 | -14.580903 | -14.938275 |
200 | 0.264100 | 6.640531 | 2.823826 | 0.888750 | 3.816705 | -110.525520 | -50.815018 | -14.796252 | -15.198202 |
300 | 0.110200 | 6.310797 | 1.718347 | 0.985000 | 4.592450 | -118.720840 | -48.524315 | -15.263680 | -15.698647 |
400 | 0.046900 | 6.744057 | 0.677384 | 0.997500 | 6.066672 | -128.757660 | -48.107479 | -15.710546 | -16.174524 |
500 | 0.019700 | 6.714230 | -0.529035 | 1.000000 | 7.243264 | -143.408020 | -49.327625 | -16.120342 | -16.611662 |
600 | 0.013700 | 6.605389 | -1.275738 | 1.000000 | 7.881127 | -146.968491 | -48.847641 | -16.320650 | -16.836390 |
700 | 0.007900 | 6.333577 | -2.010140 | 1.000000 | 8.343716 | -154.255066 | -50.590134 | -16.486574 | -16.987421 |
800 | 0.006300 | 6.489099 | -2.076626 | 1.000000 | 8.565723 | -150.381393 | -49.992256 | -16.614525 | -17.117744 |
900 | 0.005100 | 6.429256 | -2.340122 | 1.000000 | 8.769380 | -160.874405 | -51.164425 | -16.687891 | -17.165791 |
1000 | 0.004700 | 6.494193 | -2.520164 | 1.000000 | 9.014358 | -163.852982 | -54.317467 | -16.757954 | -17.206339 |
1100 | 0.005900 | 6.287598 | -2.524287 | 1.000000 | 8.811884 | -161.473770 | -52.012741 | -16.825716 | -17.266563 |
1200 | 0.005200 | 6.246828 | -3.126722 | 0.998750 | 9.373549 | -167.766861 | -52.052780 | -16.795412 | -17.277397 |
1300 | 0.004300 | 6.347938 | -2.930621 | 1.000000 | 9.278559 | -165.971939 | -50.738480 | -16.836918 | -17.304783 |
1400 | 0.003900 | 6.232501 | -3.073614 | 1.000000 | 9.306114 | -165.787643 | -50.953049 | -16.813383 | -17.290031 |