|
--- |
|
base_model: unsloth/gemma-2-2b-it-bnb-4bit |
|
language: |
|
- en |
|
license: apache-2.0 |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- gemma2 |
|
- trl |
|
- dpo |
|
--- |
|
|
|
# Uploaded model |
|
|
|
- **Developed by:** SameedHussain |
|
- **License:** apache-2.0 |
|
- **Finetuned from model :** unsloth/gemma-2-2b-it-bnb-4bit |
|
|
|
This gemma2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. |
|
|
|
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth) |
|
|
|
| Step | Training Loss | Rewards / Chosen | Rewards / Rejected | Rewards / Accuracies | Rewards / Margins | Logps / Rejected | Logps / Chosen | Logits / Rejected | Logits / Chosen | |
|
|------|---------------|------------------|--------------------|----------------------|-------------------|------------------|----------------|-------------------|-----------------| |
|
| 100 | 0.454700 | 6.241566 | 3.175092 | 0.750000 | 3.066474 | -102.758446 | -53.181263 | -14.580903 | -14.938275 | |
|
| 200 | 0.264100 | 6.640531 | 2.823826 | 0.888750 | 3.816705 | -110.525520 | -50.815018 | -14.796252 | -15.198202 | |
|
| 300 | 0.110200 | 6.310797 | 1.718347 | 0.985000 | 4.592450 | -118.720840 | -48.524315 | -15.263680 | -15.698647 | |
|
| 400 | 0.046900 | 6.744057 | 0.677384 | 0.997500 | 6.066672 | -128.757660 | -48.107479 | -15.710546 | -16.174524 | |
|
| 500 | 0.019700 | 6.714230 | -0.529035 | 1.000000 | 7.243264 | -143.408020 | -49.327625 | -16.120342 | -16.611662 | |
|
| 600 | 0.013700 | 6.605389 | -1.275738 | 1.000000 | 7.881127 | -146.968491 | -48.847641 | -16.320650 | -16.836390 | |
|
| 700 | 0.007900 | 6.333577 | -2.010140 | 1.000000 | 8.343716 | -154.255066 | -50.590134 | -16.486574 | -16.987421 | |
|
| 800 | 0.006300 | 6.489099 | -2.076626 | 1.000000 | 8.565723 | -150.381393 | -49.992256 | -16.614525 | -17.117744 | |
|
| 900 | 0.005100 | 6.429256 | -2.340122 | 1.000000 | 8.769380 | -160.874405 | -51.164425 | -16.687891 | -17.165791 | |
|
| 1000 | 0.004700 | 6.494193 | -2.520164 | 1.000000 | 9.014358 | -163.852982 | -54.317467 | -16.757954 | -17.206339 | |
|
| 1100 | 0.005900 | 6.287598 | -2.524287 | 1.000000 | 8.811884 | -161.473770 | -52.012741 | -16.825716 | -17.266563 | |
|
| 1200 | 0.005200 | 6.246828 | -3.126722 | 0.998750 | 9.373549 | -167.766861 | -52.052780 | -16.795412 | -17.277397 | |
|
| 1300 | 0.004300 | 6.347938 | -2.930621 | 1.000000 | 9.278559 | -165.971939 | -50.738480 | -16.836918 | -17.304783 | |
|
| 1400 | 0.003900 | 6.232501 | -3.073614 | 1.000000 | 9.306114 | -165.787643 | -50.953049 | -16.813383 | -17.290031 | |
|
|
|
|
|
|
|
|