Edit model card

Uploaded model

  • Developed by: SameedHussain
  • License: apache-2.0
  • Finetuned from model : unsloth/gemma-2-2b-it-bnb-4bit

This gemma2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

Step Training Loss Rewards / Chosen Rewards / Rejected Rewards / Accuracies Rewards / Margins Logps / Rejected Logps / Chosen Logits / Rejected Logits / Chosen
100 0.454700 6.241566 3.175092 0.750000 3.066474 -102.758446 -53.181263 -14.580903 -14.938275
200 0.264100 6.640531 2.823826 0.888750 3.816705 -110.525520 -50.815018 -14.796252 -15.198202
300 0.110200 6.310797 1.718347 0.985000 4.592450 -118.720840 -48.524315 -15.263680 -15.698647
400 0.046900 6.744057 0.677384 0.997500 6.066672 -128.757660 -48.107479 -15.710546 -16.174524
500 0.019700 6.714230 -0.529035 1.000000 7.243264 -143.408020 -49.327625 -16.120342 -16.611662
600 0.013700 6.605389 -1.275738 1.000000 7.881127 -146.968491 -48.847641 -16.320650 -16.836390
700 0.007900 6.333577 -2.010140 1.000000 8.343716 -154.255066 -50.590134 -16.486574 -16.987421
800 0.006300 6.489099 -2.076626 1.000000 8.565723 -150.381393 -49.992256 -16.614525 -17.117744
900 0.005100 6.429256 -2.340122 1.000000 8.769380 -160.874405 -51.164425 -16.687891 -17.165791
1000 0.004700 6.494193 -2.520164 1.000000 9.014358 -163.852982 -54.317467 -16.757954 -17.206339
1100 0.005900 6.287598 -2.524287 1.000000 8.811884 -161.473770 -52.012741 -16.825716 -17.266563
1200 0.005200 6.246828 -3.126722 0.998750 9.373549 -167.766861 -52.052780 -16.795412 -17.277397
1300 0.004300 6.347938 -2.930621 1.000000 9.278559 -165.971939 -50.738480 -16.836918 -17.304783
1400 0.003900 6.232501 -3.073614 1.000000 9.306114 -165.787643 -50.953049 -16.813383 -17.290031
Downloads last month
11
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for SameedHussain/gemma-2-2b-it-Flight-Multi-Turn-V3-DPO

Finetuned
(86)
this model
Quantizations
1 model