martimfasantos's picture
End of training
3dc1d21 verified
|
raw
history blame
8.19 kB
metadata
license: apache-2.0
base_model: TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
tags:
  - xcomet_xl_xxl
  - generated_from_trainer
model-index:
  - name: cpo-xcomet-xl_xxl-inc7b-10p-shuff-5e-7-full-tiny
    results: []

cpo-xcomet-xl_xxl-inc7b-10p-shuff-5e-7-full-tiny

This model is a fine-tuned version of TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T on the Unbabel/TowerAligned-v0.1 dataset. It achieves the following results on the evaluation set:

  • Loss: 2.5861
  • Nll Loss: 0.9631
  • Logps/best: -93.9342
  • Rewards/chosen: -9.3934
  • Rewards/rejected: -8.9617
  • Rewards/accuracies: 0.4760
  • Rewards/margins: -0.4317
  • Logps/rejected: -89.6171
  • Logps/chosen: -93.9342
  • Logits/rejected: -1.8016
  • Logits/chosen: -1.9358

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-07
  • train_batch_size: 1
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Nll Loss Logps/best Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen
2.9692 0.1063 100 2.8057 1.0649 -103.2722 -10.3272 -9.7340 0.4600 -0.5932 -97.3398 -103.2722 -1.8268 -1.9640
3.0031 0.2127 200 2.7925 1.0588 -102.7081 -10.2708 -9.6860 0.4580 -0.5848 -96.8598 -102.7081 -1.8260 -1.9630
2.7823 0.3190 300 2.7675 1.0456 -101.5052 -10.1505 -9.5824 0.4600 -0.5681 -95.8241 -101.5052 -1.8242 -1.9611
2.8692 0.4254 400 2.7382 1.0320 -100.2503 -10.0250 -9.4771 0.4580 -0.5479 -94.7710 -100.2503 -1.8210 -1.9575
3.1882 0.5317 500 2.7126 1.0203 -99.1754 -9.9175 -9.3884 0.4580 -0.5291 -93.8843 -99.1754 -1.8185 -1.9548
2.8104 0.6381 600 2.6920 1.0101 -98.2426 -9.8243 -9.3098 0.4640 -0.5144 -93.0982 -98.2426 -1.8161 -1.9520
3.127 0.7444 700 2.6735 1.0018 -97.4806 -9.7481 -9.2466 0.4680 -0.5015 -92.4660 -97.4806 -1.8148 -1.9505
2.5488 0.8508 800 2.6599 0.9951 -96.8657 -9.6866 -9.1952 0.4640 -0.4914 -91.9516 -96.8657 -1.8113 -1.9467
2.9106 0.9571 900 2.6461 0.9892 -96.3278 -9.6328 -9.1532 0.4700 -0.4795 -91.5325 -96.3278 -1.8107 -1.9459
2.7349 1.0635 1000 2.6355 0.9845 -95.8978 -9.5898 -9.1181 0.4660 -0.4717 -91.1811 -95.8978 -1.8091 -1.9442
2.607 1.1698 1100 2.6258 0.9802 -95.4924 -9.5492 -9.0851 0.4660 -0.4642 -90.8505 -95.4924 -1.8079 -1.9429
2.5949 1.2762 1200 2.6187 0.9772 -95.2189 -9.5219 -9.0638 0.4660 -0.4581 -90.6378 -95.2189 -1.8067 -1.9415
3.0028 1.3825 1300 2.6133 0.9745 -94.9713 -9.4971 -9.0415 0.4660 -0.4556 -90.4151 -94.9713 -1.8062 -1.9409
2.5891 1.4889 1400 2.6075 0.9720 -94.7468 -9.4747 -9.0242 0.4700 -0.4505 -90.2418 -94.7468 -1.8062 -1.9409
2.5647 1.5952 1500 2.6035 0.9701 -94.5637 -9.4564 -9.0103 0.4640 -0.4460 -90.1033 -94.5637 -1.8044 -1.9389
2.566 1.7016 1600 2.5974 0.9682 -94.3869 -9.4387 -8.9984 0.4660 -0.4403 -89.9837 -94.3869 -1.8037 -1.9382
2.4615 1.8079 1700 2.5960 0.9672 -94.3052 -9.4305 -8.9899 0.4760 -0.4406 -89.8987 -94.3052 -1.8029 -1.9373
2.5336 1.9143 1800 2.5936 0.9662 -94.2071 -9.4207 -8.9834 0.4700 -0.4373 -89.8344 -94.2071 -1.8026 -1.9369
2.7186 2.0206 1900 2.5908 0.9653 -94.1252 -9.4125 -8.9777 0.4820 -0.4349 -89.7766 -94.1252 -1.8021 -1.9364
2.6496 2.1270 2000 2.5912 0.9646 -94.0712 -9.4071 -8.9704 0.4700 -0.4367 -89.7039 -94.0712 -1.8020 -1.9363
2.4786 2.2333 2100 2.5882 0.9642 -94.0305 -9.4031 -8.9690 0.4820 -0.4340 -89.6904 -94.0305 -1.8022 -1.9365
2.5261 2.3396 2200 2.5871 0.9636 -93.9762 -9.3976 -8.9653 0.4720 -0.4323 -89.6528 -93.9762 -1.8015 -1.9357
2.4197 2.4460 2300 2.5855 0.9634 -93.9605 -9.3961 -8.9653 0.4720 -0.4308 -89.6529 -93.9605 -1.8015 -1.9357
2.9723 2.5523 2400 2.5863 0.9633 -93.9504 -9.3950 -8.9631 0.4760 -0.4319 -89.6311 -93.9504 -1.8016 -1.9358
2.4721 2.6587 2500 2.5864 0.9634 -93.9537 -9.3954 -8.9651 0.4740 -0.4302 -89.6514 -93.9537 -1.8014 -1.9356
2.8984 2.7650 2600 2.5856 0.9630 -93.9154 -9.3915 -8.9610 0.4700 -0.4305 -89.6100 -93.9154 -1.8014 -1.9356
3.0422 2.8714 2700 2.5848 0.9630 -93.9148 -9.3915 -8.9617 0.4800 -0.4298 -89.6169 -93.9148 -1.8015 -1.9357
2.6226 2.9777 2800 2.5861 0.9631 -93.9342 -9.3934 -8.9617 0.4760 -0.4317 -89.6171 -93.9342 -1.8016 -1.9358

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.1.2
  • Datasets 2.20.0
  • Tokenizers 0.19.1