martimfasantos's picture
End of training
ec8a6e7 verified
metadata
license: apache-2.0
base_model: TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
tags:
  - xcomet_xl_xxl
  - generated_from_trainer
model-index:
  - name: cpo-xcomet-xl_xxl-inc7b-10p-shuff-1e-7-full-tiny
    results: []

cpo-xcomet-xl_xxl-inc7b-10p-shuff-1e-7-full-tiny

This model is a fine-tuned version of TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T on the Unbabel/TowerAligned-v0.1 dataset. It achieves the following results on the evaluation set:

  • Loss: 2.7960
  • Nll Loss: 1.0602
  • Logps/best: -102.8461
  • Rewards/chosen: -10.2846
  • Rewards/rejected: -9.6988
  • Rewards/accuracies: 0.4600
  • Rewards/margins: -0.5858
  • Logps/rejected: -96.9882
  • Logps/chosen: -102.8461
  • Logits/rejected: -1.8264
  • Logits/chosen: -1.9635

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-07
  • train_batch_size: 1
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Nll Loss Logps/best Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen
2.9687 0.1063 100 2.8066 1.0659 -103.3585 -10.3358 -9.7431 0.4540 -0.5928 -97.4308 -103.3585 -1.8274 -1.9646
3.0173 0.2127 200 2.8063 1.0656 -103.3396 -10.3340 -9.7402 0.4560 -0.5937 -97.4022 -103.3396 -1.8275 -1.9648
2.8267 0.3190 300 2.8046 1.0650 -103.2849 -10.3285 -9.7373 0.4540 -0.5912 -97.3725 -103.2849 -1.8273 -1.9644
2.9404 0.4254 400 2.8046 1.0644 -103.2318 -10.3232 -9.7301 0.4600 -0.5931 -97.3013 -103.2318 -1.8271 -1.9643
3.3065 0.5317 500 2.8002 1.0637 -103.1556 -10.3156 -9.7280 0.4600 -0.5875 -97.2803 -103.1556 -1.8268 -1.9640
2.9333 0.6381 600 2.8021 1.0633 -103.1282 -10.3128 -9.7212 0.4560 -0.5916 -97.2122 -103.1282 -1.8271 -1.9642
3.2698 0.7444 700 2.8006 1.0627 -103.0742 -10.3074 -9.7178 0.4580 -0.5897 -97.1777 -103.0742 -1.8268 -1.9640
2.7002 0.8508 800 2.8003 1.0624 -103.0458 -10.3046 -9.7147 0.4580 -0.5899 -97.1470 -103.0458 -1.8269 -1.9641
3.0848 0.9571 900 2.7984 1.0620 -103.0023 -10.3002 -9.7132 0.4580 -0.5870 -97.1324 -103.0023 -1.8267 -1.9638
2.9243 1.0635 1000 2.7987 1.0617 -102.9805 -10.2980 -9.7086 0.4580 -0.5895 -97.0859 -102.9805 -1.8268 -1.9639
2.7945 1.1698 1100 2.7974 1.0615 -102.9564 -10.2956 -9.7084 0.4580 -0.5872 -97.0842 -102.9564 -1.8267 -1.9638
2.7893 1.2762 1200 2.7979 1.0613 -102.9413 -10.2941 -9.7061 0.4620 -0.5880 -97.0609 -102.9413 -1.8266 -1.9638
3.2162 1.3825 1300 2.7978 1.0611 -102.9208 -10.2921 -9.7039 0.4540 -0.5882 -97.0387 -102.9208 -1.8266 -1.9637
2.8123 1.4889 1400 2.7980 1.0611 -102.9247 -10.2925 -9.7032 0.4580 -0.5893 -97.0320 -102.9247 -1.8266 -1.9637
2.785 1.5952 1500 2.7973 1.0606 -102.8798 -10.2880 -9.6993 0.4560 -0.5887 -96.9928 -102.8798 -1.8265 -1.9636
2.7997 1.7016 1600 2.7952 1.0606 -102.8751 -10.2875 -9.7026 0.4600 -0.5849 -97.0257 -102.8751 -1.8267 -1.9638
2.6655 1.8079 1700 2.7956 1.0605 -102.8628 -10.2863 -9.7005 0.4620 -0.5858 -97.0050 -102.8628 -1.8264 -1.9635
2.7597 1.9143 1800 2.7966 1.0605 -102.8715 -10.2871 -9.6999 0.4540 -0.5872 -96.9991 -102.8715 -1.8267 -1.9637
2.9736 2.0206 1900 2.7955 1.0603 -102.8511 -10.2851 -9.6990 0.4600 -0.5861 -96.9900 -102.8511 -1.8266 -1.9637
2.8977 2.1270 2000 2.7954 1.0603 -102.8550 -10.2855 -9.6990 0.4560 -0.5865 -96.9901 -102.8550 -1.8270 -1.9641
2.7043 2.2333 2100 2.7961 1.0604 -102.8632 -10.2863 -9.6997 0.4560 -0.5867 -96.9967 -102.8632 -1.8264 -1.9635
2.7693 2.3396 2200 2.7951 1.0604 -102.8550 -10.2855 -9.6998 0.4600 -0.5857 -96.9983 -102.8550 -1.8263 -1.9634
2.6632 2.4460 2300 2.7943 1.0602 -102.8407 -10.2841 -9.6989 0.4600 -0.5851 -96.9893 -102.8407 -1.8264 -1.9635
3.2451 2.5523 2400 2.7953 1.0602 -102.8434 -10.2843 -9.6989 0.4580 -0.5855 -96.9885 -102.8434 -1.8264 -1.9635
2.7117 2.6587 2500 2.7955 1.0601 -102.8357 -10.2836 -9.6962 0.4580 -0.5873 -96.9625 -102.8357 -1.8263 -1.9634
3.148 2.7650 2600 2.7967 1.0604 -102.8636 -10.2864 -9.6985 0.4560 -0.5878 -96.9853 -102.8636 -1.8265 -1.9636
3.2951 2.8714 2700 2.7959 1.0602 -102.8490 -10.2849 -9.6981 0.4620 -0.5868 -96.9812 -102.8490 -1.8263 -1.9634
2.8486 2.9777 2800 2.7960 1.0602 -102.8461 -10.2846 -9.6988 0.4600 -0.5858 -96.9882 -102.8461 -1.8264 -1.9635

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.1.2
  • Datasets 2.20.0
  • Tokenizers 0.19.1