sft-zephyr-7b-sft-qlora-ultrafeedback-binarized-20241011-162008
This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.9805
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 32
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 1.0
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.0557 | 0.0105 | 20 | 1.2313 |
1.0725 | 0.0209 | 40 | 1.1635 |
1.0261 | 0.0314 | 60 | 1.1410 |
1.1577 | 0.0419 | 80 | 1.1209 |
1.1619 | 0.0523 | 100 | 1.1067 |
1.1234 | 0.0628 | 120 | 1.0981 |
1.0256 | 0.0733 | 140 | 1.0901 |
1.1511 | 0.0837 | 160 | 1.0850 |
1.2364 | 0.0942 | 180 | 1.0802 |
1.1676 | 0.1047 | 200 | 1.0761 |
1.2327 | 0.1151 | 220 | 1.0731 |
1.0082 | 0.1256 | 240 | 1.0695 |
0.9324 | 0.1361 | 260 | 1.0666 |
1.0435 | 0.1465 | 280 | 1.0630 |
0.8484 | 0.1570 | 300 | 1.0588 |
0.962 | 0.1675 | 320 | 1.0588 |
0.9531 | 0.1779 | 340 | 1.0549 |
0.8902 | 0.1884 | 360 | 1.0518 |
1.1103 | 0.1988 | 380 | 1.0485 |
1.0641 | 0.2093 | 400 | 1.0455 |
0.9541 | 0.2198 | 420 | 1.0431 |
1.0081 | 0.2302 | 440 | 1.0427 |
0.9761 | 0.2407 | 460 | 1.0407 |
1.0654 | 0.2512 | 480 | 1.0391 |
1.1185 | 0.2616 | 500 | 1.0367 |
1.0337 | 0.2721 | 520 | 1.0357 |
0.9059 | 0.2826 | 540 | 1.0335 |
1.1223 | 0.2930 | 560 | 1.0318 |
1.1514 | 0.3035 | 580 | 1.0300 |
1.0715 | 0.3140 | 600 | 1.0294 |
1.1336 | 0.3244 | 620 | 1.0263 |
1.0148 | 0.3349 | 640 | 1.0246 |
1.0242 | 0.3454 | 660 | 1.0238 |
1.1316 | 0.3558 | 680 | 1.0220 |
1.0114 | 0.3663 | 700 | 1.0216 |
1.1682 | 0.3768 | 720 | 1.0207 |
1.1026 | 0.3872 | 740 | 1.0180 |
1.0854 | 0.3977 | 760 | 1.0182 |
0.8933 | 0.4082 | 780 | 1.0164 |
1.0233 | 0.4186 | 800 | 1.0153 |
1.1105 | 0.4291 | 820 | 1.0140 |
0.8441 | 0.4396 | 840 | 1.0124 |
0.963 | 0.4500 | 860 | 1.0113 |
1.0488 | 0.4605 | 880 | 1.0093 |
0.8147 | 0.4710 | 900 | 1.0084 |
1.0005 | 0.4814 | 920 | 1.0081 |
0.959 | 0.4919 | 940 | 1.0071 |
0.8878 | 0.5024 | 960 | 1.0062 |
1.238 | 0.5128 | 980 | 1.0048 |
0.9114 | 0.5233 | 1000 | 1.0032 |
1.0474 | 0.5338 | 1020 | 1.0017 |
0.9858 | 0.5442 | 1040 | 1.0009 |
0.9642 | 0.5547 | 1060 | 1.0007 |
0.9116 | 0.5651 | 1080 | 0.9992 |
0.9444 | 0.5756 | 1100 | 0.9978 |
1.0698 | 0.5861 | 1120 | 0.9970 |
0.9379 | 0.5965 | 1140 | 0.9959 |
0.8902 | 0.6070 | 1160 | 0.9950 |
1.0654 | 0.6175 | 1180 | 0.9941 |
1.1352 | 0.6279 | 1200 | 0.9935 |
1.0493 | 0.6384 | 1220 | 0.9922 |
0.9792 | 0.6489 | 1240 | 0.9913 |
0.8634 | 0.6593 | 1260 | 0.9903 |
0.8152 | 0.6698 | 1280 | 0.9898 |
1.0059 | 0.6803 | 1300 | 0.9890 |
0.9244 | 0.6907 | 1320 | 0.9884 |
0.9918 | 0.7012 | 1340 | 0.9876 |
1.0536 | 0.7117 | 1360 | 0.9872 |
0.9883 | 0.7221 | 1380 | 0.9866 |
0.9426 | 0.7326 | 1400 | 0.9863 |
0.8653 | 0.7431 | 1420 | 0.9855 |
0.863 | 0.7535 | 1440 | 0.9849 |
0.9217 | 0.7640 | 1460 | 0.9847 |
1.0365 | 0.7745 | 1480 | 0.9844 |
0.8865 | 0.7849 | 1500 | 0.9841 |
1.1006 | 0.7954 | 1520 | 0.9836 |
0.9393 | 0.8059 | 1540 | 0.9832 |
0.8455 | 0.8163 | 1560 | 0.9826 |
1.1479 | 0.8268 | 1580 | 0.9823 |
1.0578 | 0.8373 | 1600 | 0.9820 |
0.7279 | 0.8477 | 1620 | 0.9818 |
0.973 | 0.8582 | 1640 | 0.9815 |
1.1227 | 0.8687 | 1660 | 0.9812 |
0.9897 | 0.8791 | 1680 | 0.9811 |
0.8196 | 0.8896 | 1700 | 0.9810 |
0.9309 | 0.9001 | 1720 | 0.9808 |
0.8774 | 0.9105 | 1740 | 0.9808 |
0.9671 | 0.9210 | 1760 | 0.9807 |
1.0849 | 0.9314 | 1780 | 0.9807 |
1.0233 | 0.9419 | 1800 | 0.9806 |
0.9742 | 0.9524 | 1820 | 0.9806 |
1.029 | 0.9628 | 1840 | 0.9806 |
1.0048 | 0.9733 | 1860 | 0.9806 |
0.9348 | 0.9838 | 1880 | 0.9805 |
0.8959 | 0.9942 | 1900 | 0.9805 |
Framework versions
- PEFT 0.12.0
- Transformers 4.45.2
- Pytorch 2.4.0+cu121
- Datasets 2.20.0
- Tokenizers 0.20.0
- Downloads last month
- 1
Model tree for sahandrez/sft-zephyr-7b-sft-qlora-ultrafeedback
Base model
mistralai/Mistral-7B-v0.1