mixtral_semptom_1 / README.md
Cem13's picture
cem13/complaint_to_sythoms_mix_8x7b
de3cd74 verified
metadata
base_model: mistralai/Mixtral-8x7B-v0.1
datasets:
  - generator
library_name: peft
license: apache-2.0
tags:
  - trl
  - sft
  - generated_from_trainer
model-index:
  - name: Mixtral_Alpace_v2
    results: []

Mixtral_Alpace_v2

This model is a fine-tuned version of mistralai/Mixtral-8x7B-v0.1 on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5881

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2.5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 15
  • num_epochs: 15

Training results

Training Loss Epoch Step Validation Loss
1.5291 0.0870 10 1.6326
1.58 0.1739 20 1.5665
1.4109 0.2609 30 1.4856
1.4493 0.3478 40 1.4159
1.2503 0.4348 50 1.3493
1.2441 0.5217 60 1.2719
1.1923 0.6087 70 1.1930
1.1158 0.6957 80 1.1193
1.0184 0.7826 90 1.0541
1.0231 0.8696 100 1.0056
0.9731 0.9565 110 0.9619
0.892 1.0435 120 0.9170
0.911 1.1304 130 0.8727
0.7789 1.2174 140 0.8338
0.8049 1.3043 150 0.8041
0.7691 1.3913 160 0.7788
0.7869 1.4783 170 0.7589
0.7366 1.5652 180 0.7428
0.7436 1.6522 190 0.7282
0.7271 1.7391 200 0.7157
0.6809 1.8261 210 0.7056
0.7068 1.9130 220 0.6960
0.6446 2.0 230 0.6872
0.6682 2.0870 240 0.6819
0.7003 2.1739 250 0.6745
0.6859 2.2609 260 0.6701
0.6169 2.3478 270 0.6655
0.666 2.4348 280 0.6607
0.6325 2.5217 290 0.6575
0.6408 2.6087 300 0.6536
0.6371 2.6957 310 0.6507
0.5933 2.7826 320 0.6474
0.6313 2.8696 330 0.6450
0.6453 2.9565 340 0.6421
0.6807 3.0435 350 0.6407
0.6217 3.1304 360 0.6390
0.589 3.2174 370 0.6355
0.5591 3.3043 380 0.6337
0.6818 3.3913 390 0.6319
0.6269 3.4783 400 0.6306
0.611 3.5652 410 0.6286
0.5602 3.6522 420 0.6268
0.6735 3.7391 430 0.6251
0.5269 3.8261 440 0.6246
0.6109 3.9130 450 0.6232
0.5745 4.0 460 0.6221
0.6348 4.0870 470 0.6227
0.5398 4.1739 480 0.6203
0.6145 4.2609 490 0.6194
0.621 4.3478 500 0.6178
0.6123 4.4348 510 0.6172
0.6113 4.5217 520 0.6162
0.5991 4.6087 530 0.6154
0.5244 4.6957 540 0.6143
0.5832 4.7826 550 0.6136
0.6284 4.8696 560 0.6120
0.54 4.9565 570 0.6121
0.541 5.0435 580 0.6120
0.5204 5.1304 590 0.6108
0.5961 5.2174 600 0.6101
0.5522 5.3043 610 0.6098
0.5778 5.3913 620 0.6087
0.6059 5.4783 630 0.6090
0.5852 5.5652 640 0.6085
0.5687 5.6522 650 0.6072
0.5685 5.7391 660 0.6061
0.593 5.8261 670 0.6052
0.5975 5.9130 680 0.6055
0.5489 6.0 690 0.6047
0.567 6.0870 700 0.6049
0.5706 6.1739 710 0.6035
0.658 6.2609 720 0.6024
0.559 6.3478 730 0.6023
0.545 6.4348 740 0.6019
0.6096 6.5217 750 0.6021
0.5385 6.6087 760 0.6018
0.5505 6.6957 770 0.6012
0.5058 6.7826 780 0.6003
0.5899 6.8696 790 0.5999
0.5102 6.9565 800 0.5995
0.5185 7.0435 810 0.5995
0.5055 7.1304 820 0.5991
0.5907 7.2174 830 0.5997
0.5636 7.3043 840 0.5991
0.5505 7.3913 850 0.5986
0.5621 7.4783 860 0.5977
0.4968 7.5652 870 0.5976
0.5713 7.6522 880 0.5970
0.5968 7.7391 890 0.5970
0.531 7.8261 900 0.5964
0.538 7.9130 910 0.5959
0.6087 8.0 920 0.5959
0.5845 8.0870 930 0.5963
0.5197 8.1739 940 0.5960
0.5128 8.2609 950 0.5959
0.5613 8.3478 960 0.5956
0.5268 8.4348 970 0.5953
0.5696 8.5217 980 0.5952
0.5755 8.6087 990 0.5941
0.5014 8.6957 1000 0.5945
0.5568 8.7826 1010 0.5936
0.5934 8.8696 1020 0.5944
0.5178 8.9565 1030 0.5941
0.4618 9.0435 1040 0.5936
0.4867 9.1304 1050 0.5934
0.5402 9.2174 1060 0.5937
0.5177 9.3043 1070 0.5936
0.5825 9.3913 1080 0.5926
0.5523 9.4783 1090 0.5929
0.583 9.5652 1100 0.5920
0.5232 9.6522 1110 0.5927
0.5367 9.7391 1120 0.5920
0.5321 9.8261 1130 0.5913
0.5672 9.9130 1140 0.5910
0.5549 10.0 1150 0.5910
0.5191 10.0870 1160 0.5915
0.5463 10.1739 1170 0.5915
0.5275 10.2609 1180 0.5913
0.5484 10.3478 1190 0.5915
0.5293 10.4348 1200 0.5910
0.519 10.5217 1210 0.5903
0.5129 10.6087 1220 0.5898
0.5793 10.6957 1230 0.5900
0.4481 10.7826 1240 0.5901
0.5309 10.8696 1250 0.5903
0.5887 10.9565 1260 0.5898
0.5109 11.0435 1270 0.5907
0.5776 11.1304 1280 0.5902
0.4984 11.2174 1290 0.5898
0.5656 11.3043 1300 0.5898
0.4931 11.3913 1310 0.5902
0.531 11.4783 1320 0.5900
0.5163 11.5652 1330 0.5892
0.5314 11.6522 1340 0.5894
0.4766 11.7391 1350 0.5893
0.5201 11.8261 1360 0.5896
0.6127 11.9130 1370 0.5889
0.5441 12.0 1380 0.5888
0.5258 12.0870 1390 0.5894
0.5722 12.1739 1400 0.5887
0.5228 12.2609 1410 0.5891
0.524 12.3478 1420 0.5884
0.4951 12.4348 1430 0.5894
0.5235 12.5217 1440 0.5893
0.5071 12.6087 1450 0.5889
0.5417 12.6957 1460 0.5886
0.4882 12.7826 1470 0.5889
0.548 12.8696 1480 0.5889
0.529 12.9565 1490 0.5889
0.5646 13.0435 1500 0.5887
0.5142 13.1304 1510 0.5889
0.5161 13.2174 1520 0.5886
0.5008 13.3043 1530 0.5888
0.5187 13.3913 1540 0.5887
0.5334 13.4783 1550 0.5886
0.5099 13.5652 1560 0.5884
0.5644 13.6522 1570 0.5888
0.5242 13.7391 1580 0.5882
0.4912 13.8261 1590 0.5886
0.5459 13.9130 1600 0.5884
0.5204 14.0 1610 0.5881
0.4644 14.0870 1620 0.5884
0.5364 14.1739 1630 0.5885
0.5852 14.2609 1640 0.5887
0.5135 14.3478 1650 0.5884
0.5192 14.4348 1660 0.5885
0.5093 14.5217 1670 0.5880
0.5398 14.6087 1680 0.5884
0.469 14.6957 1690 0.5882
0.5163 14.7826 1700 0.5883
0.5165 14.8696 1710 0.5883
0.5441 14.9565 1720 0.5881

Framework versions

  • PEFT 0.12.0
  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1