Edit model card
Configuration Parsing Warning: In adapter_config.json: "peft.base_model_name_or_path" must be a string

Mixtral_texmin

This model is a fine-tuned version of mistralai/Mixtral-8x7B-v0.1 on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0053

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2.5e-05
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 0.03
  • training_steps: 1000

Training results

Training Loss Epoch Step Validation Loss
1.1336 10.0 10 0.9527
0.994 20.0 20 0.8577
0.8959 30.0 30 0.7932
0.8284 40.0 40 0.7371
0.7632 50.0 50 0.6634
0.6947 60.0 60 0.6000
0.6178 70.0 70 0.5247
0.5351 80.0 80 0.4482
0.4532 90.0 90 0.3783
0.3764 100.0 100 0.3120
0.3038 110.0 110 0.2446
0.232 120.0 120 0.1830
0.1668 130.0 130 0.1273
0.1111 140.0 140 0.0854
0.0722 150.0 150 0.0608
0.0506 160.0 160 0.0479
0.0406 170.0 170 0.0422
0.0351 180.0 180 0.0386
0.0316 190.0 190 0.0364
0.0292 200.0 200 0.0350
0.0271 210.0 210 0.0329
0.0253 220.0 220 0.0311
0.0238 230.0 230 0.0303
0.0223 240.0 240 0.0285
0.021 250.0 250 0.0276
0.0196 260.0 260 0.0262
0.0184 270.0 270 0.0246
0.0174 280.0 280 0.0234
0.0165 290.0 290 0.0223
0.0153 300.0 300 0.0214
0.0147 310.0 310 0.0218
0.0137 320.0 320 0.0203
0.0129 330.0 330 0.0198
0.012 340.0 340 0.0190
0.0114 350.0 350 0.0217
0.0107 360.0 360 0.0183
0.0101 370.0 370 0.0149
0.0097 380.0 380 0.0149
0.0094 390.0 390 0.0145
0.0088 400.0 400 0.0140
0.0082 410.0 410 0.0132
0.0072 420.0 420 0.0122
0.0067 430.0 430 0.0115
0.0061 440.0 440 0.0106
0.0057 450.0 450 0.0102
0.0053 460.0 460 0.0095
0.005 470.0 470 0.0088
0.0048 480.0 480 0.0086
0.0047 490.0 490 0.0081
0.0045 500.0 500 0.0085
0.0045 510.0 510 0.0080
0.0043 520.0 520 0.0082
0.0042 530.0 530 0.0078
0.0041 540.0 540 0.0076
0.004 550.0 550 0.0075
0.0039 560.0 560 0.0074
0.0038 570.0 570 0.0072
0.0038 580.0 580 0.0072
0.0038 590.0 590 0.0070
0.0037 600.0 600 0.0070
0.0036 610.0 610 0.0069
0.0036 620.0 620 0.0068
0.0035 630.0 630 0.0067
0.0035 640.0 640 0.0066
0.0034 650.0 650 0.0064
0.0034 660.0 660 0.0064
0.0033 670.0 670 0.0064
0.0033 680.0 680 0.0064
0.0032 690.0 690 0.0062
0.0032 700.0 700 0.0062
0.0032 710.0 710 0.0061
0.0032 720.0 720 0.0060
0.0031 730.0 730 0.0060
0.0031 740.0 740 0.0060
0.0031 750.0 750 0.0059
0.003 760.0 760 0.0058
0.003 770.0 770 0.0057
0.003 780.0 780 0.0058
0.003 790.0 790 0.0057
0.0029 800.0 800 0.0056
0.0029 810.0 810 0.0055
0.0029 820.0 820 0.0056
0.0029 830.0 830 0.0055
0.0029 840.0 840 0.0054
0.0029 850.0 850 0.0055
0.0028 860.0 860 0.0055
0.0028 870.0 870 0.0055
0.0028 880.0 880 0.0054
0.0028 890.0 890 0.0053
0.0028 900.0 900 0.0053
0.0028 910.0 910 0.0054
0.0027 920.0 920 0.0053
0.0027 930.0 930 0.0053
0.0027 940.0 940 0.0052
0.0027 950.0 950 0.0052
0.0027 960.0 960 0.0053
0.0027 970.0 970 0.0052
0.0027 980.0 980 0.0053
0.0027 990.0 990 0.0052
0.0027 1000.0 1000 0.0053

Framework versions

  • PEFT 0.7.2.dev0
  • Transformers 4.37.0.dev0
  • Pytorch 2.1.2+cu121
  • Datasets 2.16.0
  • Tokenizers 0.15.0
Downloads last month
3
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for tigerbhai/Mixtral_texmin

Adapter
(89)
this model