Edit model card

qlora-mistral-hackatone-yandexq

This model is a fine-tuned version of TheBloke/Mistral-7B-Instruct-v0.2-GPTQ on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.8327

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 2
  • num_epochs: 60
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.0167 1.0 1 1.9699
2.0949 2.0 2 1.9681
2.0703 3.0 3 1.9624
2.0674 4.0 4 1.9563
2.0057 5.0 5 1.9500
2.0534 6.0 6 1.9431
1.9912 7.0 7 1.9359
2.0333 8.0 8 1.9285
1.9934 9.0 9 1.9210
2.0358 10.0 10 1.9136
1.9727 11.0 11 1.9064
1.9698 12.0 12 1.8994
1.9983 13.0 13 1.8928
1.981 14.0 14 1.8865
1.9554 15.0 15 1.8807
1.935 16.0 16 1.8755
1.9203 17.0 17 1.8705
1.9371 18.0 18 1.8663
1.9184 19.0 19 1.8625
1.938 20.0 20 1.8592
1.94 21.0 21 1.8565
1.9062 22.0 22 1.8542
1.9293 23.0 23 1.8520
1.9464 24.0 24 1.8503
1.9271 25.0 25 1.8488
1.8998 26.0 26 1.8473
1.9393 27.0 27 1.8461
1.9188 28.0 28 1.8449
1.9117 29.0 29 1.8438
1.8974 30.0 30 1.8428
1.9181 31.0 31 1.8418
1.9047 32.0 32 1.8409
1.8977 33.0 33 1.8400
1.8937 34.0 34 1.8392
1.8801 35.0 35 1.8385
1.9149 36.0 36 1.8377
1.9027 37.0 37 1.8372
1.9076 38.0 38 1.8366
1.8718 39.0 39 1.8362
1.9125 40.0 40 1.8357
1.8903 41.0 41 1.8353
1.8668 42.0 42 1.8350
1.8653 43.0 43 1.8347
1.9068 44.0 44 1.8345
1.869 45.0 45 1.8342
1.8844 46.0 46 1.8340
1.9001 47.0 47 1.8338
1.886 48.0 48 1.8336
1.8847 49.0 49 1.8335
1.8566 50.0 50 1.8333
1.8729 51.0 51 1.8332
1.8736 52.0 52 1.8330
1.9098 53.0 53 1.8330
1.897 54.0 54 1.8329
1.8966 55.0 55 1.8328
1.8942 56.0 56 1.8328
1.871 57.0 57 1.8328
1.8434 58.0 58 1.8327
1.8743 59.0 59 1.8327
1.8472 60.0 60 1.8327

Framework versions

  • PEFT 0.10.0
  • Transformers 4.38.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
0
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for ggeorge/qlora-mistral-hackatone-yandexq