Edit model card

Mistral-7B-text-to-sql-flash-attention-2-FAISS-NEWPOC

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.3 on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5718

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1.25e-05
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.03
  • lr_scheduler_warmup_steps: 15
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss
1.6025 2.1053 10 1.3983
1.2594 4.2105 20 1.1677
1.1238 6.3158 30 1.0695
1.0331 8.4211 40 0.9917
0.9668 10.5263 50 0.9300
0.9064 12.6316 60 0.8783
0.8569 14.7368 70 0.8309
0.8099 16.8421 80 0.7842
0.7632 18.9474 90 0.7365
0.7188 21.0526 100 0.6991
0.6855 23.1579 110 0.6714
0.6587 25.2632 120 0.6492
0.6383 27.3684 130 0.6312
0.6206 29.4737 140 0.6171
0.6077 31.5789 150 0.6062
0.5964 33.6842 160 0.5973
0.5881 35.7895 170 0.5898
0.5805 37.8947 180 0.5831
0.5732 40.0 190 0.5771
0.5665 42.1053 200 0.5718

Framework versions

  • PEFT 0.11.1
  • Transformers 4.42.3
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
6
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for frankmorales2020/Mistral-7B-text-to-sql-flash-attention-2-FAISS-NEWPOC

Adapter
(196)
this model