Edit model card

zephyr-7b-sft-lora-accum8-lr5e_6

This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1811

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 2
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 50.0

Training results

Training Loss Epoch Step Validation Loss
2.0831 0.51 6 2.0527
2.0602 1.53 13 2.0217
2.0191 2.55 20 1.9817
1.9825 3.57 27 1.9451
1.9646 4.51 33 1.9045
1.9049 5.53 40 1.8696
1.8708 6.55 47 1.8370
1.8597 7.57 54 1.8034
1.8119 8.51 60 1.7828
1.7937 9.53 67 1.7592
1.7699 10.55 74 1.7352
1.7492 11.57 81 1.7144
1.7242 12.51 87 1.6988
1.719 13.53 94 1.6788
1.6837 14.55 101 1.6611
1.6724 15.57 108 1.6420
1.6683 16.51 114 1.6240
1.6336 17.53 121 1.6057
1.6163 18.55 128 1.5872
1.5923 19.57 135 1.5668
1.5794 20.51 141 1.5539
1.5658 21.53 148 1.5327
1.5324 22.55 155 1.5131
1.523 23.57 162 1.4916
1.5045 24.51 168 1.4707
1.4709 25.53 175 1.4488
1.4638 26.55 182 1.4232
1.4426 27.57 189 1.3992
1.4043 28.51 195 1.3776
1.3797 29.53 202 1.3548
1.379 30.55 209 1.3348
1.3437 31.57 216 1.3199
1.338 32.51 222 1.3040
1.3048 33.53 229 1.2874
1.298 34.55 236 1.2761
1.2925 35.57 243 1.2652
1.277 36.51 249 1.2557
1.2679 37.53 256 1.2455
1.2551 38.55 263 1.2377
1.2395 39.57 270 1.2305
1.2384 40.51 276 1.2229
1.2342 41.53 283 1.2152
1.224 42.55 290 1.2077
1.2179 43.57 297 1.2058
1.2249 44.51 303 1.1987
1.2093 45.53 310 1.1964
1.2011 46.55 317 1.1911
1.2078 47.57 324 1.1865
1.1994 48.51 330 1.1846
1.1915 49.53 337 1.1810

Framework versions

  • Transformers 4.35.0
  • Pytorch 2.1.0
  • Datasets 2.14.6
  • Tokenizers 0.14.1
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .

Model tree for shkang/zephyr-7b-sft-lora-accum8-lr5e_6

Finetuned
(692)
this model