Gunslinger3D's picture
fine-tuning-Phi2-with-webglm-qa-with-lora_7
7c0930d verified
metadata
license: mit
library_name: peft
tags:
  - generated_from_trainer
base_model: microsoft/phi-2
model-index:
  - name: fine-tuning-Phi2-with-webglm-qa-with-lora_7
    results: []

fine-tuning-Phi2-with-webglm-qa-with-lora_7

This model is a fine-tuned version of microsoft/phi-2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0950

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 5
  • total_train_batch_size: 10
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 60
  • training_steps: 1000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
7.3505 0.31 20 6.2863
4.0914 0.63 40 0.9255
0.6517 0.94 60 0.5762
0.4621 1.26 80 0.4062
0.3128 1.57 100 0.3056
0.2536 1.89 120 0.2604
0.2227 2.2 140 0.2247
0.1901 2.52 160 0.2041
0.176 2.83 180 0.1812
0.1453 3.14 200 0.1683
0.1557 3.46 220 0.1592
0.1441 3.77 240 0.1488
0.1282 4.09 260 0.1430
0.1215 4.4 280 0.1348
0.1217 4.72 300 0.1323
0.117 5.03 320 0.1271
0.109 5.35 340 0.1255
0.1094 5.66 360 0.1210
0.1057 5.97 380 0.1175
0.0937 6.29 400 0.1158
0.0942 6.6 420 0.1159
0.1007 6.92 440 0.1125
0.0876 7.23 460 0.1119
0.0894 7.55 480 0.1099
0.0827 7.86 500 0.1072
0.0894 8.18 520 0.1069
0.0805 8.49 540 0.1075
0.0782 8.81 560 0.1043
0.0881 9.12 580 0.1034
0.0839 9.43 600 0.1015
0.0694 9.75 620 0.1000
0.068 10.06 640 0.1007
0.072 10.38 660 0.0994
0.0709 10.69 680 0.0985
0.0712 11.01 700 0.0986
0.0673 11.32 720 0.0999
0.0669 11.64 740 0.0974
0.0706 11.95 760 0.0981
0.0641 12.26 780 0.0969
0.0652 12.58 800 0.0964
0.0668 12.89 820 0.0962
0.0617 13.21 840 0.0972
0.0628 13.52 860 0.0960
0.0637 13.84 880 0.0949
0.0633 14.15 900 0.0951
0.0577 14.47 920 0.0953
0.0646 14.78 940 0.0947
0.06 15.09 960 0.0946
0.0584 15.41 980 0.0949
0.0638 15.72 1000 0.0950

Framework versions

  • PEFT 0.7.1
  • Transformers 4.36.2
  • Pytorch 2.0.0
  • Datasets 2.15.0
  • Tokenizers 0.15.0