metadata

license: mit
library_name: peft
tags:
  - generated_from_trainer
base_model: microsoft/phi-2
model-index:
  - name: fine-tuning-Phi2-with-webglm-qa-with-lora_5
    results: []

fine-tuning-Phi2-with-webglm-qa-with-lora_5

This model is a fine-tuned version of microsoft/phi-2 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.0878

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 5
total_train_batch_size: 10
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
training_steps: 1000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
8.1591	0.2	10	7.9109
7.8077	0.4	20	7.4417
6.7423	0.6	30	6.1597
5.2815	0.8	40	3.7018
2.6395	1.0	50	1.1413
0.7209	1.2	60	0.6488
0.5959	1.39	70	0.5735
0.5036	1.59	80	0.5102
0.4103	1.79	90	0.4500
0.3433	1.99	100	0.3905
0.3235	2.19	110	0.3371
0.2567	2.39	120	0.3032
0.2298	2.59	130	0.2785
0.2451	2.79	140	0.2553
0.1935	2.99	150	0.2363
0.1946	3.19	160	0.2248
0.1836	3.39	170	0.2097
0.1681	3.59	180	0.1984
0.1571	3.78	190	0.1877
0.1713	3.98	200	0.1820
0.15	4.18	210	0.1741
0.1315	4.38	220	0.1696
0.1567	4.58	230	0.1619
0.1225	4.78	240	0.1528
0.1346	4.98	250	0.1491
0.1336	5.18	260	0.1464
0.105	5.38	270	0.1427
0.1245	5.58	280	0.1404
0.1282	5.78	290	0.1363
0.1042	5.98	300	0.1314
0.1112	6.18	310	0.1264
0.106	6.37	320	0.1249
0.1043	6.57	330	0.1240
0.1016	6.77	340	0.1196
0.096	6.97	350	0.1179
0.0927	7.17	360	0.1182
0.0997	7.37	370	0.1146
0.0914	7.57	380	0.1151
0.0993	7.77	390	0.1128
0.0863	7.97	400	0.1112
0.0757	8.17	410	0.1100
0.0803	8.37	420	0.1095
0.0969	8.57	430	0.1084
0.081	8.76	440	0.1079
0.088	8.96	450	0.1050
0.082	9.16	460	0.1036
0.078	9.36	470	0.1019
0.0782	9.56	480	0.1026
0.0733	9.76	490	0.1010
0.0754	9.96	500	0.1027
0.0741	10.16	510	0.1011
0.076	10.36	520	0.1023
0.078	10.56	530	0.1010
0.0701	10.76	540	0.0990
0.0636	10.96	550	0.0974
0.0668	11.16	560	0.0973
0.0672	11.35	570	0.0972
0.0634	11.55	580	0.0955
0.061	11.75	590	0.0969
0.0671	11.95	600	0.0956
0.0611	12.15	610	0.0973
0.061	12.35	620	0.0966
0.0632	12.55	630	0.0950
0.0655	12.75	640	0.0945
0.0643	12.95	650	0.0944
0.0557	13.15	660	0.0942
0.0585	13.35	670	0.0937
0.0582	13.55	680	0.0933
0.0544	13.75	690	0.0927
0.0663	13.94	700	0.0917
0.0627	14.14	710	0.0917
0.0561	14.34	720	0.0923
0.0504	14.54	730	0.0914
0.0656	14.74	740	0.0907
0.0528	14.94	750	0.0898
0.0581	15.14	760	0.0916
0.0604	15.34	770	0.0912
0.0467	15.54	780	0.0907
0.048	15.74	790	0.0904
0.0571	15.94	800	0.0902
0.0521	16.14	810	0.0904
0.052	16.33	820	0.0896
0.0521	16.53	830	0.0895
0.0498	16.73	840	0.0898
0.0569	16.93	850	0.0887
0.0481	17.13	860	0.0884
0.0531	17.33	870	0.0889
0.046	17.53	880	0.0886
0.0492	17.73	890	0.0887
0.0532	17.93	900	0.0885
0.0511	18.13	910	0.0878
0.0433	18.33	920	0.0881
0.0518	18.53	930	0.0884
0.049	18.73	940	0.0882
0.0493	18.92	950	0.0880
0.0479	19.12	960	0.0880
0.0439	19.32	970	0.0880
0.0535	19.52	980	0.0879
0.0501	19.72	990	0.0878
0.0466	19.92	1000	0.0878

Framework versions

PEFT 0.7.1
Transformers 4.36.2
Pytorch 2.0.0
Datasets 2.15.0
Tokenizers 0.15.0