metadata

license: mit
library_name: peft
tags:
  - trl
  - sft
  - generated_from_trainer
base_model: microsoft/Phi-3-medium-128k-instruct
model-index:
  - name: results
    results: []

results

This model is a fine-tuned version of microsoft/Phi-3-medium-128k-instruct on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.3259

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 3
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
2.102	0.1065	100	2.1266
2.0156	0.2130	200	1.9941
1.8151	0.3195	300	1.8149
1.6951	0.4260	400	1.5771
1.2789	0.5325	500	1.3936
1.0007	0.6390	600	1.1524
0.7882	0.7455	700	0.9936
0.9486	0.8520	800	0.8539
0.7381	0.9585	900	0.7410
0.6254	1.0650	1000	0.6283
0.4915	1.1715	1100	0.5834
0.3432	1.2780	1200	0.5034
0.349	1.3845	1300	0.4476
0.4378	1.4909	1400	0.4160
0.4522	1.5974	1500	0.4061
0.3183	1.7039	1600	0.3795
0.3184	1.8104	1700	0.3707
0.267	1.9169	1800	0.3601
0.2966	2.0234	1900	0.3538
0.2697	2.1299	2000	0.3492
0.3662	2.2364	2100	0.3424
0.3135	2.3429	2200	0.3407
0.3339	2.4494	2300	0.3366
0.1828	2.5559	2400	0.3340
0.2824	2.6624	2500	0.3306
0.3204	2.7689	2600	0.3289
0.3062	2.8754	2700	0.3263
0.313	2.9819	2800	0.3259

Framework versions

PEFT 0.11.1
Transformers 4.41.2
Pytorch 2.1.2+cu121
Datasets 2.19.2
Tokenizers 0.19.1