results / README.md
kostasman1's picture
kostasman1/phi3_adapter_finetuned
55ebf3b verified
metadata
license: mit
library_name: peft
tags:
  - trl
  - sft
  - generated_from_trainer
base_model: microsoft/Phi-3-medium-128k-instruct
model-index:
  - name: results
    results: []

results

This model is a fine-tuned version of microsoft/Phi-3-medium-128k-instruct on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3259

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.102 0.1065 100 2.1266
2.0156 0.2130 200 1.9941
1.8151 0.3195 300 1.8149
1.6951 0.4260 400 1.5771
1.2789 0.5325 500 1.3936
1.0007 0.6390 600 1.1524
0.7882 0.7455 700 0.9936
0.9486 0.8520 800 0.8539
0.7381 0.9585 900 0.7410
0.6254 1.0650 1000 0.6283
0.4915 1.1715 1100 0.5834
0.3432 1.2780 1200 0.5034
0.349 1.3845 1300 0.4476
0.4378 1.4909 1400 0.4160
0.4522 1.5974 1500 0.4061
0.3183 1.7039 1600 0.3795
0.3184 1.8104 1700 0.3707
0.267 1.9169 1800 0.3601
0.2966 2.0234 1900 0.3538
0.2697 2.1299 2000 0.3492
0.3662 2.2364 2100 0.3424
0.3135 2.3429 2200 0.3407
0.3339 2.4494 2300 0.3366
0.1828 2.5559 2400 0.3340
0.2824 2.6624 2500 0.3306
0.3204 2.7689 2600 0.3289
0.3062 2.8754 2700 0.3263
0.313 2.9819 2800 0.3259

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.2
  • Pytorch 2.1.2+cu121
  • Datasets 2.19.2
  • Tokenizers 0.19.1