Edit model card

summary_train2

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.6767

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 17 0.9815
No log 2.0 34 0.9102
No log 3.0 51 0.8509
No log 4.0 68 0.8540
No log 5.0 85 0.8934
0.7594 6.0 102 0.9420
0.7594 7.0 119 1.0024
0.7594 8.0 136 1.1378
0.7594 9.0 153 1.2391
0.7594 10.0 170 1.3776
0.7594 11.0 187 1.5790
0.3001 12.0 204 1.8004
0.3001 13.0 221 2.0407
0.3001 14.0 238 2.2108
0.3001 15.0 255 2.3857
0.3001 16.0 272 2.5249
0.3001 17.0 289 2.6173
0.0996 18.0 306 2.6625
0.0996 19.0 323 2.6729
0.0996 20.0 340 2.6767

Framework versions

  • PEFT 0.13.1
  • Transformers 4.43.3
  • Pytorch 2.4.0+cu121
  • Datasets 3.0.1
  • Tokenizers 0.19.1
Downloads last month
14
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for JasonBounre/summary_train2

Adapter
(509)
this model