fblgit's picture
Upload folder using huggingface_hub
01adb9e verified
|
raw
history blame
2.29 kB
metadata
library_name: peft
tags:
  - generated_from_trainer
base_model: Qwen/Qwen2.5-1.5B-Instruct
model-index:
  - name: miniclaus-qw1.5B-UNAMGS
    results: []

miniclaus-qw1.5B-UNAMGS

Built with Axolotl

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7193

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • train_batch_size: 1
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • total_train_batch_size: 128
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
1.1641 0.0007 1 0.8514
0.9246 0.0503 76 0.7921
0.8791 0.1006 152 0.7727
0.8507 0.1509 228 0.7611
0.8376 0.2012 304 0.7534
0.793 0.2515 380 0.7467
0.7834 0.3018 456 0.7421
0.7807 0.3521 532 0.7384
0.764 0.4023 608 0.7359
0.7738 0.4526 684 0.7320
0.7425 0.5029 760 0.7300
0.7519 0.5532 836 0.7279
0.7461 0.6035 912 0.7255
0.7489 0.6538 988 0.7245
0.7614 0.7041 1064 0.7222
0.7576 0.7544 1140 0.7222
0.7303 0.8047 1216 0.7209
0.7332 0.8550 1292 0.7199
0.7541 0.9053 1368 0.7202
0.7369 0.9556 1444 0.7193

Framework versions

  • PEFT 0.13.2
  • Transformers 4.45.2
  • Pytorch 2.3.0+cu121
  • Datasets 3.0.1
  • Tokenizers 0.20.1