Edit model card

qwen2_Magiccoder_evol_10k_qlora_ortho

This model is a fine-tuned version of Qwen/Qwen2-7B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9025

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 0.02
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
0.8992 0.0261 4 0.9547
0.9045 0.0522 8 0.9234
0.9145 0.0783 12 0.9166
0.8688 0.1044 16 0.9117
0.9222 0.1305 20 0.9097
0.8108 0.1566 24 0.9090
0.8194 0.1827 28 0.9083
0.9616 0.2088 32 0.9086
0.8624 0.2349 36 0.9083
0.8898 0.2610 40 0.9088
0.9476 0.2871 44 0.9085
0.9156 0.3132 48 0.9091
0.8388 0.3393 52 0.9091
0.8429 0.3654 56 0.9087
0.8651 0.3915 60 0.9081
0.9228 0.4176 64 0.9082
0.9167 0.4437 68 0.9076
0.8769 0.4698 72 0.9068
0.9009 0.4959 76 0.9069
0.8611 0.5220 80 0.9074
0.9496 0.5481 84 0.9070
0.8562 0.5742 88 0.9067
0.943 0.6003 92 0.9060
0.8718 0.6264 96 0.9053
0.9642 0.6525 100 0.9046
0.8425 0.6786 104 0.9042
0.886 0.7047 108 0.9040
0.8576 0.7308 112 0.9043
0.823 0.7569 116 0.9036
0.8158 0.7830 120 0.9032
0.8854 0.8091 124 0.9031
0.8502 0.8352 128 0.9030
0.9493 0.8613 132 0.9026
0.8934 0.8874 136 0.9026
0.9158 0.9135 140 0.9026
0.8686 0.9396 144 0.9026
0.9321 0.9657 148 0.9027
0.8882 0.9918 152 0.9025

Framework versions

  • PEFT 0.7.1
  • Transformers 4.40.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
0
Unable to determine this model’s pipeline type. Check the docs .

Adapter for