Edit model card

qwen2_Magiccoder_evol_10k_ortho

This model is a fine-tuned version of Qwen/Qwen2-7B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8039

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 0.02
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
0.8045 0.0261 4 0.8796
0.8394 0.0522 8 0.8315
0.8027 0.0784 12 0.8188
0.7742 0.1045 16 0.8136
0.8206 0.1306 20 0.8118
0.7117 0.1567 24 0.8110
0.7248 0.1828 28 0.8097
0.893 0.2089 32 0.8113
0.7788 0.2351 36 0.8096
0.8043 0.2612 40 0.8098
0.8427 0.2873 44 0.8108
0.8171 0.3134 48 0.8098
0.7509 0.3395 52 0.8103
0.7373 0.3656 56 0.8105
0.7708 0.3918 60 0.8107
0.7942 0.4179 64 0.8109
0.8188 0.4440 68 0.8103
0.768 0.4701 72 0.8100
0.786 0.4962 76 0.8095
0.7728 0.5223 80 0.8094
0.8575 0.5485 84 0.8091
0.7635 0.5746 88 0.8088
0.8469 0.6007 92 0.8082
0.7647 0.6268 96 0.8078
0.8741 0.6529 100 0.8073
0.7574 0.6790 104 0.8067
0.8048 0.7052 108 0.8061
0.7615 0.7313 112 0.8056
0.7452 0.7574 116 0.8051
0.7191 0.7835 120 0.8049
0.7999 0.8096 124 0.8046
0.7317 0.8357 128 0.8045
0.8619 0.8619 132 0.8044
0.8071 0.8880 136 0.8040
0.8034 0.9141 140 0.8040
0.7892 0.9402 144 0.8040
0.8291 0.9663 148 0.8040
0.7938 0.9925 152 0.8039

Framework versions

  • PEFT 0.7.1
  • Transformers 4.40.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
2
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for imdatta0/qwen2_Magiccoder_evol_10k_ortho

Base model

Qwen/Qwen2-7B
Adapter
(233)
this model