Edit model card

careerbot_PG6_Qwen_Qwen2.5-0.5B-Instruct_model

This model is a fine-tuned version of Qwen/Qwen2.5-0.5B-Instruct on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4229

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAFACTOR and the args are: No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1
  • training_steps: 5072

Training results

Training Loss Epoch Step Validation Loss
No log 0.9968 158 1.0200
No log 2.0 317 0.9880
No log 2.9968 475 0.9873
No log 4.0 634 1.0426
No log 4.9968 792 1.0514
No log 6.0 951 1.0938
No log 6.9968 1109 1.0742
No log 8.0 1268 1.1283
No log 8.9968 1426 1.1356
No log 10.0 1585 1.1581
No log 10.9968 1743 1.2045
No log 12.0 1902 1.2060
No log 12.9968 2060 1.2354
No log 14.0 2219 1.2285
No log 14.9968 2377 1.2401
No log 16.0 2536 1.2986
No log 16.9968 2694 1.2904
No log 18.0 2853 1.3051
No log 18.9968 3011 1.3109
No log 20.0 3170 1.3154
No log 20.9968 3328 1.3202
No log 22.0 3487 1.3282
No log 22.9968 3645 1.3385
No log 24.0 3804 1.3295
No log 24.9968 3962 1.3512
No log 26.0 4121 1.3583
No log 26.9968 4279 1.3666
No log 28.0 4438 1.3841
No log 28.9968 4596 1.3938
No log 30.0 4755 1.4084
No log 30.9968 4913 1.4178
No log 32.0 5072 1.4229

Framework versions

  • Transformers 4.46.1
  • Pytorch 2.5.0+cu124
  • Datasets 2.19.0
  • Tokenizers 0.20.1
Downloads last month
9
Safetensors
Model size
494M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Hazde/careerbot_PG6_Qwen_Qwen2.5-0.5B-Instruct_model

Base model

Qwen/Qwen2.5-0.5B
Finetuned
(53)
this model