sinanazeri's picture
granite-8b-code-instruct-TM1-tunned
b362b2a verified
metadata
license: apache-2.0
library_name: peft
tags:
  - generated_from_trainer
base_model: ibm-granite/granite-8b-code-instruct
model-index:
  - name: peft-dialogue-summary-training-1716404262
    results: []

peft-dialogue-summary-training-1716404262

This model is a fine-tuned version of ibm-granite/granite-8b-code-instruct on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5224

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1
  • training_steps: 400

Training results

Training Loss Epoch Step Validation Loss
7.625 0.1453 25 5.7401
5.3519 0.2907 50 4.5831
4.2545 0.4360 75 3.8334
4.2506 0.5814 100 3.3488
3.5118 0.7267 125 2.8198
3.1713 0.8721 150 2.4324
2.5504 1.0174 175 2.3330
2.174 1.1628 200 2.2412
1.878 1.3081 225 2.1900
1.9039 1.4535 250 2.0439
1.7977 1.5988 275 1.9491
1.7755 1.7442 300 1.8278
1.629 1.8895 325 1.6587
1.5533 2.0349 350 1.5799
1.2937 2.1802 375 1.5344
1.1375 2.3256 400 1.5224

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.0
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1