Edit model card

flan-t5-small-sql

This model is a fine-tuned version of google/flan-t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3584

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • total_train_batch_size: 128
  • total_eval_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1000.0

Training results

Training Loss Epoch Step Validation Loss
0.3349 62.5 500 0.2107
0.1147 125.0 1000 0.2410
0.071 187.5 1500 0.2687
0.0502 250.0 2000 0.2901
0.0373 312.5 2500 0.3033
0.0301 375.0 3000 0.3141
0.025 437.5 3500 0.3235
0.0212 500.0 4000 0.3312
0.0187 562.5 4500 0.3404
0.017 625.0 5000 0.3371
0.0148 687.5 5500 0.3466
0.0139 750.0 6000 0.3480
0.0124 812.5 6500 0.3552
0.0118 875.0 7000 0.3594
0.0112 937.5 7500 0.3581
0.0106 1000.0 8000 0.3584

Framework versions

  • PEFT 0.7.1
  • Transformers 4.38.0
  • Pytorch 2.1.2+cu121
  • Datasets 2.17.0
  • Tokenizers 0.15.2
Downloads last month
6
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for kyryl-opens-ml/flan-t5-small-sql

Adapter
(42)
this model