izaitova's picture
End of training
d66658a verified
metadata
license: cc-by-4.0
base_model: allegro/herbert-large-cased
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: herbert-large-cased_nli
    results: []

herbert-large-cased_nli

This model is a fine-tuned version of allegro/herbert-large-cased on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.0905
  • Accuracy: 0.77

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 625 0.6466 0.751
No log 2.0 1250 0.5856 0.79
0.5915 3.0 1875 0.6142 0.761
0.5915 4.0 2500 0.6803 0.78
0.4204 5.0 3125 0.7207 0.786
0.4204 6.0 3750 0.7956 0.777
0.4204 7.0 4375 0.7964 0.787
0.306 8.0 5000 0.7869 0.766
0.306 9.0 5625 0.8671 0.766
0.2192 10.0 6250 0.8832 0.778
0.2192 11.0 6875 0.9147 0.768
0.1595 12.0 7500 1.1113 0.756
0.1595 13.0 8125 1.0984 0.761
0.1595 14.0 8750 1.3107 0.758
0.1288 15.0 9375 1.2892 0.764
0.1288 16.0 10000 1.5291 0.741
0.1037 17.0 10625 1.2105 0.786
0.1037 18.0 11250 1.3468 0.78
0.1037 19.0 11875 1.5642 0.758
0.0864 20.0 12500 1.5304 0.768
0.0864 21.0 13125 1.4310 0.776
0.0728 22.0 13750 1.5636 0.762
0.0728 23.0 14375 1.5032 0.766
0.0583 24.0 15000 1.7275 0.763
0.0583 25.0 15625 1.6669 0.758
0.0583 26.0 16250 1.6029 0.767
0.0453 27.0 16875 1.6239 0.771
0.0453 28.0 17500 1.6007 0.781
0.0335 29.0 18125 1.7028 0.766
0.0335 30.0 18750 1.8058 0.776
0.0335 31.0 19375 1.7894 0.766
0.0267 32.0 20000 1.8930 0.765
0.0267 33.0 20625 1.8582 0.775
0.022 34.0 21250 1.9610 0.764
0.022 35.0 21875 2.0128 0.775
0.0163 36.0 22500 2.0248 0.773
0.0163 37.0 23125 2.0203 0.77
0.0163 38.0 23750 2.0615 0.77
0.0115 39.0 24375 2.0787 0.769
0.0115 40.0 25000 2.0905 0.77

Framework versions

  • Transformers 4.39.3
  • Pytorch 1.11.0a0+17540c5
  • Datasets 2.20.0
  • Tokenizers 0.15.2