Edit model card

tool_choose

This model is a fine-tuned version of bert-base-multilingual-cased on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0897
  • Micro f1: 0.8434
  • Macro f1: 0.7771

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 4
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss Micro f1 Macro f1
0.2257 1.0 223 0.1608 0.0691 0.0279
0.131 2.0 446 0.1085 0.6282 0.2107
0.0971 3.0 669 0.0913 0.6594 0.2450
0.0791 4.0 892 0.0813 0.7333 0.2889
0.0676 5.0 1115 0.0730 0.7719 0.3426
0.0598 6.0 1338 0.0681 0.7931 0.3759
0.0499 7.0 1561 0.0756 0.7658 0.3849
0.0442 8.0 1784 0.0688 0.7894 0.3936
0.0404 9.0 2007 0.0637 0.8145 0.4532
0.0334 10.0 2230 0.0593 0.8276 0.4813
0.0293 11.0 2453 0.0672 0.8084 0.4919
0.0282 12.0 2676 0.0683 0.7967 0.5406
0.0244 13.0 2899 0.0617 0.8297 0.5594
0.0212 14.0 3122 0.0624 0.8372 0.6604
0.0201 15.0 3345 0.0731 0.7950 0.5878
0.0188 16.0 3568 0.0651 0.8283 0.6192
0.0157 17.0 3791 0.0705 0.8252 0.6689
0.0152 18.0 4014 0.0726 0.8115 0.6558
0.0138 19.0 4237 0.0707 0.8318 0.7159
0.0126 20.0 4460 0.0677 0.8387 0.7002
0.0129 21.0 4683 0.0707 0.8269 0.7254
0.0098 22.0 4906 0.0689 0.8257 0.7111
0.0089 23.0 5129 0.0793 0.8127 0.6561
0.0089 24.0 5352 0.0731 0.8227 0.6963
0.009 25.0 5575 0.0783 0.8203 0.7076
0.0099 26.0 5798 0.0745 0.8348 0.7155
0.0089 27.0 6021 0.0685 0.8458 0.7208
0.0077 28.0 6244 0.0780 0.8197 0.6605
0.0081 29.0 6467 0.0803 0.8193 0.6366
0.0085 30.0 6690 0.0764 0.8259 0.7797
0.0074 31.0 6913 0.0809 0.8269 0.7182
0.0036 32.0 7136 0.0808 0.8283 0.7305
0.0083 33.0 7359 0.0810 0.8378 0.7481
0.0071 34.0 7582 0.0826 0.8329 0.7348
0.0058 35.0 7805 0.1001 0.8041 0.6292
0.0047 36.0 8028 0.0864 0.8296 0.7206
0.006 37.0 8251 0.0820 0.8388 0.7131
0.0053 38.0 8474 0.0858 0.8194 0.7486
0.0056 39.0 8697 0.0902 0.8219 0.6887
0.0044 40.0 8920 0.0897 0.8434 0.7771

Framework versions

  • Transformers 4.35.0
  • Pytorch 2.1.0+cu118
  • Datasets 2.14.6
  • Tokenizers 0.14.1
Downloads last month
6
Safetensors
Model size
178M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for hohorong/tool_choose

Finetuned
(511)
this model