2023-10-20 10:18:41,990 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:18:41,990 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-20 10:18:41,990 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:18:41,991 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-20 10:18:41,991 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:18:41,991 Train: 6183 sentences 2023-10-20 10:18:41,991 (train_with_dev=False, train_with_test=False) 2023-10-20 10:18:41,991 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:18:41,991 Training Params: 2023-10-20 10:18:41,991 - learning_rate: "5e-05" 2023-10-20 10:18:41,991 - mini_batch_size: "8" 2023-10-20 10:18:41,991 - max_epochs: "10" 2023-10-20 10:18:41,991 - shuffle: "True" 2023-10-20 10:18:41,991 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:18:41,991 Plugins: 2023-10-20 10:18:41,991 - TensorboardLogger 2023-10-20 10:18:41,991 - LinearScheduler | warmup_fraction: '0.1' 2023-10-20 10:18:41,991 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:18:41,991 Final evaluation on model from best epoch (best-model.pt) 2023-10-20 10:18:41,991 - metric: "('micro avg', 'f1-score')" 2023-10-20 10:18:41,991 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:18:41,991 Computation: 2023-10-20 10:18:41,991 - compute on device: cuda:0 2023-10-20 10:18:41,991 - embedding storage: none 2023-10-20 10:18:41,991 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:18:41,991 Model training base path: "hmbench-topres19th/en-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-20 10:18:41,991 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:18:41,991 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:18:41,991 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-20 10:18:43,737 epoch 1 - iter 77/773 - loss 2.76081357 - time (sec): 1.75 - samples/sec: 6847.88 - lr: 0.000005 - momentum: 0.000000 2023-10-20 10:18:45,553 epoch 1 - iter 154/773 - loss 2.49025256 - time (sec): 3.56 - samples/sec: 6893.04 - lr: 0.000010 - momentum: 0.000000 2023-10-20 10:18:47,348 epoch 1 - iter 231/773 - loss 2.06885508 - time (sec): 5.36 - samples/sec: 6935.37 - lr: 0.000015 - momentum: 0.000000 2023-10-20 10:18:49,131 epoch 1 - iter 308/773 - loss 1.68478080 - time (sec): 7.14 - samples/sec: 6911.07 - lr: 0.000020 - momentum: 0.000000 2023-10-20 10:18:50,956 epoch 1 - iter 385/773 - loss 1.40792105 - time (sec): 8.96 - samples/sec: 6895.16 - lr: 0.000025 - momentum: 0.000000 2023-10-20 10:18:52,667 epoch 1 - iter 462/773 - loss 1.21130332 - time (sec): 10.67 - samples/sec: 6976.42 - lr: 0.000030 - momentum: 0.000000 2023-10-20 10:18:54,479 epoch 1 - iter 539/773 - loss 1.07143392 - time (sec): 12.49 - samples/sec: 6984.16 - lr: 0.000035 - momentum: 0.000000 2023-10-20 10:18:56,262 epoch 1 - iter 616/773 - loss 0.96856792 - time (sec): 14.27 - samples/sec: 6956.59 - lr: 0.000040 - momentum: 0.000000 2023-10-20 10:18:58,015 epoch 1 - iter 693/773 - loss 0.88498916 - time (sec): 16.02 - samples/sec: 7008.21 - lr: 0.000045 - momentum: 0.000000 2023-10-20 10:18:59,721 epoch 1 - iter 770/773 - loss 0.82734217 - time (sec): 17.73 - samples/sec: 6979.35 - lr: 0.000050 - momentum: 0.000000 2023-10-20 10:18:59,793 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:18:59,793 EPOCH 1 done: loss 0.8246 - lr: 0.000050 2023-10-20 10:19:00,757 DEV : loss 0.13438178598880768 - f1-score (micro avg) 0.0 2023-10-20 10:19:00,769 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:19:02,611 epoch 2 - iter 77/773 - loss 0.21322552 - time (sec): 1.84 - samples/sec: 7093.97 - lr: 0.000049 - momentum: 0.000000 2023-10-20 10:19:04,402 epoch 2 - iter 154/773 - loss 0.21500217 - time (sec): 3.63 - samples/sec: 7159.82 - lr: 0.000049 - momentum: 0.000000 2023-10-20 10:19:06,169 epoch 2 - iter 231/773 - loss 0.20279902 - time (sec): 5.40 - samples/sec: 7164.36 - lr: 0.000048 - momentum: 0.000000 2023-10-20 10:19:07,981 epoch 2 - iter 308/773 - loss 0.19576272 - time (sec): 7.21 - samples/sec: 7091.38 - lr: 0.000048 - momentum: 0.000000 2023-10-20 10:19:09,778 epoch 2 - iter 385/773 - loss 0.19373068 - time (sec): 9.01 - samples/sec: 7031.61 - lr: 0.000047 - momentum: 0.000000 2023-10-20 10:19:11,544 epoch 2 - iter 462/773 - loss 0.19162637 - time (sec): 10.77 - samples/sec: 6997.23 - lr: 0.000047 - momentum: 0.000000 2023-10-20 10:19:13,330 epoch 2 - iter 539/773 - loss 0.18956385 - time (sec): 12.56 - samples/sec: 6995.11 - lr: 0.000046 - momentum: 0.000000 2023-10-20 10:19:15,046 epoch 2 - iter 616/773 - loss 0.18775229 - time (sec): 14.28 - samples/sec: 7033.48 - lr: 0.000046 - momentum: 0.000000 2023-10-20 10:19:16,741 epoch 2 - iter 693/773 - loss 0.18551744 - time (sec): 15.97 - samples/sec: 7026.58 - lr: 0.000045 - momentum: 0.000000 2023-10-20 10:19:18,462 epoch 2 - iter 770/773 - loss 0.18371429 - time (sec): 17.69 - samples/sec: 7002.49 - lr: 0.000044 - momentum: 0.000000 2023-10-20 10:19:18,526 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:19:18,526 EPOCH 2 done: loss 0.1838 - lr: 0.000044 2023-10-20 10:19:19,602 DEV : loss 0.0927828922867775 - f1-score (micro avg) 0.3561 2023-10-20 10:19:19,613 saving best model 2023-10-20 10:19:19,641 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:19:21,317 epoch 3 - iter 77/773 - loss 0.12862111 - time (sec): 1.68 - samples/sec: 7202.29 - lr: 0.000044 - momentum: 0.000000 2023-10-20 10:19:23,022 epoch 3 - iter 154/773 - loss 0.13829102 - time (sec): 3.38 - samples/sec: 7209.97 - lr: 0.000043 - momentum: 0.000000 2023-10-20 10:19:24,819 epoch 3 - iter 231/773 - loss 0.14396600 - time (sec): 5.18 - samples/sec: 7263.82 - lr: 0.000043 - momentum: 0.000000 2023-10-20 10:19:26,633 epoch 3 - iter 308/773 - loss 0.14985513 - time (sec): 6.99 - samples/sec: 7206.66 - lr: 0.000042 - momentum: 0.000000 2023-10-20 10:19:28,366 epoch 3 - iter 385/773 - loss 0.14727970 - time (sec): 8.72 - samples/sec: 7228.75 - lr: 0.000042 - momentum: 0.000000 2023-10-20 10:19:30,085 epoch 3 - iter 462/773 - loss 0.15254413 - time (sec): 10.44 - samples/sec: 7150.46 - lr: 0.000041 - momentum: 0.000000 2023-10-20 10:19:31,810 epoch 3 - iter 539/773 - loss 0.15014457 - time (sec): 12.17 - samples/sec: 7151.46 - lr: 0.000041 - momentum: 0.000000 2023-10-20 10:19:33,550 epoch 3 - iter 616/773 - loss 0.15014077 - time (sec): 13.91 - samples/sec: 7122.98 - lr: 0.000040 - momentum: 0.000000 2023-10-20 10:19:35,289 epoch 3 - iter 693/773 - loss 0.15350555 - time (sec): 15.65 - samples/sec: 7130.39 - lr: 0.000039 - momentum: 0.000000 2023-10-20 10:19:37,008 epoch 3 - iter 770/773 - loss 0.15441552 - time (sec): 17.37 - samples/sec: 7124.27 - lr: 0.000039 - momentum: 0.000000 2023-10-20 10:19:37,075 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:19:37,075 EPOCH 3 done: loss 0.1541 - lr: 0.000039 2023-10-20 10:19:38,138 DEV : loss 0.084617480635643 - f1-score (micro avg) 0.4567 2023-10-20 10:19:38,150 saving best model 2023-10-20 10:19:38,185 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:19:39,874 epoch 4 - iter 77/773 - loss 0.12106919 - time (sec): 1.69 - samples/sec: 7626.66 - lr: 0.000038 - momentum: 0.000000 2023-10-20 10:19:41,612 epoch 4 - iter 154/773 - loss 0.11504532 - time (sec): 3.43 - samples/sec: 7345.90 - lr: 0.000038 - momentum: 0.000000 2023-10-20 10:19:43,406 epoch 4 - iter 231/773 - loss 0.11895267 - time (sec): 5.22 - samples/sec: 7123.90 - lr: 0.000037 - momentum: 0.000000 2023-10-20 10:19:45,175 epoch 4 - iter 308/773 - loss 0.12487455 - time (sec): 6.99 - samples/sec: 7164.55 - lr: 0.000037 - momentum: 0.000000 2023-10-20 10:19:46,889 epoch 4 - iter 385/773 - loss 0.12594846 - time (sec): 8.70 - samples/sec: 7202.58 - lr: 0.000036 - momentum: 0.000000 2023-10-20 10:19:48,672 epoch 4 - iter 462/773 - loss 0.12764145 - time (sec): 10.49 - samples/sec: 7134.85 - lr: 0.000036 - momentum: 0.000000 2023-10-20 10:19:50,495 epoch 4 - iter 539/773 - loss 0.13107008 - time (sec): 12.31 - samples/sec: 7013.80 - lr: 0.000035 - momentum: 0.000000 2023-10-20 10:19:52,305 epoch 4 - iter 616/773 - loss 0.13554526 - time (sec): 14.12 - samples/sec: 6990.45 - lr: 0.000034 - momentum: 0.000000 2023-10-20 10:19:54,018 epoch 4 - iter 693/773 - loss 0.13496751 - time (sec): 15.83 - samples/sec: 6990.21 - lr: 0.000034 - momentum: 0.000000 2023-10-20 10:19:55,817 epoch 4 - iter 770/773 - loss 0.13519161 - time (sec): 17.63 - samples/sec: 7021.09 - lr: 0.000033 - momentum: 0.000000 2023-10-20 10:19:55,893 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:19:55,893 EPOCH 4 done: loss 0.1351 - lr: 0.000033 2023-10-20 10:19:57,288 DEV : loss 0.0839473307132721 - f1-score (micro avg) 0.5499 2023-10-20 10:19:57,300 saving best model 2023-10-20 10:19:57,331 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:19:59,013 epoch 5 - iter 77/773 - loss 0.12213229 - time (sec): 1.68 - samples/sec: 7265.57 - lr: 0.000033 - momentum: 0.000000 2023-10-20 10:20:00,701 epoch 5 - iter 154/773 - loss 0.12055572 - time (sec): 3.37 - samples/sec: 7464.07 - lr: 0.000032 - momentum: 0.000000 2023-10-20 10:20:02,455 epoch 5 - iter 231/773 - loss 0.12157099 - time (sec): 5.12 - samples/sec: 7243.51 - lr: 0.000032 - momentum: 0.000000 2023-10-20 10:20:04,200 epoch 5 - iter 308/773 - loss 0.11807165 - time (sec): 6.87 - samples/sec: 7190.00 - lr: 0.000031 - momentum: 0.000000 2023-10-20 10:20:05,954 epoch 5 - iter 385/773 - loss 0.12115097 - time (sec): 8.62 - samples/sec: 7241.57 - lr: 0.000031 - momentum: 0.000000 2023-10-20 10:20:07,703 epoch 5 - iter 462/773 - loss 0.12082630 - time (sec): 10.37 - samples/sec: 7199.19 - lr: 0.000030 - momentum: 0.000000 2023-10-20 10:20:09,452 epoch 5 - iter 539/773 - loss 0.12301358 - time (sec): 12.12 - samples/sec: 7200.49 - lr: 0.000029 - momentum: 0.000000 2023-10-20 10:20:11,161 epoch 5 - iter 616/773 - loss 0.12323717 - time (sec): 13.83 - samples/sec: 7218.90 - lr: 0.000029 - momentum: 0.000000 2023-10-20 10:20:12,848 epoch 5 - iter 693/773 - loss 0.12371624 - time (sec): 15.52 - samples/sec: 7218.93 - lr: 0.000028 - momentum: 0.000000 2023-10-20 10:20:14,577 epoch 5 - iter 770/773 - loss 0.12519008 - time (sec): 17.25 - samples/sec: 7183.33 - lr: 0.000028 - momentum: 0.000000 2023-10-20 10:20:14,644 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:20:14,644 EPOCH 5 done: loss 0.1255 - lr: 0.000028 2023-10-20 10:20:15,740 DEV : loss 0.08093783259391785 - f1-score (micro avg) 0.5624 2023-10-20 10:20:15,752 saving best model 2023-10-20 10:20:15,787 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:20:17,528 epoch 6 - iter 77/773 - loss 0.11746877 - time (sec): 1.74 - samples/sec: 7216.53 - lr: 0.000027 - momentum: 0.000000 2023-10-20 10:20:19,261 epoch 6 - iter 154/773 - loss 0.10613002 - time (sec): 3.47 - samples/sec: 7147.55 - lr: 0.000027 - momentum: 0.000000 2023-10-20 10:20:21,050 epoch 6 - iter 231/773 - loss 0.10693685 - time (sec): 5.26 - samples/sec: 7117.39 - lr: 0.000026 - momentum: 0.000000 2023-10-20 10:20:22,761 epoch 6 - iter 308/773 - loss 0.10922407 - time (sec): 6.97 - samples/sec: 7043.40 - lr: 0.000026 - momentum: 0.000000 2023-10-20 10:20:24,521 epoch 6 - iter 385/773 - loss 0.11360514 - time (sec): 8.73 - samples/sec: 7057.18 - lr: 0.000025 - momentum: 0.000000 2023-10-20 10:20:26,292 epoch 6 - iter 462/773 - loss 0.11438363 - time (sec): 10.50 - samples/sec: 7089.40 - lr: 0.000024 - momentum: 0.000000 2023-10-20 10:20:28,060 epoch 6 - iter 539/773 - loss 0.11502393 - time (sec): 12.27 - samples/sec: 7121.44 - lr: 0.000024 - momentum: 0.000000 2023-10-20 10:20:29,797 epoch 6 - iter 616/773 - loss 0.11536019 - time (sec): 14.01 - samples/sec: 7118.05 - lr: 0.000023 - momentum: 0.000000 2023-10-20 10:20:31,497 epoch 6 - iter 693/773 - loss 0.11465378 - time (sec): 15.71 - samples/sec: 7130.83 - lr: 0.000023 - momentum: 0.000000 2023-10-20 10:20:33,255 epoch 6 - iter 770/773 - loss 0.11661110 - time (sec): 17.47 - samples/sec: 7085.16 - lr: 0.000022 - momentum: 0.000000 2023-10-20 10:20:33,324 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:20:33,325 EPOCH 6 done: loss 0.1163 - lr: 0.000022 2023-10-20 10:20:34,423 DEV : loss 0.07920077443122864 - f1-score (micro avg) 0.5938 2023-10-20 10:20:34,434 saving best model 2023-10-20 10:20:34,470 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:20:36,168 epoch 7 - iter 77/773 - loss 0.09474261 - time (sec): 1.70 - samples/sec: 7339.27 - lr: 0.000022 - momentum: 0.000000 2023-10-20 10:20:37,847 epoch 7 - iter 154/773 - loss 0.11040553 - time (sec): 3.38 - samples/sec: 7152.24 - lr: 0.000021 - momentum: 0.000000 2023-10-20 10:20:39,639 epoch 7 - iter 231/773 - loss 0.10910553 - time (sec): 5.17 - samples/sec: 7035.86 - lr: 0.000021 - momentum: 0.000000 2023-10-20 10:20:41,210 epoch 7 - iter 308/773 - loss 0.10937316 - time (sec): 6.74 - samples/sec: 7219.39 - lr: 0.000020 - momentum: 0.000000 2023-10-20 10:20:43,003 epoch 7 - iter 385/773 - loss 0.11012476 - time (sec): 8.53 - samples/sec: 7185.89 - lr: 0.000019 - momentum: 0.000000 2023-10-20 10:20:44,892 epoch 7 - iter 462/773 - loss 0.10863752 - time (sec): 10.42 - samples/sec: 7076.21 - lr: 0.000019 - momentum: 0.000000 2023-10-20 10:20:46,629 epoch 7 - iter 539/773 - loss 0.11063320 - time (sec): 12.16 - samples/sec: 7116.68 - lr: 0.000018 - momentum: 0.000000 2023-10-20 10:20:48,375 epoch 7 - iter 616/773 - loss 0.11080838 - time (sec): 13.90 - samples/sec: 7158.60 - lr: 0.000018 - momentum: 0.000000 2023-10-20 10:20:50,075 epoch 7 - iter 693/773 - loss 0.10962384 - time (sec): 15.60 - samples/sec: 7182.38 - lr: 0.000017 - momentum: 0.000000 2023-10-20 10:20:51,796 epoch 7 - iter 770/773 - loss 0.11036441 - time (sec): 17.33 - samples/sec: 7152.72 - lr: 0.000017 - momentum: 0.000000 2023-10-20 10:20:51,856 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:20:51,856 EPOCH 7 done: loss 0.1102 - lr: 0.000017 2023-10-20 10:20:52,939 DEV : loss 0.08024124056100845 - f1-score (micro avg) 0.5759 2023-10-20 10:20:52,951 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:20:54,677 epoch 8 - iter 77/773 - loss 0.09697879 - time (sec): 1.73 - samples/sec: 7324.84 - lr: 0.000016 - momentum: 0.000000 2023-10-20 10:20:56,401 epoch 8 - iter 154/773 - loss 0.10163992 - time (sec): 3.45 - samples/sec: 6933.69 - lr: 0.000016 - momentum: 0.000000 2023-10-20 10:20:58,130 epoch 8 - iter 231/773 - loss 0.10296558 - time (sec): 5.18 - samples/sec: 6998.96 - lr: 0.000015 - momentum: 0.000000 2023-10-20 10:20:59,877 epoch 8 - iter 308/773 - loss 0.10472086 - time (sec): 6.93 - samples/sec: 7099.31 - lr: 0.000014 - momentum: 0.000000 2023-10-20 10:21:01,656 epoch 8 - iter 385/773 - loss 0.10171931 - time (sec): 8.70 - samples/sec: 7078.82 - lr: 0.000014 - momentum: 0.000000 2023-10-20 10:21:03,414 epoch 8 - iter 462/773 - loss 0.10123830 - time (sec): 10.46 - samples/sec: 7071.45 - lr: 0.000013 - momentum: 0.000000 2023-10-20 10:21:05,121 epoch 8 - iter 539/773 - loss 0.10188542 - time (sec): 12.17 - samples/sec: 7010.05 - lr: 0.000013 - momentum: 0.000000 2023-10-20 10:21:06,893 epoch 8 - iter 616/773 - loss 0.10110334 - time (sec): 13.94 - samples/sec: 7076.37 - lr: 0.000012 - momentum: 0.000000 2023-10-20 10:21:08,793 epoch 8 - iter 693/773 - loss 0.10353577 - time (sec): 15.84 - samples/sec: 7019.44 - lr: 0.000012 - momentum: 0.000000 2023-10-20 10:21:10,571 epoch 8 - iter 770/773 - loss 0.10656965 - time (sec): 17.62 - samples/sec: 7032.87 - lr: 0.000011 - momentum: 0.000000 2023-10-20 10:21:10,637 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:21:10,637 EPOCH 8 done: loss 0.1065 - lr: 0.000011 2023-10-20 10:21:11,725 DEV : loss 0.08284606039524078 - f1-score (micro avg) 0.5864 2023-10-20 10:21:11,737 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:21:13,496 epoch 9 - iter 77/773 - loss 0.10668199 - time (sec): 1.76 - samples/sec: 7717.89 - lr: 0.000011 - momentum: 0.000000 2023-10-20 10:21:15,237 epoch 9 - iter 154/773 - loss 0.10089596 - time (sec): 3.50 - samples/sec: 7436.55 - lr: 0.000010 - momentum: 0.000000 2023-10-20 10:21:17,044 epoch 9 - iter 231/773 - loss 0.10023690 - time (sec): 5.31 - samples/sec: 7290.69 - lr: 0.000009 - momentum: 0.000000 2023-10-20 10:21:18,782 epoch 9 - iter 308/773 - loss 0.10112886 - time (sec): 7.04 - samples/sec: 7224.41 - lr: 0.000009 - momentum: 0.000000 2023-10-20 10:21:20,480 epoch 9 - iter 385/773 - loss 0.09928799 - time (sec): 8.74 - samples/sec: 7262.97 - lr: 0.000008 - momentum: 0.000000 2023-10-20 10:21:22,194 epoch 9 - iter 462/773 - loss 0.10228269 - time (sec): 10.46 - samples/sec: 7222.17 - lr: 0.000008 - momentum: 0.000000 2023-10-20 10:21:23,955 epoch 9 - iter 539/773 - loss 0.09979659 - time (sec): 12.22 - samples/sec: 7174.05 - lr: 0.000007 - momentum: 0.000000 2023-10-20 10:21:25,713 epoch 9 - iter 616/773 - loss 0.09923517 - time (sec): 13.98 - samples/sec: 7078.44 - lr: 0.000007 - momentum: 0.000000 2023-10-20 10:21:27,481 epoch 9 - iter 693/773 - loss 0.10044792 - time (sec): 15.74 - samples/sec: 7108.21 - lr: 0.000006 - momentum: 0.000000 2023-10-20 10:21:29,198 epoch 9 - iter 770/773 - loss 0.10167632 - time (sec): 17.46 - samples/sec: 7088.31 - lr: 0.000006 - momentum: 0.000000 2023-10-20 10:21:29,267 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:21:29,267 EPOCH 9 done: loss 0.1015 - lr: 0.000006 2023-10-20 10:21:30,360 DEV : loss 0.08339545875787735 - f1-score (micro avg) 0.5895 2023-10-20 10:21:30,372 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:21:32,116 epoch 10 - iter 77/773 - loss 0.10844329 - time (sec): 1.74 - samples/sec: 7021.81 - lr: 0.000005 - momentum: 0.000000 2023-10-20 10:21:33,887 epoch 10 - iter 154/773 - loss 0.10838063 - time (sec): 3.52 - samples/sec: 7150.42 - lr: 0.000005 - momentum: 0.000000 2023-10-20 10:21:35,563 epoch 10 - iter 231/773 - loss 0.09648269 - time (sec): 5.19 - samples/sec: 7088.88 - lr: 0.000004 - momentum: 0.000000 2023-10-20 10:21:37,267 epoch 10 - iter 308/773 - loss 0.09663622 - time (sec): 6.90 - samples/sec: 7001.59 - lr: 0.000003 - momentum: 0.000000 2023-10-20 10:21:38,966 epoch 10 - iter 385/773 - loss 0.09900535 - time (sec): 8.59 - samples/sec: 7102.73 - lr: 0.000003 - momentum: 0.000000 2023-10-20 10:21:40,567 epoch 10 - iter 462/773 - loss 0.09936474 - time (sec): 10.19 - samples/sec: 7201.67 - lr: 0.000002 - momentum: 0.000000 2023-10-20 10:21:42,271 epoch 10 - iter 539/773 - loss 0.09809994 - time (sec): 11.90 - samples/sec: 7266.80 - lr: 0.000002 - momentum: 0.000000 2023-10-20 10:21:44,006 epoch 10 - iter 616/773 - loss 0.09826543 - time (sec): 13.63 - samples/sec: 7217.04 - lr: 0.000001 - momentum: 0.000000 2023-10-20 10:21:45,897 epoch 10 - iter 693/773 - loss 0.09749620 - time (sec): 15.52 - samples/sec: 7190.67 - lr: 0.000001 - momentum: 0.000000 2023-10-20 10:21:47,593 epoch 10 - iter 770/773 - loss 0.10058972 - time (sec): 17.22 - samples/sec: 7193.03 - lr: 0.000000 - momentum: 0.000000 2023-10-20 10:21:47,655 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:21:47,656 EPOCH 10 done: loss 0.1007 - lr: 0.000000 2023-10-20 10:21:48,756 DEV : loss 0.08416001498699188 - f1-score (micro avg) 0.5843 2023-10-20 10:21:48,800 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:21:48,800 Loading model from best epoch ... 2023-10-20 10:21:48,876 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-20 10:21:51,832 Results: - F-score (micro) 0.5388 - F-score (macro) 0.2565 - Accuracy 0.381 By class: precision recall f1-score support LOC 0.5544 0.6628 0.6038 946 BUILDING 0.2267 0.0919 0.1308 185 STREET 1.0000 0.0179 0.0351 56 micro avg 0.5344 0.5434 0.5388 1187 macro avg 0.5937 0.2575 0.2565 1187 weighted avg 0.5243 0.5434 0.5032 1187 2023-10-20 10:21:51,832 ----------------------------------------------------------------------------------------------------