2023-10-20 10:00:36,344 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:00:36,344 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-20 10:00:36,344 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:00:36,344 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-20 10:00:36,345 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:00:36,345 Train: 6183 sentences 2023-10-20 10:00:36,345 (train_with_dev=False, train_with_test=False) 2023-10-20 10:00:36,345 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:00:36,345 Training Params: 2023-10-20 10:00:36,345 - learning_rate: "3e-05" 2023-10-20 10:00:36,345 - mini_batch_size: "8" 2023-10-20 10:00:36,345 - max_epochs: "10" 2023-10-20 10:00:36,345 - shuffle: "True" 2023-10-20 10:00:36,345 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:00:36,345 Plugins: 2023-10-20 10:00:36,345 - TensorboardLogger 2023-10-20 10:00:36,345 - LinearScheduler | warmup_fraction: '0.1' 2023-10-20 10:00:36,345 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:00:36,345 Final evaluation on model from best epoch (best-model.pt) 2023-10-20 10:00:36,345 - metric: "('micro avg', 'f1-score')" 2023-10-20 10:00:36,345 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:00:36,345 Computation: 2023-10-20 10:00:36,345 - compute on device: cuda:0 2023-10-20 10:00:36,345 - embedding storage: none 2023-10-20 10:00:36,345 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:00:36,345 Model training base path: "hmbench-topres19th/en-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-20 10:00:36,345 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:00:36,345 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:00:36,345 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-20 10:00:38,005 epoch 1 - iter 77/773 - loss 3.81241056 - time (sec): 1.66 - samples/sec: 7452.07 - lr: 0.000003 - momentum: 0.000000 2023-10-20 10:00:39,655 epoch 1 - iter 154/773 - loss 3.61951758 - time (sec): 3.31 - samples/sec: 7488.42 - lr: 0.000006 - momentum: 0.000000 2023-10-20 10:00:41,384 epoch 1 - iter 231/773 - loss 3.30275246 - time (sec): 5.04 - samples/sec: 7254.09 - lr: 0.000009 - momentum: 0.000000 2023-10-20 10:00:43,171 epoch 1 - iter 308/773 - loss 2.89747658 - time (sec): 6.82 - samples/sec: 7201.04 - lr: 0.000012 - momentum: 0.000000 2023-10-20 10:00:44,906 epoch 1 - iter 385/773 - loss 2.50629080 - time (sec): 8.56 - samples/sec: 7072.38 - lr: 0.000015 - momentum: 0.000000 2023-10-20 10:00:46,625 epoch 1 - iter 462/773 - loss 2.15258275 - time (sec): 10.28 - samples/sec: 7082.99 - lr: 0.000018 - momentum: 0.000000 2023-10-20 10:00:48,309 epoch 1 - iter 539/773 - loss 1.88716852 - time (sec): 11.96 - samples/sec: 7097.11 - lr: 0.000021 - momentum: 0.000000 2023-10-20 10:00:50,102 epoch 1 - iter 616/773 - loss 1.66647173 - time (sec): 13.76 - samples/sec: 7139.05 - lr: 0.000024 - momentum: 0.000000 2023-10-20 10:00:51,814 epoch 1 - iter 693/773 - loss 1.50059755 - time (sec): 15.47 - samples/sec: 7193.98 - lr: 0.000027 - momentum: 0.000000 2023-10-20 10:00:53,502 epoch 1 - iter 770/773 - loss 1.37496153 - time (sec): 17.16 - samples/sec: 7219.32 - lr: 0.000030 - momentum: 0.000000 2023-10-20 10:00:53,558 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:00:53,558 EPOCH 1 done: loss 1.3710 - lr: 0.000030 2023-10-20 10:00:54,265 DEV : loss 0.15286844968795776 - f1-score (micro avg) 0.0 2023-10-20 10:00:54,278 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:00:55,770 epoch 2 - iter 77/773 - loss 0.23214451 - time (sec): 1.49 - samples/sec: 8859.76 - lr: 0.000030 - momentum: 0.000000 2023-10-20 10:00:57,781 epoch 2 - iter 154/773 - loss 0.24097431 - time (sec): 3.50 - samples/sec: 7562.15 - lr: 0.000029 - momentum: 0.000000 2023-10-20 10:00:59,442 epoch 2 - iter 231/773 - loss 0.24356049 - time (sec): 5.16 - samples/sec: 7206.01 - lr: 0.000029 - momentum: 0.000000 2023-10-20 10:01:01,167 epoch 2 - iter 308/773 - loss 0.24470903 - time (sec): 6.89 - samples/sec: 7125.02 - lr: 0.000029 - momentum: 0.000000 2023-10-20 10:01:02,897 epoch 2 - iter 385/773 - loss 0.24262456 - time (sec): 8.62 - samples/sec: 7071.31 - lr: 0.000028 - momentum: 0.000000 2023-10-20 10:01:04,655 epoch 2 - iter 462/773 - loss 0.23888426 - time (sec): 10.38 - samples/sec: 7082.59 - lr: 0.000028 - momentum: 0.000000 2023-10-20 10:01:06,429 epoch 2 - iter 539/773 - loss 0.22964243 - time (sec): 12.15 - samples/sec: 7132.08 - lr: 0.000028 - momentum: 0.000000 2023-10-20 10:01:08,166 epoch 2 - iter 616/773 - loss 0.23048232 - time (sec): 13.89 - samples/sec: 7075.56 - lr: 0.000027 - momentum: 0.000000 2023-10-20 10:01:09,923 epoch 2 - iter 693/773 - loss 0.22376378 - time (sec): 15.64 - samples/sec: 7042.07 - lr: 0.000027 - momentum: 0.000000 2023-10-20 10:01:11,712 epoch 2 - iter 770/773 - loss 0.21921775 - time (sec): 17.43 - samples/sec: 7098.42 - lr: 0.000027 - momentum: 0.000000 2023-10-20 10:01:11,774 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:01:11,774 EPOCH 2 done: loss 0.2189 - lr: 0.000027 2023-10-20 10:01:12,842 DEV : loss 0.10697879642248154 - f1-score (micro avg) 0.2113 2023-10-20 10:01:12,854 saving best model 2023-10-20 10:01:12,882 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:01:14,585 epoch 3 - iter 77/773 - loss 0.17960081 - time (sec): 1.70 - samples/sec: 7327.95 - lr: 0.000026 - momentum: 0.000000 2023-10-20 10:01:16,295 epoch 3 - iter 154/773 - loss 0.17934017 - time (sec): 3.41 - samples/sec: 7047.87 - lr: 0.000026 - momentum: 0.000000 2023-10-20 10:01:18,025 epoch 3 - iter 231/773 - loss 0.17558716 - time (sec): 5.14 - samples/sec: 7188.59 - lr: 0.000026 - momentum: 0.000000 2023-10-20 10:01:19,753 epoch 3 - iter 308/773 - loss 0.17212991 - time (sec): 6.87 - samples/sec: 7197.64 - lr: 0.000025 - momentum: 0.000000 2023-10-20 10:01:21,516 epoch 3 - iter 385/773 - loss 0.17346036 - time (sec): 8.63 - samples/sec: 7219.64 - lr: 0.000025 - momentum: 0.000000 2023-10-20 10:01:23,201 epoch 3 - iter 462/773 - loss 0.17673051 - time (sec): 10.32 - samples/sec: 7092.46 - lr: 0.000025 - momentum: 0.000000 2023-10-20 10:01:24,962 epoch 3 - iter 539/773 - loss 0.17649304 - time (sec): 12.08 - samples/sec: 7123.71 - lr: 0.000024 - momentum: 0.000000 2023-10-20 10:01:26,682 epoch 3 - iter 616/773 - loss 0.17628556 - time (sec): 13.80 - samples/sec: 7172.27 - lr: 0.000024 - momentum: 0.000000 2023-10-20 10:01:28,419 epoch 3 - iter 693/773 - loss 0.17756531 - time (sec): 15.54 - samples/sec: 7157.36 - lr: 0.000024 - momentum: 0.000000 2023-10-20 10:01:30,165 epoch 3 - iter 770/773 - loss 0.17579288 - time (sec): 17.28 - samples/sec: 7169.81 - lr: 0.000023 - momentum: 0.000000 2023-10-20 10:01:30,225 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:01:30,225 EPOCH 3 done: loss 0.1759 - lr: 0.000023 2023-10-20 10:01:31,314 DEV : loss 0.0952216237783432 - f1-score (micro avg) 0.3757 2023-10-20 10:01:31,327 saving best model 2023-10-20 10:01:31,361 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:01:33,173 epoch 4 - iter 77/773 - loss 0.18477387 - time (sec): 1.81 - samples/sec: 7281.37 - lr: 0.000023 - momentum: 0.000000 2023-10-20 10:01:34,897 epoch 4 - iter 154/773 - loss 0.18279801 - time (sec): 3.53 - samples/sec: 7010.16 - lr: 0.000023 - momentum: 0.000000 2023-10-20 10:01:36,632 epoch 4 - iter 231/773 - loss 0.17577730 - time (sec): 5.27 - samples/sec: 6871.73 - lr: 0.000022 - momentum: 0.000000 2023-10-20 10:01:38,395 epoch 4 - iter 308/773 - loss 0.17374644 - time (sec): 7.03 - samples/sec: 7028.03 - lr: 0.000022 - momentum: 0.000000 2023-10-20 10:01:40,155 epoch 4 - iter 385/773 - loss 0.17036912 - time (sec): 8.79 - samples/sec: 6952.04 - lr: 0.000022 - momentum: 0.000000 2023-10-20 10:01:41,895 epoch 4 - iter 462/773 - loss 0.16676008 - time (sec): 10.53 - samples/sec: 6967.86 - lr: 0.000021 - momentum: 0.000000 2023-10-20 10:01:43,678 epoch 4 - iter 539/773 - loss 0.16364124 - time (sec): 12.32 - samples/sec: 6973.48 - lr: 0.000021 - momentum: 0.000000 2023-10-20 10:01:45,415 epoch 4 - iter 616/773 - loss 0.16324635 - time (sec): 14.05 - samples/sec: 7019.45 - lr: 0.000021 - momentum: 0.000000 2023-10-20 10:01:47,202 epoch 4 - iter 693/773 - loss 0.16215188 - time (sec): 15.84 - samples/sec: 7030.24 - lr: 0.000020 - momentum: 0.000000 2023-10-20 10:01:48,917 epoch 4 - iter 770/773 - loss 0.15946543 - time (sec): 17.56 - samples/sec: 7046.10 - lr: 0.000020 - momentum: 0.000000 2023-10-20 10:01:48,986 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:01:48,986 EPOCH 4 done: loss 0.1598 - lr: 0.000020 2023-10-20 10:01:50,067 DEV : loss 0.08685088157653809 - f1-score (micro avg) 0.4806 2023-10-20 10:01:50,079 saving best model 2023-10-20 10:01:50,119 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:01:51,873 epoch 5 - iter 77/773 - loss 0.14344519 - time (sec): 1.75 - samples/sec: 6814.44 - lr: 0.000020 - momentum: 0.000000 2023-10-20 10:01:53,683 epoch 5 - iter 154/773 - loss 0.15258628 - time (sec): 3.56 - samples/sec: 7121.76 - lr: 0.000019 - momentum: 0.000000 2023-10-20 10:01:55,431 epoch 5 - iter 231/773 - loss 0.15926291 - time (sec): 5.31 - samples/sec: 7141.00 - lr: 0.000019 - momentum: 0.000000 2023-10-20 10:01:57,170 epoch 5 - iter 308/773 - loss 0.15525040 - time (sec): 7.05 - samples/sec: 7002.21 - lr: 0.000019 - momentum: 0.000000 2023-10-20 10:01:59,041 epoch 5 - iter 385/773 - loss 0.15298376 - time (sec): 8.92 - samples/sec: 6989.00 - lr: 0.000018 - momentum: 0.000000 2023-10-20 10:02:00,849 epoch 5 - iter 462/773 - loss 0.14900655 - time (sec): 10.73 - samples/sec: 6924.76 - lr: 0.000018 - momentum: 0.000000 2023-10-20 10:02:02,653 epoch 5 - iter 539/773 - loss 0.14781168 - time (sec): 12.53 - samples/sec: 6941.72 - lr: 0.000018 - momentum: 0.000000 2023-10-20 10:02:04,495 epoch 5 - iter 616/773 - loss 0.15062811 - time (sec): 14.38 - samples/sec: 6896.32 - lr: 0.000017 - momentum: 0.000000 2023-10-20 10:02:06,282 epoch 5 - iter 693/773 - loss 0.15115235 - time (sec): 16.16 - samples/sec: 6910.24 - lr: 0.000017 - momentum: 0.000000 2023-10-20 10:02:08,002 epoch 5 - iter 770/773 - loss 0.14968908 - time (sec): 17.88 - samples/sec: 6919.99 - lr: 0.000017 - momentum: 0.000000 2023-10-20 10:02:08,074 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:02:08,074 EPOCH 5 done: loss 0.1495 - lr: 0.000017 2023-10-20 10:02:09,173 DEV : loss 0.08496666699647903 - f1-score (micro avg) 0.4939 2023-10-20 10:02:09,185 saving best model 2023-10-20 10:02:09,219 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:02:10,915 epoch 6 - iter 77/773 - loss 0.13683614 - time (sec): 1.70 - samples/sec: 6668.03 - lr: 0.000016 - momentum: 0.000000 2023-10-20 10:02:12,711 epoch 6 - iter 154/773 - loss 0.14211735 - time (sec): 3.49 - samples/sec: 6801.54 - lr: 0.000016 - momentum: 0.000000 2023-10-20 10:02:14,513 epoch 6 - iter 231/773 - loss 0.14128117 - time (sec): 5.29 - samples/sec: 6853.01 - lr: 0.000016 - momentum: 0.000000 2023-10-20 10:02:16,289 epoch 6 - iter 308/773 - loss 0.13590630 - time (sec): 7.07 - samples/sec: 6891.57 - lr: 0.000015 - momentum: 0.000000 2023-10-20 10:02:17,903 epoch 6 - iter 385/773 - loss 0.14259420 - time (sec): 8.68 - samples/sec: 7003.45 - lr: 0.000015 - momentum: 0.000000 2023-10-20 10:02:19,585 epoch 6 - iter 462/773 - loss 0.14552202 - time (sec): 10.37 - samples/sec: 7037.70 - lr: 0.000015 - momentum: 0.000000 2023-10-20 10:02:21,375 epoch 6 - iter 539/773 - loss 0.14254133 - time (sec): 12.16 - samples/sec: 7031.44 - lr: 0.000014 - momentum: 0.000000 2023-10-20 10:02:23,107 epoch 6 - iter 616/773 - loss 0.14102050 - time (sec): 13.89 - samples/sec: 7092.37 - lr: 0.000014 - momentum: 0.000000 2023-10-20 10:02:24,871 epoch 6 - iter 693/773 - loss 0.13907785 - time (sec): 15.65 - samples/sec: 7098.85 - lr: 0.000014 - momentum: 0.000000 2023-10-20 10:02:26,636 epoch 6 - iter 770/773 - loss 0.14100570 - time (sec): 17.42 - samples/sec: 7095.47 - lr: 0.000013 - momentum: 0.000000 2023-10-20 10:02:26,717 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:02:26,717 EPOCH 6 done: loss 0.1404 - lr: 0.000013 2023-10-20 10:02:27,791 DEV : loss 0.08332625776529312 - f1-score (micro avg) 0.5304 2023-10-20 10:02:27,802 saving best model 2023-10-20 10:02:27,836 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:02:29,593 epoch 7 - iter 77/773 - loss 0.14476114 - time (sec): 1.76 - samples/sec: 7188.18 - lr: 0.000013 - momentum: 0.000000 2023-10-20 10:02:31,331 epoch 7 - iter 154/773 - loss 0.14402242 - time (sec): 3.49 - samples/sec: 7109.92 - lr: 0.000013 - momentum: 0.000000 2023-10-20 10:02:33,063 epoch 7 - iter 231/773 - loss 0.14036025 - time (sec): 5.23 - samples/sec: 7082.29 - lr: 0.000012 - momentum: 0.000000 2023-10-20 10:02:34,756 epoch 7 - iter 308/773 - loss 0.14534767 - time (sec): 6.92 - samples/sec: 7173.84 - lr: 0.000012 - momentum: 0.000000 2023-10-20 10:02:36,559 epoch 7 - iter 385/773 - loss 0.14110367 - time (sec): 8.72 - samples/sec: 7275.71 - lr: 0.000012 - momentum: 0.000000 2023-10-20 10:02:38,203 epoch 7 - iter 462/773 - loss 0.13753663 - time (sec): 10.37 - samples/sec: 7309.11 - lr: 0.000011 - momentum: 0.000000 2023-10-20 10:02:39,822 epoch 7 - iter 539/773 - loss 0.13875122 - time (sec): 11.99 - samples/sec: 7318.44 - lr: 0.000011 - momentum: 0.000000 2023-10-20 10:02:41,511 epoch 7 - iter 616/773 - loss 0.13955186 - time (sec): 13.67 - samples/sec: 7296.14 - lr: 0.000011 - momentum: 0.000000 2023-10-20 10:02:43,257 epoch 7 - iter 693/773 - loss 0.13895365 - time (sec): 15.42 - samples/sec: 7257.89 - lr: 0.000010 - momentum: 0.000000 2023-10-20 10:02:44,971 epoch 7 - iter 770/773 - loss 0.13686198 - time (sec): 17.13 - samples/sec: 7227.08 - lr: 0.000010 - momentum: 0.000000 2023-10-20 10:02:45,031 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:02:45,031 EPOCH 7 done: loss 0.1368 - lr: 0.000010 2023-10-20 10:02:46,107 DEV : loss 0.08445805311203003 - f1-score (micro avg) 0.5253 2023-10-20 10:02:46,118 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:02:47,862 epoch 8 - iter 77/773 - loss 0.12222683 - time (sec): 1.74 - samples/sec: 7201.77 - lr: 0.000010 - momentum: 0.000000 2023-10-20 10:02:49,623 epoch 8 - iter 154/773 - loss 0.12160289 - time (sec): 3.50 - samples/sec: 7112.73 - lr: 0.000009 - momentum: 0.000000 2023-10-20 10:02:51,352 epoch 8 - iter 231/773 - loss 0.12601210 - time (sec): 5.23 - samples/sec: 7119.34 - lr: 0.000009 - momentum: 0.000000 2023-10-20 10:02:53,104 epoch 8 - iter 308/773 - loss 0.12825557 - time (sec): 6.99 - samples/sec: 7114.87 - lr: 0.000009 - momentum: 0.000000 2023-10-20 10:02:54,802 epoch 8 - iter 385/773 - loss 0.13149787 - time (sec): 8.68 - samples/sec: 7182.75 - lr: 0.000008 - momentum: 0.000000 2023-10-20 10:02:56,573 epoch 8 - iter 462/773 - loss 0.13242894 - time (sec): 10.45 - samples/sec: 7128.04 - lr: 0.000008 - momentum: 0.000000 2023-10-20 10:02:58,284 epoch 8 - iter 539/773 - loss 0.13472536 - time (sec): 12.17 - samples/sec: 7127.85 - lr: 0.000008 - momentum: 0.000000 2023-10-20 10:03:00,085 epoch 8 - iter 616/773 - loss 0.13365370 - time (sec): 13.97 - samples/sec: 7098.48 - lr: 0.000007 - momentum: 0.000000 2023-10-20 10:03:01,833 epoch 8 - iter 693/773 - loss 0.13290360 - time (sec): 15.71 - samples/sec: 7091.52 - lr: 0.000007 - momentum: 0.000000 2023-10-20 10:03:03,535 epoch 8 - iter 770/773 - loss 0.13146102 - time (sec): 17.42 - samples/sec: 7107.06 - lr: 0.000007 - momentum: 0.000000 2023-10-20 10:03:03,599 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:03:03,599 EPOCH 8 done: loss 0.1312 - lr: 0.000007 2023-10-20 10:03:04,675 DEV : loss 0.08267096430063248 - f1-score (micro avg) 0.5442 2023-10-20 10:03:04,687 saving best model 2023-10-20 10:03:04,730 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:03:06,377 epoch 9 - iter 77/773 - loss 0.13426845 - time (sec): 1.65 - samples/sec: 7221.24 - lr: 0.000006 - momentum: 0.000000 2023-10-20 10:03:08,121 epoch 9 - iter 154/773 - loss 0.12764335 - time (sec): 3.39 - samples/sec: 7073.83 - lr: 0.000006 - momentum: 0.000000 2023-10-20 10:03:09,885 epoch 9 - iter 231/773 - loss 0.13064958 - time (sec): 5.15 - samples/sec: 7054.91 - lr: 0.000006 - momentum: 0.000000 2023-10-20 10:03:11,689 epoch 9 - iter 308/773 - loss 0.12547793 - time (sec): 6.96 - samples/sec: 7098.59 - lr: 0.000005 - momentum: 0.000000 2023-10-20 10:03:13,457 epoch 9 - iter 385/773 - loss 0.12185205 - time (sec): 8.73 - samples/sec: 7038.94 - lr: 0.000005 - momentum: 0.000000 2023-10-20 10:03:15,360 epoch 9 - iter 462/773 - loss 0.12534457 - time (sec): 10.63 - samples/sec: 6904.04 - lr: 0.000005 - momentum: 0.000000 2023-10-20 10:03:17,212 epoch 9 - iter 539/773 - loss 0.12499457 - time (sec): 12.48 - samples/sec: 6875.73 - lr: 0.000004 - momentum: 0.000000 2023-10-20 10:03:19,029 epoch 9 - iter 616/773 - loss 0.12754088 - time (sec): 14.30 - samples/sec: 6896.18 - lr: 0.000004 - momentum: 0.000000 2023-10-20 10:03:20,794 epoch 9 - iter 693/773 - loss 0.12629823 - time (sec): 16.06 - samples/sec: 6965.13 - lr: 0.000004 - momentum: 0.000000 2023-10-20 10:03:22,545 epoch 9 - iter 770/773 - loss 0.12671079 - time (sec): 17.81 - samples/sec: 6951.05 - lr: 0.000003 - momentum: 0.000000 2023-10-20 10:03:22,608 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:03:22,609 EPOCH 9 done: loss 0.1270 - lr: 0.000003 2023-10-20 10:03:23,687 DEV : loss 0.08247760683298111 - f1-score (micro avg) 0.5471 2023-10-20 10:03:23,698 saving best model 2023-10-20 10:03:23,736 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:03:25,433 epoch 10 - iter 77/773 - loss 0.14764006 - time (sec): 1.70 - samples/sec: 6905.70 - lr: 0.000003 - momentum: 0.000000 2023-10-20 10:03:27,260 epoch 10 - iter 154/773 - loss 0.13385632 - time (sec): 3.52 - samples/sec: 6919.85 - lr: 0.000003 - momentum: 0.000000 2023-10-20 10:03:28,975 epoch 10 - iter 231/773 - loss 0.12965434 - time (sec): 5.24 - samples/sec: 7108.23 - lr: 0.000002 - momentum: 0.000000 2023-10-20 10:03:30,805 epoch 10 - iter 308/773 - loss 0.12967758 - time (sec): 7.07 - samples/sec: 7041.74 - lr: 0.000002 - momentum: 0.000000 2023-10-20 10:03:32,607 epoch 10 - iter 385/773 - loss 0.13165440 - time (sec): 8.87 - samples/sec: 6989.40 - lr: 0.000002 - momentum: 0.000000 2023-10-20 10:03:34,348 epoch 10 - iter 462/773 - loss 0.13289038 - time (sec): 10.61 - samples/sec: 6946.57 - lr: 0.000001 - momentum: 0.000000 2023-10-20 10:03:36,176 epoch 10 - iter 539/773 - loss 0.13066958 - time (sec): 12.44 - samples/sec: 6936.41 - lr: 0.000001 - momentum: 0.000000 2023-10-20 10:03:37,989 epoch 10 - iter 616/773 - loss 0.12645987 - time (sec): 14.25 - samples/sec: 6961.31 - lr: 0.000001 - momentum: 0.000000 2023-10-20 10:03:39,744 epoch 10 - iter 693/773 - loss 0.12677426 - time (sec): 16.01 - samples/sec: 6945.06 - lr: 0.000000 - momentum: 0.000000 2023-10-20 10:03:41,586 epoch 10 - iter 770/773 - loss 0.12613690 - time (sec): 17.85 - samples/sec: 6927.38 - lr: 0.000000 - momentum: 0.000000 2023-10-20 10:03:41,655 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:03:41,655 EPOCH 10 done: loss 0.1262 - lr: 0.000000 2023-10-20 10:03:42,740 DEV : loss 0.08230794966220856 - f1-score (micro avg) 0.5483 2023-10-20 10:03:42,752 saving best model 2023-10-20 10:03:42,817 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:03:42,817 Loading model from best epoch ... 2023-10-20 10:03:42,890 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-20 10:03:45,758 Results: - F-score (micro) 0.5024 - F-score (macro) 0.189 - Accuracy 0.3412 By class: precision recall f1-score support LOC 0.5714 0.5624 0.5669 946 BUILDING 0.0000 0.0000 0.0000 185 STREET 0.0000 0.0000 0.0000 56 micro avg 0.5714 0.4482 0.5024 1187 macro avg 0.1905 0.1875 0.1890 1187 weighted avg 0.4554 0.4482 0.4518 1187 2023-10-20 10:03:45,758 ----------------------------------------------------------------------------------------------------