2023-10-20 09:36:51,182 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:36:51,183 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-20 09:36:51,183 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:36:51,183 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-20 09:36:51,183 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:36:51,183 Train: 6183 sentences 2023-10-20 09:36:51,183 (train_with_dev=False, train_with_test=False) 2023-10-20 09:36:51,183 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:36:51,183 Training Params: 2023-10-20 09:36:51,183 - learning_rate: "3e-05" 2023-10-20 09:36:51,183 - mini_batch_size: "4" 2023-10-20 09:36:51,183 - max_epochs: "10" 2023-10-20 09:36:51,183 - shuffle: "True" 2023-10-20 09:36:51,183 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:36:51,183 Plugins: 2023-10-20 09:36:51,183 - TensorboardLogger 2023-10-20 09:36:51,183 - LinearScheduler | warmup_fraction: '0.1' 2023-10-20 09:36:51,183 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:36:51,183 Final evaluation on model from best epoch (best-model.pt) 2023-10-20 09:36:51,183 - metric: "('micro avg', 'f1-score')" 2023-10-20 09:36:51,183 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:36:51,183 Computation: 2023-10-20 09:36:51,184 - compute on device: cuda:0 2023-10-20 09:36:51,184 - embedding storage: none 2023-10-20 09:36:51,184 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:36:51,184 Model training base path: "hmbench-topres19th/en-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-20 09:36:51,184 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:36:51,184 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:36:51,184 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-20 09:36:53,298 epoch 1 - iter 154/1546 - loss 2.79890329 - time (sec): 2.11 - samples/sec: 5867.61 - lr: 0.000003 - momentum: 0.000000 2023-10-20 09:36:55,571 epoch 1 - iter 308/1546 - loss 2.53326360 - time (sec): 4.39 - samples/sec: 5870.77 - lr: 0.000006 - momentum: 0.000000 2023-10-20 09:36:57,605 epoch 1 - iter 462/1546 - loss 2.15682711 - time (sec): 6.42 - samples/sec: 5800.26 - lr: 0.000009 - momentum: 0.000000 2023-10-20 09:36:59,879 epoch 1 - iter 616/1546 - loss 1.77921467 - time (sec): 8.69 - samples/sec: 5583.56 - lr: 0.000012 - momentum: 0.000000 2023-10-20 09:37:02,210 epoch 1 - iter 770/1546 - loss 1.48207150 - time (sec): 11.03 - samples/sec: 5520.04 - lr: 0.000015 - momentum: 0.000000 2023-10-20 09:37:04,541 epoch 1 - iter 924/1546 - loss 1.28110685 - time (sec): 13.36 - samples/sec: 5480.86 - lr: 0.000018 - momentum: 0.000000 2023-10-20 09:37:06,951 epoch 1 - iter 1078/1546 - loss 1.12408317 - time (sec): 15.77 - samples/sec: 5515.77 - lr: 0.000021 - momentum: 0.000000 2023-10-20 09:37:09,275 epoch 1 - iter 1232/1546 - loss 1.02332161 - time (sec): 18.09 - samples/sec: 5461.45 - lr: 0.000024 - momentum: 0.000000 2023-10-20 09:37:11,670 epoch 1 - iter 1386/1546 - loss 0.93494889 - time (sec): 20.49 - samples/sec: 5427.01 - lr: 0.000027 - momentum: 0.000000 2023-10-20 09:37:14,048 epoch 1 - iter 1540/1546 - loss 0.86149992 - time (sec): 22.86 - samples/sec: 5415.54 - lr: 0.000030 - momentum: 0.000000 2023-10-20 09:37:14,137 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:37:14,137 EPOCH 1 done: loss 0.8587 - lr: 0.000030 2023-10-20 09:37:15,099 DEV : loss 0.14802159368991852 - f1-score (micro avg) 0.0 2023-10-20 09:37:15,110 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:37:17,332 epoch 2 - iter 154/1546 - loss 0.22428531 - time (sec): 2.22 - samples/sec: 6129.82 - lr: 0.000030 - momentum: 0.000000 2023-10-20 09:37:19,725 epoch 2 - iter 308/1546 - loss 0.21893451 - time (sec): 4.61 - samples/sec: 5623.79 - lr: 0.000029 - momentum: 0.000000 2023-10-20 09:37:22,217 epoch 2 - iter 462/1546 - loss 0.21988534 - time (sec): 7.11 - samples/sec: 5397.39 - lr: 0.000029 - momentum: 0.000000 2023-10-20 09:37:24,613 epoch 2 - iter 616/1546 - loss 0.21515815 - time (sec): 9.50 - samples/sec: 5316.52 - lr: 0.000029 - momentum: 0.000000 2023-10-20 09:37:27,001 epoch 2 - iter 770/1546 - loss 0.21763414 - time (sec): 11.89 - samples/sec: 5285.81 - lr: 0.000028 - momentum: 0.000000 2023-10-20 09:37:29,346 epoch 2 - iter 924/1546 - loss 0.21211405 - time (sec): 14.23 - samples/sec: 5315.52 - lr: 0.000028 - momentum: 0.000000 2023-10-20 09:37:31,702 epoch 2 - iter 1078/1546 - loss 0.21109794 - time (sec): 16.59 - samples/sec: 5292.76 - lr: 0.000028 - momentum: 0.000000 2023-10-20 09:37:34,038 epoch 2 - iter 1232/1546 - loss 0.20927776 - time (sec): 18.93 - samples/sec: 5250.19 - lr: 0.000027 - momentum: 0.000000 2023-10-20 09:37:36,385 epoch 2 - iter 1386/1546 - loss 0.20802068 - time (sec): 21.27 - samples/sec: 5246.75 - lr: 0.000027 - momentum: 0.000000 2023-10-20 09:37:38,756 epoch 2 - iter 1540/1546 - loss 0.20296366 - time (sec): 23.65 - samples/sec: 5236.15 - lr: 0.000027 - momentum: 0.000000 2023-10-20 09:37:38,850 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:37:38,851 EPOCH 2 done: loss 0.2029 - lr: 0.000027 2023-10-20 09:37:40,186 DEV : loss 0.10443291068077087 - f1-score (micro avg) 0.344 2023-10-20 09:37:40,197 saving best model 2023-10-20 09:37:40,226 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:37:42,489 epoch 3 - iter 154/1546 - loss 0.20147018 - time (sec): 2.26 - samples/sec: 4926.64 - lr: 0.000026 - momentum: 0.000000 2023-10-20 09:37:44,868 epoch 3 - iter 308/1546 - loss 0.17592743 - time (sec): 4.64 - samples/sec: 5245.39 - lr: 0.000026 - momentum: 0.000000 2023-10-20 09:37:47,263 epoch 3 - iter 462/1546 - loss 0.17831688 - time (sec): 7.04 - samples/sec: 5218.33 - lr: 0.000026 - momentum: 0.000000 2023-10-20 09:37:49,657 epoch 3 - iter 616/1546 - loss 0.17034345 - time (sec): 9.43 - samples/sec: 5125.70 - lr: 0.000025 - momentum: 0.000000 2023-10-20 09:37:52,040 epoch 3 - iter 770/1546 - loss 0.16874713 - time (sec): 11.81 - samples/sec: 5166.65 - lr: 0.000025 - momentum: 0.000000 2023-10-20 09:37:54,424 epoch 3 - iter 924/1546 - loss 0.16515313 - time (sec): 14.20 - samples/sec: 5218.98 - lr: 0.000025 - momentum: 0.000000 2023-10-20 09:37:56,803 epoch 3 - iter 1078/1546 - loss 0.16574309 - time (sec): 16.58 - samples/sec: 5214.14 - lr: 0.000024 - momentum: 0.000000 2023-10-20 09:37:59,277 epoch 3 - iter 1232/1546 - loss 0.16826926 - time (sec): 19.05 - samples/sec: 5177.75 - lr: 0.000024 - momentum: 0.000000 2023-10-20 09:38:01,687 epoch 3 - iter 1386/1546 - loss 0.16679726 - time (sec): 21.46 - samples/sec: 5201.01 - lr: 0.000024 - momentum: 0.000000 2023-10-20 09:38:03,985 epoch 3 - iter 1540/1546 - loss 0.16650517 - time (sec): 23.76 - samples/sec: 5211.52 - lr: 0.000023 - momentum: 0.000000 2023-10-20 09:38:04,074 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:38:04,074 EPOCH 3 done: loss 0.1671 - lr: 0.000023 2023-10-20 09:38:05,156 DEV : loss 0.09469402581453323 - f1-score (micro avg) 0.4395 2023-10-20 09:38:05,168 saving best model 2023-10-20 09:38:05,201 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:38:07,560 epoch 4 - iter 154/1546 - loss 0.15130876 - time (sec): 2.36 - samples/sec: 5779.11 - lr: 0.000023 - momentum: 0.000000 2023-10-20 09:38:09,859 epoch 4 - iter 308/1546 - loss 0.15539301 - time (sec): 4.66 - samples/sec: 5410.04 - lr: 0.000023 - momentum: 0.000000 2023-10-20 09:38:12,109 epoch 4 - iter 462/1546 - loss 0.14623792 - time (sec): 6.91 - samples/sec: 5419.50 - lr: 0.000022 - momentum: 0.000000 2023-10-20 09:38:14,528 epoch 4 - iter 616/1546 - loss 0.14975285 - time (sec): 9.33 - samples/sec: 5314.68 - lr: 0.000022 - momentum: 0.000000 2023-10-20 09:38:16,920 epoch 4 - iter 770/1546 - loss 0.14675710 - time (sec): 11.72 - samples/sec: 5342.28 - lr: 0.000022 - momentum: 0.000000 2023-10-20 09:38:19,260 epoch 4 - iter 924/1546 - loss 0.15174538 - time (sec): 14.06 - samples/sec: 5319.26 - lr: 0.000021 - momentum: 0.000000 2023-10-20 09:38:21,650 epoch 4 - iter 1078/1546 - loss 0.15309846 - time (sec): 16.45 - samples/sec: 5314.66 - lr: 0.000021 - momentum: 0.000000 2023-10-20 09:38:23,987 epoch 4 - iter 1232/1546 - loss 0.15364382 - time (sec): 18.79 - samples/sec: 5332.47 - lr: 0.000021 - momentum: 0.000000 2023-10-20 09:38:26,333 epoch 4 - iter 1386/1546 - loss 0.15105626 - time (sec): 21.13 - samples/sec: 5323.99 - lr: 0.000020 - momentum: 0.000000 2023-10-20 09:38:28,678 epoch 4 - iter 1540/1546 - loss 0.15181308 - time (sec): 23.48 - samples/sec: 5275.29 - lr: 0.000020 - momentum: 0.000000 2023-10-20 09:38:28,768 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:38:28,769 EPOCH 4 done: loss 0.1517 - lr: 0.000020 2023-10-20 09:38:29,859 DEV : loss 0.0922057181596756 - f1-score (micro avg) 0.4876 2023-10-20 09:38:29,871 saving best model 2023-10-20 09:38:29,904 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:38:32,254 epoch 5 - iter 154/1546 - loss 0.13556178 - time (sec): 2.35 - samples/sec: 5306.08 - lr: 0.000020 - momentum: 0.000000 2023-10-20 09:38:34,612 epoch 5 - iter 308/1546 - loss 0.13460505 - time (sec): 4.71 - samples/sec: 5253.37 - lr: 0.000019 - momentum: 0.000000 2023-10-20 09:38:36,978 epoch 5 - iter 462/1546 - loss 0.14191314 - time (sec): 7.07 - samples/sec: 5187.37 - lr: 0.000019 - momentum: 0.000000 2023-10-20 09:38:39,356 epoch 5 - iter 616/1546 - loss 0.13895130 - time (sec): 9.45 - samples/sec: 5248.85 - lr: 0.000019 - momentum: 0.000000 2023-10-20 09:38:41,592 epoch 5 - iter 770/1546 - loss 0.13551886 - time (sec): 11.69 - samples/sec: 5344.94 - lr: 0.000018 - momentum: 0.000000 2023-10-20 09:38:44,198 epoch 5 - iter 924/1546 - loss 0.14269409 - time (sec): 14.29 - samples/sec: 5247.88 - lr: 0.000018 - momentum: 0.000000 2023-10-20 09:38:46,538 epoch 5 - iter 1078/1546 - loss 0.14406233 - time (sec): 16.63 - samples/sec: 5240.10 - lr: 0.000018 - momentum: 0.000000 2023-10-20 09:38:48,909 epoch 5 - iter 1232/1546 - loss 0.14545796 - time (sec): 19.00 - samples/sec: 5218.97 - lr: 0.000017 - momentum: 0.000000 2023-10-20 09:38:51,305 epoch 5 - iter 1386/1546 - loss 0.14493046 - time (sec): 21.40 - samples/sec: 5211.35 - lr: 0.000017 - momentum: 0.000000 2023-10-20 09:38:53,695 epoch 5 - iter 1540/1546 - loss 0.14116276 - time (sec): 23.79 - samples/sec: 5205.40 - lr: 0.000017 - momentum: 0.000000 2023-10-20 09:38:53,791 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:38:53,791 EPOCH 5 done: loss 0.1414 - lr: 0.000017 2023-10-20 09:38:54,889 DEV : loss 0.09423129260540009 - f1-score (micro avg) 0.5231 2023-10-20 09:38:54,903 saving best model 2023-10-20 09:38:54,937 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:38:57,276 epoch 6 - iter 154/1546 - loss 0.17239964 - time (sec): 2.34 - samples/sec: 5316.19 - lr: 0.000016 - momentum: 0.000000 2023-10-20 09:38:59,665 epoch 6 - iter 308/1546 - loss 0.14716903 - time (sec): 4.73 - samples/sec: 5315.60 - lr: 0.000016 - momentum: 0.000000 2023-10-20 09:39:02,051 epoch 6 - iter 462/1546 - loss 0.14330122 - time (sec): 7.11 - samples/sec: 5254.12 - lr: 0.000016 - momentum: 0.000000 2023-10-20 09:39:04,296 epoch 6 - iter 616/1546 - loss 0.14312237 - time (sec): 9.36 - samples/sec: 5219.55 - lr: 0.000015 - momentum: 0.000000 2023-10-20 09:39:06,537 epoch 6 - iter 770/1546 - loss 0.14059730 - time (sec): 11.60 - samples/sec: 5356.49 - lr: 0.000015 - momentum: 0.000000 2023-10-20 09:39:08,788 epoch 6 - iter 924/1546 - loss 0.13924984 - time (sec): 13.85 - samples/sec: 5356.26 - lr: 0.000015 - momentum: 0.000000 2023-10-20 09:39:11,148 epoch 6 - iter 1078/1546 - loss 0.13854212 - time (sec): 16.21 - samples/sec: 5341.04 - lr: 0.000014 - momentum: 0.000000 2023-10-20 09:39:13,494 epoch 6 - iter 1232/1546 - loss 0.13605262 - time (sec): 18.56 - samples/sec: 5332.13 - lr: 0.000014 - momentum: 0.000000 2023-10-20 09:39:15,923 epoch 6 - iter 1386/1546 - loss 0.13714874 - time (sec): 20.99 - samples/sec: 5339.02 - lr: 0.000014 - momentum: 0.000000 2023-10-20 09:39:18,245 epoch 6 - iter 1540/1546 - loss 0.13631501 - time (sec): 23.31 - samples/sec: 5312.01 - lr: 0.000013 - momentum: 0.000000 2023-10-20 09:39:18,346 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:39:18,346 EPOCH 6 done: loss 0.1359 - lr: 0.000013 2023-10-20 09:39:19,427 DEV : loss 0.09495244920253754 - f1-score (micro avg) 0.5244 2023-10-20 09:39:19,438 saving best model 2023-10-20 09:39:19,471 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:39:21,806 epoch 7 - iter 154/1546 - loss 0.14752563 - time (sec): 2.33 - samples/sec: 5149.54 - lr: 0.000013 - momentum: 0.000000 2023-10-20 09:39:24,232 epoch 7 - iter 308/1546 - loss 0.13113680 - time (sec): 4.76 - samples/sec: 5290.20 - lr: 0.000013 - momentum: 0.000000 2023-10-20 09:39:26,651 epoch 7 - iter 462/1546 - loss 0.13857370 - time (sec): 7.18 - samples/sec: 5145.21 - lr: 0.000012 - momentum: 0.000000 2023-10-20 09:39:29,063 epoch 7 - iter 616/1546 - loss 0.13159319 - time (sec): 9.59 - samples/sec: 5188.73 - lr: 0.000012 - momentum: 0.000000 2023-10-20 09:39:31,376 epoch 7 - iter 770/1546 - loss 0.13076584 - time (sec): 11.90 - samples/sec: 5104.35 - lr: 0.000012 - momentum: 0.000000 2023-10-20 09:39:33,682 epoch 7 - iter 924/1546 - loss 0.13098026 - time (sec): 14.21 - samples/sec: 5121.17 - lr: 0.000011 - momentum: 0.000000 2023-10-20 09:39:36,050 epoch 7 - iter 1078/1546 - loss 0.12919210 - time (sec): 16.58 - samples/sec: 5167.48 - lr: 0.000011 - momentum: 0.000000 2023-10-20 09:39:38,492 epoch 7 - iter 1232/1546 - loss 0.12864315 - time (sec): 19.02 - samples/sec: 5180.07 - lr: 0.000011 - momentum: 0.000000 2023-10-20 09:39:41,077 epoch 7 - iter 1386/1546 - loss 0.12674351 - time (sec): 21.61 - samples/sec: 5160.14 - lr: 0.000010 - momentum: 0.000000 2023-10-20 09:39:43,447 epoch 7 - iter 1540/1546 - loss 0.12927173 - time (sec): 23.98 - samples/sec: 5163.47 - lr: 0.000010 - momentum: 0.000000 2023-10-20 09:39:43,540 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:39:43,540 EPOCH 7 done: loss 0.1291 - lr: 0.000010 2023-10-20 09:39:44,621 DEV : loss 0.09416339546442032 - f1-score (micro avg) 0.5244 2023-10-20 09:39:44,632 saving best model 2023-10-20 09:39:44,665 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:39:46,989 epoch 8 - iter 154/1546 - loss 0.09374014 - time (sec): 2.32 - samples/sec: 5181.15 - lr: 0.000010 - momentum: 0.000000 2023-10-20 09:39:49,344 epoch 8 - iter 308/1546 - loss 0.11719451 - time (sec): 4.68 - samples/sec: 5105.28 - lr: 0.000009 - momentum: 0.000000 2023-10-20 09:39:51,710 epoch 8 - iter 462/1546 - loss 0.12624370 - time (sec): 7.04 - samples/sec: 5186.76 - lr: 0.000009 - momentum: 0.000000 2023-10-20 09:39:54,127 epoch 8 - iter 616/1546 - loss 0.11692275 - time (sec): 9.46 - samples/sec: 5285.04 - lr: 0.000009 - momentum: 0.000000 2023-10-20 09:39:56,514 epoch 8 - iter 770/1546 - loss 0.12077014 - time (sec): 11.85 - samples/sec: 5301.64 - lr: 0.000008 - momentum: 0.000000 2023-10-20 09:39:58,661 epoch 8 - iter 924/1546 - loss 0.12287291 - time (sec): 13.99 - samples/sec: 5374.79 - lr: 0.000008 - momentum: 0.000000 2023-10-20 09:40:00,799 epoch 8 - iter 1078/1546 - loss 0.12786612 - time (sec): 16.13 - samples/sec: 5390.23 - lr: 0.000008 - momentum: 0.000000 2023-10-20 09:40:03,083 epoch 8 - iter 1232/1546 - loss 0.12689061 - time (sec): 18.42 - samples/sec: 5408.54 - lr: 0.000007 - momentum: 0.000000 2023-10-20 09:40:05,420 epoch 8 - iter 1386/1546 - loss 0.12815882 - time (sec): 20.75 - samples/sec: 5356.73 - lr: 0.000007 - momentum: 0.000000 2023-10-20 09:40:07,818 epoch 8 - iter 1540/1546 - loss 0.12590845 - time (sec): 23.15 - samples/sec: 5346.46 - lr: 0.000007 - momentum: 0.000000 2023-10-20 09:40:07,905 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:40:07,905 EPOCH 8 done: loss 0.1257 - lr: 0.000007 2023-10-20 09:40:09,012 DEV : loss 0.09722863882780075 - f1-score (micro avg) 0.5451 2023-10-20 09:40:09,024 saving best model 2023-10-20 09:40:09,059 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:40:11,325 epoch 9 - iter 154/1546 - loss 0.10817421 - time (sec): 2.27 - samples/sec: 5175.05 - lr: 0.000006 - momentum: 0.000000 2023-10-20 09:40:13,467 epoch 9 - iter 308/1546 - loss 0.12342581 - time (sec): 4.41 - samples/sec: 5568.49 - lr: 0.000006 - momentum: 0.000000 2023-10-20 09:40:15,778 epoch 9 - iter 462/1546 - loss 0.12682072 - time (sec): 6.72 - samples/sec: 5457.65 - lr: 0.000006 - momentum: 0.000000 2023-10-20 09:40:18,158 epoch 9 - iter 616/1546 - loss 0.12687305 - time (sec): 9.10 - samples/sec: 5419.48 - lr: 0.000005 - momentum: 0.000000 2023-10-20 09:40:20,538 epoch 9 - iter 770/1546 - loss 0.12821242 - time (sec): 11.48 - samples/sec: 5454.34 - lr: 0.000005 - momentum: 0.000000 2023-10-20 09:40:22,915 epoch 9 - iter 924/1546 - loss 0.12560217 - time (sec): 13.85 - samples/sec: 5367.20 - lr: 0.000005 - momentum: 0.000000 2023-10-20 09:40:25,293 epoch 9 - iter 1078/1546 - loss 0.12400418 - time (sec): 16.23 - samples/sec: 5324.97 - lr: 0.000004 - momentum: 0.000000 2023-10-20 09:40:27,690 epoch 9 - iter 1232/1546 - loss 0.12298350 - time (sec): 18.63 - samples/sec: 5322.95 - lr: 0.000004 - momentum: 0.000000 2023-10-20 09:40:30,149 epoch 9 - iter 1386/1546 - loss 0.12259995 - time (sec): 21.09 - samples/sec: 5287.73 - lr: 0.000004 - momentum: 0.000000 2023-10-20 09:40:32,566 epoch 9 - iter 1540/1546 - loss 0.12278321 - time (sec): 23.51 - samples/sec: 5268.08 - lr: 0.000003 - momentum: 0.000000 2023-10-20 09:40:32,660 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:40:32,660 EPOCH 9 done: loss 0.1227 - lr: 0.000003 2023-10-20 09:40:33,764 DEV : loss 0.09657485783100128 - f1-score (micro avg) 0.5519 2023-10-20 09:40:33,776 saving best model 2023-10-20 09:40:33,809 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:40:36,158 epoch 10 - iter 154/1546 - loss 0.10385896 - time (sec): 2.35 - samples/sec: 5195.63 - lr: 0.000003 - momentum: 0.000000 2023-10-20 09:40:38,477 epoch 10 - iter 308/1546 - loss 0.10040242 - time (sec): 4.67 - samples/sec: 4907.18 - lr: 0.000003 - momentum: 0.000000 2023-10-20 09:40:40,921 epoch 10 - iter 462/1546 - loss 0.10847853 - time (sec): 7.11 - samples/sec: 5160.21 - lr: 0.000002 - momentum: 0.000000 2023-10-20 09:40:43,081 epoch 10 - iter 616/1546 - loss 0.11351600 - time (sec): 9.27 - samples/sec: 5330.63 - lr: 0.000002 - momentum: 0.000000 2023-10-20 09:40:45,449 epoch 10 - iter 770/1546 - loss 0.11505025 - time (sec): 11.64 - samples/sec: 5320.26 - lr: 0.000002 - momentum: 0.000000 2023-10-20 09:40:47,855 epoch 10 - iter 924/1546 - loss 0.11901451 - time (sec): 14.05 - samples/sec: 5305.07 - lr: 0.000001 - momentum: 0.000000 2023-10-20 09:40:50,133 epoch 10 - iter 1078/1546 - loss 0.12079585 - time (sec): 16.32 - samples/sec: 5292.55 - lr: 0.000001 - momentum: 0.000000 2023-10-20 09:40:52,549 epoch 10 - iter 1232/1546 - loss 0.11885783 - time (sec): 18.74 - samples/sec: 5319.55 - lr: 0.000001 - momentum: 0.000000 2023-10-20 09:40:54,836 epoch 10 - iter 1386/1546 - loss 0.11905771 - time (sec): 21.03 - samples/sec: 5295.29 - lr: 0.000000 - momentum: 0.000000 2023-10-20 09:40:57,202 epoch 10 - iter 1540/1546 - loss 0.12050577 - time (sec): 23.39 - samples/sec: 5295.64 - lr: 0.000000 - momentum: 0.000000 2023-10-20 09:40:57,291 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:40:57,291 EPOCH 10 done: loss 0.1203 - lr: 0.000000 2023-10-20 09:40:58,380 DEV : loss 0.09660445898771286 - f1-score (micro avg) 0.551 2023-10-20 09:40:58,421 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:40:58,421 Loading model from best epoch ... 2023-10-20 09:40:58,494 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-20 09:41:01,404 Results: - F-score (micro) 0.5262 - F-score (macro) 0.2198 - Accuracy 0.3668 By class: precision recall f1-score support LOC 0.5788 0.6173 0.5974 946 BUILDING 0.0789 0.0162 0.0269 185 STREET 1.0000 0.0179 0.0351 56 micro avg 0.5611 0.4954 0.5262 1187 macro avg 0.5526 0.2171 0.2198 1187 weighted avg 0.5208 0.4954 0.4820 1187 2023-10-20 09:41:01,404 ----------------------------------------------------------------------------------------------------