2023-10-20 09:15:12,661 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:15:12,661 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-20 09:15:12,661 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:15:12,661 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-20 09:15:12,661 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:15:12,661 Train: 6183 sentences 2023-10-20 09:15:12,661 (train_with_dev=False, train_with_test=False) 2023-10-20 09:15:12,661 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:15:12,661 Training Params: 2023-10-20 09:15:12,661 - learning_rate: "3e-05" 2023-10-20 09:15:12,661 - mini_batch_size: "8" 2023-10-20 09:15:12,661 - max_epochs: "10" 2023-10-20 09:15:12,661 - shuffle: "True" 2023-10-20 09:15:12,661 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:15:12,661 Plugins: 2023-10-20 09:15:12,661 - TensorboardLogger 2023-10-20 09:15:12,662 - LinearScheduler | warmup_fraction: '0.1' 2023-10-20 09:15:12,662 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:15:12,662 Final evaluation on model from best epoch (best-model.pt) 2023-10-20 09:15:12,662 - metric: "('micro avg', 'f1-score')" 2023-10-20 09:15:12,662 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:15:12,662 Computation: 2023-10-20 09:15:12,662 - compute on device: cuda:0 2023-10-20 09:15:12,662 - embedding storage: none 2023-10-20 09:15:12,662 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:15:12,662 Model training base path: "hmbench-topres19th/en-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-20 09:15:12,662 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:15:12,662 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:15:12,662 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-20 09:15:14,192 epoch 1 - iter 77/773 - loss 3.33952986 - time (sec): 1.53 - samples/sec: 8303.34 - lr: 0.000003 - momentum: 0.000000 2023-10-20 09:15:15,870 epoch 1 - iter 154/773 - loss 3.15376051 - time (sec): 3.21 - samples/sec: 7599.49 - lr: 0.000006 - momentum: 0.000000 2023-10-20 09:15:17,643 epoch 1 - iter 231/773 - loss 2.86285749 - time (sec): 4.98 - samples/sec: 7338.12 - lr: 0.000009 - momentum: 0.000000 2023-10-20 09:15:19,406 epoch 1 - iter 308/773 - loss 2.46722800 - time (sec): 6.74 - samples/sec: 7316.92 - lr: 0.000012 - momentum: 0.000000 2023-10-20 09:15:21,097 epoch 1 - iter 385/773 - loss 2.08159163 - time (sec): 8.43 - samples/sec: 7287.20 - lr: 0.000015 - momentum: 0.000000 2023-10-20 09:15:22,893 epoch 1 - iter 462/773 - loss 1.80370396 - time (sec): 10.23 - samples/sec: 7143.10 - lr: 0.000018 - momentum: 0.000000 2023-10-20 09:15:24,673 epoch 1 - iter 539/773 - loss 1.58250891 - time (sec): 12.01 - samples/sec: 7140.71 - lr: 0.000021 - momentum: 0.000000 2023-10-20 09:15:26,482 epoch 1 - iter 616/773 - loss 1.40958823 - time (sec): 13.82 - samples/sec: 7147.35 - lr: 0.000024 - momentum: 0.000000 2023-10-20 09:15:28,279 epoch 1 - iter 693/773 - loss 1.28574327 - time (sec): 15.62 - samples/sec: 7086.91 - lr: 0.000027 - momentum: 0.000000 2023-10-20 09:15:30,017 epoch 1 - iter 770/773 - loss 1.17895676 - time (sec): 17.35 - samples/sec: 7130.19 - lr: 0.000030 - momentum: 0.000000 2023-10-20 09:15:30,086 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:15:30,086 EPOCH 1 done: loss 1.1740 - lr: 0.000030 2023-10-20 09:15:31,013 DEV : loss 0.14998705685138702 - f1-score (micro avg) 0.0 2023-10-20 09:15:31,024 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:15:32,728 epoch 2 - iter 77/773 - loss 0.24465311 - time (sec): 1.70 - samples/sec: 7292.65 - lr: 0.000030 - momentum: 0.000000 2023-10-20 09:15:34,422 epoch 2 - iter 154/773 - loss 0.23774020 - time (sec): 3.40 - samples/sec: 7056.70 - lr: 0.000029 - momentum: 0.000000 2023-10-20 09:15:36,135 epoch 2 - iter 231/773 - loss 0.24048091 - time (sec): 5.11 - samples/sec: 6938.55 - lr: 0.000029 - momentum: 0.000000 2023-10-20 09:15:37,877 epoch 2 - iter 308/773 - loss 0.22885706 - time (sec): 6.85 - samples/sec: 7056.28 - lr: 0.000029 - momentum: 0.000000 2023-10-20 09:15:39,449 epoch 2 - iter 385/773 - loss 0.22732366 - time (sec): 8.42 - samples/sec: 7262.06 - lr: 0.000028 - momentum: 0.000000 2023-10-20 09:15:41,053 epoch 2 - iter 462/773 - loss 0.22283623 - time (sec): 10.03 - samples/sec: 7270.02 - lr: 0.000028 - momentum: 0.000000 2023-10-20 09:15:42,778 epoch 2 - iter 539/773 - loss 0.22091189 - time (sec): 11.75 - samples/sec: 7218.38 - lr: 0.000028 - momentum: 0.000000 2023-10-20 09:15:44,509 epoch 2 - iter 616/773 - loss 0.21641942 - time (sec): 13.49 - samples/sec: 7231.27 - lr: 0.000027 - momentum: 0.000000 2023-10-20 09:15:46,296 epoch 2 - iter 693/773 - loss 0.21432633 - time (sec): 15.27 - samples/sec: 7192.00 - lr: 0.000027 - momentum: 0.000000 2023-10-20 09:15:48,070 epoch 2 - iter 770/773 - loss 0.20934767 - time (sec): 17.05 - samples/sec: 7251.00 - lr: 0.000027 - momentum: 0.000000 2023-10-20 09:15:48,144 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:15:48,145 EPOCH 2 done: loss 0.2087 - lr: 0.000027 2023-10-20 09:15:49,200 DEV : loss 0.10451915115118027 - f1-score (micro avg) 0.2064 2023-10-20 09:15:49,212 saving best model 2023-10-20 09:15:49,242 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:15:50,941 epoch 3 - iter 77/773 - loss 0.18760397 - time (sec): 1.70 - samples/sec: 6733.13 - lr: 0.000026 - momentum: 0.000000 2023-10-20 09:15:52,749 epoch 3 - iter 154/773 - loss 0.17290474 - time (sec): 3.51 - samples/sec: 6870.26 - lr: 0.000026 - momentum: 0.000000 2023-10-20 09:15:54,525 epoch 3 - iter 231/773 - loss 0.16574787 - time (sec): 5.28 - samples/sec: 6901.08 - lr: 0.000026 - momentum: 0.000000 2023-10-20 09:15:56,360 epoch 3 - iter 308/773 - loss 0.17220049 - time (sec): 7.12 - samples/sec: 6944.67 - lr: 0.000025 - momentum: 0.000000 2023-10-20 09:15:58,130 epoch 3 - iter 385/773 - loss 0.17055457 - time (sec): 8.89 - samples/sec: 6903.34 - lr: 0.000025 - momentum: 0.000000 2023-10-20 09:16:00,200 epoch 3 - iter 462/773 - loss 0.17102499 - time (sec): 10.96 - samples/sec: 6834.09 - lr: 0.000025 - momentum: 0.000000 2023-10-20 09:16:02,000 epoch 3 - iter 539/773 - loss 0.17156964 - time (sec): 12.76 - samples/sec: 6850.19 - lr: 0.000024 - momentum: 0.000000 2023-10-20 09:16:03,740 epoch 3 - iter 616/773 - loss 0.16973025 - time (sec): 14.50 - samples/sec: 6895.67 - lr: 0.000024 - momentum: 0.000000 2023-10-20 09:16:05,425 epoch 3 - iter 693/773 - loss 0.16937423 - time (sec): 16.18 - samples/sec: 6868.31 - lr: 0.000024 - momentum: 0.000000 2023-10-20 09:16:07,199 epoch 3 - iter 770/773 - loss 0.16819305 - time (sec): 17.96 - samples/sec: 6887.55 - lr: 0.000023 - momentum: 0.000000 2023-10-20 09:16:07,273 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:16:07,273 EPOCH 3 done: loss 0.1680 - lr: 0.000023 2023-10-20 09:16:08,371 DEV : loss 0.0877489447593689 - f1-score (micro avg) 0.4861 2023-10-20 09:16:08,383 saving best model 2023-10-20 09:16:08,418 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:16:10,216 epoch 4 - iter 77/773 - loss 0.16112576 - time (sec): 1.80 - samples/sec: 7014.62 - lr: 0.000023 - momentum: 0.000000 2023-10-20 09:16:11,957 epoch 4 - iter 154/773 - loss 0.15054189 - time (sec): 3.54 - samples/sec: 7020.61 - lr: 0.000023 - momentum: 0.000000 2023-10-20 09:16:13,681 epoch 4 - iter 231/773 - loss 0.15757889 - time (sec): 5.26 - samples/sec: 6798.16 - lr: 0.000022 - momentum: 0.000000 2023-10-20 09:16:15,454 epoch 4 - iter 308/773 - loss 0.15835643 - time (sec): 7.04 - samples/sec: 6887.47 - lr: 0.000022 - momentum: 0.000000 2023-10-20 09:16:17,147 epoch 4 - iter 385/773 - loss 0.15939542 - time (sec): 8.73 - samples/sec: 6952.52 - lr: 0.000022 - momentum: 0.000000 2023-10-20 09:16:18,879 epoch 4 - iter 462/773 - loss 0.15526926 - time (sec): 10.46 - samples/sec: 6969.98 - lr: 0.000021 - momentum: 0.000000 2023-10-20 09:16:20,636 epoch 4 - iter 539/773 - loss 0.15163147 - time (sec): 12.22 - samples/sec: 7051.10 - lr: 0.000021 - momentum: 0.000000 2023-10-20 09:16:22,428 epoch 4 - iter 616/773 - loss 0.15053741 - time (sec): 14.01 - samples/sec: 7075.84 - lr: 0.000021 - momentum: 0.000000 2023-10-20 09:16:24,135 epoch 4 - iter 693/773 - loss 0.15041744 - time (sec): 15.72 - samples/sec: 7074.15 - lr: 0.000020 - momentum: 0.000000 2023-10-20 09:16:25,852 epoch 4 - iter 770/773 - loss 0.14997795 - time (sec): 17.43 - samples/sec: 7104.06 - lr: 0.000020 - momentum: 0.000000 2023-10-20 09:16:25,917 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:16:25,917 EPOCH 4 done: loss 0.1499 - lr: 0.000020 2023-10-20 09:16:26,996 DEV : loss 0.08475784212350845 - f1-score (micro avg) 0.5195 2023-10-20 09:16:27,007 saving best model 2023-10-20 09:16:27,046 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:16:28,737 epoch 5 - iter 77/773 - loss 0.13296687 - time (sec): 1.69 - samples/sec: 7299.21 - lr: 0.000020 - momentum: 0.000000 2023-10-20 09:16:30,473 epoch 5 - iter 154/773 - loss 0.13988942 - time (sec): 3.43 - samples/sec: 6991.62 - lr: 0.000019 - momentum: 0.000000 2023-10-20 09:16:32,180 epoch 5 - iter 231/773 - loss 0.14188857 - time (sec): 5.13 - samples/sec: 7007.76 - lr: 0.000019 - momentum: 0.000000 2023-10-20 09:16:33,909 epoch 5 - iter 308/773 - loss 0.13800861 - time (sec): 6.86 - samples/sec: 7172.74 - lr: 0.000019 - momentum: 0.000000 2023-10-20 09:16:35,608 epoch 5 - iter 385/773 - loss 0.13726507 - time (sec): 8.56 - samples/sec: 7265.92 - lr: 0.000018 - momentum: 0.000000 2023-10-20 09:16:37,518 epoch 5 - iter 462/773 - loss 0.13563549 - time (sec): 10.47 - samples/sec: 7100.19 - lr: 0.000018 - momentum: 0.000000 2023-10-20 09:16:39,257 epoch 5 - iter 539/773 - loss 0.13353442 - time (sec): 12.21 - samples/sec: 7072.29 - lr: 0.000018 - momentum: 0.000000 2023-10-20 09:16:40,988 epoch 5 - iter 616/773 - loss 0.13630665 - time (sec): 13.94 - samples/sec: 7084.86 - lr: 0.000017 - momentum: 0.000000 2023-10-20 09:16:42,795 epoch 5 - iter 693/773 - loss 0.13728145 - time (sec): 15.75 - samples/sec: 7104.29 - lr: 0.000017 - momentum: 0.000000 2023-10-20 09:16:44,524 epoch 5 - iter 770/773 - loss 0.13822704 - time (sec): 17.48 - samples/sec: 7084.17 - lr: 0.000017 - momentum: 0.000000 2023-10-20 09:16:44,587 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:16:44,587 EPOCH 5 done: loss 0.1381 - lr: 0.000017 2023-10-20 09:16:45,681 DEV : loss 0.08033832907676697 - f1-score (micro avg) 0.5468 2023-10-20 09:16:45,693 saving best model 2023-10-20 09:16:45,728 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:16:47,419 epoch 6 - iter 77/773 - loss 0.11510914 - time (sec): 1.69 - samples/sec: 7063.17 - lr: 0.000016 - momentum: 0.000000 2023-10-20 09:16:49,141 epoch 6 - iter 154/773 - loss 0.12357112 - time (sec): 3.41 - samples/sec: 6951.14 - lr: 0.000016 - momentum: 0.000000 2023-10-20 09:16:50,894 epoch 6 - iter 231/773 - loss 0.13493875 - time (sec): 5.16 - samples/sec: 6930.16 - lr: 0.000016 - momentum: 0.000000 2023-10-20 09:16:52,666 epoch 6 - iter 308/773 - loss 0.13607145 - time (sec): 6.94 - samples/sec: 7020.17 - lr: 0.000015 - momentum: 0.000000 2023-10-20 09:16:54,438 epoch 6 - iter 385/773 - loss 0.14047184 - time (sec): 8.71 - samples/sec: 6924.28 - lr: 0.000015 - momentum: 0.000000 2023-10-20 09:16:56,257 epoch 6 - iter 462/773 - loss 0.13593672 - time (sec): 10.53 - samples/sec: 6963.41 - lr: 0.000015 - momentum: 0.000000 2023-10-20 09:16:57,977 epoch 6 - iter 539/773 - loss 0.13231574 - time (sec): 12.25 - samples/sec: 7002.98 - lr: 0.000014 - momentum: 0.000000 2023-10-20 09:16:59,741 epoch 6 - iter 616/773 - loss 0.13109122 - time (sec): 14.01 - samples/sec: 7059.51 - lr: 0.000014 - momentum: 0.000000 2023-10-20 09:17:01,314 epoch 6 - iter 693/773 - loss 0.13015771 - time (sec): 15.58 - samples/sec: 7092.09 - lr: 0.000014 - momentum: 0.000000 2023-10-20 09:17:03,013 epoch 6 - iter 770/773 - loss 0.13172020 - time (sec): 17.28 - samples/sec: 7161.60 - lr: 0.000013 - momentum: 0.000000 2023-10-20 09:17:03,081 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:17:03,082 EPOCH 6 done: loss 0.1316 - lr: 0.000013 2023-10-20 09:17:04,172 DEV : loss 0.07792612910270691 - f1-score (micro avg) 0.5833 2023-10-20 09:17:04,184 saving best model 2023-10-20 09:17:04,222 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:17:05,950 epoch 7 - iter 77/773 - loss 0.12524588 - time (sec): 1.73 - samples/sec: 7772.00 - lr: 0.000013 - momentum: 0.000000 2023-10-20 09:17:07,631 epoch 7 - iter 154/773 - loss 0.12230570 - time (sec): 3.41 - samples/sec: 7297.43 - lr: 0.000013 - momentum: 0.000000 2023-10-20 09:17:09,492 epoch 7 - iter 231/773 - loss 0.11811755 - time (sec): 5.27 - samples/sec: 7190.68 - lr: 0.000012 - momentum: 0.000000 2023-10-20 09:17:11,333 epoch 7 - iter 308/773 - loss 0.12745433 - time (sec): 7.11 - samples/sec: 6958.52 - lr: 0.000012 - momentum: 0.000000 2023-10-20 09:17:12,836 epoch 7 - iter 385/773 - loss 0.12730702 - time (sec): 8.61 - samples/sec: 7215.29 - lr: 0.000012 - momentum: 0.000000 2023-10-20 09:17:14,570 epoch 7 - iter 462/773 - loss 0.12624463 - time (sec): 10.35 - samples/sec: 7225.78 - lr: 0.000011 - momentum: 0.000000 2023-10-20 09:17:16,355 epoch 7 - iter 539/773 - loss 0.12879208 - time (sec): 12.13 - samples/sec: 7200.24 - lr: 0.000011 - momentum: 0.000000 2023-10-20 09:17:18,072 epoch 7 - iter 616/773 - loss 0.12718972 - time (sec): 13.85 - samples/sec: 7220.27 - lr: 0.000011 - momentum: 0.000000 2023-10-20 09:17:19,851 epoch 7 - iter 693/773 - loss 0.12645319 - time (sec): 15.63 - samples/sec: 7147.23 - lr: 0.000010 - momentum: 0.000000 2023-10-20 09:17:21,728 epoch 7 - iter 770/773 - loss 0.12643290 - time (sec): 17.51 - samples/sec: 7072.56 - lr: 0.000010 - momentum: 0.000000 2023-10-20 09:17:21,794 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:17:21,795 EPOCH 7 done: loss 0.1262 - lr: 0.000010 2023-10-20 09:17:22,891 DEV : loss 0.0776781365275383 - f1-score (micro avg) 0.5872 2023-10-20 09:17:22,903 saving best model 2023-10-20 09:17:22,938 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:17:24,719 epoch 8 - iter 77/773 - loss 0.10583346 - time (sec): 1.78 - samples/sec: 6834.39 - lr: 0.000010 - momentum: 0.000000 2023-10-20 09:17:26,504 epoch 8 - iter 154/773 - loss 0.12530026 - time (sec): 3.57 - samples/sec: 6987.51 - lr: 0.000009 - momentum: 0.000000 2023-10-20 09:17:28,247 epoch 8 - iter 231/773 - loss 0.12693031 - time (sec): 5.31 - samples/sec: 6952.20 - lr: 0.000009 - momentum: 0.000000 2023-10-20 09:17:29,969 epoch 8 - iter 308/773 - loss 0.12130837 - time (sec): 7.03 - samples/sec: 7006.83 - lr: 0.000009 - momentum: 0.000000 2023-10-20 09:17:31,722 epoch 8 - iter 385/773 - loss 0.11920464 - time (sec): 8.78 - samples/sec: 7095.66 - lr: 0.000008 - momentum: 0.000000 2023-10-20 09:17:33,516 epoch 8 - iter 462/773 - loss 0.12199633 - time (sec): 10.58 - samples/sec: 7154.64 - lr: 0.000008 - momentum: 0.000000 2023-10-20 09:17:35,189 epoch 8 - iter 539/773 - loss 0.12080649 - time (sec): 12.25 - samples/sec: 7113.25 - lr: 0.000008 - momentum: 0.000000 2023-10-20 09:17:37,007 epoch 8 - iter 616/773 - loss 0.12327536 - time (sec): 14.07 - samples/sec: 7011.62 - lr: 0.000007 - momentum: 0.000000 2023-10-20 09:17:38,708 epoch 8 - iter 693/773 - loss 0.12158899 - time (sec): 15.77 - samples/sec: 7028.09 - lr: 0.000007 - momentum: 0.000000 2023-10-20 09:17:40,487 epoch 8 - iter 770/773 - loss 0.12160387 - time (sec): 17.55 - samples/sec: 7063.84 - lr: 0.000007 - momentum: 0.000000 2023-10-20 09:17:40,546 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:17:40,547 EPOCH 8 done: loss 0.1214 - lr: 0.000007 2023-10-20 09:17:41,635 DEV : loss 0.078594870865345 - f1-score (micro avg) 0.594 2023-10-20 09:17:41,648 saving best model 2023-10-20 09:17:41,684 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:17:43,444 epoch 9 - iter 77/773 - loss 0.11865783 - time (sec): 1.76 - samples/sec: 6953.78 - lr: 0.000006 - momentum: 0.000000 2023-10-20 09:17:45,352 epoch 9 - iter 154/773 - loss 0.11779580 - time (sec): 3.67 - samples/sec: 6731.92 - lr: 0.000006 - momentum: 0.000000 2023-10-20 09:17:47,177 epoch 9 - iter 231/773 - loss 0.11079074 - time (sec): 5.49 - samples/sec: 6945.20 - lr: 0.000006 - momentum: 0.000000 2023-10-20 09:17:48,925 epoch 9 - iter 308/773 - loss 0.11269628 - time (sec): 7.24 - samples/sec: 6912.83 - lr: 0.000005 - momentum: 0.000000 2023-10-20 09:17:50,690 epoch 9 - iter 385/773 - loss 0.11775128 - time (sec): 9.01 - samples/sec: 7031.36 - lr: 0.000005 - momentum: 0.000000 2023-10-20 09:17:52,442 epoch 9 - iter 462/773 - loss 0.12178041 - time (sec): 10.76 - samples/sec: 6994.68 - lr: 0.000005 - momentum: 0.000000 2023-10-20 09:17:54,230 epoch 9 - iter 539/773 - loss 0.12266438 - time (sec): 12.55 - samples/sec: 7009.50 - lr: 0.000004 - momentum: 0.000000 2023-10-20 09:17:55,948 epoch 9 - iter 616/773 - loss 0.12347143 - time (sec): 14.26 - samples/sec: 7020.21 - lr: 0.000004 - momentum: 0.000000 2023-10-20 09:17:57,590 epoch 9 - iter 693/773 - loss 0.12164974 - time (sec): 15.91 - samples/sec: 7013.24 - lr: 0.000004 - momentum: 0.000000 2023-10-20 09:17:59,268 epoch 9 - iter 770/773 - loss 0.12028417 - time (sec): 17.58 - samples/sec: 7045.20 - lr: 0.000003 - momentum: 0.000000 2023-10-20 09:17:59,328 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:17:59,328 EPOCH 9 done: loss 0.1202 - lr: 0.000003 2023-10-20 09:18:00,418 DEV : loss 0.07745374739170074 - f1-score (micro avg) 0.5978 2023-10-20 09:18:00,430 saving best model 2023-10-20 09:18:00,463 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:18:02,196 epoch 10 - iter 77/773 - loss 0.13395817 - time (sec): 1.73 - samples/sec: 6936.02 - lr: 0.000003 - momentum: 0.000000 2023-10-20 09:18:03,975 epoch 10 - iter 154/773 - loss 0.12555449 - time (sec): 3.51 - samples/sec: 7085.61 - lr: 0.000003 - momentum: 0.000000 2023-10-20 09:18:05,577 epoch 10 - iter 231/773 - loss 0.12235248 - time (sec): 5.11 - samples/sec: 7455.37 - lr: 0.000002 - momentum: 0.000000 2023-10-20 09:18:07,115 epoch 10 - iter 308/773 - loss 0.11713785 - time (sec): 6.65 - samples/sec: 7646.12 - lr: 0.000002 - momentum: 0.000000 2023-10-20 09:18:08,958 epoch 10 - iter 385/773 - loss 0.11947063 - time (sec): 8.49 - samples/sec: 7426.05 - lr: 0.000002 - momentum: 0.000000 2023-10-20 09:18:10,695 epoch 10 - iter 462/773 - loss 0.11681252 - time (sec): 10.23 - samples/sec: 7355.83 - lr: 0.000001 - momentum: 0.000000 2023-10-20 09:18:12,412 epoch 10 - iter 539/773 - loss 0.11331848 - time (sec): 11.95 - samples/sec: 7367.65 - lr: 0.000001 - momentum: 0.000000 2023-10-20 09:18:14,164 epoch 10 - iter 616/773 - loss 0.11310343 - time (sec): 13.70 - samples/sec: 7264.87 - lr: 0.000001 - momentum: 0.000000 2023-10-20 09:18:15,932 epoch 10 - iter 693/773 - loss 0.11715195 - time (sec): 15.47 - samples/sec: 7228.65 - lr: 0.000000 - momentum: 0.000000 2023-10-20 09:18:17,637 epoch 10 - iter 770/773 - loss 0.11898981 - time (sec): 17.17 - samples/sec: 7219.65 - lr: 0.000000 - momentum: 0.000000 2023-10-20 09:18:17,698 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:18:17,698 EPOCH 10 done: loss 0.1189 - lr: 0.000000 2023-10-20 09:18:18,836 DEV : loss 0.07805749028921127 - f1-score (micro avg) 0.5968 2023-10-20 09:18:18,880 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:18:18,881 Loading model from best epoch ... 2023-10-20 09:18:18,958 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-20 09:18:21,824 Results: - F-score (micro) 0.5601 - F-score (macro) 0.2106 - Accuracy 0.4021 By class: precision recall f1-score support LOC 0.6210 0.6427 0.6317 946 BUILDING 0.0000 0.0000 0.0000 185 STREET 0.0000 0.0000 0.0000 56 micro avg 0.6179 0.5122 0.5601 1187 macro avg 0.2070 0.2142 0.2106 1187 weighted avg 0.4949 0.5122 0.5034 1187 2023-10-20 09:18:21,824 ----------------------------------------------------------------------------------------------------