2023-10-18 22:15:13,407 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:15:13,407 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 22:15:13,407 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:15:13,407 MultiCorpus: 5777 train + 722 dev + 723 test sentences - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl 2023-10-18 22:15:13,407 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:15:13,407 Train: 5777 sentences 2023-10-18 22:15:13,407 (train_with_dev=False, train_with_test=False) 2023-10-18 22:15:13,407 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:15:13,407 Training Params: 2023-10-18 22:15:13,407 - learning_rate: "3e-05" 2023-10-18 22:15:13,407 - mini_batch_size: "4" 2023-10-18 22:15:13,407 - max_epochs: "10" 2023-10-18 22:15:13,407 - shuffle: "True" 2023-10-18 22:15:13,407 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:15:13,407 Plugins: 2023-10-18 22:15:13,407 - TensorboardLogger 2023-10-18 22:15:13,407 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 22:15:13,408 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:15:13,408 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 22:15:13,408 - metric: "('micro avg', 'f1-score')" 2023-10-18 22:15:13,408 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:15:13,408 Computation: 2023-10-18 22:15:13,408 - compute on device: cuda:0 2023-10-18 22:15:13,408 - embedding storage: none 2023-10-18 22:15:13,408 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:15:13,408 Model training base path: "hmbench-icdar/nl-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-18 22:15:13,408 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:15:13,408 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:15:13,408 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 22:15:15,962 epoch 1 - iter 144/1445 - loss 2.39314632 - time (sec): 2.55 - samples/sec: 7270.88 - lr: 0.000003 - momentum: 0.000000 2023-10-18 22:15:18,317 epoch 1 - iter 288/1445 - loss 2.14684515 - time (sec): 4.91 - samples/sec: 7182.74 - lr: 0.000006 - momentum: 0.000000 2023-10-18 22:15:20,799 epoch 1 - iter 432/1445 - loss 1.76829385 - time (sec): 7.39 - samples/sec: 7053.44 - lr: 0.000009 - momentum: 0.000000 2023-10-18 22:15:23,263 epoch 1 - iter 576/1445 - loss 1.43081858 - time (sec): 9.85 - samples/sec: 7146.25 - lr: 0.000012 - momentum: 0.000000 2023-10-18 22:15:25,690 epoch 1 - iter 720/1445 - loss 1.21350836 - time (sec): 12.28 - samples/sec: 7143.92 - lr: 0.000015 - momentum: 0.000000 2023-10-18 22:15:28,172 epoch 1 - iter 864/1445 - loss 1.06801420 - time (sec): 14.76 - samples/sec: 7135.56 - lr: 0.000018 - momentum: 0.000000 2023-10-18 22:15:30,643 epoch 1 - iter 1008/1445 - loss 0.95360992 - time (sec): 17.23 - samples/sec: 7166.69 - lr: 0.000021 - momentum: 0.000000 2023-10-18 22:15:33,003 epoch 1 - iter 1152/1445 - loss 0.87386994 - time (sec): 19.60 - samples/sec: 7184.61 - lr: 0.000024 - momentum: 0.000000 2023-10-18 22:15:35,453 epoch 1 - iter 1296/1445 - loss 0.80394806 - time (sec): 22.05 - samples/sec: 7197.68 - lr: 0.000027 - momentum: 0.000000 2023-10-18 22:15:37,856 epoch 1 - iter 1440/1445 - loss 0.74866431 - time (sec): 24.45 - samples/sec: 7186.43 - lr: 0.000030 - momentum: 0.000000 2023-10-18 22:15:37,934 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:15:37,934 EPOCH 1 done: loss 0.7469 - lr: 0.000030 2023-10-18 22:15:39,177 DEV : loss 0.2806679308414459 - f1-score (micro avg) 0.0 2023-10-18 22:15:39,191 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:15:41,536 epoch 2 - iter 144/1445 - loss 0.27792179 - time (sec): 2.34 - samples/sec: 6994.31 - lr: 0.000030 - momentum: 0.000000 2023-10-18 22:15:43,948 epoch 2 - iter 288/1445 - loss 0.23375750 - time (sec): 4.76 - samples/sec: 7259.12 - lr: 0.000029 - momentum: 0.000000 2023-10-18 22:15:46,367 epoch 2 - iter 432/1445 - loss 0.22939829 - time (sec): 7.18 - samples/sec: 7373.97 - lr: 0.000029 - momentum: 0.000000 2023-10-18 22:15:48,717 epoch 2 - iter 576/1445 - loss 0.22226944 - time (sec): 9.53 - samples/sec: 7411.64 - lr: 0.000029 - momentum: 0.000000 2023-10-18 22:15:51,131 epoch 2 - iter 720/1445 - loss 0.22022531 - time (sec): 11.94 - samples/sec: 7351.84 - lr: 0.000028 - momentum: 0.000000 2023-10-18 22:15:53,833 epoch 2 - iter 864/1445 - loss 0.21955073 - time (sec): 14.64 - samples/sec: 7177.20 - lr: 0.000028 - momentum: 0.000000 2023-10-18 22:15:56,214 epoch 2 - iter 1008/1445 - loss 0.22076945 - time (sec): 17.02 - samples/sec: 7144.54 - lr: 0.000028 - momentum: 0.000000 2023-10-18 22:15:58,820 epoch 2 - iter 1152/1445 - loss 0.21903683 - time (sec): 19.63 - samples/sec: 7152.33 - lr: 0.000027 - momentum: 0.000000 2023-10-18 22:16:01,266 epoch 2 - iter 1296/1445 - loss 0.21933025 - time (sec): 22.07 - samples/sec: 7115.29 - lr: 0.000027 - momentum: 0.000000 2023-10-18 22:16:03,826 epoch 2 - iter 1440/1445 - loss 0.21626617 - time (sec): 24.63 - samples/sec: 7132.47 - lr: 0.000027 - momentum: 0.000000 2023-10-18 22:16:03,904 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:16:03,904 EPOCH 2 done: loss 0.2162 - lr: 0.000027 2023-10-18 22:16:05,672 DEV : loss 0.23692606389522552 - f1-score (micro avg) 0.3287 2023-10-18 22:16:05,687 saving best model 2023-10-18 22:16:05,718 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:16:08,206 epoch 3 - iter 144/1445 - loss 0.19237069 - time (sec): 2.49 - samples/sec: 6801.18 - lr: 0.000026 - momentum: 0.000000 2023-10-18 22:16:10,568 epoch 3 - iter 288/1445 - loss 0.20044421 - time (sec): 4.85 - samples/sec: 7017.71 - lr: 0.000026 - momentum: 0.000000 2023-10-18 22:16:13,038 epoch 3 - iter 432/1445 - loss 0.20343289 - time (sec): 7.32 - samples/sec: 7328.74 - lr: 0.000026 - momentum: 0.000000 2023-10-18 22:16:15,337 epoch 3 - iter 576/1445 - loss 0.20071860 - time (sec): 9.62 - samples/sec: 7316.26 - lr: 0.000025 - momentum: 0.000000 2023-10-18 22:16:17,741 epoch 3 - iter 720/1445 - loss 0.19738794 - time (sec): 12.02 - samples/sec: 7323.13 - lr: 0.000025 - momentum: 0.000000 2023-10-18 22:16:20,177 epoch 3 - iter 864/1445 - loss 0.19558068 - time (sec): 14.46 - samples/sec: 7277.40 - lr: 0.000025 - momentum: 0.000000 2023-10-18 22:16:22,657 epoch 3 - iter 1008/1445 - loss 0.19494579 - time (sec): 16.94 - samples/sec: 7339.32 - lr: 0.000024 - momentum: 0.000000 2023-10-18 22:16:24,966 epoch 3 - iter 1152/1445 - loss 0.19436800 - time (sec): 19.25 - samples/sec: 7309.17 - lr: 0.000024 - momentum: 0.000000 2023-10-18 22:16:27,470 epoch 3 - iter 1296/1445 - loss 0.19204889 - time (sec): 21.75 - samples/sec: 7305.79 - lr: 0.000024 - momentum: 0.000000 2023-10-18 22:16:29,826 epoch 3 - iter 1440/1445 - loss 0.19140078 - time (sec): 24.11 - samples/sec: 7287.40 - lr: 0.000023 - momentum: 0.000000 2023-10-18 22:16:29,906 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:16:29,906 EPOCH 3 done: loss 0.1913 - lr: 0.000023 2023-10-18 22:16:31,653 DEV : loss 0.2540631890296936 - f1-score (micro avg) 0.3256 2023-10-18 22:16:31,668 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:16:34,014 epoch 4 - iter 144/1445 - loss 0.17625734 - time (sec): 2.35 - samples/sec: 7496.14 - lr: 0.000023 - momentum: 0.000000 2023-10-18 22:16:36,430 epoch 4 - iter 288/1445 - loss 0.16013685 - time (sec): 4.76 - samples/sec: 7586.55 - lr: 0.000023 - momentum: 0.000000 2023-10-18 22:16:38,804 epoch 4 - iter 432/1445 - loss 0.16861437 - time (sec): 7.14 - samples/sec: 7450.16 - lr: 0.000022 - momentum: 0.000000 2023-10-18 22:16:41,070 epoch 4 - iter 576/1445 - loss 0.16940708 - time (sec): 9.40 - samples/sec: 7533.32 - lr: 0.000022 - momentum: 0.000000 2023-10-18 22:16:43,335 epoch 4 - iter 720/1445 - loss 0.17263654 - time (sec): 11.67 - samples/sec: 7472.43 - lr: 0.000022 - momentum: 0.000000 2023-10-18 22:16:45,874 epoch 4 - iter 864/1445 - loss 0.17443784 - time (sec): 14.21 - samples/sec: 7423.53 - lr: 0.000021 - momentum: 0.000000 2023-10-18 22:16:48,244 epoch 4 - iter 1008/1445 - loss 0.17680417 - time (sec): 16.58 - samples/sec: 7427.32 - lr: 0.000021 - momentum: 0.000000 2023-10-18 22:16:50,606 epoch 4 - iter 1152/1445 - loss 0.17608810 - time (sec): 18.94 - samples/sec: 7413.92 - lr: 0.000021 - momentum: 0.000000 2023-10-18 22:16:52,990 epoch 4 - iter 1296/1445 - loss 0.17675409 - time (sec): 21.32 - samples/sec: 7379.58 - lr: 0.000020 - momentum: 0.000000 2023-10-18 22:16:55,447 epoch 4 - iter 1440/1445 - loss 0.17717412 - time (sec): 23.78 - samples/sec: 7385.67 - lr: 0.000020 - momentum: 0.000000 2023-10-18 22:16:55,531 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:16:55,531 EPOCH 4 done: loss 0.1772 - lr: 0.000020 2023-10-18 22:16:57,593 DEV : loss 0.21325278282165527 - f1-score (micro avg) 0.4364 2023-10-18 22:16:57,607 saving best model 2023-10-18 22:16:57,642 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:17:00,004 epoch 5 - iter 144/1445 - loss 0.16636630 - time (sec): 2.36 - samples/sec: 7283.76 - lr: 0.000020 - momentum: 0.000000 2023-10-18 22:17:02,528 epoch 5 - iter 288/1445 - loss 0.15766715 - time (sec): 4.88 - samples/sec: 7419.05 - lr: 0.000019 - momentum: 0.000000 2023-10-18 22:17:04,918 epoch 5 - iter 432/1445 - loss 0.15448021 - time (sec): 7.27 - samples/sec: 7341.91 - lr: 0.000019 - momentum: 0.000000 2023-10-18 22:17:07,397 epoch 5 - iter 576/1445 - loss 0.15950395 - time (sec): 9.75 - samples/sec: 7339.99 - lr: 0.000019 - momentum: 0.000000 2023-10-18 22:17:09,810 epoch 5 - iter 720/1445 - loss 0.16067601 - time (sec): 12.17 - samples/sec: 7397.82 - lr: 0.000018 - momentum: 0.000000 2023-10-18 22:17:12,001 epoch 5 - iter 864/1445 - loss 0.16175646 - time (sec): 14.36 - samples/sec: 7517.75 - lr: 0.000018 - momentum: 0.000000 2023-10-18 22:17:14,235 epoch 5 - iter 1008/1445 - loss 0.16105262 - time (sec): 16.59 - samples/sec: 7524.87 - lr: 0.000018 - momentum: 0.000000 2023-10-18 22:17:16,662 epoch 5 - iter 1152/1445 - loss 0.16316996 - time (sec): 19.02 - samples/sec: 7536.66 - lr: 0.000017 - momentum: 0.000000 2023-10-18 22:17:19,015 epoch 5 - iter 1296/1445 - loss 0.16557631 - time (sec): 21.37 - samples/sec: 7486.45 - lr: 0.000017 - momentum: 0.000000 2023-10-18 22:17:21,350 epoch 5 - iter 1440/1445 - loss 0.16498905 - time (sec): 23.71 - samples/sec: 7403.27 - lr: 0.000017 - momentum: 0.000000 2023-10-18 22:17:21,438 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:17:21,438 EPOCH 5 done: loss 0.1647 - lr: 0.000017 2023-10-18 22:17:23,208 DEV : loss 0.20644906163215637 - f1-score (micro avg) 0.4324 2023-10-18 22:17:23,222 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:17:25,647 epoch 6 - iter 144/1445 - loss 0.17082670 - time (sec): 2.42 - samples/sec: 7555.88 - lr: 0.000016 - momentum: 0.000000 2023-10-18 22:17:28,129 epoch 6 - iter 288/1445 - loss 0.16750043 - time (sec): 4.91 - samples/sec: 7212.45 - lr: 0.000016 - momentum: 0.000000 2023-10-18 22:17:30,591 epoch 6 - iter 432/1445 - loss 0.17029904 - time (sec): 7.37 - samples/sec: 7130.80 - lr: 0.000016 - momentum: 0.000000 2023-10-18 22:17:33,171 epoch 6 - iter 576/1445 - loss 0.16210786 - time (sec): 9.95 - samples/sec: 7113.75 - lr: 0.000015 - momentum: 0.000000 2023-10-18 22:17:35,596 epoch 6 - iter 720/1445 - loss 0.15996811 - time (sec): 12.37 - samples/sec: 7192.79 - lr: 0.000015 - momentum: 0.000000 2023-10-18 22:17:37,952 epoch 6 - iter 864/1445 - loss 0.16028998 - time (sec): 14.73 - samples/sec: 7181.61 - lr: 0.000015 - momentum: 0.000000 2023-10-18 22:17:40,277 epoch 6 - iter 1008/1445 - loss 0.16103140 - time (sec): 17.05 - samples/sec: 7197.95 - lr: 0.000014 - momentum: 0.000000 2023-10-18 22:17:42,653 epoch 6 - iter 1152/1445 - loss 0.16240655 - time (sec): 19.43 - samples/sec: 7226.49 - lr: 0.000014 - momentum: 0.000000 2023-10-18 22:17:45,046 epoch 6 - iter 1296/1445 - loss 0.15993163 - time (sec): 21.82 - samples/sec: 7245.37 - lr: 0.000014 - momentum: 0.000000 2023-10-18 22:17:47,386 epoch 6 - iter 1440/1445 - loss 0.15859787 - time (sec): 24.16 - samples/sec: 7262.98 - lr: 0.000013 - momentum: 0.000000 2023-10-18 22:17:47,478 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:17:47,478 EPOCH 6 done: loss 0.1587 - lr: 0.000013 2023-10-18 22:17:49,275 DEV : loss 0.19981378316879272 - f1-score (micro avg) 0.4602 2023-10-18 22:17:49,290 saving best model 2023-10-18 22:17:49,327 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:17:51,814 epoch 7 - iter 144/1445 - loss 0.15298779 - time (sec): 2.49 - samples/sec: 6654.90 - lr: 0.000013 - momentum: 0.000000 2023-10-18 22:17:54,333 epoch 7 - iter 288/1445 - loss 0.15264329 - time (sec): 5.00 - samples/sec: 7121.17 - lr: 0.000013 - momentum: 0.000000 2023-10-18 22:17:56,774 epoch 7 - iter 432/1445 - loss 0.15380859 - time (sec): 7.45 - samples/sec: 7220.78 - lr: 0.000012 - momentum: 0.000000 2023-10-18 22:17:59,139 epoch 7 - iter 576/1445 - loss 0.15272071 - time (sec): 9.81 - samples/sec: 7239.52 - lr: 0.000012 - momentum: 0.000000 2023-10-18 22:18:01,538 epoch 7 - iter 720/1445 - loss 0.15533373 - time (sec): 12.21 - samples/sec: 7244.29 - lr: 0.000012 - momentum: 0.000000 2023-10-18 22:18:03,695 epoch 7 - iter 864/1445 - loss 0.15379752 - time (sec): 14.37 - samples/sec: 7386.12 - lr: 0.000011 - momentum: 0.000000 2023-10-18 22:18:05,960 epoch 7 - iter 1008/1445 - loss 0.15411990 - time (sec): 16.63 - samples/sec: 7428.33 - lr: 0.000011 - momentum: 0.000000 2023-10-18 22:18:08,406 epoch 7 - iter 1152/1445 - loss 0.15631995 - time (sec): 19.08 - samples/sec: 7477.96 - lr: 0.000011 - momentum: 0.000000 2023-10-18 22:18:10,797 epoch 7 - iter 1296/1445 - loss 0.15702230 - time (sec): 21.47 - samples/sec: 7414.47 - lr: 0.000010 - momentum: 0.000000 2023-10-18 22:18:13,172 epoch 7 - iter 1440/1445 - loss 0.15433279 - time (sec): 23.84 - samples/sec: 7372.37 - lr: 0.000010 - momentum: 0.000000 2023-10-18 22:18:13,250 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:18:13,250 EPOCH 7 done: loss 0.1543 - lr: 0.000010 2023-10-18 22:18:15,383 DEV : loss 0.20650173723697662 - f1-score (micro avg) 0.4613 2023-10-18 22:18:15,397 saving best model 2023-10-18 22:18:15,432 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:18:17,902 epoch 8 - iter 144/1445 - loss 0.17182834 - time (sec): 2.47 - samples/sec: 7720.37 - lr: 0.000010 - momentum: 0.000000 2023-10-18 22:18:20,373 epoch 8 - iter 288/1445 - loss 0.16207302 - time (sec): 4.94 - samples/sec: 7399.34 - lr: 0.000009 - momentum: 0.000000 2023-10-18 22:18:22,749 epoch 8 - iter 432/1445 - loss 0.15599699 - time (sec): 7.32 - samples/sec: 7254.95 - lr: 0.000009 - momentum: 0.000000 2023-10-18 22:18:25,114 epoch 8 - iter 576/1445 - loss 0.16015713 - time (sec): 9.68 - samples/sec: 7189.52 - lr: 0.000009 - momentum: 0.000000 2023-10-18 22:18:27,471 epoch 8 - iter 720/1445 - loss 0.15628888 - time (sec): 12.04 - samples/sec: 7161.34 - lr: 0.000008 - momentum: 0.000000 2023-10-18 22:18:29,816 epoch 8 - iter 864/1445 - loss 0.15567396 - time (sec): 14.38 - samples/sec: 7204.07 - lr: 0.000008 - momentum: 0.000000 2023-10-18 22:18:32,111 epoch 8 - iter 1008/1445 - loss 0.15296437 - time (sec): 16.68 - samples/sec: 7302.17 - lr: 0.000008 - momentum: 0.000000 2023-10-18 22:18:34,490 epoch 8 - iter 1152/1445 - loss 0.15205419 - time (sec): 19.06 - samples/sec: 7357.23 - lr: 0.000007 - momentum: 0.000000 2023-10-18 22:18:36,881 epoch 8 - iter 1296/1445 - loss 0.15117031 - time (sec): 21.45 - samples/sec: 7361.91 - lr: 0.000007 - momentum: 0.000000 2023-10-18 22:18:39,320 epoch 8 - iter 1440/1445 - loss 0.14875319 - time (sec): 23.89 - samples/sec: 7355.33 - lr: 0.000007 - momentum: 0.000000 2023-10-18 22:18:39,405 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:18:39,405 EPOCH 8 done: loss 0.1489 - lr: 0.000007 2023-10-18 22:18:41,182 DEV : loss 0.2007928192615509 - f1-score (micro avg) 0.4698 2023-10-18 22:18:41,197 saving best model 2023-10-18 22:18:41,233 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:18:43,612 epoch 9 - iter 144/1445 - loss 0.13803425 - time (sec): 2.38 - samples/sec: 7192.10 - lr: 0.000006 - momentum: 0.000000 2023-10-18 22:18:45,994 epoch 9 - iter 288/1445 - loss 0.13390889 - time (sec): 4.76 - samples/sec: 7388.05 - lr: 0.000006 - momentum: 0.000000 2023-10-18 22:18:48,274 epoch 9 - iter 432/1445 - loss 0.14039521 - time (sec): 7.04 - samples/sec: 7375.77 - lr: 0.000006 - momentum: 0.000000 2023-10-18 22:18:50,726 epoch 9 - iter 576/1445 - loss 0.13920635 - time (sec): 9.49 - samples/sec: 7421.14 - lr: 0.000005 - momentum: 0.000000 2023-10-18 22:18:53,125 epoch 9 - iter 720/1445 - loss 0.13962614 - time (sec): 11.89 - samples/sec: 7412.45 - lr: 0.000005 - momentum: 0.000000 2023-10-18 22:18:55,524 epoch 9 - iter 864/1445 - loss 0.14289517 - time (sec): 14.29 - samples/sec: 7323.98 - lr: 0.000005 - momentum: 0.000000 2023-10-18 22:18:57,951 epoch 9 - iter 1008/1445 - loss 0.14366200 - time (sec): 16.72 - samples/sec: 7317.65 - lr: 0.000004 - momentum: 0.000000 2023-10-18 22:19:00,430 epoch 9 - iter 1152/1445 - loss 0.14679698 - time (sec): 19.20 - samples/sec: 7336.73 - lr: 0.000004 - momentum: 0.000000 2023-10-18 22:19:02,838 epoch 9 - iter 1296/1445 - loss 0.14740503 - time (sec): 21.60 - samples/sec: 7335.93 - lr: 0.000004 - momentum: 0.000000 2023-10-18 22:19:05,203 epoch 9 - iter 1440/1445 - loss 0.14602498 - time (sec): 23.97 - samples/sec: 7329.86 - lr: 0.000003 - momentum: 0.000000 2023-10-18 22:19:05,283 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:19:05,284 EPOCH 9 done: loss 0.1462 - lr: 0.000003 2023-10-18 22:19:07,083 DEV : loss 0.19696438312530518 - f1-score (micro avg) 0.471 2023-10-18 22:19:07,098 saving best model 2023-10-18 22:19:07,138 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:19:09,593 epoch 10 - iter 144/1445 - loss 0.17038935 - time (sec): 2.45 - samples/sec: 7036.32 - lr: 0.000003 - momentum: 0.000000 2023-10-18 22:19:11,999 epoch 10 - iter 288/1445 - loss 0.15496516 - time (sec): 4.86 - samples/sec: 7198.44 - lr: 0.000003 - momentum: 0.000000 2023-10-18 22:19:14,441 epoch 10 - iter 432/1445 - loss 0.14934365 - time (sec): 7.30 - samples/sec: 7333.86 - lr: 0.000002 - momentum: 0.000000 2023-10-18 22:19:16,858 epoch 10 - iter 576/1445 - loss 0.14818797 - time (sec): 9.72 - samples/sec: 7263.78 - lr: 0.000002 - momentum: 0.000000 2023-10-18 22:19:19,266 epoch 10 - iter 720/1445 - loss 0.14649527 - time (sec): 12.13 - samples/sec: 7274.87 - lr: 0.000002 - momentum: 0.000000 2023-10-18 22:19:21,642 epoch 10 - iter 864/1445 - loss 0.14868735 - time (sec): 14.50 - samples/sec: 7264.03 - lr: 0.000001 - momentum: 0.000000 2023-10-18 22:19:24,031 epoch 10 - iter 1008/1445 - loss 0.14662905 - time (sec): 16.89 - samples/sec: 7265.70 - lr: 0.000001 - momentum: 0.000000 2023-10-18 22:19:26,493 epoch 10 - iter 1152/1445 - loss 0.14401068 - time (sec): 19.35 - samples/sec: 7295.78 - lr: 0.000001 - momentum: 0.000000 2023-10-18 22:19:28,722 epoch 10 - iter 1296/1445 - loss 0.14499991 - time (sec): 21.58 - samples/sec: 7392.64 - lr: 0.000000 - momentum: 0.000000 2023-10-18 22:19:31,128 epoch 10 - iter 1440/1445 - loss 0.14488256 - time (sec): 23.99 - samples/sec: 7327.85 - lr: 0.000000 - momentum: 0.000000 2023-10-18 22:19:31,208 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:19:31,208 EPOCH 10 done: loss 0.1450 - lr: 0.000000 2023-10-18 22:19:33,334 DEV : loss 0.19905780255794525 - f1-score (micro avg) 0.4676 2023-10-18 22:19:33,378 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:19:33,378 Loading model from best epoch ... 2023-10-18 22:19:33,459 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-18 22:19:34,792 Results: - F-score (micro) 0.4923 - F-score (macro) 0.3343 - Accuracy 0.3404 By class: precision recall f1-score support LOC 0.5121 0.6463 0.5714 458 PER 0.5556 0.3527 0.4315 482 ORG 0.0000 0.0000 0.0000 69 micro avg 0.5271 0.4618 0.4923 1009 macro avg 0.3559 0.3330 0.3343 1009 weighted avg 0.4978 0.4618 0.4655 1009 2023-10-18 22:19:34,792 ----------------------------------------------------------------------------------------------------