2023-10-19 20:37:47,501 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:37:47,501 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-19 20:37:47,501 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:37:47,501 MultiCorpus: 7142 train + 698 dev + 2570 test sentences - NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator 2023-10-19 20:37:47,501 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:37:47,501 Train: 7142 sentences 2023-10-19 20:37:47,502 (train_with_dev=False, train_with_test=False) 2023-10-19 20:37:47,502 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:37:47,502 Training Params: 2023-10-19 20:37:47,502 - learning_rate: "5e-05" 2023-10-19 20:37:47,502 - mini_batch_size: "4" 2023-10-19 20:37:47,502 - max_epochs: "10" 2023-10-19 20:37:47,502 - shuffle: "True" 2023-10-19 20:37:47,502 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:37:47,502 Plugins: 2023-10-19 20:37:47,502 - TensorboardLogger 2023-10-19 20:37:47,502 - LinearScheduler | warmup_fraction: '0.1' 2023-10-19 20:37:47,502 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:37:47,502 Final evaluation on model from best epoch (best-model.pt) 2023-10-19 20:37:47,502 - metric: "('micro avg', 'f1-score')" 2023-10-19 20:37:47,502 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:37:47,502 Computation: 2023-10-19 20:37:47,502 - compute on device: cuda:0 2023-10-19 20:37:47,502 - embedding storage: none 2023-10-19 20:37:47,502 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:37:47,502 Model training base path: "hmbench-newseye/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-19 20:37:47,502 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:37:47,502 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:37:47,502 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-19 20:37:50,218 epoch 1 - iter 178/1786 - loss 3.25038440 - time (sec): 2.71 - samples/sec: 9003.73 - lr: 0.000005 - momentum: 0.000000 2023-10-19 20:37:53,334 epoch 1 - iter 356/1786 - loss 2.72723456 - time (sec): 5.83 - samples/sec: 8560.72 - lr: 0.000010 - momentum: 0.000000 2023-10-19 20:37:56,391 epoch 1 - iter 534/1786 - loss 2.13524414 - time (sec): 8.89 - samples/sec: 8481.62 - lr: 0.000015 - momentum: 0.000000 2023-10-19 20:37:59,443 epoch 1 - iter 712/1786 - loss 1.75821465 - time (sec): 11.94 - samples/sec: 8570.20 - lr: 0.000020 - momentum: 0.000000 2023-10-19 20:38:02,460 epoch 1 - iter 890/1786 - loss 1.56053783 - time (sec): 14.96 - samples/sec: 8457.10 - lr: 0.000025 - momentum: 0.000000 2023-10-19 20:38:05,477 epoch 1 - iter 1068/1786 - loss 1.41754035 - time (sec): 17.97 - samples/sec: 8347.51 - lr: 0.000030 - momentum: 0.000000 2023-10-19 20:38:08,514 epoch 1 - iter 1246/1786 - loss 1.30441193 - time (sec): 21.01 - samples/sec: 8250.97 - lr: 0.000035 - momentum: 0.000000 2023-10-19 20:38:11,602 epoch 1 - iter 1424/1786 - loss 1.20423029 - time (sec): 24.10 - samples/sec: 8223.35 - lr: 0.000040 - momentum: 0.000000 2023-10-19 20:38:14,683 epoch 1 - iter 1602/1786 - loss 1.12451472 - time (sec): 27.18 - samples/sec: 8226.15 - lr: 0.000045 - momentum: 0.000000 2023-10-19 20:38:17,750 epoch 1 - iter 1780/1786 - loss 1.05967954 - time (sec): 30.25 - samples/sec: 8210.13 - lr: 0.000050 - momentum: 0.000000 2023-10-19 20:38:17,841 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:38:17,841 EPOCH 1 done: loss 1.0591 - lr: 0.000050 2023-10-19 20:38:19,268 DEV : loss 0.2911563515663147 - f1-score (micro avg) 0.2194 2023-10-19 20:38:19,281 saving best model 2023-10-19 20:38:19,317 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:38:22,372 epoch 2 - iter 178/1786 - loss 0.43030123 - time (sec): 3.05 - samples/sec: 8566.85 - lr: 0.000049 - momentum: 0.000000 2023-10-19 20:38:25,394 epoch 2 - iter 356/1786 - loss 0.42697674 - time (sec): 6.08 - samples/sec: 8308.39 - lr: 0.000049 - momentum: 0.000000 2023-10-19 20:38:28,377 epoch 2 - iter 534/1786 - loss 0.41531955 - time (sec): 9.06 - samples/sec: 8180.29 - lr: 0.000048 - momentum: 0.000000 2023-10-19 20:38:31,485 epoch 2 - iter 712/1786 - loss 0.41406429 - time (sec): 12.17 - samples/sec: 8249.87 - lr: 0.000048 - momentum: 0.000000 2023-10-19 20:38:34,537 epoch 2 - iter 890/1786 - loss 0.40178088 - time (sec): 15.22 - samples/sec: 8234.11 - lr: 0.000047 - momentum: 0.000000 2023-10-19 20:38:37,582 epoch 2 - iter 1068/1786 - loss 0.40129750 - time (sec): 18.26 - samples/sec: 8234.00 - lr: 0.000047 - momentum: 0.000000 2023-10-19 20:38:40,571 epoch 2 - iter 1246/1786 - loss 0.39641341 - time (sec): 21.25 - samples/sec: 8194.34 - lr: 0.000046 - momentum: 0.000000 2023-10-19 20:38:43,411 epoch 2 - iter 1424/1786 - loss 0.39072789 - time (sec): 24.09 - samples/sec: 8264.27 - lr: 0.000046 - momentum: 0.000000 2023-10-19 20:38:46,396 epoch 2 - iter 1602/1786 - loss 0.38909677 - time (sec): 27.08 - samples/sec: 8278.05 - lr: 0.000045 - momentum: 0.000000 2023-10-19 20:38:49,429 epoch 2 - iter 1780/1786 - loss 0.38366296 - time (sec): 30.11 - samples/sec: 8243.44 - lr: 0.000044 - momentum: 0.000000 2023-10-19 20:38:49,517 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:38:49,517 EPOCH 2 done: loss 0.3839 - lr: 0.000044 2023-10-19 20:38:52,342 DEV : loss 0.22596031427383423 - f1-score (micro avg) 0.44 2023-10-19 20:38:52,357 saving best model 2023-10-19 20:38:52,401 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:38:55,409 epoch 3 - iter 178/1786 - loss 0.31914122 - time (sec): 3.01 - samples/sec: 7689.78 - lr: 0.000044 - momentum: 0.000000 2023-10-19 20:38:58,513 epoch 3 - iter 356/1786 - loss 0.31467803 - time (sec): 6.11 - samples/sec: 7874.30 - lr: 0.000043 - momentum: 0.000000 2023-10-19 20:39:01,608 epoch 3 - iter 534/1786 - loss 0.31791586 - time (sec): 9.21 - samples/sec: 7952.03 - lr: 0.000043 - momentum: 0.000000 2023-10-19 20:39:04,722 epoch 3 - iter 712/1786 - loss 0.32683842 - time (sec): 12.32 - samples/sec: 8044.64 - lr: 0.000042 - momentum: 0.000000 2023-10-19 20:39:07,739 epoch 3 - iter 890/1786 - loss 0.32735327 - time (sec): 15.34 - samples/sec: 8054.04 - lr: 0.000042 - momentum: 0.000000 2023-10-19 20:39:10,840 epoch 3 - iter 1068/1786 - loss 0.31925669 - time (sec): 18.44 - samples/sec: 8066.25 - lr: 0.000041 - momentum: 0.000000 2023-10-19 20:39:14,011 epoch 3 - iter 1246/1786 - loss 0.31637069 - time (sec): 21.61 - samples/sec: 8038.72 - lr: 0.000041 - momentum: 0.000000 2023-10-19 20:39:17,146 epoch 3 - iter 1424/1786 - loss 0.31200384 - time (sec): 24.74 - samples/sec: 8043.31 - lr: 0.000040 - momentum: 0.000000 2023-10-19 20:39:20,329 epoch 3 - iter 1602/1786 - loss 0.30757124 - time (sec): 27.93 - samples/sec: 8017.97 - lr: 0.000039 - momentum: 0.000000 2023-10-19 20:39:23,304 epoch 3 - iter 1780/1786 - loss 0.30421286 - time (sec): 30.90 - samples/sec: 8009.38 - lr: 0.000039 - momentum: 0.000000 2023-10-19 20:39:23,420 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:39:23,420 EPOCH 3 done: loss 0.3041 - lr: 0.000039 2023-10-19 20:39:25,770 DEV : loss 0.20335422456264496 - f1-score (micro avg) 0.4838 2023-10-19 20:39:25,784 saving best model 2023-10-19 20:39:25,818 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:39:28,873 epoch 4 - iter 178/1786 - loss 0.28120448 - time (sec): 3.05 - samples/sec: 8065.57 - lr: 0.000038 - momentum: 0.000000 2023-10-19 20:39:31,879 epoch 4 - iter 356/1786 - loss 0.28162907 - time (sec): 6.06 - samples/sec: 8160.54 - lr: 0.000038 - momentum: 0.000000 2023-10-19 20:39:34,912 epoch 4 - iter 534/1786 - loss 0.26744737 - time (sec): 9.09 - samples/sec: 8154.76 - lr: 0.000037 - momentum: 0.000000 2023-10-19 20:39:37,961 epoch 4 - iter 712/1786 - loss 0.26511953 - time (sec): 12.14 - samples/sec: 8148.29 - lr: 0.000037 - momentum: 0.000000 2023-10-19 20:39:41,026 epoch 4 - iter 890/1786 - loss 0.26550097 - time (sec): 15.21 - samples/sec: 8141.25 - lr: 0.000036 - momentum: 0.000000 2023-10-19 20:39:44,100 epoch 4 - iter 1068/1786 - loss 0.26573746 - time (sec): 18.28 - samples/sec: 8179.43 - lr: 0.000036 - momentum: 0.000000 2023-10-19 20:39:47,135 epoch 4 - iter 1246/1786 - loss 0.26849947 - time (sec): 21.32 - samples/sec: 8116.37 - lr: 0.000035 - momentum: 0.000000 2023-10-19 20:39:50,221 epoch 4 - iter 1424/1786 - loss 0.26853077 - time (sec): 24.40 - samples/sec: 8134.82 - lr: 0.000034 - momentum: 0.000000 2023-10-19 20:39:53,283 epoch 4 - iter 1602/1786 - loss 0.26807341 - time (sec): 27.46 - samples/sec: 8071.04 - lr: 0.000034 - momentum: 0.000000 2023-10-19 20:39:56,481 epoch 4 - iter 1780/1786 - loss 0.26526365 - time (sec): 30.66 - samples/sec: 8085.04 - lr: 0.000033 - momentum: 0.000000 2023-10-19 20:39:56,581 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:39:56,581 EPOCH 4 done: loss 0.2650 - lr: 0.000033 2023-10-19 20:39:59,392 DEV : loss 0.1998205929994583 - f1-score (micro avg) 0.4911 2023-10-19 20:39:59,406 saving best model 2023-10-19 20:39:59,441 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:40:02,525 epoch 5 - iter 178/1786 - loss 0.22521854 - time (sec): 3.08 - samples/sec: 7905.59 - lr: 0.000033 - momentum: 0.000000 2023-10-19 20:40:05,575 epoch 5 - iter 356/1786 - loss 0.23509832 - time (sec): 6.13 - samples/sec: 7930.76 - lr: 0.000032 - momentum: 0.000000 2023-10-19 20:40:08,566 epoch 5 - iter 534/1786 - loss 0.23834051 - time (sec): 9.12 - samples/sec: 7997.55 - lr: 0.000032 - momentum: 0.000000 2023-10-19 20:40:11,659 epoch 5 - iter 712/1786 - loss 0.23851056 - time (sec): 12.22 - samples/sec: 8049.31 - lr: 0.000031 - momentum: 0.000000 2023-10-19 20:40:14,717 epoch 5 - iter 890/1786 - loss 0.24170666 - time (sec): 15.28 - samples/sec: 8111.02 - lr: 0.000031 - momentum: 0.000000 2023-10-19 20:40:17,635 epoch 5 - iter 1068/1786 - loss 0.23929066 - time (sec): 18.19 - samples/sec: 8180.01 - lr: 0.000030 - momentum: 0.000000 2023-10-19 20:40:20,743 epoch 5 - iter 1246/1786 - loss 0.24080387 - time (sec): 21.30 - samples/sec: 8140.72 - lr: 0.000029 - momentum: 0.000000 2023-10-19 20:40:23,818 epoch 5 - iter 1424/1786 - loss 0.24269516 - time (sec): 24.38 - samples/sec: 8151.89 - lr: 0.000029 - momentum: 0.000000 2023-10-19 20:40:26,856 epoch 5 - iter 1602/1786 - loss 0.23904279 - time (sec): 27.41 - samples/sec: 8140.69 - lr: 0.000028 - momentum: 0.000000 2023-10-19 20:40:29,905 epoch 5 - iter 1780/1786 - loss 0.23838990 - time (sec): 30.46 - samples/sec: 8138.26 - lr: 0.000028 - momentum: 0.000000 2023-10-19 20:40:30,002 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:40:30,002 EPOCH 5 done: loss 0.2383 - lr: 0.000028 2023-10-19 20:40:32,346 DEV : loss 0.19193986058235168 - f1-score (micro avg) 0.5141 2023-10-19 20:40:32,359 saving best model 2023-10-19 20:40:32,394 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:40:35,541 epoch 6 - iter 178/1786 - loss 0.22022921 - time (sec): 3.15 - samples/sec: 7916.90 - lr: 0.000027 - momentum: 0.000000 2023-10-19 20:40:38,573 epoch 6 - iter 356/1786 - loss 0.22009712 - time (sec): 6.18 - samples/sec: 7793.77 - lr: 0.000027 - momentum: 0.000000 2023-10-19 20:40:41,626 epoch 6 - iter 534/1786 - loss 0.21614358 - time (sec): 9.23 - samples/sec: 7792.40 - lr: 0.000026 - momentum: 0.000000 2023-10-19 20:40:44,653 epoch 6 - iter 712/1786 - loss 0.21370923 - time (sec): 12.26 - samples/sec: 7868.70 - lr: 0.000026 - momentum: 0.000000 2023-10-19 20:40:47,713 epoch 6 - iter 890/1786 - loss 0.21638230 - time (sec): 15.32 - samples/sec: 7875.92 - lr: 0.000025 - momentum: 0.000000 2023-10-19 20:40:50,715 epoch 6 - iter 1068/1786 - loss 0.21880081 - time (sec): 18.32 - samples/sec: 7941.46 - lr: 0.000024 - momentum: 0.000000 2023-10-19 20:40:53,782 epoch 6 - iter 1246/1786 - loss 0.21972606 - time (sec): 21.39 - samples/sec: 7993.42 - lr: 0.000024 - momentum: 0.000000 2023-10-19 20:40:56,910 epoch 6 - iter 1424/1786 - loss 0.22189381 - time (sec): 24.52 - samples/sec: 8039.84 - lr: 0.000023 - momentum: 0.000000 2023-10-19 20:41:00,080 epoch 6 - iter 1602/1786 - loss 0.22127930 - time (sec): 27.69 - samples/sec: 8040.59 - lr: 0.000023 - momentum: 0.000000 2023-10-19 20:41:03,207 epoch 6 - iter 1780/1786 - loss 0.22077708 - time (sec): 30.81 - samples/sec: 8040.50 - lr: 0.000022 - momentum: 0.000000 2023-10-19 20:41:03,326 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:41:03,326 EPOCH 6 done: loss 0.2204 - lr: 0.000022 2023-10-19 20:41:06,202 DEV : loss 0.18350262939929962 - f1-score (micro avg) 0.536 2023-10-19 20:41:06,216 saving best model 2023-10-19 20:41:06,251 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:41:09,203 epoch 7 - iter 178/1786 - loss 0.19164778 - time (sec): 2.95 - samples/sec: 7736.55 - lr: 0.000022 - momentum: 0.000000 2023-10-19 20:41:12,209 epoch 7 - iter 356/1786 - loss 0.20305516 - time (sec): 5.96 - samples/sec: 7908.04 - lr: 0.000021 - momentum: 0.000000 2023-10-19 20:41:15,276 epoch 7 - iter 534/1786 - loss 0.20243257 - time (sec): 9.02 - samples/sec: 8026.84 - lr: 0.000021 - momentum: 0.000000 2023-10-19 20:41:18,401 epoch 7 - iter 712/1786 - loss 0.20043618 - time (sec): 12.15 - samples/sec: 7926.15 - lr: 0.000020 - momentum: 0.000000 2023-10-19 20:41:21,489 epoch 7 - iter 890/1786 - loss 0.19885104 - time (sec): 15.24 - samples/sec: 8005.15 - lr: 0.000019 - momentum: 0.000000 2023-10-19 20:41:24,379 epoch 7 - iter 1068/1786 - loss 0.20114197 - time (sec): 18.13 - samples/sec: 8023.68 - lr: 0.000019 - momentum: 0.000000 2023-10-19 20:41:27,266 epoch 7 - iter 1246/1786 - loss 0.20353714 - time (sec): 21.01 - samples/sec: 8008.93 - lr: 0.000018 - momentum: 0.000000 2023-10-19 20:41:30,473 epoch 7 - iter 1424/1786 - loss 0.20389885 - time (sec): 24.22 - samples/sec: 8134.58 - lr: 0.000018 - momentum: 0.000000 2023-10-19 20:41:33,490 epoch 7 - iter 1602/1786 - loss 0.20418664 - time (sec): 27.24 - samples/sec: 8220.64 - lr: 0.000017 - momentum: 0.000000 2023-10-19 20:41:36,145 epoch 7 - iter 1780/1786 - loss 0.20471845 - time (sec): 29.89 - samples/sec: 8297.20 - lr: 0.000017 - momentum: 0.000000 2023-10-19 20:41:36,243 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:41:36,243 EPOCH 7 done: loss 0.2044 - lr: 0.000017 2023-10-19 20:41:38,607 DEV : loss 0.18785029649734497 - f1-score (micro avg) 0.5566 2023-10-19 20:41:38,621 saving best model 2023-10-19 20:41:38,656 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:41:41,747 epoch 8 - iter 178/1786 - loss 0.17209114 - time (sec): 3.09 - samples/sec: 7635.51 - lr: 0.000016 - momentum: 0.000000 2023-10-19 20:41:44,834 epoch 8 - iter 356/1786 - loss 0.18649148 - time (sec): 6.18 - samples/sec: 8050.90 - lr: 0.000016 - momentum: 0.000000 2023-10-19 20:41:47,770 epoch 8 - iter 534/1786 - loss 0.19257228 - time (sec): 9.11 - samples/sec: 8143.11 - lr: 0.000015 - momentum: 0.000000 2023-10-19 20:41:50,802 epoch 8 - iter 712/1786 - loss 0.18865919 - time (sec): 12.15 - samples/sec: 8159.67 - lr: 0.000014 - momentum: 0.000000 2023-10-19 20:41:53,862 epoch 8 - iter 890/1786 - loss 0.19551212 - time (sec): 15.21 - samples/sec: 8069.21 - lr: 0.000014 - momentum: 0.000000 2023-10-19 20:41:56,869 epoch 8 - iter 1068/1786 - loss 0.19666334 - time (sec): 18.21 - samples/sec: 8097.11 - lr: 0.000013 - momentum: 0.000000 2023-10-19 20:41:59,961 epoch 8 - iter 1246/1786 - loss 0.19544818 - time (sec): 21.30 - samples/sec: 8042.12 - lr: 0.000013 - momentum: 0.000000 2023-10-19 20:42:02,975 epoch 8 - iter 1424/1786 - loss 0.19344021 - time (sec): 24.32 - samples/sec: 8041.95 - lr: 0.000012 - momentum: 0.000000 2023-10-19 20:42:06,103 epoch 8 - iter 1602/1786 - loss 0.19343569 - time (sec): 27.45 - samples/sec: 8117.39 - lr: 0.000012 - momentum: 0.000000 2023-10-19 20:42:09,630 epoch 8 - iter 1780/1786 - loss 0.19218063 - time (sec): 30.97 - samples/sec: 8008.93 - lr: 0.000011 - momentum: 0.000000 2023-10-19 20:42:09,731 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:42:09,731 EPOCH 8 done: loss 0.1927 - lr: 0.000011 2023-10-19 20:42:12,129 DEV : loss 0.18572503328323364 - f1-score (micro avg) 0.5618 2023-10-19 20:42:12,143 saving best model 2023-10-19 20:42:12,182 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:42:15,250 epoch 9 - iter 178/1786 - loss 0.19667861 - time (sec): 3.07 - samples/sec: 7940.55 - lr: 0.000011 - momentum: 0.000000 2023-10-19 20:42:18,298 epoch 9 - iter 356/1786 - loss 0.18488315 - time (sec): 6.12 - samples/sec: 8020.57 - lr: 0.000010 - momentum: 0.000000 2023-10-19 20:42:21,350 epoch 9 - iter 534/1786 - loss 0.18801911 - time (sec): 9.17 - samples/sec: 7986.67 - lr: 0.000009 - momentum: 0.000000 2023-10-19 20:42:24,420 epoch 9 - iter 712/1786 - loss 0.19094468 - time (sec): 12.24 - samples/sec: 8068.42 - lr: 0.000009 - momentum: 0.000000 2023-10-19 20:42:27,582 epoch 9 - iter 890/1786 - loss 0.19203212 - time (sec): 15.40 - samples/sec: 8162.53 - lr: 0.000008 - momentum: 0.000000 2023-10-19 20:42:30,593 epoch 9 - iter 1068/1786 - loss 0.18757669 - time (sec): 18.41 - samples/sec: 8116.35 - lr: 0.000008 - momentum: 0.000000 2023-10-19 20:42:33,700 epoch 9 - iter 1246/1786 - loss 0.18805714 - time (sec): 21.52 - samples/sec: 8113.44 - lr: 0.000007 - momentum: 0.000000 2023-10-19 20:42:36,559 epoch 9 - iter 1424/1786 - loss 0.18712464 - time (sec): 24.38 - samples/sec: 8144.68 - lr: 0.000007 - momentum: 0.000000 2023-10-19 20:42:39,647 epoch 9 - iter 1602/1786 - loss 0.18607916 - time (sec): 27.46 - samples/sec: 8143.69 - lr: 0.000006 - momentum: 0.000000 2023-10-19 20:42:42,752 epoch 9 - iter 1780/1786 - loss 0.18572632 - time (sec): 30.57 - samples/sec: 8108.96 - lr: 0.000006 - momentum: 0.000000 2023-10-19 20:42:42,854 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:42:42,854 EPOCH 9 done: loss 0.1856 - lr: 0.000006 2023-10-19 20:42:45,252 DEV : loss 0.19007962942123413 - f1-score (micro avg) 0.5685 2023-10-19 20:42:45,266 saving best model 2023-10-19 20:42:45,300 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:42:48,484 epoch 10 - iter 178/1786 - loss 0.17314883 - time (sec): 3.18 - samples/sec: 8236.77 - lr: 0.000005 - momentum: 0.000000 2023-10-19 20:42:51,678 epoch 10 - iter 356/1786 - loss 0.17597820 - time (sec): 6.38 - samples/sec: 8122.48 - lr: 0.000004 - momentum: 0.000000 2023-10-19 20:42:54,852 epoch 10 - iter 534/1786 - loss 0.18240659 - time (sec): 9.55 - samples/sec: 8166.07 - lr: 0.000004 - momentum: 0.000000 2023-10-19 20:42:57,846 epoch 10 - iter 712/1786 - loss 0.18381425 - time (sec): 12.54 - samples/sec: 8166.67 - lr: 0.000003 - momentum: 0.000000 2023-10-19 20:43:00,912 epoch 10 - iter 890/1786 - loss 0.18290301 - time (sec): 15.61 - samples/sec: 8151.54 - lr: 0.000003 - momentum: 0.000000 2023-10-19 20:43:04,044 epoch 10 - iter 1068/1786 - loss 0.18308905 - time (sec): 18.74 - samples/sec: 8086.03 - lr: 0.000002 - momentum: 0.000000 2023-10-19 20:43:07,605 epoch 10 - iter 1246/1786 - loss 0.18156946 - time (sec): 22.30 - samples/sec: 7889.83 - lr: 0.000002 - momentum: 0.000000 2023-10-19 20:43:10,669 epoch 10 - iter 1424/1786 - loss 0.18071674 - time (sec): 25.37 - samples/sec: 7877.57 - lr: 0.000001 - momentum: 0.000000 2023-10-19 20:43:13,642 epoch 10 - iter 1602/1786 - loss 0.18145714 - time (sec): 28.34 - samples/sec: 7889.82 - lr: 0.000001 - momentum: 0.000000 2023-10-19 20:43:17,011 epoch 10 - iter 1780/1786 - loss 0.18146549 - time (sec): 31.71 - samples/sec: 7809.46 - lr: 0.000000 - momentum: 0.000000 2023-10-19 20:43:17,140 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:43:17,140 EPOCH 10 done: loss 0.1815 - lr: 0.000000 2023-10-19 20:43:19,536 DEV : loss 0.18993768095970154 - f1-score (micro avg) 0.5632 2023-10-19 20:43:19,580 ---------------------------------------------------------------------------------------------------- 2023-10-19 20:43:19,580 Loading model from best epoch ... 2023-10-19 20:43:19,660 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-19 20:43:24,342 Results: - F-score (micro) 0.4627 - F-score (macro) 0.3013 - Accuracy 0.3107 By class: precision recall f1-score support LOC 0.4447 0.5653 0.4978 1095 PER 0.4861 0.5346 0.5092 1012 ORG 0.2215 0.1793 0.1981 357 HumanProd 0.0000 0.0000 0.0000 33 micro avg 0.4381 0.4902 0.4627 2497 macro avg 0.2881 0.3198 0.3013 2497 weighted avg 0.4237 0.4902 0.4530 2497 2023-10-19 20:43:24,342 ----------------------------------------------------------------------------------------------------