stefan-it's picture
Upload folder using huggingface_hub
83bc775
2023-10-19 20:37:47,501 ----------------------------------------------------------------------------------------------------
2023-10-19 20:37:47,501 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-19 20:37:47,501 ----------------------------------------------------------------------------------------------------
2023-10-19 20:37:47,501 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-19 20:37:47,501 ----------------------------------------------------------------------------------------------------
2023-10-19 20:37:47,501 Train: 7142 sentences
2023-10-19 20:37:47,502 (train_with_dev=False, train_with_test=False)
2023-10-19 20:37:47,502 ----------------------------------------------------------------------------------------------------
2023-10-19 20:37:47,502 Training Params:
2023-10-19 20:37:47,502 - learning_rate: "5e-05"
2023-10-19 20:37:47,502 - mini_batch_size: "4"
2023-10-19 20:37:47,502 - max_epochs: "10"
2023-10-19 20:37:47,502 - shuffle: "True"
2023-10-19 20:37:47,502 ----------------------------------------------------------------------------------------------------
2023-10-19 20:37:47,502 Plugins:
2023-10-19 20:37:47,502 - TensorboardLogger
2023-10-19 20:37:47,502 - LinearScheduler | warmup_fraction: '0.1'
2023-10-19 20:37:47,502 ----------------------------------------------------------------------------------------------------
2023-10-19 20:37:47,502 Final evaluation on model from best epoch (best-model.pt)
2023-10-19 20:37:47,502 - metric: "('micro avg', 'f1-score')"
2023-10-19 20:37:47,502 ----------------------------------------------------------------------------------------------------
2023-10-19 20:37:47,502 Computation:
2023-10-19 20:37:47,502 - compute on device: cuda:0
2023-10-19 20:37:47,502 - embedding storage: none
2023-10-19 20:37:47,502 ----------------------------------------------------------------------------------------------------
2023-10-19 20:37:47,502 Model training base path: "hmbench-newseye/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-19 20:37:47,502 ----------------------------------------------------------------------------------------------------
2023-10-19 20:37:47,502 ----------------------------------------------------------------------------------------------------
2023-10-19 20:37:47,502 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-19 20:37:50,218 epoch 1 - iter 178/1786 - loss 3.25038440 - time (sec): 2.71 - samples/sec: 9003.73 - lr: 0.000005 - momentum: 0.000000
2023-10-19 20:37:53,334 epoch 1 - iter 356/1786 - loss 2.72723456 - time (sec): 5.83 - samples/sec: 8560.72 - lr: 0.000010 - momentum: 0.000000
2023-10-19 20:37:56,391 epoch 1 - iter 534/1786 - loss 2.13524414 - time (sec): 8.89 - samples/sec: 8481.62 - lr: 0.000015 - momentum: 0.000000
2023-10-19 20:37:59,443 epoch 1 - iter 712/1786 - loss 1.75821465 - time (sec): 11.94 - samples/sec: 8570.20 - lr: 0.000020 - momentum: 0.000000
2023-10-19 20:38:02,460 epoch 1 - iter 890/1786 - loss 1.56053783 - time (sec): 14.96 - samples/sec: 8457.10 - lr: 0.000025 - momentum: 0.000000
2023-10-19 20:38:05,477 epoch 1 - iter 1068/1786 - loss 1.41754035 - time (sec): 17.97 - samples/sec: 8347.51 - lr: 0.000030 - momentum: 0.000000
2023-10-19 20:38:08,514 epoch 1 - iter 1246/1786 - loss 1.30441193 - time (sec): 21.01 - samples/sec: 8250.97 - lr: 0.000035 - momentum: 0.000000
2023-10-19 20:38:11,602 epoch 1 - iter 1424/1786 - loss 1.20423029 - time (sec): 24.10 - samples/sec: 8223.35 - lr: 0.000040 - momentum: 0.000000
2023-10-19 20:38:14,683 epoch 1 - iter 1602/1786 - loss 1.12451472 - time (sec): 27.18 - samples/sec: 8226.15 - lr: 0.000045 - momentum: 0.000000
2023-10-19 20:38:17,750 epoch 1 - iter 1780/1786 - loss 1.05967954 - time (sec): 30.25 - samples/sec: 8210.13 - lr: 0.000050 - momentum: 0.000000
2023-10-19 20:38:17,841 ----------------------------------------------------------------------------------------------------
2023-10-19 20:38:17,841 EPOCH 1 done: loss 1.0591 - lr: 0.000050
2023-10-19 20:38:19,268 DEV : loss 0.2911563515663147 - f1-score (micro avg) 0.2194
2023-10-19 20:38:19,281 saving best model
2023-10-19 20:38:19,317 ----------------------------------------------------------------------------------------------------
2023-10-19 20:38:22,372 epoch 2 - iter 178/1786 - loss 0.43030123 - time (sec): 3.05 - samples/sec: 8566.85 - lr: 0.000049 - momentum: 0.000000
2023-10-19 20:38:25,394 epoch 2 - iter 356/1786 - loss 0.42697674 - time (sec): 6.08 - samples/sec: 8308.39 - lr: 0.000049 - momentum: 0.000000
2023-10-19 20:38:28,377 epoch 2 - iter 534/1786 - loss 0.41531955 - time (sec): 9.06 - samples/sec: 8180.29 - lr: 0.000048 - momentum: 0.000000
2023-10-19 20:38:31,485 epoch 2 - iter 712/1786 - loss 0.41406429 - time (sec): 12.17 - samples/sec: 8249.87 - lr: 0.000048 - momentum: 0.000000
2023-10-19 20:38:34,537 epoch 2 - iter 890/1786 - loss 0.40178088 - time (sec): 15.22 - samples/sec: 8234.11 - lr: 0.000047 - momentum: 0.000000
2023-10-19 20:38:37,582 epoch 2 - iter 1068/1786 - loss 0.40129750 - time (sec): 18.26 - samples/sec: 8234.00 - lr: 0.000047 - momentum: 0.000000
2023-10-19 20:38:40,571 epoch 2 - iter 1246/1786 - loss 0.39641341 - time (sec): 21.25 - samples/sec: 8194.34 - lr: 0.000046 - momentum: 0.000000
2023-10-19 20:38:43,411 epoch 2 - iter 1424/1786 - loss 0.39072789 - time (sec): 24.09 - samples/sec: 8264.27 - lr: 0.000046 - momentum: 0.000000
2023-10-19 20:38:46,396 epoch 2 - iter 1602/1786 - loss 0.38909677 - time (sec): 27.08 - samples/sec: 8278.05 - lr: 0.000045 - momentum: 0.000000
2023-10-19 20:38:49,429 epoch 2 - iter 1780/1786 - loss 0.38366296 - time (sec): 30.11 - samples/sec: 8243.44 - lr: 0.000044 - momentum: 0.000000
2023-10-19 20:38:49,517 ----------------------------------------------------------------------------------------------------
2023-10-19 20:38:49,517 EPOCH 2 done: loss 0.3839 - lr: 0.000044
2023-10-19 20:38:52,342 DEV : loss 0.22596031427383423 - f1-score (micro avg) 0.44
2023-10-19 20:38:52,357 saving best model
2023-10-19 20:38:52,401 ----------------------------------------------------------------------------------------------------
2023-10-19 20:38:55,409 epoch 3 - iter 178/1786 - loss 0.31914122 - time (sec): 3.01 - samples/sec: 7689.78 - lr: 0.000044 - momentum: 0.000000
2023-10-19 20:38:58,513 epoch 3 - iter 356/1786 - loss 0.31467803 - time (sec): 6.11 - samples/sec: 7874.30 - lr: 0.000043 - momentum: 0.000000
2023-10-19 20:39:01,608 epoch 3 - iter 534/1786 - loss 0.31791586 - time (sec): 9.21 - samples/sec: 7952.03 - lr: 0.000043 - momentum: 0.000000
2023-10-19 20:39:04,722 epoch 3 - iter 712/1786 - loss 0.32683842 - time (sec): 12.32 - samples/sec: 8044.64 - lr: 0.000042 - momentum: 0.000000
2023-10-19 20:39:07,739 epoch 3 - iter 890/1786 - loss 0.32735327 - time (sec): 15.34 - samples/sec: 8054.04 - lr: 0.000042 - momentum: 0.000000
2023-10-19 20:39:10,840 epoch 3 - iter 1068/1786 - loss 0.31925669 - time (sec): 18.44 - samples/sec: 8066.25 - lr: 0.000041 - momentum: 0.000000
2023-10-19 20:39:14,011 epoch 3 - iter 1246/1786 - loss 0.31637069 - time (sec): 21.61 - samples/sec: 8038.72 - lr: 0.000041 - momentum: 0.000000
2023-10-19 20:39:17,146 epoch 3 - iter 1424/1786 - loss 0.31200384 - time (sec): 24.74 - samples/sec: 8043.31 - lr: 0.000040 - momentum: 0.000000
2023-10-19 20:39:20,329 epoch 3 - iter 1602/1786 - loss 0.30757124 - time (sec): 27.93 - samples/sec: 8017.97 - lr: 0.000039 - momentum: 0.000000
2023-10-19 20:39:23,304 epoch 3 - iter 1780/1786 - loss 0.30421286 - time (sec): 30.90 - samples/sec: 8009.38 - lr: 0.000039 - momentum: 0.000000
2023-10-19 20:39:23,420 ----------------------------------------------------------------------------------------------------
2023-10-19 20:39:23,420 EPOCH 3 done: loss 0.3041 - lr: 0.000039
2023-10-19 20:39:25,770 DEV : loss 0.20335422456264496 - f1-score (micro avg) 0.4838
2023-10-19 20:39:25,784 saving best model
2023-10-19 20:39:25,818 ----------------------------------------------------------------------------------------------------
2023-10-19 20:39:28,873 epoch 4 - iter 178/1786 - loss 0.28120448 - time (sec): 3.05 - samples/sec: 8065.57 - lr: 0.000038 - momentum: 0.000000
2023-10-19 20:39:31,879 epoch 4 - iter 356/1786 - loss 0.28162907 - time (sec): 6.06 - samples/sec: 8160.54 - lr: 0.000038 - momentum: 0.000000
2023-10-19 20:39:34,912 epoch 4 - iter 534/1786 - loss 0.26744737 - time (sec): 9.09 - samples/sec: 8154.76 - lr: 0.000037 - momentum: 0.000000
2023-10-19 20:39:37,961 epoch 4 - iter 712/1786 - loss 0.26511953 - time (sec): 12.14 - samples/sec: 8148.29 - lr: 0.000037 - momentum: 0.000000
2023-10-19 20:39:41,026 epoch 4 - iter 890/1786 - loss 0.26550097 - time (sec): 15.21 - samples/sec: 8141.25 - lr: 0.000036 - momentum: 0.000000
2023-10-19 20:39:44,100 epoch 4 - iter 1068/1786 - loss 0.26573746 - time (sec): 18.28 - samples/sec: 8179.43 - lr: 0.000036 - momentum: 0.000000
2023-10-19 20:39:47,135 epoch 4 - iter 1246/1786 - loss 0.26849947 - time (sec): 21.32 - samples/sec: 8116.37 - lr: 0.000035 - momentum: 0.000000
2023-10-19 20:39:50,221 epoch 4 - iter 1424/1786 - loss 0.26853077 - time (sec): 24.40 - samples/sec: 8134.82 - lr: 0.000034 - momentum: 0.000000
2023-10-19 20:39:53,283 epoch 4 - iter 1602/1786 - loss 0.26807341 - time (sec): 27.46 - samples/sec: 8071.04 - lr: 0.000034 - momentum: 0.000000
2023-10-19 20:39:56,481 epoch 4 - iter 1780/1786 - loss 0.26526365 - time (sec): 30.66 - samples/sec: 8085.04 - lr: 0.000033 - momentum: 0.000000
2023-10-19 20:39:56,581 ----------------------------------------------------------------------------------------------------
2023-10-19 20:39:56,581 EPOCH 4 done: loss 0.2650 - lr: 0.000033
2023-10-19 20:39:59,392 DEV : loss 0.1998205929994583 - f1-score (micro avg) 0.4911
2023-10-19 20:39:59,406 saving best model
2023-10-19 20:39:59,441 ----------------------------------------------------------------------------------------------------
2023-10-19 20:40:02,525 epoch 5 - iter 178/1786 - loss 0.22521854 - time (sec): 3.08 - samples/sec: 7905.59 - lr: 0.000033 - momentum: 0.000000
2023-10-19 20:40:05,575 epoch 5 - iter 356/1786 - loss 0.23509832 - time (sec): 6.13 - samples/sec: 7930.76 - lr: 0.000032 - momentum: 0.000000
2023-10-19 20:40:08,566 epoch 5 - iter 534/1786 - loss 0.23834051 - time (sec): 9.12 - samples/sec: 7997.55 - lr: 0.000032 - momentum: 0.000000
2023-10-19 20:40:11,659 epoch 5 - iter 712/1786 - loss 0.23851056 - time (sec): 12.22 - samples/sec: 8049.31 - lr: 0.000031 - momentum: 0.000000
2023-10-19 20:40:14,717 epoch 5 - iter 890/1786 - loss 0.24170666 - time (sec): 15.28 - samples/sec: 8111.02 - lr: 0.000031 - momentum: 0.000000
2023-10-19 20:40:17,635 epoch 5 - iter 1068/1786 - loss 0.23929066 - time (sec): 18.19 - samples/sec: 8180.01 - lr: 0.000030 - momentum: 0.000000
2023-10-19 20:40:20,743 epoch 5 - iter 1246/1786 - loss 0.24080387 - time (sec): 21.30 - samples/sec: 8140.72 - lr: 0.000029 - momentum: 0.000000
2023-10-19 20:40:23,818 epoch 5 - iter 1424/1786 - loss 0.24269516 - time (sec): 24.38 - samples/sec: 8151.89 - lr: 0.000029 - momentum: 0.000000
2023-10-19 20:40:26,856 epoch 5 - iter 1602/1786 - loss 0.23904279 - time (sec): 27.41 - samples/sec: 8140.69 - lr: 0.000028 - momentum: 0.000000
2023-10-19 20:40:29,905 epoch 5 - iter 1780/1786 - loss 0.23838990 - time (sec): 30.46 - samples/sec: 8138.26 - lr: 0.000028 - momentum: 0.000000
2023-10-19 20:40:30,002 ----------------------------------------------------------------------------------------------------
2023-10-19 20:40:30,002 EPOCH 5 done: loss 0.2383 - lr: 0.000028
2023-10-19 20:40:32,346 DEV : loss 0.19193986058235168 - f1-score (micro avg) 0.5141
2023-10-19 20:40:32,359 saving best model
2023-10-19 20:40:32,394 ----------------------------------------------------------------------------------------------------
2023-10-19 20:40:35,541 epoch 6 - iter 178/1786 - loss 0.22022921 - time (sec): 3.15 - samples/sec: 7916.90 - lr: 0.000027 - momentum: 0.000000
2023-10-19 20:40:38,573 epoch 6 - iter 356/1786 - loss 0.22009712 - time (sec): 6.18 - samples/sec: 7793.77 - lr: 0.000027 - momentum: 0.000000
2023-10-19 20:40:41,626 epoch 6 - iter 534/1786 - loss 0.21614358 - time (sec): 9.23 - samples/sec: 7792.40 - lr: 0.000026 - momentum: 0.000000
2023-10-19 20:40:44,653 epoch 6 - iter 712/1786 - loss 0.21370923 - time (sec): 12.26 - samples/sec: 7868.70 - lr: 0.000026 - momentum: 0.000000
2023-10-19 20:40:47,713 epoch 6 - iter 890/1786 - loss 0.21638230 - time (sec): 15.32 - samples/sec: 7875.92 - lr: 0.000025 - momentum: 0.000000
2023-10-19 20:40:50,715 epoch 6 - iter 1068/1786 - loss 0.21880081 - time (sec): 18.32 - samples/sec: 7941.46 - lr: 0.000024 - momentum: 0.000000
2023-10-19 20:40:53,782 epoch 6 - iter 1246/1786 - loss 0.21972606 - time (sec): 21.39 - samples/sec: 7993.42 - lr: 0.000024 - momentum: 0.000000
2023-10-19 20:40:56,910 epoch 6 - iter 1424/1786 - loss 0.22189381 - time (sec): 24.52 - samples/sec: 8039.84 - lr: 0.000023 - momentum: 0.000000
2023-10-19 20:41:00,080 epoch 6 - iter 1602/1786 - loss 0.22127930 - time (sec): 27.69 - samples/sec: 8040.59 - lr: 0.000023 - momentum: 0.000000
2023-10-19 20:41:03,207 epoch 6 - iter 1780/1786 - loss 0.22077708 - time (sec): 30.81 - samples/sec: 8040.50 - lr: 0.000022 - momentum: 0.000000
2023-10-19 20:41:03,326 ----------------------------------------------------------------------------------------------------
2023-10-19 20:41:03,326 EPOCH 6 done: loss 0.2204 - lr: 0.000022
2023-10-19 20:41:06,202 DEV : loss 0.18350262939929962 - f1-score (micro avg) 0.536
2023-10-19 20:41:06,216 saving best model
2023-10-19 20:41:06,251 ----------------------------------------------------------------------------------------------------
2023-10-19 20:41:09,203 epoch 7 - iter 178/1786 - loss 0.19164778 - time (sec): 2.95 - samples/sec: 7736.55 - lr: 0.000022 - momentum: 0.000000
2023-10-19 20:41:12,209 epoch 7 - iter 356/1786 - loss 0.20305516 - time (sec): 5.96 - samples/sec: 7908.04 - lr: 0.000021 - momentum: 0.000000
2023-10-19 20:41:15,276 epoch 7 - iter 534/1786 - loss 0.20243257 - time (sec): 9.02 - samples/sec: 8026.84 - lr: 0.000021 - momentum: 0.000000
2023-10-19 20:41:18,401 epoch 7 - iter 712/1786 - loss 0.20043618 - time (sec): 12.15 - samples/sec: 7926.15 - lr: 0.000020 - momentum: 0.000000
2023-10-19 20:41:21,489 epoch 7 - iter 890/1786 - loss 0.19885104 - time (sec): 15.24 - samples/sec: 8005.15 - lr: 0.000019 - momentum: 0.000000
2023-10-19 20:41:24,379 epoch 7 - iter 1068/1786 - loss 0.20114197 - time (sec): 18.13 - samples/sec: 8023.68 - lr: 0.000019 - momentum: 0.000000
2023-10-19 20:41:27,266 epoch 7 - iter 1246/1786 - loss 0.20353714 - time (sec): 21.01 - samples/sec: 8008.93 - lr: 0.000018 - momentum: 0.000000
2023-10-19 20:41:30,473 epoch 7 - iter 1424/1786 - loss 0.20389885 - time (sec): 24.22 - samples/sec: 8134.58 - lr: 0.000018 - momentum: 0.000000
2023-10-19 20:41:33,490 epoch 7 - iter 1602/1786 - loss 0.20418664 - time (sec): 27.24 - samples/sec: 8220.64 - lr: 0.000017 - momentum: 0.000000
2023-10-19 20:41:36,145 epoch 7 - iter 1780/1786 - loss 0.20471845 - time (sec): 29.89 - samples/sec: 8297.20 - lr: 0.000017 - momentum: 0.000000
2023-10-19 20:41:36,243 ----------------------------------------------------------------------------------------------------
2023-10-19 20:41:36,243 EPOCH 7 done: loss 0.2044 - lr: 0.000017
2023-10-19 20:41:38,607 DEV : loss 0.18785029649734497 - f1-score (micro avg) 0.5566
2023-10-19 20:41:38,621 saving best model
2023-10-19 20:41:38,656 ----------------------------------------------------------------------------------------------------
2023-10-19 20:41:41,747 epoch 8 - iter 178/1786 - loss 0.17209114 - time (sec): 3.09 - samples/sec: 7635.51 - lr: 0.000016 - momentum: 0.000000
2023-10-19 20:41:44,834 epoch 8 - iter 356/1786 - loss 0.18649148 - time (sec): 6.18 - samples/sec: 8050.90 - lr: 0.000016 - momentum: 0.000000
2023-10-19 20:41:47,770 epoch 8 - iter 534/1786 - loss 0.19257228 - time (sec): 9.11 - samples/sec: 8143.11 - lr: 0.000015 - momentum: 0.000000
2023-10-19 20:41:50,802 epoch 8 - iter 712/1786 - loss 0.18865919 - time (sec): 12.15 - samples/sec: 8159.67 - lr: 0.000014 - momentum: 0.000000
2023-10-19 20:41:53,862 epoch 8 - iter 890/1786 - loss 0.19551212 - time (sec): 15.21 - samples/sec: 8069.21 - lr: 0.000014 - momentum: 0.000000
2023-10-19 20:41:56,869 epoch 8 - iter 1068/1786 - loss 0.19666334 - time (sec): 18.21 - samples/sec: 8097.11 - lr: 0.000013 - momentum: 0.000000
2023-10-19 20:41:59,961 epoch 8 - iter 1246/1786 - loss 0.19544818 - time (sec): 21.30 - samples/sec: 8042.12 - lr: 0.000013 - momentum: 0.000000
2023-10-19 20:42:02,975 epoch 8 - iter 1424/1786 - loss 0.19344021 - time (sec): 24.32 - samples/sec: 8041.95 - lr: 0.000012 - momentum: 0.000000
2023-10-19 20:42:06,103 epoch 8 - iter 1602/1786 - loss 0.19343569 - time (sec): 27.45 - samples/sec: 8117.39 - lr: 0.000012 - momentum: 0.000000
2023-10-19 20:42:09,630 epoch 8 - iter 1780/1786 - loss 0.19218063 - time (sec): 30.97 - samples/sec: 8008.93 - lr: 0.000011 - momentum: 0.000000
2023-10-19 20:42:09,731 ----------------------------------------------------------------------------------------------------
2023-10-19 20:42:09,731 EPOCH 8 done: loss 0.1927 - lr: 0.000011
2023-10-19 20:42:12,129 DEV : loss 0.18572503328323364 - f1-score (micro avg) 0.5618
2023-10-19 20:42:12,143 saving best model
2023-10-19 20:42:12,182 ----------------------------------------------------------------------------------------------------
2023-10-19 20:42:15,250 epoch 9 - iter 178/1786 - loss 0.19667861 - time (sec): 3.07 - samples/sec: 7940.55 - lr: 0.000011 - momentum: 0.000000
2023-10-19 20:42:18,298 epoch 9 - iter 356/1786 - loss 0.18488315 - time (sec): 6.12 - samples/sec: 8020.57 - lr: 0.000010 - momentum: 0.000000
2023-10-19 20:42:21,350 epoch 9 - iter 534/1786 - loss 0.18801911 - time (sec): 9.17 - samples/sec: 7986.67 - lr: 0.000009 - momentum: 0.000000
2023-10-19 20:42:24,420 epoch 9 - iter 712/1786 - loss 0.19094468 - time (sec): 12.24 - samples/sec: 8068.42 - lr: 0.000009 - momentum: 0.000000
2023-10-19 20:42:27,582 epoch 9 - iter 890/1786 - loss 0.19203212 - time (sec): 15.40 - samples/sec: 8162.53 - lr: 0.000008 - momentum: 0.000000
2023-10-19 20:42:30,593 epoch 9 - iter 1068/1786 - loss 0.18757669 - time (sec): 18.41 - samples/sec: 8116.35 - lr: 0.000008 - momentum: 0.000000
2023-10-19 20:42:33,700 epoch 9 - iter 1246/1786 - loss 0.18805714 - time (sec): 21.52 - samples/sec: 8113.44 - lr: 0.000007 - momentum: 0.000000
2023-10-19 20:42:36,559 epoch 9 - iter 1424/1786 - loss 0.18712464 - time (sec): 24.38 - samples/sec: 8144.68 - lr: 0.000007 - momentum: 0.000000
2023-10-19 20:42:39,647 epoch 9 - iter 1602/1786 - loss 0.18607916 - time (sec): 27.46 - samples/sec: 8143.69 - lr: 0.000006 - momentum: 0.000000
2023-10-19 20:42:42,752 epoch 9 - iter 1780/1786 - loss 0.18572632 - time (sec): 30.57 - samples/sec: 8108.96 - lr: 0.000006 - momentum: 0.000000
2023-10-19 20:42:42,854 ----------------------------------------------------------------------------------------------------
2023-10-19 20:42:42,854 EPOCH 9 done: loss 0.1856 - lr: 0.000006
2023-10-19 20:42:45,252 DEV : loss 0.19007962942123413 - f1-score (micro avg) 0.5685
2023-10-19 20:42:45,266 saving best model
2023-10-19 20:42:45,300 ----------------------------------------------------------------------------------------------------
2023-10-19 20:42:48,484 epoch 10 - iter 178/1786 - loss 0.17314883 - time (sec): 3.18 - samples/sec: 8236.77 - lr: 0.000005 - momentum: 0.000000
2023-10-19 20:42:51,678 epoch 10 - iter 356/1786 - loss 0.17597820 - time (sec): 6.38 - samples/sec: 8122.48 - lr: 0.000004 - momentum: 0.000000
2023-10-19 20:42:54,852 epoch 10 - iter 534/1786 - loss 0.18240659 - time (sec): 9.55 - samples/sec: 8166.07 - lr: 0.000004 - momentum: 0.000000
2023-10-19 20:42:57,846 epoch 10 - iter 712/1786 - loss 0.18381425 - time (sec): 12.54 - samples/sec: 8166.67 - lr: 0.000003 - momentum: 0.000000
2023-10-19 20:43:00,912 epoch 10 - iter 890/1786 - loss 0.18290301 - time (sec): 15.61 - samples/sec: 8151.54 - lr: 0.000003 - momentum: 0.000000
2023-10-19 20:43:04,044 epoch 10 - iter 1068/1786 - loss 0.18308905 - time (sec): 18.74 - samples/sec: 8086.03 - lr: 0.000002 - momentum: 0.000000
2023-10-19 20:43:07,605 epoch 10 - iter 1246/1786 - loss 0.18156946 - time (sec): 22.30 - samples/sec: 7889.83 - lr: 0.000002 - momentum: 0.000000
2023-10-19 20:43:10,669 epoch 10 - iter 1424/1786 - loss 0.18071674 - time (sec): 25.37 - samples/sec: 7877.57 - lr: 0.000001 - momentum: 0.000000
2023-10-19 20:43:13,642 epoch 10 - iter 1602/1786 - loss 0.18145714 - time (sec): 28.34 - samples/sec: 7889.82 - lr: 0.000001 - momentum: 0.000000
2023-10-19 20:43:17,011 epoch 10 - iter 1780/1786 - loss 0.18146549 - time (sec): 31.71 - samples/sec: 7809.46 - lr: 0.000000 - momentum: 0.000000
2023-10-19 20:43:17,140 ----------------------------------------------------------------------------------------------------
2023-10-19 20:43:17,140 EPOCH 10 done: loss 0.1815 - lr: 0.000000
2023-10-19 20:43:19,536 DEV : loss 0.18993768095970154 - f1-score (micro avg) 0.5632
2023-10-19 20:43:19,580 ----------------------------------------------------------------------------------------------------
2023-10-19 20:43:19,580 Loading model from best epoch ...
2023-10-19 20:43:19,660 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-19 20:43:24,342
Results:
- F-score (micro) 0.4627
- F-score (macro) 0.3013
- Accuracy 0.3107
By class:
precision recall f1-score support
LOC 0.4447 0.5653 0.4978 1095
PER 0.4861 0.5346 0.5092 1012
ORG 0.2215 0.1793 0.1981 357
HumanProd 0.0000 0.0000 0.0000 33
micro avg 0.4381 0.4902 0.4627 2497
macro avg 0.2881 0.3198 0.3013 2497
weighted avg 0.4237 0.4902 0.4530 2497
2023-10-19 20:43:24,342 ----------------------------------------------------------------------------------------------------