2023-10-13 23:38:56,260 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:38:56,261 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 23:38:56,261 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:38:56,261 MultiCorpus: 7936 train + 992 dev + 992 test sentences - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr 2023-10-13 23:38:56,261 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:38:56,261 Train: 7936 sentences 2023-10-13 23:38:56,261 (train_with_dev=False, train_with_test=False) 2023-10-13 23:38:56,261 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:38:56,261 Training Params: 2023-10-13 23:38:56,261 - learning_rate: "3e-05" 2023-10-13 23:38:56,261 - mini_batch_size: "4" 2023-10-13 23:38:56,261 - max_epochs: "10" 2023-10-13 23:38:56,261 - shuffle: "True" 2023-10-13 23:38:56,261 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:38:56,261 Plugins: 2023-10-13 23:38:56,261 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 23:38:56,261 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:38:56,261 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 23:38:56,261 - metric: "('micro avg', 'f1-score')" 2023-10-13 23:38:56,262 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:38:56,262 Computation: 2023-10-13 23:38:56,262 - compute on device: cuda:0 2023-10-13 23:38:56,262 - embedding storage: none 2023-10-13 23:38:56,262 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:38:56,262 Model training base path: "hmbench-icdar/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-13 23:38:56,262 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:38:56,262 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:39:05,623 epoch 1 - iter 198/1984 - loss 1.83762815 - time (sec): 9.36 - samples/sec: 1740.22 - lr: 0.000003 - momentum: 0.000000 2023-10-13 23:39:14,861 epoch 1 - iter 396/1984 - loss 1.09662191 - time (sec): 18.60 - samples/sec: 1741.17 - lr: 0.000006 - momentum: 0.000000 2023-10-13 23:39:23,842 epoch 1 - iter 594/1984 - loss 0.81504132 - time (sec): 27.58 - samples/sec: 1746.82 - lr: 0.000009 - momentum: 0.000000 2023-10-13 23:39:32,828 epoch 1 - iter 792/1984 - loss 0.66249214 - time (sec): 36.57 - samples/sec: 1765.29 - lr: 0.000012 - momentum: 0.000000 2023-10-13 23:39:41,985 epoch 1 - iter 990/1984 - loss 0.56311956 - time (sec): 45.72 - samples/sec: 1779.23 - lr: 0.000015 - momentum: 0.000000 2023-10-13 23:39:51,226 epoch 1 - iter 1188/1984 - loss 0.48540145 - time (sec): 54.96 - samples/sec: 1808.89 - lr: 0.000018 - momentum: 0.000000 2023-10-13 23:40:00,329 epoch 1 - iter 1386/1984 - loss 0.44028878 - time (sec): 64.07 - samples/sec: 1802.31 - lr: 0.000021 - momentum: 0.000000 2023-10-13 23:40:09,323 epoch 1 - iter 1584/1984 - loss 0.40258648 - time (sec): 73.06 - samples/sec: 1804.47 - lr: 0.000024 - momentum: 0.000000 2023-10-13 23:40:18,196 epoch 1 - iter 1782/1984 - loss 0.37540297 - time (sec): 81.93 - samples/sec: 1799.90 - lr: 0.000027 - momentum: 0.000000 2023-10-13 23:40:27,100 epoch 1 - iter 1980/1984 - loss 0.35246436 - time (sec): 90.84 - samples/sec: 1801.35 - lr: 0.000030 - momentum: 0.000000 2023-10-13 23:40:27,278 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:40:27,278 EPOCH 1 done: loss 0.3524 - lr: 0.000030 2023-10-13 23:40:30,422 DEV : loss 0.12433891743421555 - f1-score (micro avg) 0.7099 2023-10-13 23:40:30,444 saving best model 2023-10-13 23:40:30,861 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:40:39,812 epoch 2 - iter 198/1984 - loss 0.11828575 - time (sec): 8.95 - samples/sec: 1789.32 - lr: 0.000030 - momentum: 0.000000 2023-10-13 23:40:48,784 epoch 2 - iter 396/1984 - loss 0.11140177 - time (sec): 17.92 - samples/sec: 1815.27 - lr: 0.000029 - momentum: 0.000000 2023-10-13 23:40:58,222 epoch 2 - iter 594/1984 - loss 0.11940888 - time (sec): 27.36 - samples/sec: 1789.15 - lr: 0.000029 - momentum: 0.000000 2023-10-13 23:41:07,249 epoch 2 - iter 792/1984 - loss 0.11906737 - time (sec): 36.39 - samples/sec: 1799.64 - lr: 0.000029 - momentum: 0.000000 2023-10-13 23:41:16,257 epoch 2 - iter 990/1984 - loss 0.11891276 - time (sec): 45.39 - samples/sec: 1804.73 - lr: 0.000028 - momentum: 0.000000 2023-10-13 23:41:25,219 epoch 2 - iter 1188/1984 - loss 0.11640076 - time (sec): 54.36 - samples/sec: 1812.33 - lr: 0.000028 - momentum: 0.000000 2023-10-13 23:41:34,108 epoch 2 - iter 1386/1984 - loss 0.11635323 - time (sec): 63.25 - samples/sec: 1815.08 - lr: 0.000028 - momentum: 0.000000 2023-10-13 23:41:43,049 epoch 2 - iter 1584/1984 - loss 0.11619891 - time (sec): 72.19 - samples/sec: 1814.34 - lr: 0.000027 - momentum: 0.000000 2023-10-13 23:41:52,022 epoch 2 - iter 1782/1984 - loss 0.11338861 - time (sec): 81.16 - samples/sec: 1818.54 - lr: 0.000027 - momentum: 0.000000 2023-10-13 23:42:01,193 epoch 2 - iter 1980/1984 - loss 0.11087439 - time (sec): 90.33 - samples/sec: 1812.59 - lr: 0.000027 - momentum: 0.000000 2023-10-13 23:42:01,372 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:42:01,372 EPOCH 2 done: loss 0.1110 - lr: 0.000027 2023-10-13 23:42:05,217 DEV : loss 0.09629001468420029 - f1-score (micro avg) 0.7285 2023-10-13 23:42:05,237 saving best model 2023-10-13 23:42:05,739 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:42:15,161 epoch 3 - iter 198/1984 - loss 0.07429461 - time (sec): 9.42 - samples/sec: 1699.53 - lr: 0.000026 - momentum: 0.000000 2023-10-13 23:42:24,281 epoch 3 - iter 396/1984 - loss 0.08110074 - time (sec): 18.54 - samples/sec: 1720.44 - lr: 0.000026 - momentum: 0.000000 2023-10-13 23:42:33,319 epoch 3 - iter 594/1984 - loss 0.08477110 - time (sec): 27.58 - samples/sec: 1770.13 - lr: 0.000026 - momentum: 0.000000 2023-10-13 23:42:42,426 epoch 3 - iter 792/1984 - loss 0.08227969 - time (sec): 36.68 - samples/sec: 1798.59 - lr: 0.000025 - momentum: 0.000000 2023-10-13 23:42:51,384 epoch 3 - iter 990/1984 - loss 0.08206052 - time (sec): 45.64 - samples/sec: 1800.84 - lr: 0.000025 - momentum: 0.000000 2023-10-13 23:43:00,378 epoch 3 - iter 1188/1984 - loss 0.08348155 - time (sec): 54.63 - samples/sec: 1794.26 - lr: 0.000025 - momentum: 0.000000 2023-10-13 23:43:09,338 epoch 3 - iter 1386/1984 - loss 0.08114266 - time (sec): 63.60 - samples/sec: 1796.94 - lr: 0.000024 - momentum: 0.000000 2023-10-13 23:43:18,462 epoch 3 - iter 1584/1984 - loss 0.07984718 - time (sec): 72.72 - samples/sec: 1803.30 - lr: 0.000024 - momentum: 0.000000 2023-10-13 23:43:27,624 epoch 3 - iter 1782/1984 - loss 0.07934502 - time (sec): 81.88 - samples/sec: 1801.00 - lr: 0.000024 - momentum: 0.000000 2023-10-13 23:43:36,624 epoch 3 - iter 1980/1984 - loss 0.07959415 - time (sec): 90.88 - samples/sec: 1802.37 - lr: 0.000023 - momentum: 0.000000 2023-10-13 23:43:36,802 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:43:36,802 EPOCH 3 done: loss 0.0797 - lr: 0.000023 2023-10-13 23:43:40,249 DEV : loss 0.11717832088470459 - f1-score (micro avg) 0.7546 2023-10-13 23:43:40,270 saving best model 2023-10-13 23:43:40,823 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:43:50,001 epoch 4 - iter 198/1984 - loss 0.06052117 - time (sec): 9.18 - samples/sec: 1734.76 - lr: 0.000023 - momentum: 0.000000 2023-10-13 23:43:59,043 epoch 4 - iter 396/1984 - loss 0.06140378 - time (sec): 18.22 - samples/sec: 1796.43 - lr: 0.000023 - momentum: 0.000000 2023-10-13 23:44:08,039 epoch 4 - iter 594/1984 - loss 0.06052478 - time (sec): 27.21 - samples/sec: 1754.30 - lr: 0.000022 - momentum: 0.000000 2023-10-13 23:44:17,139 epoch 4 - iter 792/1984 - loss 0.06090929 - time (sec): 36.31 - samples/sec: 1772.51 - lr: 0.000022 - momentum: 0.000000 2023-10-13 23:44:26,199 epoch 4 - iter 990/1984 - loss 0.05940225 - time (sec): 45.37 - samples/sec: 1784.17 - lr: 0.000022 - momentum: 0.000000 2023-10-13 23:44:35,410 epoch 4 - iter 1188/1984 - loss 0.06144572 - time (sec): 54.59 - samples/sec: 1792.61 - lr: 0.000021 - momentum: 0.000000 2023-10-13 23:44:44,471 epoch 4 - iter 1386/1984 - loss 0.06104930 - time (sec): 63.65 - samples/sec: 1790.11 - lr: 0.000021 - momentum: 0.000000 2023-10-13 23:44:53,448 epoch 4 - iter 1584/1984 - loss 0.06021919 - time (sec): 72.62 - samples/sec: 1787.03 - lr: 0.000021 - momentum: 0.000000 2023-10-13 23:45:02,466 epoch 4 - iter 1782/1984 - loss 0.06016900 - time (sec): 81.64 - samples/sec: 1792.86 - lr: 0.000020 - momentum: 0.000000 2023-10-13 23:45:11,679 epoch 4 - iter 1980/1984 - loss 0.05925545 - time (sec): 90.85 - samples/sec: 1801.73 - lr: 0.000020 - momentum: 0.000000 2023-10-13 23:45:11,872 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:45:11,872 EPOCH 4 done: loss 0.0592 - lr: 0.000020 2023-10-13 23:45:15,405 DEV : loss 0.14380605518817902 - f1-score (micro avg) 0.7823 2023-10-13 23:45:15,440 saving best model 2023-10-13 23:45:15,946 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:45:25,143 epoch 5 - iter 198/1984 - loss 0.04087458 - time (sec): 9.19 - samples/sec: 1756.29 - lr: 0.000020 - momentum: 0.000000 2023-10-13 23:45:34,337 epoch 5 - iter 396/1984 - loss 0.04482892 - time (sec): 18.39 - samples/sec: 1786.25 - lr: 0.000019 - momentum: 0.000000 2023-10-13 23:45:43,527 epoch 5 - iter 594/1984 - loss 0.04311401 - time (sec): 27.58 - samples/sec: 1811.77 - lr: 0.000019 - momentum: 0.000000 2023-10-13 23:45:52,597 epoch 5 - iter 792/1984 - loss 0.04281423 - time (sec): 36.65 - samples/sec: 1791.80 - lr: 0.000019 - momentum: 0.000000 2023-10-13 23:46:01,560 epoch 5 - iter 990/1984 - loss 0.04332734 - time (sec): 45.61 - samples/sec: 1786.28 - lr: 0.000018 - momentum: 0.000000 2023-10-13 23:46:10,615 epoch 5 - iter 1188/1984 - loss 0.04358208 - time (sec): 54.66 - samples/sec: 1794.86 - lr: 0.000018 - momentum: 0.000000 2023-10-13 23:46:19,677 epoch 5 - iter 1386/1984 - loss 0.04379144 - time (sec): 63.73 - samples/sec: 1803.16 - lr: 0.000018 - momentum: 0.000000 2023-10-13 23:46:28,892 epoch 5 - iter 1584/1984 - loss 0.04545313 - time (sec): 72.94 - samples/sec: 1811.13 - lr: 0.000017 - momentum: 0.000000 2023-10-13 23:46:37,793 epoch 5 - iter 1782/1984 - loss 0.04375927 - time (sec): 81.84 - samples/sec: 1807.89 - lr: 0.000017 - momentum: 0.000000 2023-10-13 23:46:46,890 epoch 5 - iter 1980/1984 - loss 0.04461810 - time (sec): 90.94 - samples/sec: 1798.77 - lr: 0.000017 - momentum: 0.000000 2023-10-13 23:46:47,084 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:46:47,084 EPOCH 5 done: loss 0.0446 - lr: 0.000017 2023-10-13 23:46:51,086 DEV : loss 0.16696855425834656 - f1-score (micro avg) 0.767 2023-10-13 23:46:51,110 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:47:00,477 epoch 6 - iter 198/1984 - loss 0.03505539 - time (sec): 9.37 - samples/sec: 1863.68 - lr: 0.000016 - momentum: 0.000000 2023-10-13 23:47:09,450 epoch 6 - iter 396/1984 - loss 0.03251886 - time (sec): 18.34 - samples/sec: 1813.90 - lr: 0.000016 - momentum: 0.000000 2023-10-13 23:47:18,382 epoch 6 - iter 594/1984 - loss 0.03280510 - time (sec): 27.27 - samples/sec: 1791.04 - lr: 0.000016 - momentum: 0.000000 2023-10-13 23:47:27,447 epoch 6 - iter 792/1984 - loss 0.03472134 - time (sec): 36.34 - samples/sec: 1798.71 - lr: 0.000015 - momentum: 0.000000 2023-10-13 23:47:36,364 epoch 6 - iter 990/1984 - loss 0.03368279 - time (sec): 45.25 - samples/sec: 1793.01 - lr: 0.000015 - momentum: 0.000000 2023-10-13 23:47:45,255 epoch 6 - iter 1188/1984 - loss 0.03396626 - time (sec): 54.14 - samples/sec: 1792.49 - lr: 0.000015 - momentum: 0.000000 2023-10-13 23:47:54,494 epoch 6 - iter 1386/1984 - loss 0.03365918 - time (sec): 63.38 - samples/sec: 1800.85 - lr: 0.000014 - momentum: 0.000000 2023-10-13 23:48:03,472 epoch 6 - iter 1584/1984 - loss 0.03440385 - time (sec): 72.36 - samples/sec: 1803.11 - lr: 0.000014 - momentum: 0.000000 2023-10-13 23:48:12,439 epoch 6 - iter 1782/1984 - loss 0.03479304 - time (sec): 81.33 - samples/sec: 1808.13 - lr: 0.000014 - momentum: 0.000000 2023-10-13 23:48:21,422 epoch 6 - iter 1980/1984 - loss 0.03544213 - time (sec): 90.31 - samples/sec: 1813.02 - lr: 0.000013 - momentum: 0.000000 2023-10-13 23:48:21,597 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:48:21,598 EPOCH 6 done: loss 0.0354 - lr: 0.000013 2023-10-13 23:48:24,988 DEV : loss 0.1906966120004654 - f1-score (micro avg) 0.7609 2023-10-13 23:48:25,009 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:48:34,032 epoch 7 - iter 198/1984 - loss 0.02239552 - time (sec): 9.02 - samples/sec: 1854.40 - lr: 0.000013 - momentum: 0.000000 2023-10-13 23:48:42,982 epoch 7 - iter 396/1984 - loss 0.01908052 - time (sec): 17.97 - samples/sec: 1843.91 - lr: 0.000013 - momentum: 0.000000 2023-10-13 23:48:52,026 epoch 7 - iter 594/1984 - loss 0.01846812 - time (sec): 27.02 - samples/sec: 1843.30 - lr: 0.000012 - momentum: 0.000000 2023-10-13 23:49:01,113 epoch 7 - iter 792/1984 - loss 0.02010717 - time (sec): 36.10 - samples/sec: 1803.01 - lr: 0.000012 - momentum: 0.000000 2023-10-13 23:49:10,095 epoch 7 - iter 990/1984 - loss 0.02248072 - time (sec): 45.08 - samples/sec: 1819.55 - lr: 0.000012 - momentum: 0.000000 2023-10-13 23:49:18,978 epoch 7 - iter 1188/1984 - loss 0.02304749 - time (sec): 53.97 - samples/sec: 1817.48 - lr: 0.000011 - momentum: 0.000000 2023-10-13 23:49:27,935 epoch 7 - iter 1386/1984 - loss 0.02228094 - time (sec): 62.93 - samples/sec: 1812.85 - lr: 0.000011 - momentum: 0.000000 2023-10-13 23:49:36,927 epoch 7 - iter 1584/1984 - loss 0.02350682 - time (sec): 71.92 - samples/sec: 1811.32 - lr: 0.000011 - momentum: 0.000000 2023-10-13 23:49:46,117 epoch 7 - iter 1782/1984 - loss 0.02341870 - time (sec): 81.11 - samples/sec: 1810.88 - lr: 0.000010 - momentum: 0.000000 2023-10-13 23:49:55,199 epoch 7 - iter 1980/1984 - loss 0.02353248 - time (sec): 90.19 - samples/sec: 1815.61 - lr: 0.000010 - momentum: 0.000000 2023-10-13 23:49:55,377 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:49:55,377 EPOCH 7 done: loss 0.0235 - lr: 0.000010 2023-10-13 23:49:59,310 DEV : loss 0.19835640490055084 - f1-score (micro avg) 0.782 2023-10-13 23:49:59,331 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:50:08,784 epoch 8 - iter 198/1984 - loss 0.02230710 - time (sec): 9.45 - samples/sec: 1803.36 - lr: 0.000010 - momentum: 0.000000 2023-10-13 23:50:17,765 epoch 8 - iter 396/1984 - loss 0.01940484 - time (sec): 18.43 - samples/sec: 1809.43 - lr: 0.000009 - momentum: 0.000000 2023-10-13 23:50:26,817 epoch 8 - iter 594/1984 - loss 0.01764565 - time (sec): 27.48 - samples/sec: 1836.45 - lr: 0.000009 - momentum: 0.000000 2023-10-13 23:50:35,737 epoch 8 - iter 792/1984 - loss 0.01686537 - time (sec): 36.40 - samples/sec: 1833.82 - lr: 0.000009 - momentum: 0.000000 2023-10-13 23:50:44,795 epoch 8 - iter 990/1984 - loss 0.01829213 - time (sec): 45.46 - samples/sec: 1805.63 - lr: 0.000008 - momentum: 0.000000 2023-10-13 23:50:53,724 epoch 8 - iter 1188/1984 - loss 0.01820113 - time (sec): 54.39 - samples/sec: 1809.96 - lr: 0.000008 - momentum: 0.000000 2023-10-13 23:51:02,869 epoch 8 - iter 1386/1984 - loss 0.01748894 - time (sec): 63.54 - samples/sec: 1807.49 - lr: 0.000008 - momentum: 0.000000 2023-10-13 23:51:12,333 epoch 8 - iter 1584/1984 - loss 0.01719619 - time (sec): 73.00 - samples/sec: 1800.67 - lr: 0.000007 - momentum: 0.000000 2023-10-13 23:51:21,277 epoch 8 - iter 1782/1984 - loss 0.01734332 - time (sec): 81.94 - samples/sec: 1807.45 - lr: 0.000007 - momentum: 0.000000 2023-10-13 23:51:30,188 epoch 8 - iter 1980/1984 - loss 0.01756418 - time (sec): 90.86 - samples/sec: 1801.14 - lr: 0.000007 - momentum: 0.000000 2023-10-13 23:51:30,370 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:51:30,370 EPOCH 8 done: loss 0.0175 - lr: 0.000007 2023-10-13 23:51:33,748 DEV : loss 0.215502068400383 - f1-score (micro avg) 0.7633 2023-10-13 23:51:33,769 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:51:42,760 epoch 9 - iter 198/1984 - loss 0.01149768 - time (sec): 8.99 - samples/sec: 1769.82 - lr: 0.000006 - momentum: 0.000000 2023-10-13 23:51:51,819 epoch 9 - iter 396/1984 - loss 0.01515882 - time (sec): 18.05 - samples/sec: 1787.98 - lr: 0.000006 - momentum: 0.000000 2023-10-13 23:52:00,844 epoch 9 - iter 594/1984 - loss 0.01337974 - time (sec): 27.07 - samples/sec: 1824.64 - lr: 0.000006 - momentum: 0.000000 2023-10-13 23:52:09,839 epoch 9 - iter 792/1984 - loss 0.01205560 - time (sec): 36.07 - samples/sec: 1828.37 - lr: 0.000005 - momentum: 0.000000 2023-10-13 23:52:18,833 epoch 9 - iter 990/1984 - loss 0.01052937 - time (sec): 45.06 - samples/sec: 1821.80 - lr: 0.000005 - momentum: 0.000000 2023-10-13 23:52:27,846 epoch 9 - iter 1188/1984 - loss 0.01105230 - time (sec): 54.08 - samples/sec: 1822.13 - lr: 0.000005 - momentum: 0.000000 2023-10-13 23:52:36,854 epoch 9 - iter 1386/1984 - loss 0.01084377 - time (sec): 63.08 - samples/sec: 1815.25 - lr: 0.000004 - momentum: 0.000000 2023-10-13 23:52:45,813 epoch 9 - iter 1584/1984 - loss 0.01116629 - time (sec): 72.04 - samples/sec: 1817.09 - lr: 0.000004 - momentum: 0.000000 2023-10-13 23:52:54,838 epoch 9 - iter 1782/1984 - loss 0.01094512 - time (sec): 81.07 - samples/sec: 1812.73 - lr: 0.000004 - momentum: 0.000000 2023-10-13 23:53:03,779 epoch 9 - iter 1980/1984 - loss 0.01048321 - time (sec): 90.01 - samples/sec: 1818.01 - lr: 0.000003 - momentum: 0.000000 2023-10-13 23:53:03,957 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:53:03,957 EPOCH 9 done: loss 0.0105 - lr: 0.000003 2023-10-13 23:53:07,359 DEV : loss 0.2330678254365921 - f1-score (micro avg) 0.7632 2023-10-13 23:53:07,381 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:53:16,538 epoch 10 - iter 198/1984 - loss 0.00727112 - time (sec): 9.16 - samples/sec: 1873.57 - lr: 0.000003 - momentum: 0.000000 2023-10-13 23:53:25,529 epoch 10 - iter 396/1984 - loss 0.00644925 - time (sec): 18.15 - samples/sec: 1829.60 - lr: 0.000003 - momentum: 0.000000 2023-10-13 23:53:34,573 epoch 10 - iter 594/1984 - loss 0.00644003 - time (sec): 27.19 - samples/sec: 1831.35 - lr: 0.000002 - momentum: 0.000000 2023-10-13 23:53:44,140 epoch 10 - iter 792/1984 - loss 0.00614650 - time (sec): 36.76 - samples/sec: 1815.49 - lr: 0.000002 - momentum: 0.000000 2023-10-13 23:53:53,201 epoch 10 - iter 990/1984 - loss 0.00665805 - time (sec): 45.82 - samples/sec: 1833.66 - lr: 0.000002 - momentum: 0.000000 2023-10-13 23:54:02,072 epoch 10 - iter 1188/1984 - loss 0.00708536 - time (sec): 54.69 - samples/sec: 1829.89 - lr: 0.000001 - momentum: 0.000000 2023-10-13 23:54:10,915 epoch 10 - iter 1386/1984 - loss 0.00771984 - time (sec): 63.53 - samples/sec: 1817.53 - lr: 0.000001 - momentum: 0.000000 2023-10-13 23:54:19,803 epoch 10 - iter 1584/1984 - loss 0.00781777 - time (sec): 72.42 - samples/sec: 1807.47 - lr: 0.000001 - momentum: 0.000000 2023-10-13 23:54:28,689 epoch 10 - iter 1782/1984 - loss 0.00737069 - time (sec): 81.31 - samples/sec: 1806.15 - lr: 0.000000 - momentum: 0.000000 2023-10-13 23:54:38,294 epoch 10 - iter 1980/1984 - loss 0.00694992 - time (sec): 90.91 - samples/sec: 1800.46 - lr: 0.000000 - momentum: 0.000000 2023-10-13 23:54:38,470 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:54:38,470 EPOCH 10 done: loss 0.0069 - lr: 0.000000 2023-10-13 23:54:41,942 DEV : loss 0.23772144317626953 - f1-score (micro avg) 0.7709 2023-10-13 23:54:42,395 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:54:42,396 Loading model from best epoch ... 2023-10-13 23:54:43,833 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-13 23:54:47,114 Results: - F-score (micro) 0.7789 - F-score (macro) 0.6828 - Accuracy 0.6588 By class: precision recall f1-score support LOC 0.8215 0.8641 0.8423 655 PER 0.6963 0.8430 0.7627 223 ORG 0.4951 0.4016 0.4435 127 micro avg 0.7580 0.8010 0.7789 1005 macro avg 0.6710 0.7029 0.6828 1005 weighted avg 0.7525 0.8010 0.7742 1005 2023-10-13 23:54:47,115 ----------------------------------------------------------------------------------------------------