2023-10-23 17:50:48,289 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:50:48,290 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-23 17:50:48,290 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:50:48,290 MultiCorpus: 1214 train + 266 dev + 251 test sentences - NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator 2023-10-23 17:50:48,290 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:50:48,290 Train: 1214 sentences 2023-10-23 17:50:48,290 (train_with_dev=False, train_with_test=False) 2023-10-23 17:50:48,290 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:50:48,290 Training Params: 2023-10-23 17:50:48,290 - learning_rate: "3e-05" 2023-10-23 17:50:48,290 - mini_batch_size: "4" 2023-10-23 17:50:48,290 - max_epochs: "10" 2023-10-23 17:50:48,290 - shuffle: "True" 2023-10-23 17:50:48,290 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:50:48,290 Plugins: 2023-10-23 17:50:48,290 - TensorboardLogger 2023-10-23 17:50:48,290 - LinearScheduler | warmup_fraction: '0.1' 2023-10-23 17:50:48,290 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:50:48,291 Final evaluation on model from best epoch (best-model.pt) 2023-10-23 17:50:48,291 - metric: "('micro avg', 'f1-score')" 2023-10-23 17:50:48,291 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:50:48,291 Computation: 2023-10-23 17:50:48,291 - compute on device: cuda:0 2023-10-23 17:50:48,291 - embedding storage: none 2023-10-23 17:50:48,291 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:50:48,291 Model training base path: "hmbench-ajmc/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-23 17:50:48,291 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:50:48,291 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:50:48,291 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-23 17:50:50,356 epoch 1 - iter 30/304 - loss 3.69722481 - time (sec): 2.06 - samples/sec: 1655.82 - lr: 0.000003 - momentum: 0.000000 2023-10-23 17:50:51,989 epoch 1 - iter 60/304 - loss 2.74719092 - time (sec): 3.70 - samples/sec: 1780.24 - lr: 0.000006 - momentum: 0.000000 2023-10-23 17:50:53,607 epoch 1 - iter 90/304 - loss 2.12779152 - time (sec): 5.32 - samples/sec: 1787.84 - lr: 0.000009 - momentum: 0.000000 2023-10-23 17:50:55,228 epoch 1 - iter 120/304 - loss 1.74025184 - time (sec): 6.94 - samples/sec: 1797.95 - lr: 0.000012 - momentum: 0.000000 2023-10-23 17:50:56,852 epoch 1 - iter 150/304 - loss 1.47975099 - time (sec): 8.56 - samples/sec: 1817.19 - lr: 0.000015 - momentum: 0.000000 2023-10-23 17:50:58,231 epoch 1 - iter 180/304 - loss 1.27411463 - time (sec): 9.94 - samples/sec: 1867.48 - lr: 0.000018 - momentum: 0.000000 2023-10-23 17:50:59,678 epoch 1 - iter 210/304 - loss 1.12417703 - time (sec): 11.39 - samples/sec: 1903.83 - lr: 0.000021 - momentum: 0.000000 2023-10-23 17:51:01,302 epoch 1 - iter 240/304 - loss 1.01577048 - time (sec): 13.01 - samples/sec: 1893.29 - lr: 0.000024 - momentum: 0.000000 2023-10-23 17:51:02,937 epoch 1 - iter 270/304 - loss 0.92647351 - time (sec): 14.64 - samples/sec: 1889.22 - lr: 0.000027 - momentum: 0.000000 2023-10-23 17:51:04,559 epoch 1 - iter 300/304 - loss 0.85337022 - time (sec): 16.27 - samples/sec: 1881.87 - lr: 0.000030 - momentum: 0.000000 2023-10-23 17:51:04,773 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:51:04,773 EPOCH 1 done: loss 0.8459 - lr: 0.000030 2023-10-23 17:51:05,612 DEV : loss 0.18107786774635315 - f1-score (micro avg) 0.7157 2023-10-23 17:51:05,620 saving best model 2023-10-23 17:51:06,072 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:51:07,691 epoch 2 - iter 30/304 - loss 0.16115555 - time (sec): 1.62 - samples/sec: 1892.62 - lr: 0.000030 - momentum: 0.000000 2023-10-23 17:51:09,330 epoch 2 - iter 60/304 - loss 0.14404024 - time (sec): 3.26 - samples/sec: 1853.56 - lr: 0.000029 - momentum: 0.000000 2023-10-23 17:51:10,964 epoch 2 - iter 90/304 - loss 0.15544101 - time (sec): 4.89 - samples/sec: 1866.09 - lr: 0.000029 - momentum: 0.000000 2023-10-23 17:51:12,592 epoch 2 - iter 120/304 - loss 0.14600381 - time (sec): 6.52 - samples/sec: 1838.67 - lr: 0.000029 - momentum: 0.000000 2023-10-23 17:51:14,216 epoch 2 - iter 150/304 - loss 0.13989853 - time (sec): 8.14 - samples/sec: 1830.00 - lr: 0.000028 - momentum: 0.000000 2023-10-23 17:51:15,851 epoch 2 - iter 180/304 - loss 0.13760840 - time (sec): 9.78 - samples/sec: 1870.74 - lr: 0.000028 - momentum: 0.000000 2023-10-23 17:51:17,477 epoch 2 - iter 210/304 - loss 0.12848240 - time (sec): 11.40 - samples/sec: 1873.95 - lr: 0.000028 - momentum: 0.000000 2023-10-23 17:51:19,107 epoch 2 - iter 240/304 - loss 0.12642604 - time (sec): 13.03 - samples/sec: 1874.56 - lr: 0.000027 - momentum: 0.000000 2023-10-23 17:51:20,743 epoch 2 - iter 270/304 - loss 0.11972906 - time (sec): 14.67 - samples/sec: 1876.07 - lr: 0.000027 - momentum: 0.000000 2023-10-23 17:51:22,373 epoch 2 - iter 300/304 - loss 0.12418244 - time (sec): 16.30 - samples/sec: 1877.37 - lr: 0.000027 - momentum: 0.000000 2023-10-23 17:51:22,591 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:51:22,591 EPOCH 2 done: loss 0.1238 - lr: 0.000027 2023-10-23 17:51:23,455 DEV : loss 0.15617969632148743 - f1-score (micro avg) 0.7948 2023-10-23 17:51:23,461 saving best model 2023-10-23 17:51:24,052 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:51:25,664 epoch 3 - iter 30/304 - loss 0.07088016 - time (sec): 1.61 - samples/sec: 1807.21 - lr: 0.000026 - momentum: 0.000000 2023-10-23 17:51:27,305 epoch 3 - iter 60/304 - loss 0.07201890 - time (sec): 3.25 - samples/sec: 1900.19 - lr: 0.000026 - momentum: 0.000000 2023-10-23 17:51:28,942 epoch 3 - iter 90/304 - loss 0.07456532 - time (sec): 4.89 - samples/sec: 1963.05 - lr: 0.000026 - momentum: 0.000000 2023-10-23 17:51:30,366 epoch 3 - iter 120/304 - loss 0.08787217 - time (sec): 6.31 - samples/sec: 2023.68 - lr: 0.000025 - momentum: 0.000000 2023-10-23 17:51:31,944 epoch 3 - iter 150/304 - loss 0.08021076 - time (sec): 7.89 - samples/sec: 1997.95 - lr: 0.000025 - momentum: 0.000000 2023-10-23 17:51:33,571 epoch 3 - iter 180/304 - loss 0.07958015 - time (sec): 9.52 - samples/sec: 1967.41 - lr: 0.000025 - momentum: 0.000000 2023-10-23 17:51:35,208 epoch 3 - iter 210/304 - loss 0.07942447 - time (sec): 11.15 - samples/sec: 1964.35 - lr: 0.000024 - momentum: 0.000000 2023-10-23 17:51:36,828 epoch 3 - iter 240/304 - loss 0.08143248 - time (sec): 12.77 - samples/sec: 1930.93 - lr: 0.000024 - momentum: 0.000000 2023-10-23 17:51:38,456 epoch 3 - iter 270/304 - loss 0.07695793 - time (sec): 14.40 - samples/sec: 1913.58 - lr: 0.000024 - momentum: 0.000000 2023-10-23 17:51:40,076 epoch 3 - iter 300/304 - loss 0.07907368 - time (sec): 16.02 - samples/sec: 1908.68 - lr: 0.000023 - momentum: 0.000000 2023-10-23 17:51:40,294 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:51:40,294 EPOCH 3 done: loss 0.0798 - lr: 0.000023 2023-10-23 17:51:41,232 DEV : loss 0.18515826761722565 - f1-score (micro avg) 0.8115 2023-10-23 17:51:41,240 saving best model 2023-10-23 17:51:41,808 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:51:43,431 epoch 4 - iter 30/304 - loss 0.08009535 - time (sec): 1.62 - samples/sec: 1933.48 - lr: 0.000023 - momentum: 0.000000 2023-10-23 17:51:45,062 epoch 4 - iter 60/304 - loss 0.05354314 - time (sec): 3.25 - samples/sec: 1963.95 - lr: 0.000023 - momentum: 0.000000 2023-10-23 17:51:46,699 epoch 4 - iter 90/304 - loss 0.05563023 - time (sec): 4.89 - samples/sec: 2018.05 - lr: 0.000022 - momentum: 0.000000 2023-10-23 17:51:48,339 epoch 4 - iter 120/304 - loss 0.05586782 - time (sec): 6.53 - samples/sec: 2002.79 - lr: 0.000022 - momentum: 0.000000 2023-10-23 17:51:49,970 epoch 4 - iter 150/304 - loss 0.05452144 - time (sec): 8.16 - samples/sec: 1956.68 - lr: 0.000022 - momentum: 0.000000 2023-10-23 17:51:51,593 epoch 4 - iter 180/304 - loss 0.05103357 - time (sec): 9.78 - samples/sec: 1918.00 - lr: 0.000021 - momentum: 0.000000 2023-10-23 17:51:53,220 epoch 4 - iter 210/304 - loss 0.05587431 - time (sec): 11.41 - samples/sec: 1903.48 - lr: 0.000021 - momentum: 0.000000 2023-10-23 17:51:54,844 epoch 4 - iter 240/304 - loss 0.05468130 - time (sec): 13.03 - samples/sec: 1891.71 - lr: 0.000021 - momentum: 0.000000 2023-10-23 17:51:56,470 epoch 4 - iter 270/304 - loss 0.05462271 - time (sec): 14.66 - samples/sec: 1884.47 - lr: 0.000020 - momentum: 0.000000 2023-10-23 17:51:58,103 epoch 4 - iter 300/304 - loss 0.05458237 - time (sec): 16.29 - samples/sec: 1878.39 - lr: 0.000020 - momentum: 0.000000 2023-10-23 17:51:58,318 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:51:58,318 EPOCH 4 done: loss 0.0539 - lr: 0.000020 2023-10-23 17:51:59,172 DEV : loss 0.1898188292980194 - f1-score (micro avg) 0.8154 2023-10-23 17:51:59,179 saving best model 2023-10-23 17:51:59,731 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:52:01,335 epoch 5 - iter 30/304 - loss 0.04767102 - time (sec): 1.60 - samples/sec: 1757.29 - lr: 0.000020 - momentum: 0.000000 2023-10-23 17:52:02,960 epoch 5 - iter 60/304 - loss 0.03422077 - time (sec): 3.23 - samples/sec: 1860.53 - lr: 0.000019 - momentum: 0.000000 2023-10-23 17:52:04,583 epoch 5 - iter 90/304 - loss 0.05289543 - time (sec): 4.85 - samples/sec: 1865.16 - lr: 0.000019 - momentum: 0.000000 2023-10-23 17:52:06,204 epoch 5 - iter 120/304 - loss 0.05165638 - time (sec): 6.47 - samples/sec: 1828.59 - lr: 0.000019 - momentum: 0.000000 2023-10-23 17:52:07,827 epoch 5 - iter 150/304 - loss 0.05063179 - time (sec): 8.09 - samples/sec: 1841.61 - lr: 0.000018 - momentum: 0.000000 2023-10-23 17:52:09,464 epoch 5 - iter 180/304 - loss 0.04770632 - time (sec): 9.73 - samples/sec: 1858.78 - lr: 0.000018 - momentum: 0.000000 2023-10-23 17:52:11,089 epoch 5 - iter 210/304 - loss 0.04098756 - time (sec): 11.36 - samples/sec: 1871.70 - lr: 0.000018 - momentum: 0.000000 2023-10-23 17:52:12,711 epoch 5 - iter 240/304 - loss 0.03813000 - time (sec): 12.98 - samples/sec: 1867.83 - lr: 0.000017 - momentum: 0.000000 2023-10-23 17:52:14,346 epoch 5 - iter 270/304 - loss 0.04061018 - time (sec): 14.61 - samples/sec: 1891.22 - lr: 0.000017 - momentum: 0.000000 2023-10-23 17:52:15,974 epoch 5 - iter 300/304 - loss 0.04162435 - time (sec): 16.24 - samples/sec: 1889.30 - lr: 0.000017 - momentum: 0.000000 2023-10-23 17:52:16,185 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:52:16,185 EPOCH 5 done: loss 0.0417 - lr: 0.000017 2023-10-23 17:52:17,072 DEV : loss 0.20237892866134644 - f1-score (micro avg) 0.8476 2023-10-23 17:52:17,080 saving best model 2023-10-23 17:52:17,660 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:52:19,284 epoch 6 - iter 30/304 - loss 0.04356615 - time (sec): 1.62 - samples/sec: 1856.35 - lr: 0.000016 - momentum: 0.000000 2023-10-23 17:52:20,904 epoch 6 - iter 60/304 - loss 0.05192274 - time (sec): 3.24 - samples/sec: 1816.39 - lr: 0.000016 - momentum: 0.000000 2023-10-23 17:52:22,530 epoch 6 - iter 90/304 - loss 0.04799394 - time (sec): 4.87 - samples/sec: 1822.11 - lr: 0.000016 - momentum: 0.000000 2023-10-23 17:52:24,147 epoch 6 - iter 120/304 - loss 0.03667820 - time (sec): 6.49 - samples/sec: 1817.67 - lr: 0.000015 - momentum: 0.000000 2023-10-23 17:52:25,765 epoch 6 - iter 150/304 - loss 0.03542625 - time (sec): 8.10 - samples/sec: 1833.18 - lr: 0.000015 - momentum: 0.000000 2023-10-23 17:52:27,378 epoch 6 - iter 180/304 - loss 0.03351087 - time (sec): 9.72 - samples/sec: 1806.89 - lr: 0.000015 - momentum: 0.000000 2023-10-23 17:52:29,001 epoch 6 - iter 210/304 - loss 0.03366804 - time (sec): 11.34 - samples/sec: 1817.09 - lr: 0.000014 - momentum: 0.000000 2023-10-23 17:52:30,634 epoch 6 - iter 240/304 - loss 0.03115946 - time (sec): 12.97 - samples/sec: 1849.52 - lr: 0.000014 - momentum: 0.000000 2023-10-23 17:52:32,263 epoch 6 - iter 270/304 - loss 0.03044961 - time (sec): 14.60 - samples/sec: 1857.96 - lr: 0.000014 - momentum: 0.000000 2023-10-23 17:52:33,896 epoch 6 - iter 300/304 - loss 0.02969013 - time (sec): 16.23 - samples/sec: 1884.08 - lr: 0.000013 - momentum: 0.000000 2023-10-23 17:52:34,112 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:52:34,112 EPOCH 6 done: loss 0.0295 - lr: 0.000013 2023-10-23 17:52:35,127 DEV : loss 0.21144907176494598 - f1-score (micro avg) 0.8353 2023-10-23 17:52:35,134 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:52:36,753 epoch 7 - iter 30/304 - loss 0.02126686 - time (sec): 1.62 - samples/sec: 1871.39 - lr: 0.000013 - momentum: 0.000000 2023-10-23 17:52:38,376 epoch 7 - iter 60/304 - loss 0.01593123 - time (sec): 3.24 - samples/sec: 1822.70 - lr: 0.000013 - momentum: 0.000000 2023-10-23 17:52:40,009 epoch 7 - iter 90/304 - loss 0.01390258 - time (sec): 4.87 - samples/sec: 1915.04 - lr: 0.000012 - momentum: 0.000000 2023-10-23 17:52:41,637 epoch 7 - iter 120/304 - loss 0.01400485 - time (sec): 6.50 - samples/sec: 1906.17 - lr: 0.000012 - momentum: 0.000000 2023-10-23 17:52:43,268 epoch 7 - iter 150/304 - loss 0.02486925 - time (sec): 8.13 - samples/sec: 1907.03 - lr: 0.000012 - momentum: 0.000000 2023-10-23 17:52:44,894 epoch 7 - iter 180/304 - loss 0.02498553 - time (sec): 9.76 - samples/sec: 1906.24 - lr: 0.000011 - momentum: 0.000000 2023-10-23 17:52:46,526 epoch 7 - iter 210/304 - loss 0.02236850 - time (sec): 11.39 - samples/sec: 1911.54 - lr: 0.000011 - momentum: 0.000000 2023-10-23 17:52:48,152 epoch 7 - iter 240/304 - loss 0.01961862 - time (sec): 13.02 - samples/sec: 1913.30 - lr: 0.000011 - momentum: 0.000000 2023-10-23 17:52:49,775 epoch 7 - iter 270/304 - loss 0.02088018 - time (sec): 14.64 - samples/sec: 1901.12 - lr: 0.000010 - momentum: 0.000000 2023-10-23 17:52:51,391 epoch 7 - iter 300/304 - loss 0.02117849 - time (sec): 16.26 - samples/sec: 1882.96 - lr: 0.000010 - momentum: 0.000000 2023-10-23 17:52:51,605 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:52:51,605 EPOCH 7 done: loss 0.0213 - lr: 0.000010 2023-10-23 17:52:52,481 DEV : loss 0.2095845639705658 - f1-score (micro avg) 0.8551 2023-10-23 17:52:52,489 saving best model 2023-10-23 17:52:53,005 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:52:54,630 epoch 8 - iter 30/304 - loss 0.02580967 - time (sec): 1.62 - samples/sec: 2042.98 - lr: 0.000010 - momentum: 0.000000 2023-10-23 17:52:56,254 epoch 8 - iter 60/304 - loss 0.01504738 - time (sec): 3.25 - samples/sec: 1891.36 - lr: 0.000009 - momentum: 0.000000 2023-10-23 17:52:57,880 epoch 8 - iter 90/304 - loss 0.01507099 - time (sec): 4.87 - samples/sec: 1911.65 - lr: 0.000009 - momentum: 0.000000 2023-10-23 17:52:59,500 epoch 8 - iter 120/304 - loss 0.01583357 - time (sec): 6.49 - samples/sec: 1884.16 - lr: 0.000009 - momentum: 0.000000 2023-10-23 17:53:01,134 epoch 8 - iter 150/304 - loss 0.02004945 - time (sec): 8.13 - samples/sec: 1888.81 - lr: 0.000008 - momentum: 0.000000 2023-10-23 17:53:02,759 epoch 8 - iter 180/304 - loss 0.01977157 - time (sec): 9.75 - samples/sec: 1878.69 - lr: 0.000008 - momentum: 0.000000 2023-10-23 17:53:04,391 epoch 8 - iter 210/304 - loss 0.01789790 - time (sec): 11.38 - samples/sec: 1886.15 - lr: 0.000008 - momentum: 0.000000 2023-10-23 17:53:06,018 epoch 8 - iter 240/304 - loss 0.01644621 - time (sec): 13.01 - samples/sec: 1893.46 - lr: 0.000007 - momentum: 0.000000 2023-10-23 17:53:07,649 epoch 8 - iter 270/304 - loss 0.01762986 - time (sec): 14.64 - samples/sec: 1897.81 - lr: 0.000007 - momentum: 0.000000 2023-10-23 17:53:09,279 epoch 8 - iter 300/304 - loss 0.01804904 - time (sec): 16.27 - samples/sec: 1886.64 - lr: 0.000007 - momentum: 0.000000 2023-10-23 17:53:09,491 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:53:09,491 EPOCH 8 done: loss 0.0179 - lr: 0.000007 2023-10-23 17:53:10,356 DEV : loss 0.22657519578933716 - f1-score (micro avg) 0.8507 2023-10-23 17:53:10,364 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:53:12,010 epoch 9 - iter 30/304 - loss 0.00543554 - time (sec): 1.64 - samples/sec: 1887.04 - lr: 0.000006 - momentum: 0.000000 2023-10-23 17:53:13,660 epoch 9 - iter 60/304 - loss 0.00288948 - time (sec): 3.29 - samples/sec: 1843.10 - lr: 0.000006 - momentum: 0.000000 2023-10-23 17:53:15,313 epoch 9 - iter 90/304 - loss 0.00255373 - time (sec): 4.95 - samples/sec: 1836.88 - lr: 0.000006 - momentum: 0.000000 2023-10-23 17:53:16,959 epoch 9 - iter 120/304 - loss 0.00421364 - time (sec): 6.59 - samples/sec: 1835.11 - lr: 0.000005 - momentum: 0.000000 2023-10-23 17:53:18,586 epoch 9 - iter 150/304 - loss 0.00514879 - time (sec): 8.22 - samples/sec: 1855.15 - lr: 0.000005 - momentum: 0.000000 2023-10-23 17:53:20,221 epoch 9 - iter 180/304 - loss 0.00649532 - time (sec): 9.86 - samples/sec: 1852.98 - lr: 0.000005 - momentum: 0.000000 2023-10-23 17:53:21,845 epoch 9 - iter 210/304 - loss 0.00604209 - time (sec): 11.48 - samples/sec: 1832.45 - lr: 0.000004 - momentum: 0.000000 2023-10-23 17:53:23,476 epoch 9 - iter 240/304 - loss 0.00627424 - time (sec): 13.11 - samples/sec: 1844.19 - lr: 0.000004 - momentum: 0.000000 2023-10-23 17:53:25,108 epoch 9 - iter 270/304 - loss 0.00662459 - time (sec): 14.74 - samples/sec: 1841.46 - lr: 0.000004 - momentum: 0.000000 2023-10-23 17:53:26,744 epoch 9 - iter 300/304 - loss 0.00836449 - time (sec): 16.38 - samples/sec: 1864.75 - lr: 0.000003 - momentum: 0.000000 2023-10-23 17:53:26,962 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:53:26,962 EPOCH 9 done: loss 0.0091 - lr: 0.000003 2023-10-23 17:53:27,808 DEV : loss 0.22202661633491516 - f1-score (micro avg) 0.8548 2023-10-23 17:53:27,815 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:53:29,342 epoch 10 - iter 30/304 - loss 0.01129457 - time (sec): 1.53 - samples/sec: 2105.48 - lr: 0.000003 - momentum: 0.000000 2023-10-23 17:53:30,873 epoch 10 - iter 60/304 - loss 0.00749556 - time (sec): 3.06 - samples/sec: 2065.81 - lr: 0.000003 - momentum: 0.000000 2023-10-23 17:53:32,399 epoch 10 - iter 90/304 - loss 0.00658369 - time (sec): 4.58 - samples/sec: 2028.26 - lr: 0.000002 - momentum: 0.000000 2023-10-23 17:53:33,926 epoch 10 - iter 120/304 - loss 0.01060511 - time (sec): 6.11 - samples/sec: 2024.80 - lr: 0.000002 - momentum: 0.000000 2023-10-23 17:53:35,458 epoch 10 - iter 150/304 - loss 0.00932383 - time (sec): 7.64 - samples/sec: 2031.58 - lr: 0.000002 - momentum: 0.000000 2023-10-23 17:53:36,987 epoch 10 - iter 180/304 - loss 0.00954633 - time (sec): 9.17 - samples/sec: 2031.06 - lr: 0.000001 - momentum: 0.000000 2023-10-23 17:53:38,506 epoch 10 - iter 210/304 - loss 0.00864562 - time (sec): 10.69 - samples/sec: 2003.68 - lr: 0.000001 - momentum: 0.000000 2023-10-23 17:53:40,030 epoch 10 - iter 240/304 - loss 0.00926603 - time (sec): 12.21 - samples/sec: 2003.07 - lr: 0.000001 - momentum: 0.000000 2023-10-23 17:53:41,557 epoch 10 - iter 270/304 - loss 0.00918413 - time (sec): 13.74 - samples/sec: 2003.56 - lr: 0.000000 - momentum: 0.000000 2023-10-23 17:53:43,093 epoch 10 - iter 300/304 - loss 0.00838212 - time (sec): 15.28 - samples/sec: 2002.68 - lr: 0.000000 - momentum: 0.000000 2023-10-23 17:53:43,294 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:53:43,294 EPOCH 10 done: loss 0.0090 - lr: 0.000000 2023-10-23 17:53:44,148 DEV : loss 0.21371521055698395 - f1-score (micro avg) 0.8565 2023-10-23 17:53:44,155 saving best model 2023-10-23 17:53:45,159 ---------------------------------------------------------------------------------------------------- 2023-10-23 17:53:45,160 Loading model from best epoch ... 2023-10-23 17:53:47,222 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object 2023-10-23 17:53:48,038 Results: - F-score (micro) 0.8192 - F-score (macro) 0.657 - Accuracy 0.6986 By class: precision recall f1-score support scope 0.7888 0.8411 0.8141 151 work 0.7523 0.8632 0.8039 95 pers 0.8224 0.9167 0.8670 96 date 0.0000 0.0000 0.0000 3 loc 1.0000 0.6667 0.8000 3 micro avg 0.7827 0.8592 0.8192 348 macro avg 0.6727 0.6575 0.6570 348 weighted avg 0.7831 0.8592 0.8188 348 2023-10-23 17:53:48,038 ----------------------------------------------------------------------------------------------------