2023-10-17 09:37:27,236 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:37:27,238 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 09:37:27,238 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:37:27,238 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-17 09:37:27,238 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:37:27,239 Train: 6183 sentences 2023-10-17 09:37:27,239 (train_with_dev=False, train_with_test=False) 2023-10-17 09:37:27,239 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:37:27,239 Training Params: 2023-10-17 09:37:27,239 - learning_rate: "3e-05" 2023-10-17 09:37:27,239 - mini_batch_size: "8" 2023-10-17 09:37:27,239 - max_epochs: "10" 2023-10-17 09:37:27,239 - shuffle: "True" 2023-10-17 09:37:27,239 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:37:27,239 Plugins: 2023-10-17 09:37:27,239 - TensorboardLogger 2023-10-17 09:37:27,239 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 09:37:27,239 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:37:27,239 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 09:37:27,240 - metric: "('micro avg', 'f1-score')" 2023-10-17 09:37:27,240 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:37:27,240 Computation: 2023-10-17 09:37:27,240 - compute on device: cuda:0 2023-10-17 09:37:27,240 - embedding storage: none 2023-10-17 09:37:27,240 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:37:27,240 Model training base path: "hmbench-topres19th/en-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-17 09:37:27,240 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:37:27,240 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:37:27,240 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 09:37:34,576 epoch 1 - iter 77/773 - loss 2.33879566 - time (sec): 7.33 - samples/sec: 1752.72 - lr: 0.000003 - momentum: 0.000000 2023-10-17 09:37:41,584 epoch 1 - iter 154/773 - loss 1.42646834 - time (sec): 14.34 - samples/sec: 1749.19 - lr: 0.000006 - momentum: 0.000000 2023-10-17 09:37:48,222 epoch 1 - iter 231/773 - loss 1.01107736 - time (sec): 20.98 - samples/sec: 1783.57 - lr: 0.000009 - momentum: 0.000000 2023-10-17 09:37:55,237 epoch 1 - iter 308/773 - loss 0.78648939 - time (sec): 28.00 - samples/sec: 1800.87 - lr: 0.000012 - momentum: 0.000000 2023-10-17 09:38:02,692 epoch 1 - iter 385/773 - loss 0.65487679 - time (sec): 35.45 - samples/sec: 1767.05 - lr: 0.000015 - momentum: 0.000000 2023-10-17 09:38:09,721 epoch 1 - iter 462/773 - loss 0.56517862 - time (sec): 42.48 - samples/sec: 1763.85 - lr: 0.000018 - momentum: 0.000000 2023-10-17 09:38:17,445 epoch 1 - iter 539/773 - loss 0.50859951 - time (sec): 50.20 - samples/sec: 1727.34 - lr: 0.000021 - momentum: 0.000000 2023-10-17 09:38:24,994 epoch 1 - iter 616/773 - loss 0.46183259 - time (sec): 57.75 - samples/sec: 1710.88 - lr: 0.000024 - momentum: 0.000000 2023-10-17 09:38:32,538 epoch 1 - iter 693/773 - loss 0.41835378 - time (sec): 65.30 - samples/sec: 1708.79 - lr: 0.000027 - momentum: 0.000000 2023-10-17 09:38:39,593 epoch 1 - iter 770/773 - loss 0.38499874 - time (sec): 72.35 - samples/sec: 1713.83 - lr: 0.000030 - momentum: 0.000000 2023-10-17 09:38:39,844 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:38:39,844 EPOCH 1 done: loss 0.3842 - lr: 0.000030 2023-10-17 09:38:42,623 DEV : loss 0.05754069611430168 - f1-score (micro avg) 0.7534 2023-10-17 09:38:42,650 saving best model 2023-10-17 09:38:43,192 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:38:49,786 epoch 2 - iter 77/773 - loss 0.10261422 - time (sec): 6.59 - samples/sec: 1792.64 - lr: 0.000030 - momentum: 0.000000 2023-10-17 09:38:56,506 epoch 2 - iter 154/773 - loss 0.08620224 - time (sec): 13.31 - samples/sec: 1814.77 - lr: 0.000029 - momentum: 0.000000 2023-10-17 09:39:03,586 epoch 2 - iter 231/773 - loss 0.08217454 - time (sec): 20.39 - samples/sec: 1848.76 - lr: 0.000029 - momentum: 0.000000 2023-10-17 09:39:10,410 epoch 2 - iter 308/773 - loss 0.08124803 - time (sec): 27.22 - samples/sec: 1841.98 - lr: 0.000029 - momentum: 0.000000 2023-10-17 09:39:17,499 epoch 2 - iter 385/773 - loss 0.07988568 - time (sec): 34.31 - samples/sec: 1829.13 - lr: 0.000028 - momentum: 0.000000 2023-10-17 09:39:25,060 epoch 2 - iter 462/773 - loss 0.07997375 - time (sec): 41.87 - samples/sec: 1789.65 - lr: 0.000028 - momentum: 0.000000 2023-10-17 09:39:32,052 epoch 2 - iter 539/773 - loss 0.07763047 - time (sec): 48.86 - samples/sec: 1790.10 - lr: 0.000028 - momentum: 0.000000 2023-10-17 09:39:39,204 epoch 2 - iter 616/773 - loss 0.07703563 - time (sec): 56.01 - samples/sec: 1794.93 - lr: 0.000027 - momentum: 0.000000 2023-10-17 09:39:46,211 epoch 2 - iter 693/773 - loss 0.07619810 - time (sec): 63.02 - samples/sec: 1780.52 - lr: 0.000027 - momentum: 0.000000 2023-10-17 09:39:53,478 epoch 2 - iter 770/773 - loss 0.07711628 - time (sec): 70.28 - samples/sec: 1764.60 - lr: 0.000027 - momentum: 0.000000 2023-10-17 09:39:53,755 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:39:53,755 EPOCH 2 done: loss 0.0772 - lr: 0.000027 2023-10-17 09:39:56,582 DEV : loss 0.047075219452381134 - f1-score (micro avg) 0.7597 2023-10-17 09:39:56,610 saving best model 2023-10-17 09:39:57,996 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:40:05,357 epoch 3 - iter 77/773 - loss 0.04621094 - time (sec): 7.36 - samples/sec: 1588.98 - lr: 0.000026 - momentum: 0.000000 2023-10-17 09:40:13,046 epoch 3 - iter 154/773 - loss 0.04711698 - time (sec): 15.05 - samples/sec: 1650.32 - lr: 0.000026 - momentum: 0.000000 2023-10-17 09:40:20,395 epoch 3 - iter 231/773 - loss 0.04627990 - time (sec): 22.39 - samples/sec: 1705.13 - lr: 0.000026 - momentum: 0.000000 2023-10-17 09:40:27,418 epoch 3 - iter 308/773 - loss 0.04368407 - time (sec): 29.42 - samples/sec: 1719.86 - lr: 0.000025 - momentum: 0.000000 2023-10-17 09:40:34,373 epoch 3 - iter 385/773 - loss 0.04513632 - time (sec): 36.37 - samples/sec: 1716.82 - lr: 0.000025 - momentum: 0.000000 2023-10-17 09:40:41,366 epoch 3 - iter 462/773 - loss 0.04669092 - time (sec): 43.37 - samples/sec: 1731.18 - lr: 0.000025 - momentum: 0.000000 2023-10-17 09:40:48,481 epoch 3 - iter 539/773 - loss 0.04627462 - time (sec): 50.48 - samples/sec: 1726.60 - lr: 0.000024 - momentum: 0.000000 2023-10-17 09:40:55,508 epoch 3 - iter 616/773 - loss 0.04565734 - time (sec): 57.51 - samples/sec: 1730.70 - lr: 0.000024 - momentum: 0.000000 2023-10-17 09:41:02,972 epoch 3 - iter 693/773 - loss 0.04735239 - time (sec): 64.97 - samples/sec: 1698.13 - lr: 0.000024 - momentum: 0.000000 2023-10-17 09:41:10,809 epoch 3 - iter 770/773 - loss 0.04789770 - time (sec): 72.81 - samples/sec: 1701.70 - lr: 0.000023 - momentum: 0.000000 2023-10-17 09:41:11,097 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:41:11,097 EPOCH 3 done: loss 0.0479 - lr: 0.000023 2023-10-17 09:41:14,193 DEV : loss 0.04879758879542351 - f1-score (micro avg) 0.804 2023-10-17 09:41:14,221 saving best model 2023-10-17 09:41:15,684 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:41:23,490 epoch 4 - iter 77/773 - loss 0.02803532 - time (sec): 7.80 - samples/sec: 1647.61 - lr: 0.000023 - momentum: 0.000000 2023-10-17 09:41:31,172 epoch 4 - iter 154/773 - loss 0.02614174 - time (sec): 15.49 - samples/sec: 1583.78 - lr: 0.000023 - momentum: 0.000000 2023-10-17 09:41:38,912 epoch 4 - iter 231/773 - loss 0.02740037 - time (sec): 23.22 - samples/sec: 1615.22 - lr: 0.000022 - momentum: 0.000000 2023-10-17 09:41:46,418 epoch 4 - iter 308/773 - loss 0.02810327 - time (sec): 30.73 - samples/sec: 1627.11 - lr: 0.000022 - momentum: 0.000000 2023-10-17 09:41:54,017 epoch 4 - iter 385/773 - loss 0.02856760 - time (sec): 38.33 - samples/sec: 1628.38 - lr: 0.000022 - momentum: 0.000000 2023-10-17 09:42:02,005 epoch 4 - iter 462/773 - loss 0.03151706 - time (sec): 46.32 - samples/sec: 1625.94 - lr: 0.000021 - momentum: 0.000000 2023-10-17 09:42:09,711 epoch 4 - iter 539/773 - loss 0.03118289 - time (sec): 54.02 - samples/sec: 1629.16 - lr: 0.000021 - momentum: 0.000000 2023-10-17 09:42:17,213 epoch 4 - iter 616/773 - loss 0.03069938 - time (sec): 61.53 - samples/sec: 1620.50 - lr: 0.000021 - momentum: 0.000000 2023-10-17 09:42:24,916 epoch 4 - iter 693/773 - loss 0.03102882 - time (sec): 69.23 - samples/sec: 1610.69 - lr: 0.000020 - momentum: 0.000000 2023-10-17 09:42:32,858 epoch 4 - iter 770/773 - loss 0.03091728 - time (sec): 77.17 - samples/sec: 1606.19 - lr: 0.000020 - momentum: 0.000000 2023-10-17 09:42:33,116 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:42:33,116 EPOCH 4 done: loss 0.0311 - lr: 0.000020 2023-10-17 09:42:36,004 DEV : loss 0.0732564851641655 - f1-score (micro avg) 0.7938 2023-10-17 09:42:36,033 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:42:43,080 epoch 5 - iter 77/773 - loss 0.02562462 - time (sec): 7.04 - samples/sec: 1684.07 - lr: 0.000020 - momentum: 0.000000 2023-10-17 09:42:50,338 epoch 5 - iter 154/773 - loss 0.02186778 - time (sec): 14.30 - samples/sec: 1698.22 - lr: 0.000019 - momentum: 0.000000 2023-10-17 09:42:57,692 epoch 5 - iter 231/773 - loss 0.02197283 - time (sec): 21.66 - samples/sec: 1670.07 - lr: 0.000019 - momentum: 0.000000 2023-10-17 09:43:05,104 epoch 5 - iter 308/773 - loss 0.02085800 - time (sec): 29.07 - samples/sec: 1665.35 - lr: 0.000019 - momentum: 0.000000 2023-10-17 09:43:12,720 epoch 5 - iter 385/773 - loss 0.02066640 - time (sec): 36.68 - samples/sec: 1673.37 - lr: 0.000018 - momentum: 0.000000 2023-10-17 09:43:19,967 epoch 5 - iter 462/773 - loss 0.02032896 - time (sec): 43.93 - samples/sec: 1684.46 - lr: 0.000018 - momentum: 0.000000 2023-10-17 09:43:27,490 epoch 5 - iter 539/773 - loss 0.01942875 - time (sec): 51.45 - samples/sec: 1681.14 - lr: 0.000018 - momentum: 0.000000 2023-10-17 09:43:35,178 epoch 5 - iter 616/773 - loss 0.02036285 - time (sec): 59.14 - samples/sec: 1668.33 - lr: 0.000017 - momentum: 0.000000 2023-10-17 09:43:42,777 epoch 5 - iter 693/773 - loss 0.02050199 - time (sec): 66.74 - samples/sec: 1676.73 - lr: 0.000017 - momentum: 0.000000 2023-10-17 09:43:50,579 epoch 5 - iter 770/773 - loss 0.02101369 - time (sec): 74.54 - samples/sec: 1660.09 - lr: 0.000017 - momentum: 0.000000 2023-10-17 09:43:50,903 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:43:50,904 EPOCH 5 done: loss 0.0212 - lr: 0.000017 2023-10-17 09:43:54,291 DEV : loss 0.09599114209413528 - f1-score (micro avg) 0.7787 2023-10-17 09:43:54,321 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:44:02,198 epoch 6 - iter 77/773 - loss 0.01113512 - time (sec): 7.87 - samples/sec: 1629.16 - lr: 0.000016 - momentum: 0.000000 2023-10-17 09:44:09,915 epoch 6 - iter 154/773 - loss 0.01041077 - time (sec): 15.59 - samples/sec: 1650.97 - lr: 0.000016 - momentum: 0.000000 2023-10-17 09:44:17,666 epoch 6 - iter 231/773 - loss 0.01184309 - time (sec): 23.34 - samples/sec: 1629.14 - lr: 0.000016 - momentum: 0.000000 2023-10-17 09:44:25,575 epoch 6 - iter 308/773 - loss 0.01381827 - time (sec): 31.25 - samples/sec: 1617.95 - lr: 0.000015 - momentum: 0.000000 2023-10-17 09:44:32,527 epoch 6 - iter 385/773 - loss 0.01530116 - time (sec): 38.20 - samples/sec: 1663.06 - lr: 0.000015 - momentum: 0.000000 2023-10-17 09:44:39,617 epoch 6 - iter 462/773 - loss 0.01664026 - time (sec): 45.29 - samples/sec: 1658.96 - lr: 0.000015 - momentum: 0.000000 2023-10-17 09:44:46,872 epoch 6 - iter 539/773 - loss 0.01619538 - time (sec): 52.55 - samples/sec: 1653.81 - lr: 0.000014 - momentum: 0.000000 2023-10-17 09:44:54,077 epoch 6 - iter 616/773 - loss 0.01625589 - time (sec): 59.75 - samples/sec: 1652.08 - lr: 0.000014 - momentum: 0.000000 2023-10-17 09:45:01,550 epoch 6 - iter 693/773 - loss 0.01623770 - time (sec): 67.22 - samples/sec: 1656.58 - lr: 0.000014 - momentum: 0.000000 2023-10-17 09:45:09,080 epoch 6 - iter 770/773 - loss 0.01610707 - time (sec): 74.75 - samples/sec: 1657.18 - lr: 0.000013 - momentum: 0.000000 2023-10-17 09:45:09,364 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:45:09,364 EPOCH 6 done: loss 0.0161 - lr: 0.000013 2023-10-17 09:45:12,313 DEV : loss 0.10213357210159302 - f1-score (micro avg) 0.7975 2023-10-17 09:45:12,341 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:45:19,108 epoch 7 - iter 77/773 - loss 0.00659626 - time (sec): 6.77 - samples/sec: 1731.83 - lr: 0.000013 - momentum: 0.000000 2023-10-17 09:45:25,993 epoch 7 - iter 154/773 - loss 0.01050483 - time (sec): 13.65 - samples/sec: 1738.79 - lr: 0.000013 - momentum: 0.000000 2023-10-17 09:45:32,858 epoch 7 - iter 231/773 - loss 0.01031709 - time (sec): 20.52 - samples/sec: 1766.06 - lr: 0.000012 - momentum: 0.000000 2023-10-17 09:45:39,920 epoch 7 - iter 308/773 - loss 0.01094286 - time (sec): 27.58 - samples/sec: 1774.72 - lr: 0.000012 - momentum: 0.000000 2023-10-17 09:45:46,969 epoch 7 - iter 385/773 - loss 0.01060798 - time (sec): 34.63 - samples/sec: 1778.49 - lr: 0.000012 - momentum: 0.000000 2023-10-17 09:45:53,878 epoch 7 - iter 462/773 - loss 0.00964801 - time (sec): 41.53 - samples/sec: 1779.93 - lr: 0.000011 - momentum: 0.000000 2023-10-17 09:46:00,494 epoch 7 - iter 539/773 - loss 0.00899137 - time (sec): 48.15 - samples/sec: 1785.57 - lr: 0.000011 - momentum: 0.000000 2023-10-17 09:46:07,442 epoch 7 - iter 616/773 - loss 0.00899506 - time (sec): 55.10 - samples/sec: 1798.25 - lr: 0.000011 - momentum: 0.000000 2023-10-17 09:46:14,354 epoch 7 - iter 693/773 - loss 0.00947777 - time (sec): 62.01 - samples/sec: 1804.02 - lr: 0.000010 - momentum: 0.000000 2023-10-17 09:46:21,321 epoch 7 - iter 770/773 - loss 0.01042790 - time (sec): 68.98 - samples/sec: 1793.18 - lr: 0.000010 - momentum: 0.000000 2023-10-17 09:46:21,605 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:46:21,605 EPOCH 7 done: loss 0.0105 - lr: 0.000010 2023-10-17 09:46:24,548 DEV : loss 0.10294033586978912 - f1-score (micro avg) 0.7984 2023-10-17 09:46:24,575 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:46:31,444 epoch 8 - iter 77/773 - loss 0.01082701 - time (sec): 6.87 - samples/sec: 1802.66 - lr: 0.000010 - momentum: 0.000000 2023-10-17 09:46:38,217 epoch 8 - iter 154/773 - loss 0.00937227 - time (sec): 13.64 - samples/sec: 1852.12 - lr: 0.000009 - momentum: 0.000000 2023-10-17 09:46:44,941 epoch 8 - iter 231/773 - loss 0.01054096 - time (sec): 20.36 - samples/sec: 1835.19 - lr: 0.000009 - momentum: 0.000000 2023-10-17 09:46:51,562 epoch 8 - iter 308/773 - loss 0.00981916 - time (sec): 26.99 - samples/sec: 1833.97 - lr: 0.000009 - momentum: 0.000000 2023-10-17 09:46:58,389 epoch 8 - iter 385/773 - loss 0.00891815 - time (sec): 33.81 - samples/sec: 1819.33 - lr: 0.000008 - momentum: 0.000000 2023-10-17 09:47:05,424 epoch 8 - iter 462/773 - loss 0.00820688 - time (sec): 40.85 - samples/sec: 1827.76 - lr: 0.000008 - momentum: 0.000000 2023-10-17 09:47:12,186 epoch 8 - iter 539/773 - loss 0.00762464 - time (sec): 47.61 - samples/sec: 1840.44 - lr: 0.000008 - momentum: 0.000000 2023-10-17 09:47:18,915 epoch 8 - iter 616/773 - loss 0.00722670 - time (sec): 54.34 - samples/sec: 1830.88 - lr: 0.000007 - momentum: 0.000000 2023-10-17 09:47:26,084 epoch 8 - iter 693/773 - loss 0.00708273 - time (sec): 61.51 - samples/sec: 1804.20 - lr: 0.000007 - momentum: 0.000000 2023-10-17 09:47:33,076 epoch 8 - iter 770/773 - loss 0.00691715 - time (sec): 68.50 - samples/sec: 1809.37 - lr: 0.000007 - momentum: 0.000000 2023-10-17 09:47:33,365 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:47:33,366 EPOCH 8 done: loss 0.0069 - lr: 0.000007 2023-10-17 09:47:36,625 DEV : loss 0.12377041578292847 - f1-score (micro avg) 0.7854 2023-10-17 09:47:36,665 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:47:43,762 epoch 9 - iter 77/773 - loss 0.00448652 - time (sec): 7.09 - samples/sec: 1776.99 - lr: 0.000006 - momentum: 0.000000 2023-10-17 09:47:50,939 epoch 9 - iter 154/773 - loss 0.00413824 - time (sec): 14.27 - samples/sec: 1718.66 - lr: 0.000006 - momentum: 0.000000 2023-10-17 09:47:58,038 epoch 9 - iter 231/773 - loss 0.00450347 - time (sec): 21.37 - samples/sec: 1750.13 - lr: 0.000006 - momentum: 0.000000 2023-10-17 09:48:04,825 epoch 9 - iter 308/773 - loss 0.00438937 - time (sec): 28.16 - samples/sec: 1744.81 - lr: 0.000005 - momentum: 0.000000 2023-10-17 09:48:11,751 epoch 9 - iter 385/773 - loss 0.00389918 - time (sec): 35.08 - samples/sec: 1764.04 - lr: 0.000005 - momentum: 0.000000 2023-10-17 09:48:18,655 epoch 9 - iter 462/773 - loss 0.00409811 - time (sec): 41.99 - samples/sec: 1762.60 - lr: 0.000005 - momentum: 0.000000 2023-10-17 09:48:25,972 epoch 9 - iter 539/773 - loss 0.00408809 - time (sec): 49.30 - samples/sec: 1762.28 - lr: 0.000004 - momentum: 0.000000 2023-10-17 09:48:33,178 epoch 9 - iter 616/773 - loss 0.00422096 - time (sec): 56.51 - samples/sec: 1752.15 - lr: 0.000004 - momentum: 0.000000 2023-10-17 09:48:40,631 epoch 9 - iter 693/773 - loss 0.00424621 - time (sec): 63.96 - samples/sec: 1756.00 - lr: 0.000004 - momentum: 0.000000 2023-10-17 09:48:47,886 epoch 9 - iter 770/773 - loss 0.00460245 - time (sec): 71.22 - samples/sec: 1738.54 - lr: 0.000003 - momentum: 0.000000 2023-10-17 09:48:48,169 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:48:48,169 EPOCH 9 done: loss 0.0046 - lr: 0.000003 2023-10-17 09:48:51,211 DEV : loss 0.11872641742229462 - f1-score (micro avg) 0.8089 2023-10-17 09:48:51,241 saving best model 2023-10-17 09:48:51,815 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:48:58,687 epoch 10 - iter 77/773 - loss 0.00222354 - time (sec): 6.87 - samples/sec: 1822.68 - lr: 0.000003 - momentum: 0.000000 2023-10-17 09:49:05,736 epoch 10 - iter 154/773 - loss 0.00255318 - time (sec): 13.92 - samples/sec: 1781.09 - lr: 0.000003 - momentum: 0.000000 2023-10-17 09:49:12,757 epoch 10 - iter 231/773 - loss 0.00220584 - time (sec): 20.94 - samples/sec: 1807.28 - lr: 0.000002 - momentum: 0.000000 2023-10-17 09:49:19,469 epoch 10 - iter 308/773 - loss 0.00272066 - time (sec): 27.65 - samples/sec: 1817.54 - lr: 0.000002 - momentum: 0.000000 2023-10-17 09:49:26,263 epoch 10 - iter 385/773 - loss 0.00296102 - time (sec): 34.45 - samples/sec: 1814.48 - lr: 0.000002 - momentum: 0.000000 2023-10-17 09:49:32,975 epoch 10 - iter 462/773 - loss 0.00304331 - time (sec): 41.16 - samples/sec: 1803.57 - lr: 0.000001 - momentum: 0.000000 2023-10-17 09:49:40,073 epoch 10 - iter 539/773 - loss 0.00318944 - time (sec): 48.26 - samples/sec: 1803.54 - lr: 0.000001 - momentum: 0.000000 2023-10-17 09:49:47,138 epoch 10 - iter 616/773 - loss 0.00312936 - time (sec): 55.32 - samples/sec: 1786.57 - lr: 0.000001 - momentum: 0.000000 2023-10-17 09:49:54,283 epoch 10 - iter 693/773 - loss 0.00302278 - time (sec): 62.47 - samples/sec: 1784.02 - lr: 0.000000 - momentum: 0.000000 2023-10-17 09:50:01,888 epoch 10 - iter 770/773 - loss 0.00307193 - time (sec): 70.07 - samples/sec: 1767.38 - lr: 0.000000 - momentum: 0.000000 2023-10-17 09:50:02,150 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:50:02,150 EPOCH 10 done: loss 0.0031 - lr: 0.000000 2023-10-17 09:50:05,066 DEV : loss 0.12313356250524521 - f1-score (micro avg) 0.7967 2023-10-17 09:50:05,705 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:50:05,708 Loading model from best epoch ... 2023-10-17 09:50:08,303 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-17 09:50:17,304 Results: - F-score (micro) 0.8152 - F-score (macro) 0.7257 - Accuracy 0.7076 By class: precision recall f1-score support LOC 0.8495 0.8710 0.8601 946 BUILDING 0.6301 0.5892 0.6089 185 STREET 0.7018 0.7143 0.7080 56 micro avg 0.8108 0.8197 0.8152 1187 macro avg 0.7271 0.7248 0.7257 1187 weighted avg 0.8083 0.8197 0.8138 1187 2023-10-17 09:50:17,304 ----------------------------------------------------------------------------------------------------