stefan-it's picture
Upload folder using huggingface_hub
8d79acd
2023-10-17 09:37:27,236 ----------------------------------------------------------------------------------------------------
2023-10-17 09:37:27,238 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 09:37:27,238 ----------------------------------------------------------------------------------------------------
2023-10-17 09:37:27,238 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-17 09:37:27,238 ----------------------------------------------------------------------------------------------------
2023-10-17 09:37:27,239 Train: 6183 sentences
2023-10-17 09:37:27,239 (train_with_dev=False, train_with_test=False)
2023-10-17 09:37:27,239 ----------------------------------------------------------------------------------------------------
2023-10-17 09:37:27,239 Training Params:
2023-10-17 09:37:27,239 - learning_rate: "3e-05"
2023-10-17 09:37:27,239 - mini_batch_size: "8"
2023-10-17 09:37:27,239 - max_epochs: "10"
2023-10-17 09:37:27,239 - shuffle: "True"
2023-10-17 09:37:27,239 ----------------------------------------------------------------------------------------------------
2023-10-17 09:37:27,239 Plugins:
2023-10-17 09:37:27,239 - TensorboardLogger
2023-10-17 09:37:27,239 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 09:37:27,239 ----------------------------------------------------------------------------------------------------
2023-10-17 09:37:27,239 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 09:37:27,240 - metric: "('micro avg', 'f1-score')"
2023-10-17 09:37:27,240 ----------------------------------------------------------------------------------------------------
2023-10-17 09:37:27,240 Computation:
2023-10-17 09:37:27,240 - compute on device: cuda:0
2023-10-17 09:37:27,240 - embedding storage: none
2023-10-17 09:37:27,240 ----------------------------------------------------------------------------------------------------
2023-10-17 09:37:27,240 Model training base path: "hmbench-topres19th/en-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-17 09:37:27,240 ----------------------------------------------------------------------------------------------------
2023-10-17 09:37:27,240 ----------------------------------------------------------------------------------------------------
2023-10-17 09:37:27,240 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 09:37:34,576 epoch 1 - iter 77/773 - loss 2.33879566 - time (sec): 7.33 - samples/sec: 1752.72 - lr: 0.000003 - momentum: 0.000000
2023-10-17 09:37:41,584 epoch 1 - iter 154/773 - loss 1.42646834 - time (sec): 14.34 - samples/sec: 1749.19 - lr: 0.000006 - momentum: 0.000000
2023-10-17 09:37:48,222 epoch 1 - iter 231/773 - loss 1.01107736 - time (sec): 20.98 - samples/sec: 1783.57 - lr: 0.000009 - momentum: 0.000000
2023-10-17 09:37:55,237 epoch 1 - iter 308/773 - loss 0.78648939 - time (sec): 28.00 - samples/sec: 1800.87 - lr: 0.000012 - momentum: 0.000000
2023-10-17 09:38:02,692 epoch 1 - iter 385/773 - loss 0.65487679 - time (sec): 35.45 - samples/sec: 1767.05 - lr: 0.000015 - momentum: 0.000000
2023-10-17 09:38:09,721 epoch 1 - iter 462/773 - loss 0.56517862 - time (sec): 42.48 - samples/sec: 1763.85 - lr: 0.000018 - momentum: 0.000000
2023-10-17 09:38:17,445 epoch 1 - iter 539/773 - loss 0.50859951 - time (sec): 50.20 - samples/sec: 1727.34 - lr: 0.000021 - momentum: 0.000000
2023-10-17 09:38:24,994 epoch 1 - iter 616/773 - loss 0.46183259 - time (sec): 57.75 - samples/sec: 1710.88 - lr: 0.000024 - momentum: 0.000000
2023-10-17 09:38:32,538 epoch 1 - iter 693/773 - loss 0.41835378 - time (sec): 65.30 - samples/sec: 1708.79 - lr: 0.000027 - momentum: 0.000000
2023-10-17 09:38:39,593 epoch 1 - iter 770/773 - loss 0.38499874 - time (sec): 72.35 - samples/sec: 1713.83 - lr: 0.000030 - momentum: 0.000000
2023-10-17 09:38:39,844 ----------------------------------------------------------------------------------------------------
2023-10-17 09:38:39,844 EPOCH 1 done: loss 0.3842 - lr: 0.000030
2023-10-17 09:38:42,623 DEV : loss 0.05754069611430168 - f1-score (micro avg) 0.7534
2023-10-17 09:38:42,650 saving best model
2023-10-17 09:38:43,192 ----------------------------------------------------------------------------------------------------
2023-10-17 09:38:49,786 epoch 2 - iter 77/773 - loss 0.10261422 - time (sec): 6.59 - samples/sec: 1792.64 - lr: 0.000030 - momentum: 0.000000
2023-10-17 09:38:56,506 epoch 2 - iter 154/773 - loss 0.08620224 - time (sec): 13.31 - samples/sec: 1814.77 - lr: 0.000029 - momentum: 0.000000
2023-10-17 09:39:03,586 epoch 2 - iter 231/773 - loss 0.08217454 - time (sec): 20.39 - samples/sec: 1848.76 - lr: 0.000029 - momentum: 0.000000
2023-10-17 09:39:10,410 epoch 2 - iter 308/773 - loss 0.08124803 - time (sec): 27.22 - samples/sec: 1841.98 - lr: 0.000029 - momentum: 0.000000
2023-10-17 09:39:17,499 epoch 2 - iter 385/773 - loss 0.07988568 - time (sec): 34.31 - samples/sec: 1829.13 - lr: 0.000028 - momentum: 0.000000
2023-10-17 09:39:25,060 epoch 2 - iter 462/773 - loss 0.07997375 - time (sec): 41.87 - samples/sec: 1789.65 - lr: 0.000028 - momentum: 0.000000
2023-10-17 09:39:32,052 epoch 2 - iter 539/773 - loss 0.07763047 - time (sec): 48.86 - samples/sec: 1790.10 - lr: 0.000028 - momentum: 0.000000
2023-10-17 09:39:39,204 epoch 2 - iter 616/773 - loss 0.07703563 - time (sec): 56.01 - samples/sec: 1794.93 - lr: 0.000027 - momentum: 0.000000
2023-10-17 09:39:46,211 epoch 2 - iter 693/773 - loss 0.07619810 - time (sec): 63.02 - samples/sec: 1780.52 - lr: 0.000027 - momentum: 0.000000
2023-10-17 09:39:53,478 epoch 2 - iter 770/773 - loss 0.07711628 - time (sec): 70.28 - samples/sec: 1764.60 - lr: 0.000027 - momentum: 0.000000
2023-10-17 09:39:53,755 ----------------------------------------------------------------------------------------------------
2023-10-17 09:39:53,755 EPOCH 2 done: loss 0.0772 - lr: 0.000027
2023-10-17 09:39:56,582 DEV : loss 0.047075219452381134 - f1-score (micro avg) 0.7597
2023-10-17 09:39:56,610 saving best model
2023-10-17 09:39:57,996 ----------------------------------------------------------------------------------------------------
2023-10-17 09:40:05,357 epoch 3 - iter 77/773 - loss 0.04621094 - time (sec): 7.36 - samples/sec: 1588.98 - lr: 0.000026 - momentum: 0.000000
2023-10-17 09:40:13,046 epoch 3 - iter 154/773 - loss 0.04711698 - time (sec): 15.05 - samples/sec: 1650.32 - lr: 0.000026 - momentum: 0.000000
2023-10-17 09:40:20,395 epoch 3 - iter 231/773 - loss 0.04627990 - time (sec): 22.39 - samples/sec: 1705.13 - lr: 0.000026 - momentum: 0.000000
2023-10-17 09:40:27,418 epoch 3 - iter 308/773 - loss 0.04368407 - time (sec): 29.42 - samples/sec: 1719.86 - lr: 0.000025 - momentum: 0.000000
2023-10-17 09:40:34,373 epoch 3 - iter 385/773 - loss 0.04513632 - time (sec): 36.37 - samples/sec: 1716.82 - lr: 0.000025 - momentum: 0.000000
2023-10-17 09:40:41,366 epoch 3 - iter 462/773 - loss 0.04669092 - time (sec): 43.37 - samples/sec: 1731.18 - lr: 0.000025 - momentum: 0.000000
2023-10-17 09:40:48,481 epoch 3 - iter 539/773 - loss 0.04627462 - time (sec): 50.48 - samples/sec: 1726.60 - lr: 0.000024 - momentum: 0.000000
2023-10-17 09:40:55,508 epoch 3 - iter 616/773 - loss 0.04565734 - time (sec): 57.51 - samples/sec: 1730.70 - lr: 0.000024 - momentum: 0.000000
2023-10-17 09:41:02,972 epoch 3 - iter 693/773 - loss 0.04735239 - time (sec): 64.97 - samples/sec: 1698.13 - lr: 0.000024 - momentum: 0.000000
2023-10-17 09:41:10,809 epoch 3 - iter 770/773 - loss 0.04789770 - time (sec): 72.81 - samples/sec: 1701.70 - lr: 0.000023 - momentum: 0.000000
2023-10-17 09:41:11,097 ----------------------------------------------------------------------------------------------------
2023-10-17 09:41:11,097 EPOCH 3 done: loss 0.0479 - lr: 0.000023
2023-10-17 09:41:14,193 DEV : loss 0.04879758879542351 - f1-score (micro avg) 0.804
2023-10-17 09:41:14,221 saving best model
2023-10-17 09:41:15,684 ----------------------------------------------------------------------------------------------------
2023-10-17 09:41:23,490 epoch 4 - iter 77/773 - loss 0.02803532 - time (sec): 7.80 - samples/sec: 1647.61 - lr: 0.000023 - momentum: 0.000000
2023-10-17 09:41:31,172 epoch 4 - iter 154/773 - loss 0.02614174 - time (sec): 15.49 - samples/sec: 1583.78 - lr: 0.000023 - momentum: 0.000000
2023-10-17 09:41:38,912 epoch 4 - iter 231/773 - loss 0.02740037 - time (sec): 23.22 - samples/sec: 1615.22 - lr: 0.000022 - momentum: 0.000000
2023-10-17 09:41:46,418 epoch 4 - iter 308/773 - loss 0.02810327 - time (sec): 30.73 - samples/sec: 1627.11 - lr: 0.000022 - momentum: 0.000000
2023-10-17 09:41:54,017 epoch 4 - iter 385/773 - loss 0.02856760 - time (sec): 38.33 - samples/sec: 1628.38 - lr: 0.000022 - momentum: 0.000000
2023-10-17 09:42:02,005 epoch 4 - iter 462/773 - loss 0.03151706 - time (sec): 46.32 - samples/sec: 1625.94 - lr: 0.000021 - momentum: 0.000000
2023-10-17 09:42:09,711 epoch 4 - iter 539/773 - loss 0.03118289 - time (sec): 54.02 - samples/sec: 1629.16 - lr: 0.000021 - momentum: 0.000000
2023-10-17 09:42:17,213 epoch 4 - iter 616/773 - loss 0.03069938 - time (sec): 61.53 - samples/sec: 1620.50 - lr: 0.000021 - momentum: 0.000000
2023-10-17 09:42:24,916 epoch 4 - iter 693/773 - loss 0.03102882 - time (sec): 69.23 - samples/sec: 1610.69 - lr: 0.000020 - momentum: 0.000000
2023-10-17 09:42:32,858 epoch 4 - iter 770/773 - loss 0.03091728 - time (sec): 77.17 - samples/sec: 1606.19 - lr: 0.000020 - momentum: 0.000000
2023-10-17 09:42:33,116 ----------------------------------------------------------------------------------------------------
2023-10-17 09:42:33,116 EPOCH 4 done: loss 0.0311 - lr: 0.000020
2023-10-17 09:42:36,004 DEV : loss 0.0732564851641655 - f1-score (micro avg) 0.7938
2023-10-17 09:42:36,033 ----------------------------------------------------------------------------------------------------
2023-10-17 09:42:43,080 epoch 5 - iter 77/773 - loss 0.02562462 - time (sec): 7.04 - samples/sec: 1684.07 - lr: 0.000020 - momentum: 0.000000
2023-10-17 09:42:50,338 epoch 5 - iter 154/773 - loss 0.02186778 - time (sec): 14.30 - samples/sec: 1698.22 - lr: 0.000019 - momentum: 0.000000
2023-10-17 09:42:57,692 epoch 5 - iter 231/773 - loss 0.02197283 - time (sec): 21.66 - samples/sec: 1670.07 - lr: 0.000019 - momentum: 0.000000
2023-10-17 09:43:05,104 epoch 5 - iter 308/773 - loss 0.02085800 - time (sec): 29.07 - samples/sec: 1665.35 - lr: 0.000019 - momentum: 0.000000
2023-10-17 09:43:12,720 epoch 5 - iter 385/773 - loss 0.02066640 - time (sec): 36.68 - samples/sec: 1673.37 - lr: 0.000018 - momentum: 0.000000
2023-10-17 09:43:19,967 epoch 5 - iter 462/773 - loss 0.02032896 - time (sec): 43.93 - samples/sec: 1684.46 - lr: 0.000018 - momentum: 0.000000
2023-10-17 09:43:27,490 epoch 5 - iter 539/773 - loss 0.01942875 - time (sec): 51.45 - samples/sec: 1681.14 - lr: 0.000018 - momentum: 0.000000
2023-10-17 09:43:35,178 epoch 5 - iter 616/773 - loss 0.02036285 - time (sec): 59.14 - samples/sec: 1668.33 - lr: 0.000017 - momentum: 0.000000
2023-10-17 09:43:42,777 epoch 5 - iter 693/773 - loss 0.02050199 - time (sec): 66.74 - samples/sec: 1676.73 - lr: 0.000017 - momentum: 0.000000
2023-10-17 09:43:50,579 epoch 5 - iter 770/773 - loss 0.02101369 - time (sec): 74.54 - samples/sec: 1660.09 - lr: 0.000017 - momentum: 0.000000
2023-10-17 09:43:50,903 ----------------------------------------------------------------------------------------------------
2023-10-17 09:43:50,904 EPOCH 5 done: loss 0.0212 - lr: 0.000017
2023-10-17 09:43:54,291 DEV : loss 0.09599114209413528 - f1-score (micro avg) 0.7787
2023-10-17 09:43:54,321 ----------------------------------------------------------------------------------------------------
2023-10-17 09:44:02,198 epoch 6 - iter 77/773 - loss 0.01113512 - time (sec): 7.87 - samples/sec: 1629.16 - lr: 0.000016 - momentum: 0.000000
2023-10-17 09:44:09,915 epoch 6 - iter 154/773 - loss 0.01041077 - time (sec): 15.59 - samples/sec: 1650.97 - lr: 0.000016 - momentum: 0.000000
2023-10-17 09:44:17,666 epoch 6 - iter 231/773 - loss 0.01184309 - time (sec): 23.34 - samples/sec: 1629.14 - lr: 0.000016 - momentum: 0.000000
2023-10-17 09:44:25,575 epoch 6 - iter 308/773 - loss 0.01381827 - time (sec): 31.25 - samples/sec: 1617.95 - lr: 0.000015 - momentum: 0.000000
2023-10-17 09:44:32,527 epoch 6 - iter 385/773 - loss 0.01530116 - time (sec): 38.20 - samples/sec: 1663.06 - lr: 0.000015 - momentum: 0.000000
2023-10-17 09:44:39,617 epoch 6 - iter 462/773 - loss 0.01664026 - time (sec): 45.29 - samples/sec: 1658.96 - lr: 0.000015 - momentum: 0.000000
2023-10-17 09:44:46,872 epoch 6 - iter 539/773 - loss 0.01619538 - time (sec): 52.55 - samples/sec: 1653.81 - lr: 0.000014 - momentum: 0.000000
2023-10-17 09:44:54,077 epoch 6 - iter 616/773 - loss 0.01625589 - time (sec): 59.75 - samples/sec: 1652.08 - lr: 0.000014 - momentum: 0.000000
2023-10-17 09:45:01,550 epoch 6 - iter 693/773 - loss 0.01623770 - time (sec): 67.22 - samples/sec: 1656.58 - lr: 0.000014 - momentum: 0.000000
2023-10-17 09:45:09,080 epoch 6 - iter 770/773 - loss 0.01610707 - time (sec): 74.75 - samples/sec: 1657.18 - lr: 0.000013 - momentum: 0.000000
2023-10-17 09:45:09,364 ----------------------------------------------------------------------------------------------------
2023-10-17 09:45:09,364 EPOCH 6 done: loss 0.0161 - lr: 0.000013
2023-10-17 09:45:12,313 DEV : loss 0.10213357210159302 - f1-score (micro avg) 0.7975
2023-10-17 09:45:12,341 ----------------------------------------------------------------------------------------------------
2023-10-17 09:45:19,108 epoch 7 - iter 77/773 - loss 0.00659626 - time (sec): 6.77 - samples/sec: 1731.83 - lr: 0.000013 - momentum: 0.000000
2023-10-17 09:45:25,993 epoch 7 - iter 154/773 - loss 0.01050483 - time (sec): 13.65 - samples/sec: 1738.79 - lr: 0.000013 - momentum: 0.000000
2023-10-17 09:45:32,858 epoch 7 - iter 231/773 - loss 0.01031709 - time (sec): 20.52 - samples/sec: 1766.06 - lr: 0.000012 - momentum: 0.000000
2023-10-17 09:45:39,920 epoch 7 - iter 308/773 - loss 0.01094286 - time (sec): 27.58 - samples/sec: 1774.72 - lr: 0.000012 - momentum: 0.000000
2023-10-17 09:45:46,969 epoch 7 - iter 385/773 - loss 0.01060798 - time (sec): 34.63 - samples/sec: 1778.49 - lr: 0.000012 - momentum: 0.000000
2023-10-17 09:45:53,878 epoch 7 - iter 462/773 - loss 0.00964801 - time (sec): 41.53 - samples/sec: 1779.93 - lr: 0.000011 - momentum: 0.000000
2023-10-17 09:46:00,494 epoch 7 - iter 539/773 - loss 0.00899137 - time (sec): 48.15 - samples/sec: 1785.57 - lr: 0.000011 - momentum: 0.000000
2023-10-17 09:46:07,442 epoch 7 - iter 616/773 - loss 0.00899506 - time (sec): 55.10 - samples/sec: 1798.25 - lr: 0.000011 - momentum: 0.000000
2023-10-17 09:46:14,354 epoch 7 - iter 693/773 - loss 0.00947777 - time (sec): 62.01 - samples/sec: 1804.02 - lr: 0.000010 - momentum: 0.000000
2023-10-17 09:46:21,321 epoch 7 - iter 770/773 - loss 0.01042790 - time (sec): 68.98 - samples/sec: 1793.18 - lr: 0.000010 - momentum: 0.000000
2023-10-17 09:46:21,605 ----------------------------------------------------------------------------------------------------
2023-10-17 09:46:21,605 EPOCH 7 done: loss 0.0105 - lr: 0.000010
2023-10-17 09:46:24,548 DEV : loss 0.10294033586978912 - f1-score (micro avg) 0.7984
2023-10-17 09:46:24,575 ----------------------------------------------------------------------------------------------------
2023-10-17 09:46:31,444 epoch 8 - iter 77/773 - loss 0.01082701 - time (sec): 6.87 - samples/sec: 1802.66 - lr: 0.000010 - momentum: 0.000000
2023-10-17 09:46:38,217 epoch 8 - iter 154/773 - loss 0.00937227 - time (sec): 13.64 - samples/sec: 1852.12 - lr: 0.000009 - momentum: 0.000000
2023-10-17 09:46:44,941 epoch 8 - iter 231/773 - loss 0.01054096 - time (sec): 20.36 - samples/sec: 1835.19 - lr: 0.000009 - momentum: 0.000000
2023-10-17 09:46:51,562 epoch 8 - iter 308/773 - loss 0.00981916 - time (sec): 26.99 - samples/sec: 1833.97 - lr: 0.000009 - momentum: 0.000000
2023-10-17 09:46:58,389 epoch 8 - iter 385/773 - loss 0.00891815 - time (sec): 33.81 - samples/sec: 1819.33 - lr: 0.000008 - momentum: 0.000000
2023-10-17 09:47:05,424 epoch 8 - iter 462/773 - loss 0.00820688 - time (sec): 40.85 - samples/sec: 1827.76 - lr: 0.000008 - momentum: 0.000000
2023-10-17 09:47:12,186 epoch 8 - iter 539/773 - loss 0.00762464 - time (sec): 47.61 - samples/sec: 1840.44 - lr: 0.000008 - momentum: 0.000000
2023-10-17 09:47:18,915 epoch 8 - iter 616/773 - loss 0.00722670 - time (sec): 54.34 - samples/sec: 1830.88 - lr: 0.000007 - momentum: 0.000000
2023-10-17 09:47:26,084 epoch 8 - iter 693/773 - loss 0.00708273 - time (sec): 61.51 - samples/sec: 1804.20 - lr: 0.000007 - momentum: 0.000000
2023-10-17 09:47:33,076 epoch 8 - iter 770/773 - loss 0.00691715 - time (sec): 68.50 - samples/sec: 1809.37 - lr: 0.000007 - momentum: 0.000000
2023-10-17 09:47:33,365 ----------------------------------------------------------------------------------------------------
2023-10-17 09:47:33,366 EPOCH 8 done: loss 0.0069 - lr: 0.000007
2023-10-17 09:47:36,625 DEV : loss 0.12377041578292847 - f1-score (micro avg) 0.7854
2023-10-17 09:47:36,665 ----------------------------------------------------------------------------------------------------
2023-10-17 09:47:43,762 epoch 9 - iter 77/773 - loss 0.00448652 - time (sec): 7.09 - samples/sec: 1776.99 - lr: 0.000006 - momentum: 0.000000
2023-10-17 09:47:50,939 epoch 9 - iter 154/773 - loss 0.00413824 - time (sec): 14.27 - samples/sec: 1718.66 - lr: 0.000006 - momentum: 0.000000
2023-10-17 09:47:58,038 epoch 9 - iter 231/773 - loss 0.00450347 - time (sec): 21.37 - samples/sec: 1750.13 - lr: 0.000006 - momentum: 0.000000
2023-10-17 09:48:04,825 epoch 9 - iter 308/773 - loss 0.00438937 - time (sec): 28.16 - samples/sec: 1744.81 - lr: 0.000005 - momentum: 0.000000
2023-10-17 09:48:11,751 epoch 9 - iter 385/773 - loss 0.00389918 - time (sec): 35.08 - samples/sec: 1764.04 - lr: 0.000005 - momentum: 0.000000
2023-10-17 09:48:18,655 epoch 9 - iter 462/773 - loss 0.00409811 - time (sec): 41.99 - samples/sec: 1762.60 - lr: 0.000005 - momentum: 0.000000
2023-10-17 09:48:25,972 epoch 9 - iter 539/773 - loss 0.00408809 - time (sec): 49.30 - samples/sec: 1762.28 - lr: 0.000004 - momentum: 0.000000
2023-10-17 09:48:33,178 epoch 9 - iter 616/773 - loss 0.00422096 - time (sec): 56.51 - samples/sec: 1752.15 - lr: 0.000004 - momentum: 0.000000
2023-10-17 09:48:40,631 epoch 9 - iter 693/773 - loss 0.00424621 - time (sec): 63.96 - samples/sec: 1756.00 - lr: 0.000004 - momentum: 0.000000
2023-10-17 09:48:47,886 epoch 9 - iter 770/773 - loss 0.00460245 - time (sec): 71.22 - samples/sec: 1738.54 - lr: 0.000003 - momentum: 0.000000
2023-10-17 09:48:48,169 ----------------------------------------------------------------------------------------------------
2023-10-17 09:48:48,169 EPOCH 9 done: loss 0.0046 - lr: 0.000003
2023-10-17 09:48:51,211 DEV : loss 0.11872641742229462 - f1-score (micro avg) 0.8089
2023-10-17 09:48:51,241 saving best model
2023-10-17 09:48:51,815 ----------------------------------------------------------------------------------------------------
2023-10-17 09:48:58,687 epoch 10 - iter 77/773 - loss 0.00222354 - time (sec): 6.87 - samples/sec: 1822.68 - lr: 0.000003 - momentum: 0.000000
2023-10-17 09:49:05,736 epoch 10 - iter 154/773 - loss 0.00255318 - time (sec): 13.92 - samples/sec: 1781.09 - lr: 0.000003 - momentum: 0.000000
2023-10-17 09:49:12,757 epoch 10 - iter 231/773 - loss 0.00220584 - time (sec): 20.94 - samples/sec: 1807.28 - lr: 0.000002 - momentum: 0.000000
2023-10-17 09:49:19,469 epoch 10 - iter 308/773 - loss 0.00272066 - time (sec): 27.65 - samples/sec: 1817.54 - lr: 0.000002 - momentum: 0.000000
2023-10-17 09:49:26,263 epoch 10 - iter 385/773 - loss 0.00296102 - time (sec): 34.45 - samples/sec: 1814.48 - lr: 0.000002 - momentum: 0.000000
2023-10-17 09:49:32,975 epoch 10 - iter 462/773 - loss 0.00304331 - time (sec): 41.16 - samples/sec: 1803.57 - lr: 0.000001 - momentum: 0.000000
2023-10-17 09:49:40,073 epoch 10 - iter 539/773 - loss 0.00318944 - time (sec): 48.26 - samples/sec: 1803.54 - lr: 0.000001 - momentum: 0.000000
2023-10-17 09:49:47,138 epoch 10 - iter 616/773 - loss 0.00312936 - time (sec): 55.32 - samples/sec: 1786.57 - lr: 0.000001 - momentum: 0.000000
2023-10-17 09:49:54,283 epoch 10 - iter 693/773 - loss 0.00302278 - time (sec): 62.47 - samples/sec: 1784.02 - lr: 0.000000 - momentum: 0.000000
2023-10-17 09:50:01,888 epoch 10 - iter 770/773 - loss 0.00307193 - time (sec): 70.07 - samples/sec: 1767.38 - lr: 0.000000 - momentum: 0.000000
2023-10-17 09:50:02,150 ----------------------------------------------------------------------------------------------------
2023-10-17 09:50:02,150 EPOCH 10 done: loss 0.0031 - lr: 0.000000
2023-10-17 09:50:05,066 DEV : loss 0.12313356250524521 - f1-score (micro avg) 0.7967
2023-10-17 09:50:05,705 ----------------------------------------------------------------------------------------------------
2023-10-17 09:50:05,708 Loading model from best epoch ...
2023-10-17 09:50:08,303 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-17 09:50:17,304
Results:
- F-score (micro) 0.8152
- F-score (macro) 0.7257
- Accuracy 0.7076
By class:
precision recall f1-score support
LOC 0.8495 0.8710 0.8601 946
BUILDING 0.6301 0.5892 0.6089 185
STREET 0.7018 0.7143 0.7080 56
micro avg 0.8108 0.8197 0.8152 1187
macro avg 0.7271 0.7248 0.7257 1187
weighted avg 0.8083 0.8197 0.8138 1187
2023-10-17 09:50:17,304 ----------------------------------------------------------------------------------------------------