stefan-it's picture
Upload ./training.log with huggingface_hub
97a914a verified
raw
history blame
23.8 kB
2024-03-26 16:29:38,174 ----------------------------------------------------------------------------------------------------
2024-03-26 16:29:38,175 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(31103, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2024-03-26 16:29:38,175 ----------------------------------------------------------------------------------------------------
2024-03-26 16:29:38,175 Corpus: 758 train + 94 dev + 96 test sentences
2024-03-26 16:29:38,175 ----------------------------------------------------------------------------------------------------
2024-03-26 16:29:38,175 Train: 758 sentences
2024-03-26 16:29:38,175 (train_with_dev=False, train_with_test=False)
2024-03-26 16:29:38,175 ----------------------------------------------------------------------------------------------------
2024-03-26 16:29:38,175 Training Params:
2024-03-26 16:29:38,175 - learning_rate: "3e-05"
2024-03-26 16:29:38,175 - mini_batch_size: "8"
2024-03-26 16:29:38,175 - max_epochs: "10"
2024-03-26 16:29:38,175 - shuffle: "True"
2024-03-26 16:29:38,175 ----------------------------------------------------------------------------------------------------
2024-03-26 16:29:38,175 Plugins:
2024-03-26 16:29:38,175 - TensorboardLogger
2024-03-26 16:29:38,175 - LinearScheduler | warmup_fraction: '0.1'
2024-03-26 16:29:38,175 ----------------------------------------------------------------------------------------------------
2024-03-26 16:29:38,175 Final evaluation on model from best epoch (best-model.pt)
2024-03-26 16:29:38,175 - metric: "('micro avg', 'f1-score')"
2024-03-26 16:29:38,175 ----------------------------------------------------------------------------------------------------
2024-03-26 16:29:38,175 Computation:
2024-03-26 16:29:38,175 - compute on device: cuda:0
2024-03-26 16:29:38,175 - embedding storage: none
2024-03-26 16:29:38,175 ----------------------------------------------------------------------------------------------------
2024-03-26 16:29:38,175 Model training base path: "flair-co-funer-german_dbmdz_bert_base-bs8-e10-lr3e-05-5"
2024-03-26 16:29:38,175 ----------------------------------------------------------------------------------------------------
2024-03-26 16:29:38,175 ----------------------------------------------------------------------------------------------------
2024-03-26 16:29:38,175 Logging anything other than scalars to TensorBoard is currently not supported.
2024-03-26 16:29:40,026 epoch 1 - iter 9/95 - loss 3.26803712 - time (sec): 1.85 - samples/sec: 1694.05 - lr: 0.000003 - momentum: 0.000000
2024-03-26 16:29:41,878 epoch 1 - iter 18/95 - loss 3.16168639 - time (sec): 3.70 - samples/sec: 1791.47 - lr: 0.000005 - momentum: 0.000000
2024-03-26 16:29:44,152 epoch 1 - iter 27/95 - loss 2.95928444 - time (sec): 5.98 - samples/sec: 1735.30 - lr: 0.000008 - momentum: 0.000000
2024-03-26 16:29:45,625 epoch 1 - iter 36/95 - loss 2.77842393 - time (sec): 7.45 - samples/sec: 1813.39 - lr: 0.000011 - momentum: 0.000000
2024-03-26 16:29:47,756 epoch 1 - iter 45/95 - loss 2.59475970 - time (sec): 9.58 - samples/sec: 1793.81 - lr: 0.000014 - momentum: 0.000000
2024-03-26 16:29:49,318 epoch 1 - iter 54/95 - loss 2.43721303 - time (sec): 11.14 - samples/sec: 1816.01 - lr: 0.000017 - momentum: 0.000000
2024-03-26 16:29:50,944 epoch 1 - iter 63/95 - loss 2.29872757 - time (sec): 12.77 - samples/sec: 1832.95 - lr: 0.000020 - momentum: 0.000000
2024-03-26 16:29:52,801 epoch 1 - iter 72/95 - loss 2.16340330 - time (sec): 14.62 - samples/sec: 1825.60 - lr: 0.000022 - momentum: 0.000000
2024-03-26 16:29:54,859 epoch 1 - iter 81/95 - loss 2.01632633 - time (sec): 16.68 - samples/sec: 1807.15 - lr: 0.000025 - momentum: 0.000000
2024-03-26 16:29:56,473 epoch 1 - iter 90/95 - loss 1.90367232 - time (sec): 18.30 - samples/sec: 1799.51 - lr: 0.000028 - momentum: 0.000000
2024-03-26 16:29:57,220 ----------------------------------------------------------------------------------------------------
2024-03-26 16:29:57,220 EPOCH 1 done: loss 1.8467 - lr: 0.000028
2024-03-26 16:29:58,132 DEV : loss 0.5060820579528809 - f1-score (micro avg) 0.6598
2024-03-26 16:29:58,133 saving best model
2024-03-26 16:29:58,427 ----------------------------------------------------------------------------------------------------
2024-03-26 16:30:00,681 epoch 2 - iter 9/95 - loss 0.59828895 - time (sec): 2.25 - samples/sec: 1693.18 - lr: 0.000030 - momentum: 0.000000
2024-03-26 16:30:02,580 epoch 2 - iter 18/95 - loss 0.55104273 - time (sec): 4.15 - samples/sec: 1684.44 - lr: 0.000029 - momentum: 0.000000
2024-03-26 16:30:04,892 epoch 2 - iter 27/95 - loss 0.49400970 - time (sec): 6.46 - samples/sec: 1654.65 - lr: 0.000029 - momentum: 0.000000
2024-03-26 16:30:06,250 epoch 2 - iter 36/95 - loss 0.47958751 - time (sec): 7.82 - samples/sec: 1766.70 - lr: 0.000029 - momentum: 0.000000
2024-03-26 16:30:08,193 epoch 2 - iter 45/95 - loss 0.44983894 - time (sec): 9.77 - samples/sec: 1728.38 - lr: 0.000028 - momentum: 0.000000
2024-03-26 16:30:09,499 epoch 2 - iter 54/95 - loss 0.44502692 - time (sec): 11.07 - samples/sec: 1775.28 - lr: 0.000028 - momentum: 0.000000
2024-03-26 16:30:11,061 epoch 2 - iter 63/95 - loss 0.43034408 - time (sec): 12.63 - samples/sec: 1791.36 - lr: 0.000028 - momentum: 0.000000
2024-03-26 16:30:13,119 epoch 2 - iter 72/95 - loss 0.42003547 - time (sec): 14.69 - samples/sec: 1784.04 - lr: 0.000028 - momentum: 0.000000
2024-03-26 16:30:15,027 epoch 2 - iter 81/95 - loss 0.42048863 - time (sec): 16.60 - samples/sec: 1782.80 - lr: 0.000027 - momentum: 0.000000
2024-03-26 16:30:16,956 epoch 2 - iter 90/95 - loss 0.40536615 - time (sec): 18.53 - samples/sec: 1785.20 - lr: 0.000027 - momentum: 0.000000
2024-03-26 16:30:17,543 ----------------------------------------------------------------------------------------------------
2024-03-26 16:30:17,543 EPOCH 2 done: loss 0.4047 - lr: 0.000027
2024-03-26 16:30:18,472 DEV : loss 0.3058844208717346 - f1-score (micro avg) 0.8146
2024-03-26 16:30:18,473 saving best model
2024-03-26 16:30:18,934 ----------------------------------------------------------------------------------------------------
2024-03-26 16:30:20,131 epoch 3 - iter 9/95 - loss 0.36501735 - time (sec): 1.20 - samples/sec: 2166.44 - lr: 0.000026 - momentum: 0.000000
2024-03-26 16:30:22,371 epoch 3 - iter 18/95 - loss 0.28272292 - time (sec): 3.44 - samples/sec: 1867.73 - lr: 0.000026 - momentum: 0.000000
2024-03-26 16:30:24,064 epoch 3 - iter 27/95 - loss 0.28437566 - time (sec): 5.13 - samples/sec: 1902.56 - lr: 0.000026 - momentum: 0.000000
2024-03-26 16:30:25,767 epoch 3 - iter 36/95 - loss 0.26517000 - time (sec): 6.83 - samples/sec: 1927.57 - lr: 0.000025 - momentum: 0.000000
2024-03-26 16:30:27,211 epoch 3 - iter 45/95 - loss 0.25081771 - time (sec): 8.28 - samples/sec: 1918.73 - lr: 0.000025 - momentum: 0.000000
2024-03-26 16:30:29,326 epoch 3 - iter 54/95 - loss 0.24159034 - time (sec): 10.39 - samples/sec: 1859.04 - lr: 0.000025 - momentum: 0.000000
2024-03-26 16:30:31,005 epoch 3 - iter 63/95 - loss 0.23814016 - time (sec): 12.07 - samples/sec: 1843.90 - lr: 0.000025 - momentum: 0.000000
2024-03-26 16:30:33,274 epoch 3 - iter 72/95 - loss 0.22675339 - time (sec): 14.34 - samples/sec: 1808.88 - lr: 0.000024 - momentum: 0.000000
2024-03-26 16:30:35,461 epoch 3 - iter 81/95 - loss 0.22929871 - time (sec): 16.53 - samples/sec: 1801.11 - lr: 0.000024 - momentum: 0.000000
2024-03-26 16:30:37,212 epoch 3 - iter 90/95 - loss 0.22536947 - time (sec): 18.28 - samples/sec: 1790.28 - lr: 0.000024 - momentum: 0.000000
2024-03-26 16:30:38,093 ----------------------------------------------------------------------------------------------------
2024-03-26 16:30:38,093 EPOCH 3 done: loss 0.2228 - lr: 0.000024
2024-03-26 16:30:38,991 DEV : loss 0.2297995537519455 - f1-score (micro avg) 0.8511
2024-03-26 16:30:38,992 saving best model
2024-03-26 16:30:39,451 ----------------------------------------------------------------------------------------------------
2024-03-26 16:30:42,214 epoch 4 - iter 9/95 - loss 0.13046065 - time (sec): 2.76 - samples/sec: 1545.06 - lr: 0.000023 - momentum: 0.000000
2024-03-26 16:30:43,247 epoch 4 - iter 18/95 - loss 0.16091422 - time (sec): 3.80 - samples/sec: 1753.28 - lr: 0.000023 - momentum: 0.000000
2024-03-26 16:30:45,734 epoch 4 - iter 27/95 - loss 0.14821611 - time (sec): 6.28 - samples/sec: 1693.16 - lr: 0.000022 - momentum: 0.000000
2024-03-26 16:30:48,334 epoch 4 - iter 36/95 - loss 0.14666483 - time (sec): 8.88 - samples/sec: 1633.76 - lr: 0.000022 - momentum: 0.000000
2024-03-26 16:30:50,034 epoch 4 - iter 45/95 - loss 0.13947575 - time (sec): 10.58 - samples/sec: 1664.40 - lr: 0.000022 - momentum: 0.000000
2024-03-26 16:30:51,736 epoch 4 - iter 54/95 - loss 0.13842701 - time (sec): 12.28 - samples/sec: 1676.99 - lr: 0.000022 - momentum: 0.000000
2024-03-26 16:30:53,637 epoch 4 - iter 63/95 - loss 0.13985858 - time (sec): 14.19 - samples/sec: 1702.88 - lr: 0.000021 - momentum: 0.000000
2024-03-26 16:30:55,307 epoch 4 - iter 72/95 - loss 0.13970582 - time (sec): 15.86 - samples/sec: 1750.57 - lr: 0.000021 - momentum: 0.000000
2024-03-26 16:30:56,345 epoch 4 - iter 81/95 - loss 0.14079165 - time (sec): 16.89 - samples/sec: 1788.59 - lr: 0.000021 - momentum: 0.000000
2024-03-26 16:30:57,748 epoch 4 - iter 90/95 - loss 0.14099099 - time (sec): 18.30 - samples/sec: 1813.90 - lr: 0.000020 - momentum: 0.000000
2024-03-26 16:30:58,295 ----------------------------------------------------------------------------------------------------
2024-03-26 16:30:58,295 EPOCH 4 done: loss 0.1421 - lr: 0.000020
2024-03-26 16:30:59,197 DEV : loss 0.19711963832378387 - f1-score (micro avg) 0.8793
2024-03-26 16:30:59,198 saving best model
2024-03-26 16:30:59,650 ----------------------------------------------------------------------------------------------------
2024-03-26 16:31:01,298 epoch 5 - iter 9/95 - loss 0.12454574 - time (sec): 1.65 - samples/sec: 1989.41 - lr: 0.000020 - momentum: 0.000000
2024-03-26 16:31:03,260 epoch 5 - iter 18/95 - loss 0.11089770 - time (sec): 3.61 - samples/sec: 1973.58 - lr: 0.000019 - momentum: 0.000000
2024-03-26 16:31:05,378 epoch 5 - iter 27/95 - loss 0.09499923 - time (sec): 5.73 - samples/sec: 1849.03 - lr: 0.000019 - momentum: 0.000000
2024-03-26 16:31:06,710 epoch 5 - iter 36/95 - loss 0.10839084 - time (sec): 7.06 - samples/sec: 1904.77 - lr: 0.000019 - momentum: 0.000000
2024-03-26 16:31:08,803 epoch 5 - iter 45/95 - loss 0.10543794 - time (sec): 9.15 - samples/sec: 1861.82 - lr: 0.000019 - momentum: 0.000000
2024-03-26 16:31:09,980 epoch 5 - iter 54/95 - loss 0.10672654 - time (sec): 10.33 - samples/sec: 1895.66 - lr: 0.000018 - momentum: 0.000000
2024-03-26 16:31:11,457 epoch 5 - iter 63/95 - loss 0.11147467 - time (sec): 11.80 - samples/sec: 1909.79 - lr: 0.000018 - momentum: 0.000000
2024-03-26 16:31:13,380 epoch 5 - iter 72/95 - loss 0.11029657 - time (sec): 13.73 - samples/sec: 1881.09 - lr: 0.000018 - momentum: 0.000000
2024-03-26 16:31:15,150 epoch 5 - iter 81/95 - loss 0.10777118 - time (sec): 15.50 - samples/sec: 1868.93 - lr: 0.000017 - momentum: 0.000000
2024-03-26 16:31:17,654 epoch 5 - iter 90/95 - loss 0.10489690 - time (sec): 18.00 - samples/sec: 1826.39 - lr: 0.000017 - momentum: 0.000000
2024-03-26 16:31:18,628 ----------------------------------------------------------------------------------------------------
2024-03-26 16:31:18,628 EPOCH 5 done: loss 0.1030 - lr: 0.000017
2024-03-26 16:31:19,547 DEV : loss 0.20496806502342224 - f1-score (micro avg) 0.9068
2024-03-26 16:31:19,549 saving best model
2024-03-26 16:31:20,001 ----------------------------------------------------------------------------------------------------
2024-03-26 16:31:21,951 epoch 6 - iter 9/95 - loss 0.08427941 - time (sec): 1.95 - samples/sec: 1672.91 - lr: 0.000016 - momentum: 0.000000
2024-03-26 16:31:24,382 epoch 6 - iter 18/95 - loss 0.09516629 - time (sec): 4.38 - samples/sec: 1693.06 - lr: 0.000016 - momentum: 0.000000
2024-03-26 16:31:25,533 epoch 6 - iter 27/95 - loss 0.10718208 - time (sec): 5.53 - samples/sec: 1787.27 - lr: 0.000016 - momentum: 0.000000
2024-03-26 16:31:27,132 epoch 6 - iter 36/95 - loss 0.09756097 - time (sec): 7.13 - samples/sec: 1809.69 - lr: 0.000016 - momentum: 0.000000
2024-03-26 16:31:29,076 epoch 6 - iter 45/95 - loss 0.09333884 - time (sec): 9.07 - samples/sec: 1800.44 - lr: 0.000015 - momentum: 0.000000
2024-03-26 16:31:31,224 epoch 6 - iter 54/95 - loss 0.08922901 - time (sec): 11.22 - samples/sec: 1763.30 - lr: 0.000015 - momentum: 0.000000
2024-03-26 16:31:32,909 epoch 6 - iter 63/95 - loss 0.09258864 - time (sec): 12.91 - samples/sec: 1779.44 - lr: 0.000015 - momentum: 0.000000
2024-03-26 16:31:34,471 epoch 6 - iter 72/95 - loss 0.09222911 - time (sec): 14.47 - samples/sec: 1799.91 - lr: 0.000014 - momentum: 0.000000
2024-03-26 16:31:35,710 epoch 6 - iter 81/95 - loss 0.08983495 - time (sec): 15.71 - samples/sec: 1829.96 - lr: 0.000014 - momentum: 0.000000
2024-03-26 16:31:37,583 epoch 6 - iter 90/95 - loss 0.08564523 - time (sec): 17.58 - samples/sec: 1827.65 - lr: 0.000014 - momentum: 0.000000
2024-03-26 16:31:39,095 ----------------------------------------------------------------------------------------------------
2024-03-26 16:31:39,095 EPOCH 6 done: loss 0.0821 - lr: 0.000014
2024-03-26 16:31:40,004 DEV : loss 0.2126017063856125 - f1-score (micro avg) 0.911
2024-03-26 16:31:40,005 saving best model
2024-03-26 16:31:40,454 ----------------------------------------------------------------------------------------------------
2024-03-26 16:31:42,131 epoch 7 - iter 9/95 - loss 0.04698518 - time (sec): 1.68 - samples/sec: 1878.40 - lr: 0.000013 - momentum: 0.000000
2024-03-26 16:31:43,630 epoch 7 - iter 18/95 - loss 0.06511447 - time (sec): 3.17 - samples/sec: 1853.03 - lr: 0.000013 - momentum: 0.000000
2024-03-26 16:31:44,922 epoch 7 - iter 27/95 - loss 0.07786885 - time (sec): 4.47 - samples/sec: 1895.87 - lr: 0.000013 - momentum: 0.000000
2024-03-26 16:31:47,160 epoch 7 - iter 36/95 - loss 0.06772422 - time (sec): 6.70 - samples/sec: 1895.81 - lr: 0.000012 - momentum: 0.000000
2024-03-26 16:31:49,073 epoch 7 - iter 45/95 - loss 0.07114282 - time (sec): 8.62 - samples/sec: 1889.46 - lr: 0.000012 - momentum: 0.000000
2024-03-26 16:31:50,763 epoch 7 - iter 54/95 - loss 0.06993658 - time (sec): 10.31 - samples/sec: 1881.41 - lr: 0.000012 - momentum: 0.000000
2024-03-26 16:31:52,310 epoch 7 - iter 63/95 - loss 0.06884486 - time (sec): 11.85 - samples/sec: 1901.33 - lr: 0.000011 - momentum: 0.000000
2024-03-26 16:31:53,832 epoch 7 - iter 72/95 - loss 0.06833758 - time (sec): 13.38 - samples/sec: 1889.58 - lr: 0.000011 - momentum: 0.000000
2024-03-26 16:31:56,531 epoch 7 - iter 81/95 - loss 0.06575132 - time (sec): 16.08 - samples/sec: 1827.53 - lr: 0.000011 - momentum: 0.000000
2024-03-26 16:31:58,132 epoch 7 - iter 90/95 - loss 0.06568883 - time (sec): 17.68 - samples/sec: 1837.28 - lr: 0.000010 - momentum: 0.000000
2024-03-26 16:31:59,304 ----------------------------------------------------------------------------------------------------
2024-03-26 16:31:59,305 EPOCH 7 done: loss 0.0645 - lr: 0.000010
2024-03-26 16:32:00,244 DEV : loss 0.20466142892837524 - f1-score (micro avg) 0.9148
2024-03-26 16:32:00,245 saving best model
2024-03-26 16:32:00,700 ----------------------------------------------------------------------------------------------------
2024-03-26 16:32:02,861 epoch 8 - iter 9/95 - loss 0.06658622 - time (sec): 2.16 - samples/sec: 1565.45 - lr: 0.000010 - momentum: 0.000000
2024-03-26 16:32:04,372 epoch 8 - iter 18/95 - loss 0.04907228 - time (sec): 3.67 - samples/sec: 1663.12 - lr: 0.000010 - momentum: 0.000000
2024-03-26 16:32:06,375 epoch 8 - iter 27/95 - loss 0.05007111 - time (sec): 5.67 - samples/sec: 1729.59 - lr: 0.000009 - momentum: 0.000000
2024-03-26 16:32:08,344 epoch 8 - iter 36/95 - loss 0.04963051 - time (sec): 7.64 - samples/sec: 1761.80 - lr: 0.000009 - momentum: 0.000000
2024-03-26 16:32:09,761 epoch 8 - iter 45/95 - loss 0.04655741 - time (sec): 9.06 - samples/sec: 1816.85 - lr: 0.000009 - momentum: 0.000000
2024-03-26 16:32:11,249 epoch 8 - iter 54/95 - loss 0.04860590 - time (sec): 10.55 - samples/sec: 1884.34 - lr: 0.000008 - momentum: 0.000000
2024-03-26 16:32:12,817 epoch 8 - iter 63/95 - loss 0.05087392 - time (sec): 12.12 - samples/sec: 1877.07 - lr: 0.000008 - momentum: 0.000000
2024-03-26 16:32:14,908 epoch 8 - iter 72/95 - loss 0.04871077 - time (sec): 14.21 - samples/sec: 1843.05 - lr: 0.000008 - momentum: 0.000000
2024-03-26 16:32:16,476 epoch 8 - iter 81/95 - loss 0.05079554 - time (sec): 15.77 - samples/sec: 1866.81 - lr: 0.000007 - momentum: 0.000000
2024-03-26 16:32:18,569 epoch 8 - iter 90/95 - loss 0.05202209 - time (sec): 17.87 - samples/sec: 1840.24 - lr: 0.000007 - momentum: 0.000000
2024-03-26 16:32:19,204 ----------------------------------------------------------------------------------------------------
2024-03-26 16:32:19,204 EPOCH 8 done: loss 0.0518 - lr: 0.000007
2024-03-26 16:32:20,118 DEV : loss 0.1933898627758026 - f1-score (micro avg) 0.9237
2024-03-26 16:32:20,120 saving best model
2024-03-26 16:32:20,591 ----------------------------------------------------------------------------------------------------
2024-03-26 16:32:23,224 epoch 9 - iter 9/95 - loss 0.03140119 - time (sec): 2.63 - samples/sec: 1638.93 - lr: 0.000007 - momentum: 0.000000
2024-03-26 16:32:24,787 epoch 9 - iter 18/95 - loss 0.03679889 - time (sec): 4.20 - samples/sec: 1723.84 - lr: 0.000006 - momentum: 0.000000
2024-03-26 16:32:27,258 epoch 9 - iter 27/95 - loss 0.04302958 - time (sec): 6.67 - samples/sec: 1695.72 - lr: 0.000006 - momentum: 0.000000
2024-03-26 16:32:29,088 epoch 9 - iter 36/95 - loss 0.04876538 - time (sec): 8.50 - samples/sec: 1705.94 - lr: 0.000006 - momentum: 0.000000
2024-03-26 16:32:30,251 epoch 9 - iter 45/95 - loss 0.04541126 - time (sec): 9.66 - samples/sec: 1765.93 - lr: 0.000005 - momentum: 0.000000
2024-03-26 16:32:31,996 epoch 9 - iter 54/95 - loss 0.04161509 - time (sec): 11.40 - samples/sec: 1759.36 - lr: 0.000005 - momentum: 0.000000
2024-03-26 16:32:33,389 epoch 9 - iter 63/95 - loss 0.04353728 - time (sec): 12.80 - samples/sec: 1805.12 - lr: 0.000005 - momentum: 0.000000
2024-03-26 16:32:34,556 epoch 9 - iter 72/95 - loss 0.04204169 - time (sec): 13.96 - samples/sec: 1855.33 - lr: 0.000004 - momentum: 0.000000
2024-03-26 16:32:36,082 epoch 9 - iter 81/95 - loss 0.03960533 - time (sec): 15.49 - samples/sec: 1854.55 - lr: 0.000004 - momentum: 0.000000
2024-03-26 16:32:38,829 epoch 9 - iter 90/95 - loss 0.04161902 - time (sec): 18.24 - samples/sec: 1808.12 - lr: 0.000004 - momentum: 0.000000
2024-03-26 16:32:39,594 ----------------------------------------------------------------------------------------------------
2024-03-26 16:32:39,594 EPOCH 9 done: loss 0.0402 - lr: 0.000004
2024-03-26 16:32:40,526 DEV : loss 0.2051326334476471 - f1-score (micro avg) 0.9281
2024-03-26 16:32:40,527 saving best model
2024-03-26 16:32:40,994 ----------------------------------------------------------------------------------------------------
2024-03-26 16:32:43,436 epoch 10 - iter 9/95 - loss 0.03423904 - time (sec): 2.44 - samples/sec: 1653.71 - lr: 0.000003 - momentum: 0.000000
2024-03-26 16:32:45,012 epoch 10 - iter 18/95 - loss 0.02828668 - time (sec): 4.02 - samples/sec: 1735.92 - lr: 0.000003 - momentum: 0.000000
2024-03-26 16:32:46,977 epoch 10 - iter 27/95 - loss 0.02533777 - time (sec): 5.98 - samples/sec: 1684.70 - lr: 0.000003 - momentum: 0.000000
2024-03-26 16:32:48,985 epoch 10 - iter 36/95 - loss 0.02854483 - time (sec): 7.99 - samples/sec: 1707.95 - lr: 0.000002 - momentum: 0.000000
2024-03-26 16:32:50,853 epoch 10 - iter 45/95 - loss 0.02832813 - time (sec): 9.86 - samples/sec: 1720.80 - lr: 0.000002 - momentum: 0.000000
2024-03-26 16:32:51,990 epoch 10 - iter 54/95 - loss 0.03171872 - time (sec): 11.00 - samples/sec: 1781.85 - lr: 0.000002 - momentum: 0.000000
2024-03-26 16:32:53,616 epoch 10 - iter 63/95 - loss 0.03580322 - time (sec): 12.62 - samples/sec: 1803.06 - lr: 0.000001 - momentum: 0.000000
2024-03-26 16:32:55,431 epoch 10 - iter 72/95 - loss 0.03568848 - time (sec): 14.44 - samples/sec: 1792.10 - lr: 0.000001 - momentum: 0.000000
2024-03-26 16:32:57,103 epoch 10 - iter 81/95 - loss 0.03880628 - time (sec): 16.11 - samples/sec: 1802.60 - lr: 0.000001 - momentum: 0.000000
2024-03-26 16:32:59,872 epoch 10 - iter 90/95 - loss 0.03625832 - time (sec): 18.88 - samples/sec: 1766.14 - lr: 0.000000 - momentum: 0.000000
2024-03-26 16:33:00,433 ----------------------------------------------------------------------------------------------------
2024-03-26 16:33:00,433 EPOCH 10 done: loss 0.0366 - lr: 0.000000
2024-03-26 16:33:01,363 DEV : loss 0.20437297224998474 - f1-score (micro avg) 0.926
2024-03-26 16:33:01,644 ----------------------------------------------------------------------------------------------------
2024-03-26 16:33:01,644 Loading model from best epoch ...
2024-03-26 16:33:02,547 SequenceTagger predicts: Dictionary with 17 tags: O, S-Unternehmen, B-Unternehmen, E-Unternehmen, I-Unternehmen, S-Auslagerung, B-Auslagerung, E-Auslagerung, I-Auslagerung, S-Ort, B-Ort, E-Ort, I-Ort, S-Software, B-Software, E-Software, I-Software
2024-03-26 16:33:03,330
Results:
- F-score (micro) 0.9016
- F-score (macro) 0.6853
- Accuracy 0.8243
By class:
precision recall f1-score support
Unternehmen 0.8893 0.8759 0.8826 266
Auslagerung 0.8692 0.9076 0.8880 249
Ort 0.9565 0.9851 0.9706 134
Software 0.0000 0.0000 0.0000 0
micro avg 0.8927 0.9106 0.9016 649
macro avg 0.6788 0.6922 0.6853 649
weighted avg 0.8955 0.9106 0.9028 649
2024-03-26 16:33:03,330 ----------------------------------------------------------------------------------------------------