stefan-it's picture
Upload folder using huggingface_hub
d111af7
2023-10-18 17:43:04,080 ----------------------------------------------------------------------------------------------------
2023-10-18 17:43:04,080 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 17:43:04,080 ----------------------------------------------------------------------------------------------------
2023-10-18 17:43:04,080 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-18 17:43:04,080 ----------------------------------------------------------------------------------------------------
2023-10-18 17:43:04,080 Train: 3575 sentences
2023-10-18 17:43:04,080 (train_with_dev=False, train_with_test=False)
2023-10-18 17:43:04,080 ----------------------------------------------------------------------------------------------------
2023-10-18 17:43:04,080 Training Params:
2023-10-18 17:43:04,080 - learning_rate: "3e-05"
2023-10-18 17:43:04,080 - mini_batch_size: "8"
2023-10-18 17:43:04,080 - max_epochs: "10"
2023-10-18 17:43:04,080 - shuffle: "True"
2023-10-18 17:43:04,080 ----------------------------------------------------------------------------------------------------
2023-10-18 17:43:04,080 Plugins:
2023-10-18 17:43:04,080 - TensorboardLogger
2023-10-18 17:43:04,080 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 17:43:04,081 ----------------------------------------------------------------------------------------------------
2023-10-18 17:43:04,081 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 17:43:04,081 - metric: "('micro avg', 'f1-score')"
2023-10-18 17:43:04,081 ----------------------------------------------------------------------------------------------------
2023-10-18 17:43:04,081 Computation:
2023-10-18 17:43:04,081 - compute on device: cuda:0
2023-10-18 17:43:04,081 - embedding storage: none
2023-10-18 17:43:04,081 ----------------------------------------------------------------------------------------------------
2023-10-18 17:43:04,081 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-18 17:43:04,081 ----------------------------------------------------------------------------------------------------
2023-10-18 17:43:04,081 ----------------------------------------------------------------------------------------------------
2023-10-18 17:43:04,081 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 17:43:05,144 epoch 1 - iter 44/447 - loss 3.65841700 - time (sec): 1.06 - samples/sec: 7713.80 - lr: 0.000003 - momentum: 0.000000
2023-10-18 17:43:06,199 epoch 1 - iter 88/447 - loss 3.57458239 - time (sec): 2.12 - samples/sec: 7828.32 - lr: 0.000006 - momentum: 0.000000
2023-10-18 17:43:07,234 epoch 1 - iter 132/447 - loss 3.39036802 - time (sec): 3.15 - samples/sec: 7956.93 - lr: 0.000009 - momentum: 0.000000
2023-10-18 17:43:08,235 epoch 1 - iter 176/447 - loss 3.17329702 - time (sec): 4.15 - samples/sec: 8053.02 - lr: 0.000012 - momentum: 0.000000
2023-10-18 17:43:09,270 epoch 1 - iter 220/447 - loss 2.85941374 - time (sec): 5.19 - samples/sec: 8162.15 - lr: 0.000015 - momentum: 0.000000
2023-10-18 17:43:10,278 epoch 1 - iter 264/447 - loss 2.55730823 - time (sec): 6.20 - samples/sec: 8256.04 - lr: 0.000018 - momentum: 0.000000
2023-10-18 17:43:11,302 epoch 1 - iter 308/447 - loss 2.29715011 - time (sec): 7.22 - samples/sec: 8268.17 - lr: 0.000021 - momentum: 0.000000
2023-10-18 17:43:12,297 epoch 1 - iter 352/447 - loss 2.08270657 - time (sec): 8.22 - samples/sec: 8344.94 - lr: 0.000024 - momentum: 0.000000
2023-10-18 17:43:13,298 epoch 1 - iter 396/447 - loss 1.92246771 - time (sec): 9.22 - samples/sec: 8373.56 - lr: 0.000027 - momentum: 0.000000
2023-10-18 17:43:14,289 epoch 1 - iter 440/447 - loss 1.79687681 - time (sec): 10.21 - samples/sec: 8344.92 - lr: 0.000029 - momentum: 0.000000
2023-10-18 17:43:14,437 ----------------------------------------------------------------------------------------------------
2023-10-18 17:43:14,437 EPOCH 1 done: loss 1.7781 - lr: 0.000029
2023-10-18 17:43:16,615 DEV : loss 0.4775766432285309 - f1-score (micro avg) 0.0
2023-10-18 17:43:16,640 ----------------------------------------------------------------------------------------------------
2023-10-18 17:43:17,685 epoch 2 - iter 44/447 - loss 0.59589924 - time (sec): 1.04 - samples/sec: 9017.00 - lr: 0.000030 - momentum: 0.000000
2023-10-18 17:43:18,715 epoch 2 - iter 88/447 - loss 0.56401892 - time (sec): 2.07 - samples/sec: 8811.19 - lr: 0.000029 - momentum: 0.000000
2023-10-18 17:43:19,789 epoch 2 - iter 132/447 - loss 0.56470757 - time (sec): 3.15 - samples/sec: 8633.50 - lr: 0.000029 - momentum: 0.000000
2023-10-18 17:43:20,799 epoch 2 - iter 176/447 - loss 0.56520829 - time (sec): 4.16 - samples/sec: 8345.06 - lr: 0.000029 - momentum: 0.000000
2023-10-18 17:43:21,845 epoch 2 - iter 220/447 - loss 0.55623726 - time (sec): 5.20 - samples/sec: 8399.40 - lr: 0.000028 - momentum: 0.000000
2023-10-18 17:43:22,847 epoch 2 - iter 264/447 - loss 0.54920417 - time (sec): 6.21 - samples/sec: 8342.39 - lr: 0.000028 - momentum: 0.000000
2023-10-18 17:43:23,868 epoch 2 - iter 308/447 - loss 0.55616306 - time (sec): 7.23 - samples/sec: 8312.48 - lr: 0.000028 - momentum: 0.000000
2023-10-18 17:43:24,933 epoch 2 - iter 352/447 - loss 0.55121105 - time (sec): 8.29 - samples/sec: 8275.02 - lr: 0.000027 - momentum: 0.000000
2023-10-18 17:43:25,962 epoch 2 - iter 396/447 - loss 0.55138490 - time (sec): 9.32 - samples/sec: 8261.89 - lr: 0.000027 - momentum: 0.000000
2023-10-18 17:43:26,977 epoch 2 - iter 440/447 - loss 0.54548733 - time (sec): 10.34 - samples/sec: 8232.48 - lr: 0.000027 - momentum: 0.000000
2023-10-18 17:43:27,142 ----------------------------------------------------------------------------------------------------
2023-10-18 17:43:27,142 EPOCH 2 done: loss 0.5453 - lr: 0.000027
2023-10-18 17:43:32,386 DEV : loss 0.3954547643661499 - f1-score (micro avg) 0.0047
2023-10-18 17:43:32,412 saving best model
2023-10-18 17:43:32,446 ----------------------------------------------------------------------------------------------------
2023-10-18 17:43:33,438 epoch 3 - iter 44/447 - loss 0.46568998 - time (sec): 0.99 - samples/sec: 7873.33 - lr: 0.000026 - momentum: 0.000000
2023-10-18 17:43:34,448 epoch 3 - iter 88/447 - loss 0.48584638 - time (sec): 2.00 - samples/sec: 8243.38 - lr: 0.000026 - momentum: 0.000000
2023-10-18 17:43:35,426 epoch 3 - iter 132/447 - loss 0.48381208 - time (sec): 2.98 - samples/sec: 8146.00 - lr: 0.000026 - momentum: 0.000000
2023-10-18 17:43:36,433 epoch 3 - iter 176/447 - loss 0.46960679 - time (sec): 3.99 - samples/sec: 8330.61 - lr: 0.000025 - momentum: 0.000000
2023-10-18 17:43:37,453 epoch 3 - iter 220/447 - loss 0.46623827 - time (sec): 5.01 - samples/sec: 8371.67 - lr: 0.000025 - momentum: 0.000000
2023-10-18 17:43:38,479 epoch 3 - iter 264/447 - loss 0.46159907 - time (sec): 6.03 - samples/sec: 8435.20 - lr: 0.000025 - momentum: 0.000000
2023-10-18 17:43:39,455 epoch 3 - iter 308/447 - loss 0.45514585 - time (sec): 7.01 - samples/sec: 8383.41 - lr: 0.000024 - momentum: 0.000000
2023-10-18 17:43:40,425 epoch 3 - iter 352/447 - loss 0.45263927 - time (sec): 7.98 - samples/sec: 8415.36 - lr: 0.000024 - momentum: 0.000000
2023-10-18 17:43:41,417 epoch 3 - iter 396/447 - loss 0.45536448 - time (sec): 8.97 - samples/sec: 8428.60 - lr: 0.000024 - momentum: 0.000000
2023-10-18 17:43:42,459 epoch 3 - iter 440/447 - loss 0.45158447 - time (sec): 10.01 - samples/sec: 8515.27 - lr: 0.000023 - momentum: 0.000000
2023-10-18 17:43:42,605 ----------------------------------------------------------------------------------------------------
2023-10-18 17:43:42,606 EPOCH 3 done: loss 0.4501 - lr: 0.000023
2023-10-18 17:43:47,818 DEV : loss 0.34006714820861816 - f1-score (micro avg) 0.1762
2023-10-18 17:43:47,844 saving best model
2023-10-18 17:43:47,878 ----------------------------------------------------------------------------------------------------
2023-10-18 17:43:48,930 epoch 4 - iter 44/447 - loss 0.40096435 - time (sec): 1.05 - samples/sec: 8641.77 - lr: 0.000023 - momentum: 0.000000
2023-10-18 17:43:49,946 epoch 4 - iter 88/447 - loss 0.42313975 - time (sec): 2.07 - samples/sec: 8518.95 - lr: 0.000023 - momentum: 0.000000
2023-10-18 17:43:50,941 epoch 4 - iter 132/447 - loss 0.43615643 - time (sec): 3.06 - samples/sec: 8605.19 - lr: 0.000022 - momentum: 0.000000
2023-10-18 17:43:51,951 epoch 4 - iter 176/447 - loss 0.43038311 - time (sec): 4.07 - samples/sec: 8692.14 - lr: 0.000022 - momentum: 0.000000
2023-10-18 17:43:52,941 epoch 4 - iter 220/447 - loss 0.42491578 - time (sec): 5.06 - samples/sec: 8589.74 - lr: 0.000022 - momentum: 0.000000
2023-10-18 17:43:53,937 epoch 4 - iter 264/447 - loss 0.41934533 - time (sec): 6.06 - samples/sec: 8550.42 - lr: 0.000021 - momentum: 0.000000
2023-10-18 17:43:54,941 epoch 4 - iter 308/447 - loss 0.41228277 - time (sec): 7.06 - samples/sec: 8549.29 - lr: 0.000021 - momentum: 0.000000
2023-10-18 17:43:56,014 epoch 4 - iter 352/447 - loss 0.40713092 - time (sec): 8.14 - samples/sec: 8457.46 - lr: 0.000021 - momentum: 0.000000
2023-10-18 17:43:57,053 epoch 4 - iter 396/447 - loss 0.41039188 - time (sec): 9.17 - samples/sec: 8419.34 - lr: 0.000020 - momentum: 0.000000
2023-10-18 17:43:58,086 epoch 4 - iter 440/447 - loss 0.40812616 - time (sec): 10.21 - samples/sec: 8356.67 - lr: 0.000020 - momentum: 0.000000
2023-10-18 17:43:58,234 ----------------------------------------------------------------------------------------------------
2023-10-18 17:43:58,234 EPOCH 4 done: loss 0.4086 - lr: 0.000020
2023-10-18 17:44:03,498 DEV : loss 0.33860334753990173 - f1-score (micro avg) 0.2534
2023-10-18 17:44:03,524 saving best model
2023-10-18 17:44:03,556 ----------------------------------------------------------------------------------------------------
2023-10-18 17:44:04,590 epoch 5 - iter 44/447 - loss 0.36098289 - time (sec): 1.03 - samples/sec: 7878.70 - lr: 0.000020 - momentum: 0.000000
2023-10-18 17:44:05,601 epoch 5 - iter 88/447 - loss 0.39788915 - time (sec): 2.04 - samples/sec: 7724.01 - lr: 0.000019 - momentum: 0.000000
2023-10-18 17:44:06,617 epoch 5 - iter 132/447 - loss 0.37581248 - time (sec): 3.06 - samples/sec: 7780.33 - lr: 0.000019 - momentum: 0.000000
2023-10-18 17:44:07,695 epoch 5 - iter 176/447 - loss 0.37494525 - time (sec): 4.14 - samples/sec: 8048.39 - lr: 0.000019 - momentum: 0.000000
2023-10-18 17:44:08,706 epoch 5 - iter 220/447 - loss 0.37534730 - time (sec): 5.15 - samples/sec: 8161.94 - lr: 0.000018 - momentum: 0.000000
2023-10-18 17:44:09,732 epoch 5 - iter 264/447 - loss 0.37572997 - time (sec): 6.18 - samples/sec: 8254.05 - lr: 0.000018 - momentum: 0.000000
2023-10-18 17:44:10,768 epoch 5 - iter 308/447 - loss 0.37803576 - time (sec): 7.21 - samples/sec: 8247.46 - lr: 0.000018 - momentum: 0.000000
2023-10-18 17:44:11,797 epoch 5 - iter 352/447 - loss 0.38296535 - time (sec): 8.24 - samples/sec: 8275.23 - lr: 0.000017 - momentum: 0.000000
2023-10-18 17:44:12,634 epoch 5 - iter 396/447 - loss 0.38360699 - time (sec): 9.08 - samples/sec: 8430.98 - lr: 0.000017 - momentum: 0.000000
2023-10-18 17:44:13,491 epoch 5 - iter 440/447 - loss 0.38337258 - time (sec): 9.93 - samples/sec: 8583.20 - lr: 0.000017 - momentum: 0.000000
2023-10-18 17:44:13,633 ----------------------------------------------------------------------------------------------------
2023-10-18 17:44:13,633 EPOCH 5 done: loss 0.3872 - lr: 0.000017
2023-10-18 17:44:18,644 DEV : loss 0.3213781416416168 - f1-score (micro avg) 0.2904
2023-10-18 17:44:18,669 saving best model
2023-10-18 17:44:18,713 ----------------------------------------------------------------------------------------------------
2023-10-18 17:44:19,667 epoch 6 - iter 44/447 - loss 0.39724277 - time (sec): 0.95 - samples/sec: 8724.03 - lr: 0.000016 - momentum: 0.000000
2023-10-18 17:44:20,673 epoch 6 - iter 88/447 - loss 0.35175566 - time (sec): 1.96 - samples/sec: 8833.06 - lr: 0.000016 - momentum: 0.000000
2023-10-18 17:44:21,765 epoch 6 - iter 132/447 - loss 0.33678911 - time (sec): 3.05 - samples/sec: 8716.95 - lr: 0.000016 - momentum: 0.000000
2023-10-18 17:44:22,726 epoch 6 - iter 176/447 - loss 0.35169461 - time (sec): 4.01 - samples/sec: 8618.93 - lr: 0.000015 - momentum: 0.000000
2023-10-18 17:44:24,015 epoch 6 - iter 220/447 - loss 0.36233877 - time (sec): 5.30 - samples/sec: 8117.05 - lr: 0.000015 - momentum: 0.000000
2023-10-18 17:44:24,995 epoch 6 - iter 264/447 - loss 0.36252351 - time (sec): 6.28 - samples/sec: 8149.49 - lr: 0.000015 - momentum: 0.000000
2023-10-18 17:44:25,991 epoch 6 - iter 308/447 - loss 0.36318080 - time (sec): 7.28 - samples/sec: 8180.81 - lr: 0.000014 - momentum: 0.000000
2023-10-18 17:44:27,018 epoch 6 - iter 352/447 - loss 0.36149952 - time (sec): 8.31 - samples/sec: 8230.02 - lr: 0.000014 - momentum: 0.000000
2023-10-18 17:44:28,069 epoch 6 - iter 396/447 - loss 0.36522263 - time (sec): 9.36 - samples/sec: 8223.75 - lr: 0.000014 - momentum: 0.000000
2023-10-18 17:44:29,035 epoch 6 - iter 440/447 - loss 0.36696941 - time (sec): 10.32 - samples/sec: 8242.81 - lr: 0.000013 - momentum: 0.000000
2023-10-18 17:44:29,194 ----------------------------------------------------------------------------------------------------
2023-10-18 17:44:29,194 EPOCH 6 done: loss 0.3667 - lr: 0.000013
2023-10-18 17:44:34,156 DEV : loss 0.3145124316215515 - f1-score (micro avg) 0.3153
2023-10-18 17:44:34,182 saving best model
2023-10-18 17:44:34,215 ----------------------------------------------------------------------------------------------------
2023-10-18 17:44:35,237 epoch 7 - iter 44/447 - loss 0.33254949 - time (sec): 1.02 - samples/sec: 8026.91 - lr: 0.000013 - momentum: 0.000000
2023-10-18 17:44:36,289 epoch 7 - iter 88/447 - loss 0.33867551 - time (sec): 2.07 - samples/sec: 8083.21 - lr: 0.000013 - momentum: 0.000000
2023-10-18 17:44:37,293 epoch 7 - iter 132/447 - loss 0.34664698 - time (sec): 3.08 - samples/sec: 7937.45 - lr: 0.000012 - momentum: 0.000000
2023-10-18 17:44:38,336 epoch 7 - iter 176/447 - loss 0.35008385 - time (sec): 4.12 - samples/sec: 8059.32 - lr: 0.000012 - momentum: 0.000000
2023-10-18 17:44:39,361 epoch 7 - iter 220/447 - loss 0.34738790 - time (sec): 5.15 - samples/sec: 8109.70 - lr: 0.000012 - momentum: 0.000000
2023-10-18 17:44:40,353 epoch 7 - iter 264/447 - loss 0.34675104 - time (sec): 6.14 - samples/sec: 8102.12 - lr: 0.000011 - momentum: 0.000000
2023-10-18 17:44:41,348 epoch 7 - iter 308/447 - loss 0.35204448 - time (sec): 7.13 - samples/sec: 8185.69 - lr: 0.000011 - momentum: 0.000000
2023-10-18 17:44:42,408 epoch 7 - iter 352/447 - loss 0.35087997 - time (sec): 8.19 - samples/sec: 8314.56 - lr: 0.000011 - momentum: 0.000000
2023-10-18 17:44:43,385 epoch 7 - iter 396/447 - loss 0.34971556 - time (sec): 9.17 - samples/sec: 8296.34 - lr: 0.000010 - momentum: 0.000000
2023-10-18 17:44:44,485 epoch 7 - iter 440/447 - loss 0.35367941 - time (sec): 10.27 - samples/sec: 8320.95 - lr: 0.000010 - momentum: 0.000000
2023-10-18 17:44:44,643 ----------------------------------------------------------------------------------------------------
2023-10-18 17:44:44,644 EPOCH 7 done: loss 0.3538 - lr: 0.000010
2023-10-18 17:44:49,903 DEV : loss 0.30769509077072144 - f1-score (micro avg) 0.3193
2023-10-18 17:44:49,929 saving best model
2023-10-18 17:44:49,962 ----------------------------------------------------------------------------------------------------
2023-10-18 17:44:50,958 epoch 8 - iter 44/447 - loss 0.33422276 - time (sec): 0.99 - samples/sec: 8165.06 - lr: 0.000010 - momentum: 0.000000
2023-10-18 17:44:51,968 epoch 8 - iter 88/447 - loss 0.33262216 - time (sec): 2.01 - samples/sec: 8318.97 - lr: 0.000009 - momentum: 0.000000
2023-10-18 17:44:52,970 epoch 8 - iter 132/447 - loss 0.33401019 - time (sec): 3.01 - samples/sec: 8188.09 - lr: 0.000009 - momentum: 0.000000
2023-10-18 17:44:54,003 epoch 8 - iter 176/447 - loss 0.34047320 - time (sec): 4.04 - samples/sec: 8112.27 - lr: 0.000009 - momentum: 0.000000
2023-10-18 17:44:55,056 epoch 8 - iter 220/447 - loss 0.34431160 - time (sec): 5.09 - samples/sec: 8109.95 - lr: 0.000008 - momentum: 0.000000
2023-10-18 17:44:56,114 epoch 8 - iter 264/447 - loss 0.35113437 - time (sec): 6.15 - samples/sec: 8116.36 - lr: 0.000008 - momentum: 0.000000
2023-10-18 17:44:57,217 epoch 8 - iter 308/447 - loss 0.34790518 - time (sec): 7.25 - samples/sec: 8267.13 - lr: 0.000008 - momentum: 0.000000
2023-10-18 17:44:58,227 epoch 8 - iter 352/447 - loss 0.35030628 - time (sec): 8.26 - samples/sec: 8323.70 - lr: 0.000007 - momentum: 0.000000
2023-10-18 17:44:59,235 epoch 8 - iter 396/447 - loss 0.35029828 - time (sec): 9.27 - samples/sec: 8291.91 - lr: 0.000007 - momentum: 0.000000
2023-10-18 17:45:00,271 epoch 8 - iter 440/447 - loss 0.34699832 - time (sec): 10.31 - samples/sec: 8284.79 - lr: 0.000007 - momentum: 0.000000
2023-10-18 17:45:00,431 ----------------------------------------------------------------------------------------------------
2023-10-18 17:45:00,431 EPOCH 8 done: loss 0.3456 - lr: 0.000007
2023-10-18 17:45:05,789 DEV : loss 0.3056319057941437 - f1-score (micro avg) 0.3282
2023-10-18 17:45:05,815 saving best model
2023-10-18 17:45:05,850 ----------------------------------------------------------------------------------------------------
2023-10-18 17:45:06,908 epoch 9 - iter 44/447 - loss 0.33033877 - time (sec): 1.06 - samples/sec: 7764.12 - lr: 0.000006 - momentum: 0.000000
2023-10-18 17:45:07,972 epoch 9 - iter 88/447 - loss 0.32438109 - time (sec): 2.12 - samples/sec: 7960.53 - lr: 0.000006 - momentum: 0.000000
2023-10-18 17:45:08,980 epoch 9 - iter 132/447 - loss 0.33253674 - time (sec): 3.13 - samples/sec: 8029.34 - lr: 0.000006 - momentum: 0.000000
2023-10-18 17:45:10,020 epoch 9 - iter 176/447 - loss 0.32688422 - time (sec): 4.17 - samples/sec: 8005.44 - lr: 0.000005 - momentum: 0.000000
2023-10-18 17:45:11,047 epoch 9 - iter 220/447 - loss 0.32819605 - time (sec): 5.20 - samples/sec: 7999.77 - lr: 0.000005 - momentum: 0.000000
2023-10-18 17:45:12,072 epoch 9 - iter 264/447 - loss 0.32313458 - time (sec): 6.22 - samples/sec: 8130.90 - lr: 0.000005 - momentum: 0.000000
2023-10-18 17:45:13,119 epoch 9 - iter 308/447 - loss 0.32996163 - time (sec): 7.27 - samples/sec: 8219.22 - lr: 0.000004 - momentum: 0.000000
2023-10-18 17:45:14,091 epoch 9 - iter 352/447 - loss 0.33131371 - time (sec): 8.24 - samples/sec: 8213.97 - lr: 0.000004 - momentum: 0.000000
2023-10-18 17:45:15,118 epoch 9 - iter 396/447 - loss 0.33364538 - time (sec): 9.27 - samples/sec: 8198.10 - lr: 0.000004 - momentum: 0.000000
2023-10-18 17:45:16,158 epoch 9 - iter 440/447 - loss 0.33168082 - time (sec): 10.31 - samples/sec: 8290.47 - lr: 0.000003 - momentum: 0.000000
2023-10-18 17:45:16,310 ----------------------------------------------------------------------------------------------------
2023-10-18 17:45:16,310 EPOCH 9 done: loss 0.3324 - lr: 0.000003
2023-10-18 17:45:21,581 DEV : loss 0.30642732977867126 - f1-score (micro avg) 0.3224
2023-10-18 17:45:21,606 ----------------------------------------------------------------------------------------------------
2023-10-18 17:45:22,469 epoch 10 - iter 44/447 - loss 0.31571016 - time (sec): 0.86 - samples/sec: 9189.95 - lr: 0.000003 - momentum: 0.000000
2023-10-18 17:45:23,521 epoch 10 - iter 88/447 - loss 0.30383316 - time (sec): 1.91 - samples/sec: 8967.01 - lr: 0.000003 - momentum: 0.000000
2023-10-18 17:45:24,548 epoch 10 - iter 132/447 - loss 0.31717265 - time (sec): 2.94 - samples/sec: 8708.30 - lr: 0.000002 - momentum: 0.000000
2023-10-18 17:45:25,537 epoch 10 - iter 176/447 - loss 0.32673786 - time (sec): 3.93 - samples/sec: 8612.34 - lr: 0.000002 - momentum: 0.000000
2023-10-18 17:45:26,610 epoch 10 - iter 220/447 - loss 0.33065968 - time (sec): 5.00 - samples/sec: 8635.05 - lr: 0.000002 - momentum: 0.000000
2023-10-18 17:45:27,600 epoch 10 - iter 264/447 - loss 0.32901583 - time (sec): 5.99 - samples/sec: 8607.52 - lr: 0.000001 - momentum: 0.000000
2023-10-18 17:45:28,597 epoch 10 - iter 308/447 - loss 0.33533615 - time (sec): 6.99 - samples/sec: 8580.40 - lr: 0.000001 - momentum: 0.000000
2023-10-18 17:45:29,593 epoch 10 - iter 352/447 - loss 0.33627064 - time (sec): 7.99 - samples/sec: 8559.46 - lr: 0.000001 - momentum: 0.000000
2023-10-18 17:45:30,644 epoch 10 - iter 396/447 - loss 0.33572652 - time (sec): 9.04 - samples/sec: 8512.58 - lr: 0.000000 - momentum: 0.000000
2023-10-18 17:45:31,694 epoch 10 - iter 440/447 - loss 0.33227974 - time (sec): 10.09 - samples/sec: 8452.37 - lr: 0.000000 - momentum: 0.000000
2023-10-18 17:45:31,857 ----------------------------------------------------------------------------------------------------
2023-10-18 17:45:31,857 EPOCH 10 done: loss 0.3314 - lr: 0.000000
2023-10-18 17:45:36,818 DEV : loss 0.3060940206050873 - f1-score (micro avg) 0.3249
2023-10-18 17:45:36,875 ----------------------------------------------------------------------------------------------------
2023-10-18 17:45:36,875 Loading model from best epoch ...
2023-10-18 17:45:36,950 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-18 17:45:39,234
Results:
- F-score (micro) 0.3356
- F-score (macro) 0.1275
- Accuracy 0.2112
By class:
precision recall f1-score support
loc 0.4857 0.5117 0.4984 596
pers 0.1595 0.1231 0.1390 333
org 0.0000 0.0000 0.0000 132
prod 0.0000 0.0000 0.0000 66
time 0.0000 0.0000 0.0000 49
micro avg 0.3905 0.2942 0.3356 1176
macro avg 0.1290 0.1270 0.1275 1176
weighted avg 0.2913 0.2942 0.2919 1176
2023-10-18 17:45:39,234 ----------------------------------------------------------------------------------------------------