stefan-it's picture
Upload folder using huggingface_hub
c029be7
2023-10-19 12:38:44,207 ----------------------------------------------------------------------------------------------------
2023-10-19 12:38:44,207 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-19 12:38:44,207 ----------------------------------------------------------------------------------------------------
2023-10-19 12:38:44,207 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-19 12:38:44,207 ----------------------------------------------------------------------------------------------------
2023-10-19 12:38:44,207 Train: 20847 sentences
2023-10-19 12:38:44,207 (train_with_dev=False, train_with_test=False)
2023-10-19 12:38:44,207 ----------------------------------------------------------------------------------------------------
2023-10-19 12:38:44,207 Training Params:
2023-10-19 12:38:44,207 - learning_rate: "5e-05"
2023-10-19 12:38:44,207 - mini_batch_size: "4"
2023-10-19 12:38:44,207 - max_epochs: "10"
2023-10-19 12:38:44,207 - shuffle: "True"
2023-10-19 12:38:44,207 ----------------------------------------------------------------------------------------------------
2023-10-19 12:38:44,207 Plugins:
2023-10-19 12:38:44,207 - TensorboardLogger
2023-10-19 12:38:44,207 - LinearScheduler | warmup_fraction: '0.1'
2023-10-19 12:38:44,207 ----------------------------------------------------------------------------------------------------
2023-10-19 12:38:44,208 Final evaluation on model from best epoch (best-model.pt)
2023-10-19 12:38:44,208 - metric: "('micro avg', 'f1-score')"
2023-10-19 12:38:44,208 ----------------------------------------------------------------------------------------------------
2023-10-19 12:38:44,208 Computation:
2023-10-19 12:38:44,208 - compute on device: cuda:0
2023-10-19 12:38:44,208 - embedding storage: none
2023-10-19 12:38:44,208 ----------------------------------------------------------------------------------------------------
2023-10-19 12:38:44,208 Model training base path: "hmbench-newseye/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-19 12:38:44,208 ----------------------------------------------------------------------------------------------------
2023-10-19 12:38:44,208 ----------------------------------------------------------------------------------------------------
2023-10-19 12:38:44,208 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-19 12:38:52,418 epoch 1 - iter 521/5212 - loss 3.42604136 - time (sec): 8.21 - samples/sec: 4698.46 - lr: 0.000005 - momentum: 0.000000
2023-10-19 12:39:00,610 epoch 1 - iter 1042/5212 - loss 2.44104122 - time (sec): 16.40 - samples/sec: 4513.24 - lr: 0.000010 - momentum: 0.000000
2023-10-19 12:39:08,630 epoch 1 - iter 1563/5212 - loss 1.85175851 - time (sec): 24.42 - samples/sec: 4442.18 - lr: 0.000015 - momentum: 0.000000
2023-10-19 12:39:17,015 epoch 1 - iter 2084/5212 - loss 1.52346946 - time (sec): 32.81 - samples/sec: 4460.50 - lr: 0.000020 - momentum: 0.000000
2023-10-19 12:39:25,253 epoch 1 - iter 2605/5212 - loss 1.33374847 - time (sec): 41.05 - samples/sec: 4415.37 - lr: 0.000025 - momentum: 0.000000
2023-10-19 12:39:33,797 epoch 1 - iter 3126/5212 - loss 1.18123210 - time (sec): 49.59 - samples/sec: 4431.28 - lr: 0.000030 - momentum: 0.000000
2023-10-19 12:39:42,091 epoch 1 - iter 3647/5212 - loss 1.06507178 - time (sec): 57.88 - samples/sec: 4460.86 - lr: 0.000035 - momentum: 0.000000
2023-10-19 12:39:50,274 epoch 1 - iter 4168/5212 - loss 0.98648191 - time (sec): 66.07 - samples/sec: 4435.47 - lr: 0.000040 - momentum: 0.000000
2023-10-19 12:39:58,678 epoch 1 - iter 4689/5212 - loss 0.91790985 - time (sec): 74.47 - samples/sec: 4441.81 - lr: 0.000045 - momentum: 0.000000
2023-10-19 12:40:07,124 epoch 1 - iter 5210/5212 - loss 0.86893234 - time (sec): 82.92 - samples/sec: 4427.56 - lr: 0.000050 - momentum: 0.000000
2023-10-19 12:40:07,164 ----------------------------------------------------------------------------------------------------
2023-10-19 12:40:07,164 EPOCH 1 done: loss 0.8683 - lr: 0.000050
2023-10-19 12:40:10,143 DEV : loss 0.14668601751327515 - f1-score (micro avg) 0.2553
2023-10-19 12:40:10,167 saving best model
2023-10-19 12:40:10,195 ----------------------------------------------------------------------------------------------------
2023-10-19 12:40:18,443 epoch 2 - iter 521/5212 - loss 0.36989339 - time (sec): 8.25 - samples/sec: 4548.82 - lr: 0.000049 - momentum: 0.000000
2023-10-19 12:40:26,940 epoch 2 - iter 1042/5212 - loss 0.36460474 - time (sec): 16.74 - samples/sec: 4638.12 - lr: 0.000049 - momentum: 0.000000
2023-10-19 12:40:34,539 epoch 2 - iter 1563/5212 - loss 0.35676632 - time (sec): 24.34 - samples/sec: 4753.46 - lr: 0.000048 - momentum: 0.000000
2023-10-19 12:40:42,578 epoch 2 - iter 2084/5212 - loss 0.35219138 - time (sec): 32.38 - samples/sec: 4628.18 - lr: 0.000048 - momentum: 0.000000
2023-10-19 12:40:50,828 epoch 2 - iter 2605/5212 - loss 0.35698932 - time (sec): 40.63 - samples/sec: 4589.75 - lr: 0.000047 - momentum: 0.000000
2023-10-19 12:40:59,176 epoch 2 - iter 3126/5212 - loss 0.35632972 - time (sec): 48.98 - samples/sec: 4562.33 - lr: 0.000047 - momentum: 0.000000
2023-10-19 12:41:07,474 epoch 2 - iter 3647/5212 - loss 0.35437571 - time (sec): 57.28 - samples/sec: 4520.49 - lr: 0.000046 - momentum: 0.000000
2023-10-19 12:41:15,443 epoch 2 - iter 4168/5212 - loss 0.35077830 - time (sec): 65.25 - samples/sec: 4511.87 - lr: 0.000046 - momentum: 0.000000
2023-10-19 12:41:23,605 epoch 2 - iter 4689/5212 - loss 0.34724407 - time (sec): 73.41 - samples/sec: 4480.54 - lr: 0.000045 - momentum: 0.000000
2023-10-19 12:41:31,941 epoch 2 - iter 5210/5212 - loss 0.34213703 - time (sec): 81.75 - samples/sec: 4493.35 - lr: 0.000044 - momentum: 0.000000
2023-10-19 12:41:31,969 ----------------------------------------------------------------------------------------------------
2023-10-19 12:41:31,969 EPOCH 2 done: loss 0.3421 - lr: 0.000044
2023-10-19 12:41:37,054 DEV : loss 0.1477486938238144 - f1-score (micro avg) 0.268
2023-10-19 12:41:37,079 saving best model
2023-10-19 12:41:37,111 ----------------------------------------------------------------------------------------------------
2023-10-19 12:41:45,513 epoch 3 - iter 521/5212 - loss 0.30625132 - time (sec): 8.40 - samples/sec: 4389.44 - lr: 0.000044 - momentum: 0.000000
2023-10-19 12:41:53,837 epoch 3 - iter 1042/5212 - loss 0.28511737 - time (sec): 16.73 - samples/sec: 4611.01 - lr: 0.000043 - momentum: 0.000000
2023-10-19 12:42:02,178 epoch 3 - iter 1563/5212 - loss 0.29631007 - time (sec): 25.07 - samples/sec: 4461.36 - lr: 0.000043 - momentum: 0.000000
2023-10-19 12:42:10,497 epoch 3 - iter 2084/5212 - loss 0.29906238 - time (sec): 33.39 - samples/sec: 4388.92 - lr: 0.000042 - momentum: 0.000000
2023-10-19 12:42:18,831 epoch 3 - iter 2605/5212 - loss 0.29236860 - time (sec): 41.72 - samples/sec: 4411.83 - lr: 0.000042 - momentum: 0.000000
2023-10-19 12:42:27,376 epoch 3 - iter 3126/5212 - loss 0.29086057 - time (sec): 50.26 - samples/sec: 4484.68 - lr: 0.000041 - momentum: 0.000000
2023-10-19 12:42:35,391 epoch 3 - iter 3647/5212 - loss 0.28709763 - time (sec): 58.28 - samples/sec: 4418.75 - lr: 0.000041 - momentum: 0.000000
2023-10-19 12:42:43,630 epoch 3 - iter 4168/5212 - loss 0.28615399 - time (sec): 66.52 - samples/sec: 4392.11 - lr: 0.000040 - momentum: 0.000000
2023-10-19 12:42:52,134 epoch 3 - iter 4689/5212 - loss 0.28645582 - time (sec): 75.02 - samples/sec: 4417.42 - lr: 0.000039 - momentum: 0.000000
2023-10-19 12:43:00,575 epoch 3 - iter 5210/5212 - loss 0.28485558 - time (sec): 83.46 - samples/sec: 4401.47 - lr: 0.000039 - momentum: 0.000000
2023-10-19 12:43:00,608 ----------------------------------------------------------------------------------------------------
2023-10-19 12:43:00,609 EPOCH 3 done: loss 0.2848 - lr: 0.000039
2023-10-19 12:43:05,127 DEV : loss 0.15424801409244537 - f1-score (micro avg) 0.2655
2023-10-19 12:43:05,150 ----------------------------------------------------------------------------------------------------
2023-10-19 12:43:13,654 epoch 4 - iter 521/5212 - loss 0.23083735 - time (sec): 8.50 - samples/sec: 4449.71 - lr: 0.000038 - momentum: 0.000000
2023-10-19 12:43:21,963 epoch 4 - iter 1042/5212 - loss 0.24038158 - time (sec): 16.81 - samples/sec: 4370.66 - lr: 0.000038 - momentum: 0.000000
2023-10-19 12:43:30,476 epoch 4 - iter 1563/5212 - loss 0.24387218 - time (sec): 25.33 - samples/sec: 4396.26 - lr: 0.000037 - momentum: 0.000000
2023-10-19 12:43:38,752 epoch 4 - iter 2084/5212 - loss 0.25131770 - time (sec): 33.60 - samples/sec: 4347.25 - lr: 0.000037 - momentum: 0.000000
2023-10-19 12:43:46,995 epoch 4 - iter 2605/5212 - loss 0.25240929 - time (sec): 41.84 - samples/sec: 4359.80 - lr: 0.000036 - momentum: 0.000000
2023-10-19 12:43:55,908 epoch 4 - iter 3126/5212 - loss 0.24536869 - time (sec): 50.76 - samples/sec: 4309.73 - lr: 0.000036 - momentum: 0.000000
2023-10-19 12:44:04,103 epoch 4 - iter 3647/5212 - loss 0.24569598 - time (sec): 58.95 - samples/sec: 4322.62 - lr: 0.000035 - momentum: 0.000000
2023-10-19 12:44:12,545 epoch 4 - iter 4168/5212 - loss 0.24208870 - time (sec): 67.39 - samples/sec: 4334.05 - lr: 0.000034 - momentum: 0.000000
2023-10-19 12:44:20,894 epoch 4 - iter 4689/5212 - loss 0.24553610 - time (sec): 75.74 - samples/sec: 4371.14 - lr: 0.000034 - momentum: 0.000000
2023-10-19 12:44:29,241 epoch 4 - iter 5210/5212 - loss 0.24595979 - time (sec): 84.09 - samples/sec: 4367.98 - lr: 0.000033 - momentum: 0.000000
2023-10-19 12:44:29,274 ----------------------------------------------------------------------------------------------------
2023-10-19 12:44:29,274 EPOCH 4 done: loss 0.2459 - lr: 0.000033
2023-10-19 12:44:33,816 DEV : loss 0.16588500142097473 - f1-score (micro avg) 0.2542
2023-10-19 12:44:33,839 ----------------------------------------------------------------------------------------------------
2023-10-19 12:44:41,920 epoch 5 - iter 521/5212 - loss 0.23619366 - time (sec): 8.08 - samples/sec: 4011.71 - lr: 0.000033 - momentum: 0.000000
2023-10-19 12:44:50,407 epoch 5 - iter 1042/5212 - loss 0.22990360 - time (sec): 16.57 - samples/sec: 4419.65 - lr: 0.000032 - momentum: 0.000000
2023-10-19 12:44:58,588 epoch 5 - iter 1563/5212 - loss 0.23443858 - time (sec): 24.75 - samples/sec: 4332.12 - lr: 0.000032 - momentum: 0.000000
2023-10-19 12:45:06,990 epoch 5 - iter 2084/5212 - loss 0.23265323 - time (sec): 33.15 - samples/sec: 4402.86 - lr: 0.000031 - momentum: 0.000000
2023-10-19 12:45:15,383 epoch 5 - iter 2605/5212 - loss 0.22890484 - time (sec): 41.54 - samples/sec: 4363.15 - lr: 0.000031 - momentum: 0.000000
2023-10-19 12:45:23,928 epoch 5 - iter 3126/5212 - loss 0.22567797 - time (sec): 50.09 - samples/sec: 4377.36 - lr: 0.000030 - momentum: 0.000000
2023-10-19 12:45:32,221 epoch 5 - iter 3647/5212 - loss 0.22504579 - time (sec): 58.38 - samples/sec: 4384.04 - lr: 0.000029 - momentum: 0.000000
2023-10-19 12:45:40,448 epoch 5 - iter 4168/5212 - loss 0.22175996 - time (sec): 66.61 - samples/sec: 4373.65 - lr: 0.000029 - momentum: 0.000000
2023-10-19 12:45:48,958 epoch 5 - iter 4689/5212 - loss 0.22060023 - time (sec): 75.12 - samples/sec: 4394.33 - lr: 0.000028 - momentum: 0.000000
2023-10-19 12:45:57,313 epoch 5 - iter 5210/5212 - loss 0.21920627 - time (sec): 83.47 - samples/sec: 4400.79 - lr: 0.000028 - momentum: 0.000000
2023-10-19 12:45:57,347 ----------------------------------------------------------------------------------------------------
2023-10-19 12:45:57,348 EPOCH 5 done: loss 0.2193 - lr: 0.000028
2023-10-19 12:46:02,433 DEV : loss 0.1704423874616623 - f1-score (micro avg) 0.2369
2023-10-19 12:46:02,455 ----------------------------------------------------------------------------------------------------
2023-10-19 12:46:10,589 epoch 6 - iter 521/5212 - loss 0.20066139 - time (sec): 8.13 - samples/sec: 4352.84 - lr: 0.000027 - momentum: 0.000000
2023-10-19 12:46:18,871 epoch 6 - iter 1042/5212 - loss 0.20163315 - time (sec): 16.42 - samples/sec: 4372.98 - lr: 0.000027 - momentum: 0.000000
2023-10-19 12:46:26,922 epoch 6 - iter 1563/5212 - loss 0.20471101 - time (sec): 24.47 - samples/sec: 4398.00 - lr: 0.000026 - momentum: 0.000000
2023-10-19 12:46:34,965 epoch 6 - iter 2084/5212 - loss 0.20285373 - time (sec): 32.51 - samples/sec: 4460.23 - lr: 0.000026 - momentum: 0.000000
2023-10-19 12:46:43,049 epoch 6 - iter 2605/5212 - loss 0.20365457 - time (sec): 40.59 - samples/sec: 4462.42 - lr: 0.000025 - momentum: 0.000000
2023-10-19 12:46:51,455 epoch 6 - iter 3126/5212 - loss 0.20122669 - time (sec): 49.00 - samples/sec: 4485.43 - lr: 0.000024 - momentum: 0.000000
2023-10-19 12:46:59,804 epoch 6 - iter 3647/5212 - loss 0.20056229 - time (sec): 57.35 - samples/sec: 4484.76 - lr: 0.000024 - momentum: 0.000000
2023-10-19 12:47:08,064 epoch 6 - iter 4168/5212 - loss 0.20088302 - time (sec): 65.61 - samples/sec: 4461.25 - lr: 0.000023 - momentum: 0.000000
2023-10-19 12:47:16,409 epoch 6 - iter 4689/5212 - loss 0.20360026 - time (sec): 73.95 - samples/sec: 4446.39 - lr: 0.000023 - momentum: 0.000000
2023-10-19 12:47:24,692 epoch 6 - iter 5210/5212 - loss 0.20135037 - time (sec): 82.24 - samples/sec: 4465.47 - lr: 0.000022 - momentum: 0.000000
2023-10-19 12:47:24,730 ----------------------------------------------------------------------------------------------------
2023-10-19 12:47:24,730 EPOCH 6 done: loss 0.2015 - lr: 0.000022
2023-10-19 12:47:29,838 DEV : loss 0.18821725249290466 - f1-score (micro avg) 0.2472
2023-10-19 12:47:29,861 ----------------------------------------------------------------------------------------------------
2023-10-19 12:47:38,045 epoch 7 - iter 521/5212 - loss 0.18170328 - time (sec): 8.18 - samples/sec: 4512.41 - lr: 0.000022 - momentum: 0.000000
2023-10-19 12:47:45,967 epoch 7 - iter 1042/5212 - loss 0.18669584 - time (sec): 16.11 - samples/sec: 4459.69 - lr: 0.000021 - momentum: 0.000000
2023-10-19 12:47:54,222 epoch 7 - iter 1563/5212 - loss 0.19044056 - time (sec): 24.36 - samples/sec: 4417.82 - lr: 0.000021 - momentum: 0.000000
2023-10-19 12:48:02,478 epoch 7 - iter 2084/5212 - loss 0.18825042 - time (sec): 32.62 - samples/sec: 4460.22 - lr: 0.000020 - momentum: 0.000000
2023-10-19 12:48:10,835 epoch 7 - iter 2605/5212 - loss 0.18756143 - time (sec): 40.97 - samples/sec: 4474.00 - lr: 0.000019 - momentum: 0.000000
2023-10-19 12:48:19,272 epoch 7 - iter 3126/5212 - loss 0.18748580 - time (sec): 49.41 - samples/sec: 4448.10 - lr: 0.000019 - momentum: 0.000000
2023-10-19 12:48:27,883 epoch 7 - iter 3647/5212 - loss 0.19022095 - time (sec): 58.02 - samples/sec: 4465.43 - lr: 0.000018 - momentum: 0.000000
2023-10-19 12:48:36,192 epoch 7 - iter 4168/5212 - loss 0.18650095 - time (sec): 66.33 - samples/sec: 4433.66 - lr: 0.000018 - momentum: 0.000000
2023-10-19 12:48:44,645 epoch 7 - iter 4689/5212 - loss 0.18516860 - time (sec): 74.78 - samples/sec: 4424.80 - lr: 0.000017 - momentum: 0.000000
2023-10-19 12:48:52,930 epoch 7 - iter 5210/5212 - loss 0.18473747 - time (sec): 83.07 - samples/sec: 4420.26 - lr: 0.000017 - momentum: 0.000000
2023-10-19 12:48:52,972 ----------------------------------------------------------------------------------------------------
2023-10-19 12:48:52,972 EPOCH 7 done: loss 0.1847 - lr: 0.000017
2023-10-19 12:48:58,094 DEV : loss 0.21395063400268555 - f1-score (micro avg) 0.2538
2023-10-19 12:48:58,117 ----------------------------------------------------------------------------------------------------
2023-10-19 12:49:06,575 epoch 8 - iter 521/5212 - loss 0.15862877 - time (sec): 8.46 - samples/sec: 4334.42 - lr: 0.000016 - momentum: 0.000000
2023-10-19 12:49:14,911 epoch 8 - iter 1042/5212 - loss 0.18100841 - time (sec): 16.79 - samples/sec: 4502.79 - lr: 0.000016 - momentum: 0.000000
2023-10-19 12:49:23,046 epoch 8 - iter 1563/5212 - loss 0.18687514 - time (sec): 24.93 - samples/sec: 4498.75 - lr: 0.000015 - momentum: 0.000000
2023-10-19 12:49:31,042 epoch 8 - iter 2084/5212 - loss 0.18206795 - time (sec): 32.92 - samples/sec: 4519.24 - lr: 0.000014 - momentum: 0.000000
2023-10-19 12:49:39,110 epoch 8 - iter 2605/5212 - loss 0.17991319 - time (sec): 40.99 - samples/sec: 4479.84 - lr: 0.000014 - momentum: 0.000000
2023-10-19 12:49:47,355 epoch 8 - iter 3126/5212 - loss 0.17865515 - time (sec): 49.24 - samples/sec: 4467.04 - lr: 0.000013 - momentum: 0.000000
2023-10-19 12:49:55,723 epoch 8 - iter 3647/5212 - loss 0.17797690 - time (sec): 57.61 - samples/sec: 4466.78 - lr: 0.000013 - momentum: 0.000000
2023-10-19 12:50:03,712 epoch 8 - iter 4168/5212 - loss 0.17640421 - time (sec): 65.59 - samples/sec: 4452.77 - lr: 0.000012 - momentum: 0.000000
2023-10-19 12:50:12,162 epoch 8 - iter 4689/5212 - loss 0.17412305 - time (sec): 74.05 - samples/sec: 4450.19 - lr: 0.000012 - momentum: 0.000000
2023-10-19 12:50:20,336 epoch 8 - iter 5210/5212 - loss 0.17327899 - time (sec): 82.22 - samples/sec: 4466.46 - lr: 0.000011 - momentum: 0.000000
2023-10-19 12:50:20,368 ----------------------------------------------------------------------------------------------------
2023-10-19 12:50:20,369 EPOCH 8 done: loss 0.1732 - lr: 0.000011
2023-10-19 12:50:25,581 DEV : loss 0.2319490760564804 - f1-score (micro avg) 0.2526
2023-10-19 12:50:25,603 ----------------------------------------------------------------------------------------------------
2023-10-19 12:50:33,726 epoch 9 - iter 521/5212 - loss 0.18312888 - time (sec): 8.12 - samples/sec: 4590.45 - lr: 0.000011 - momentum: 0.000000
2023-10-19 12:50:41,977 epoch 9 - iter 1042/5212 - loss 0.17592071 - time (sec): 16.37 - samples/sec: 4421.50 - lr: 0.000010 - momentum: 0.000000
2023-10-19 12:50:50,476 epoch 9 - iter 1563/5212 - loss 0.16919437 - time (sec): 24.87 - samples/sec: 4446.16 - lr: 0.000009 - momentum: 0.000000
2023-10-19 12:50:58,858 epoch 9 - iter 2084/5212 - loss 0.16984065 - time (sec): 33.25 - samples/sec: 4421.93 - lr: 0.000009 - momentum: 0.000000
2023-10-19 12:51:07,083 epoch 9 - iter 2605/5212 - loss 0.16510263 - time (sec): 41.48 - samples/sec: 4411.94 - lr: 0.000008 - momentum: 0.000000
2023-10-19 12:51:15,417 epoch 9 - iter 3126/5212 - loss 0.16523013 - time (sec): 49.81 - samples/sec: 4377.83 - lr: 0.000008 - momentum: 0.000000
2023-10-19 12:51:23,816 epoch 9 - iter 3647/5212 - loss 0.16658392 - time (sec): 58.21 - samples/sec: 4403.46 - lr: 0.000007 - momentum: 0.000000
2023-10-19 12:51:32,026 epoch 9 - iter 4168/5212 - loss 0.16708629 - time (sec): 66.42 - samples/sec: 4393.88 - lr: 0.000007 - momentum: 0.000000
2023-10-19 12:51:40,671 epoch 9 - iter 4689/5212 - loss 0.16558440 - time (sec): 75.07 - samples/sec: 4369.94 - lr: 0.000006 - momentum: 0.000000
2023-10-19 12:51:49,043 epoch 9 - iter 5210/5212 - loss 0.16580617 - time (sec): 83.44 - samples/sec: 4403.02 - lr: 0.000006 - momentum: 0.000000
2023-10-19 12:51:49,078 ----------------------------------------------------------------------------------------------------
2023-10-19 12:51:49,078 EPOCH 9 done: loss 0.1658 - lr: 0.000006
2023-10-19 12:51:54,229 DEV : loss 0.2346869856119156 - f1-score (micro avg) 0.2311
2023-10-19 12:51:54,254 ----------------------------------------------------------------------------------------------------
2023-10-19 12:52:02,385 epoch 10 - iter 521/5212 - loss 0.12981061 - time (sec): 8.13 - samples/sec: 4684.15 - lr: 0.000005 - momentum: 0.000000
2023-10-19 12:52:10,554 epoch 10 - iter 1042/5212 - loss 0.14790609 - time (sec): 16.30 - samples/sec: 4512.11 - lr: 0.000004 - momentum: 0.000000
2023-10-19 12:52:18,659 epoch 10 - iter 1563/5212 - loss 0.15676457 - time (sec): 24.40 - samples/sec: 4452.83 - lr: 0.000004 - momentum: 0.000000
2023-10-19 12:52:26,867 epoch 10 - iter 2084/5212 - loss 0.15662785 - time (sec): 32.61 - samples/sec: 4479.24 - lr: 0.000003 - momentum: 0.000000
2023-10-19 12:52:35,401 epoch 10 - iter 2605/5212 - loss 0.15975847 - time (sec): 41.15 - samples/sec: 4497.68 - lr: 0.000003 - momentum: 0.000000
2023-10-19 12:52:43,847 epoch 10 - iter 3126/5212 - loss 0.15964390 - time (sec): 49.59 - samples/sec: 4397.75 - lr: 0.000002 - momentum: 0.000000
2023-10-19 12:52:52,232 epoch 10 - iter 3647/5212 - loss 0.15852081 - time (sec): 57.98 - samples/sec: 4435.35 - lr: 0.000002 - momentum: 0.000000
2023-10-19 12:53:00,489 epoch 10 - iter 4168/5212 - loss 0.16049237 - time (sec): 66.23 - samples/sec: 4398.62 - lr: 0.000001 - momentum: 0.000000
2023-10-19 12:53:08,832 epoch 10 - iter 4689/5212 - loss 0.15889932 - time (sec): 74.58 - samples/sec: 4416.14 - lr: 0.000001 - momentum: 0.000000
2023-10-19 12:53:17,265 epoch 10 - iter 5210/5212 - loss 0.15970840 - time (sec): 83.01 - samples/sec: 4424.80 - lr: 0.000000 - momentum: 0.000000
2023-10-19 12:53:17,299 ----------------------------------------------------------------------------------------------------
2023-10-19 12:53:17,299 EPOCH 10 done: loss 0.1597 - lr: 0.000000
2023-10-19 12:53:22,483 DEV : loss 0.23591217398643494 - f1-score (micro avg) 0.2428
2023-10-19 12:53:22,536 ----------------------------------------------------------------------------------------------------
2023-10-19 12:53:22,536 Loading model from best epoch ...
2023-10-19 12:53:22,620 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-19 12:53:28,802
Results:
- F-score (micro) 0.2064
- F-score (macro) 0.1091
- Accuracy 0.1156
By class:
precision recall f1-score support
LOC 0.4440 0.2611 0.3288 1214
PER 0.1258 0.0755 0.0944 808
ORG 0.0294 0.0085 0.0132 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.2929 0.1594 0.2064 2390
macro avg 0.1498 0.0863 0.1091 2390
weighted avg 0.2724 0.1594 0.2009 2390
2023-10-19 12:53:28,802 ----------------------------------------------------------------------------------------------------