stefan-it's picture
Upload folder using huggingface_hub
7fbad1f
2023-10-18 14:42:40,217 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:40,217 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 14:42:40,217 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:40,217 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-18 14:42:40,217 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:40,217 Train: 1100 sentences
2023-10-18 14:42:40,217 (train_with_dev=False, train_with_test=False)
2023-10-18 14:42:40,217 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:40,217 Training Params:
2023-10-18 14:42:40,217 - learning_rate: "3e-05"
2023-10-18 14:42:40,217 - mini_batch_size: "4"
2023-10-18 14:42:40,217 - max_epochs: "10"
2023-10-18 14:42:40,217 - shuffle: "True"
2023-10-18 14:42:40,217 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:40,217 Plugins:
2023-10-18 14:42:40,217 - TensorboardLogger
2023-10-18 14:42:40,217 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 14:42:40,218 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:40,218 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 14:42:40,218 - metric: "('micro avg', 'f1-score')"
2023-10-18 14:42:40,218 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:40,218 Computation:
2023-10-18 14:42:40,218 - compute on device: cuda:0
2023-10-18 14:42:40,218 - embedding storage: none
2023-10-18 14:42:40,218 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:40,218 Model training base path: "hmbench-ajmc/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-18 14:42:40,218 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:40,218 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:40,218 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 14:42:40,625 epoch 1 - iter 27/275 - loss 4.04291065 - time (sec): 0.41 - samples/sec: 4876.59 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:42:41,036 epoch 1 - iter 54/275 - loss 3.99046555 - time (sec): 0.82 - samples/sec: 5295.40 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:42:41,450 epoch 1 - iter 81/275 - loss 3.88300995 - time (sec): 1.23 - samples/sec: 5217.03 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:42:41,839 epoch 1 - iter 108/275 - loss 3.72596285 - time (sec): 1.62 - samples/sec: 5451.17 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:42:42,204 epoch 1 - iter 135/275 - loss 3.59012547 - time (sec): 1.99 - samples/sec: 5587.39 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:42:42,567 epoch 1 - iter 162/275 - loss 3.42139254 - time (sec): 2.35 - samples/sec: 5723.76 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:42:42,940 epoch 1 - iter 189/275 - loss 3.22434273 - time (sec): 2.72 - samples/sec: 5769.38 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:42:43,317 epoch 1 - iter 216/275 - loss 3.00442121 - time (sec): 3.10 - samples/sec: 5893.76 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:42:43,695 epoch 1 - iter 243/275 - loss 2.83483712 - time (sec): 3.48 - samples/sec: 5821.97 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:42:44,062 epoch 1 - iter 270/275 - loss 2.67508372 - time (sec): 3.84 - samples/sec: 5835.49 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:42:44,125 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:44,125 EPOCH 1 done: loss 2.6478 - lr: 0.000029
2023-10-18 14:42:44,372 DEV : loss 0.9118794202804565 - f1-score (micro avg) 0.0
2023-10-18 14:42:44,376 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:44,748 epoch 2 - iter 27/275 - loss 0.87662447 - time (sec): 0.37 - samples/sec: 6845.47 - lr: 0.000030 - momentum: 0.000000
2023-10-18 14:42:45,110 epoch 2 - iter 54/275 - loss 0.91374560 - time (sec): 0.73 - samples/sec: 6393.62 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:42:45,483 epoch 2 - iter 81/275 - loss 0.95889295 - time (sec): 1.11 - samples/sec: 6284.21 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:42:45,845 epoch 2 - iter 108/275 - loss 0.98239975 - time (sec): 1.47 - samples/sec: 6046.67 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:42:46,212 epoch 2 - iter 135/275 - loss 0.96439524 - time (sec): 1.84 - samples/sec: 5984.52 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:42:46,584 epoch 2 - iter 162/275 - loss 0.95106604 - time (sec): 2.21 - samples/sec: 6042.69 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:42:46,944 epoch 2 - iter 189/275 - loss 0.94723193 - time (sec): 2.57 - samples/sec: 5948.24 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:42:47,317 epoch 2 - iter 216/275 - loss 0.94445831 - time (sec): 2.94 - samples/sec: 6021.22 - lr: 0.000027 - momentum: 0.000000
2023-10-18 14:42:47,685 epoch 2 - iter 243/275 - loss 0.93548787 - time (sec): 3.31 - samples/sec: 6070.65 - lr: 0.000027 - momentum: 0.000000
2023-10-18 14:42:48,054 epoch 2 - iter 270/275 - loss 0.91241355 - time (sec): 3.68 - samples/sec: 6096.45 - lr: 0.000027 - momentum: 0.000000
2023-10-18 14:42:48,125 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:48,125 EPOCH 2 done: loss 0.9173 - lr: 0.000027
2023-10-18 14:42:48,490 DEV : loss 0.7475361227989197 - f1-score (micro avg) 0.0
2023-10-18 14:42:48,496 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:48,902 epoch 3 - iter 27/275 - loss 0.84635988 - time (sec): 0.41 - samples/sec: 5704.90 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:42:49,321 epoch 3 - iter 54/275 - loss 0.86258624 - time (sec): 0.82 - samples/sec: 5841.02 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:42:49,724 epoch 3 - iter 81/275 - loss 0.82020005 - time (sec): 1.23 - samples/sec: 5773.89 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:42:50,139 epoch 3 - iter 108/275 - loss 0.77919393 - time (sec): 1.64 - samples/sec: 5707.72 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:42:50,534 epoch 3 - iter 135/275 - loss 0.76811019 - time (sec): 2.04 - samples/sec: 5700.03 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:42:50,947 epoch 3 - iter 162/275 - loss 0.74306539 - time (sec): 2.45 - samples/sec: 5612.75 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:42:51,356 epoch 3 - iter 189/275 - loss 0.73905325 - time (sec): 2.86 - samples/sec: 5540.48 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:42:51,754 epoch 3 - iter 216/275 - loss 0.73527931 - time (sec): 3.26 - samples/sec: 5553.83 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:42:52,169 epoch 3 - iter 243/275 - loss 0.72744124 - time (sec): 3.67 - samples/sec: 5554.37 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:42:52,563 epoch 3 - iter 270/275 - loss 0.73156120 - time (sec): 4.07 - samples/sec: 5514.56 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:42:52,639 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:52,639 EPOCH 3 done: loss 0.7276 - lr: 0.000023
2023-10-18 14:42:52,997 DEV : loss 0.5736738443374634 - f1-score (micro avg) 0.0998
2023-10-18 14:42:53,001 saving best model
2023-10-18 14:42:53,036 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:53,436 epoch 4 - iter 27/275 - loss 0.66304226 - time (sec): 0.40 - samples/sec: 5203.57 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:42:53,839 epoch 4 - iter 54/275 - loss 0.67760661 - time (sec): 0.80 - samples/sec: 5109.62 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:42:54,238 epoch 4 - iter 81/275 - loss 0.67789303 - time (sec): 1.20 - samples/sec: 5267.91 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:42:54,648 epoch 4 - iter 108/275 - loss 0.67021179 - time (sec): 1.61 - samples/sec: 5332.48 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:42:55,055 epoch 4 - iter 135/275 - loss 0.65333439 - time (sec): 2.02 - samples/sec: 5361.47 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:42:55,468 epoch 4 - iter 162/275 - loss 0.65800262 - time (sec): 2.43 - samples/sec: 5419.86 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:42:55,871 epoch 4 - iter 189/275 - loss 0.65164776 - time (sec): 2.83 - samples/sec: 5427.11 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:42:56,277 epoch 4 - iter 216/275 - loss 0.64308123 - time (sec): 3.24 - samples/sec: 5493.28 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:42:56,700 epoch 4 - iter 243/275 - loss 0.62933789 - time (sec): 3.66 - samples/sec: 5518.39 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:42:57,100 epoch 4 - iter 270/275 - loss 0.62240563 - time (sec): 4.06 - samples/sec: 5504.32 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:42:57,175 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:57,175 EPOCH 4 done: loss 0.6153 - lr: 0.000020
2023-10-18 14:42:57,655 DEV : loss 0.4884113371372223 - f1-score (micro avg) 0.2163
2023-10-18 14:42:57,659 saving best model
2023-10-18 14:42:57,700 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:58,108 epoch 5 - iter 27/275 - loss 0.57991470 - time (sec): 0.41 - samples/sec: 5953.53 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:42:58,511 epoch 5 - iter 54/275 - loss 0.53287747 - time (sec): 0.81 - samples/sec: 5885.03 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:42:58,934 epoch 5 - iter 81/275 - loss 0.52636296 - time (sec): 1.23 - samples/sec: 5630.94 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:42:59,347 epoch 5 - iter 108/275 - loss 0.54229407 - time (sec): 1.65 - samples/sec: 5617.38 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:42:59,771 epoch 5 - iter 135/275 - loss 0.53641215 - time (sec): 2.07 - samples/sec: 5556.48 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:43:00,171 epoch 5 - iter 162/275 - loss 0.54184928 - time (sec): 2.47 - samples/sec: 5497.97 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:43:00,583 epoch 5 - iter 189/275 - loss 0.54976097 - time (sec): 2.88 - samples/sec: 5537.77 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:43:00,985 epoch 5 - iter 216/275 - loss 0.54676931 - time (sec): 3.28 - samples/sec: 5481.02 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:43:01,389 epoch 5 - iter 243/275 - loss 0.53924318 - time (sec): 3.69 - samples/sec: 5420.89 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:43:01,804 epoch 5 - iter 270/275 - loss 0.53859676 - time (sec): 4.10 - samples/sec: 5448.37 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:43:01,878 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:01,878 EPOCH 5 done: loss 0.5378 - lr: 0.000017
2023-10-18 14:43:02,242 DEV : loss 0.4137643575668335 - f1-score (micro avg) 0.3669
2023-10-18 14:43:02,246 saving best model
2023-10-18 14:43:02,279 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:02,688 epoch 6 - iter 27/275 - loss 0.50724664 - time (sec): 0.41 - samples/sec: 5132.05 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:43:03,089 epoch 6 - iter 54/275 - loss 0.49264158 - time (sec): 0.81 - samples/sec: 5192.29 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:43:03,494 epoch 6 - iter 81/275 - loss 0.48482614 - time (sec): 1.21 - samples/sec: 5061.78 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:43:03,914 epoch 6 - iter 108/275 - loss 0.48640583 - time (sec): 1.63 - samples/sec: 5233.22 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:43:04,315 epoch 6 - iter 135/275 - loss 0.49124330 - time (sec): 2.04 - samples/sec: 5313.77 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:43:04,734 epoch 6 - iter 162/275 - loss 0.49211512 - time (sec): 2.45 - samples/sec: 5412.89 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:43:05,139 epoch 6 - iter 189/275 - loss 0.48077587 - time (sec): 2.86 - samples/sec: 5433.05 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:43:05,534 epoch 6 - iter 216/275 - loss 0.48715522 - time (sec): 3.25 - samples/sec: 5387.09 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:43:05,942 epoch 6 - iter 243/275 - loss 0.50096569 - time (sec): 3.66 - samples/sec: 5386.51 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:43:06,355 epoch 6 - iter 270/275 - loss 0.49391955 - time (sec): 4.08 - samples/sec: 5466.99 - lr: 0.000013 - momentum: 0.000000
2023-10-18 14:43:06,438 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:06,438 EPOCH 6 done: loss 0.4934 - lr: 0.000013
2023-10-18 14:43:06,811 DEV : loss 0.38504037261009216 - f1-score (micro avg) 0.4258
2023-10-18 14:43:06,816 saving best model
2023-10-18 14:43:06,850 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:07,252 epoch 7 - iter 27/275 - loss 0.54413041 - time (sec): 0.40 - samples/sec: 4764.45 - lr: 0.000013 - momentum: 0.000000
2023-10-18 14:43:07,667 epoch 7 - iter 54/275 - loss 0.50300859 - time (sec): 0.82 - samples/sec: 5186.24 - lr: 0.000013 - momentum: 0.000000
2023-10-18 14:43:08,091 epoch 7 - iter 81/275 - loss 0.46386285 - time (sec): 1.24 - samples/sec: 5448.61 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:43:08,495 epoch 7 - iter 108/275 - loss 0.47033759 - time (sec): 1.65 - samples/sec: 5491.03 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:43:08,907 epoch 7 - iter 135/275 - loss 0.47671658 - time (sec): 2.06 - samples/sec: 5404.26 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:43:09,315 epoch 7 - iter 162/275 - loss 0.47678839 - time (sec): 2.46 - samples/sec: 5373.05 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:43:09,737 epoch 7 - iter 189/275 - loss 0.47287308 - time (sec): 2.89 - samples/sec: 5432.45 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:43:10,149 epoch 7 - iter 216/275 - loss 0.47324651 - time (sec): 3.30 - samples/sec: 5403.72 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:43:10,558 epoch 7 - iter 243/275 - loss 0.47351055 - time (sec): 3.71 - samples/sec: 5457.27 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:43:10,974 epoch 7 - iter 270/275 - loss 0.46992356 - time (sec): 4.12 - samples/sec: 5429.40 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:43:11,045 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:11,045 EPOCH 7 done: loss 0.4690 - lr: 0.000010
2023-10-18 14:43:11,424 DEV : loss 0.3652515113353729 - f1-score (micro avg) 0.4913
2023-10-18 14:43:11,428 saving best model
2023-10-18 14:43:11,464 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:11,885 epoch 8 - iter 27/275 - loss 0.48504152 - time (sec): 0.42 - samples/sec: 5742.71 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:43:12,284 epoch 8 - iter 54/275 - loss 0.47425987 - time (sec): 0.82 - samples/sec: 5788.13 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:43:12,697 epoch 8 - iter 81/275 - loss 0.49146730 - time (sec): 1.23 - samples/sec: 5823.19 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:43:13,118 epoch 8 - iter 108/275 - loss 0.47302956 - time (sec): 1.65 - samples/sec: 5631.56 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:43:13,530 epoch 8 - iter 135/275 - loss 0.47714840 - time (sec): 2.07 - samples/sec: 5559.89 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:43:13,943 epoch 8 - iter 162/275 - loss 0.46515601 - time (sec): 2.48 - samples/sec: 5455.87 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:43:14,350 epoch 8 - iter 189/275 - loss 0.46293655 - time (sec): 2.89 - samples/sec: 5408.67 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:43:14,754 epoch 8 - iter 216/275 - loss 0.45905739 - time (sec): 3.29 - samples/sec: 5412.04 - lr: 0.000007 - momentum: 0.000000
2023-10-18 14:43:15,169 epoch 8 - iter 243/275 - loss 0.45136845 - time (sec): 3.71 - samples/sec: 5420.68 - lr: 0.000007 - momentum: 0.000000
2023-10-18 14:43:15,589 epoch 8 - iter 270/275 - loss 0.44211449 - time (sec): 4.12 - samples/sec: 5430.32 - lr: 0.000007 - momentum: 0.000000
2023-10-18 14:43:15,662 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:15,662 EPOCH 8 done: loss 0.4418 - lr: 0.000007
2023-10-18 14:43:16,034 DEV : loss 0.35667097568511963 - f1-score (micro avg) 0.5222
2023-10-18 14:43:16,037 saving best model
2023-10-18 14:43:16,073 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:16,475 epoch 9 - iter 27/275 - loss 0.46086074 - time (sec): 0.40 - samples/sec: 5520.10 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:43:16,880 epoch 9 - iter 54/275 - loss 0.44215721 - time (sec): 0.81 - samples/sec: 5428.44 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:43:17,284 epoch 9 - iter 81/275 - loss 0.43449692 - time (sec): 1.21 - samples/sec: 5342.81 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:43:17,695 epoch 9 - iter 108/275 - loss 0.43572935 - time (sec): 1.62 - samples/sec: 5257.18 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:43:18,106 epoch 9 - iter 135/275 - loss 0.44037196 - time (sec): 2.03 - samples/sec: 5185.35 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:43:18,504 epoch 9 - iter 162/275 - loss 0.43832233 - time (sec): 2.43 - samples/sec: 5265.56 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:43:18,920 epoch 9 - iter 189/275 - loss 0.44870956 - time (sec): 2.85 - samples/sec: 5342.73 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:43:19,346 epoch 9 - iter 216/275 - loss 0.43530574 - time (sec): 3.27 - samples/sec: 5358.24 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:43:19,757 epoch 9 - iter 243/275 - loss 0.43530429 - time (sec): 3.68 - samples/sec: 5466.45 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:43:20,171 epoch 9 - iter 270/275 - loss 0.43049081 - time (sec): 4.10 - samples/sec: 5463.51 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:43:20,242 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:20,243 EPOCH 9 done: loss 0.4341 - lr: 0.000003
2023-10-18 14:43:20,612 DEV : loss 0.3517986536026001 - f1-score (micro avg) 0.5435
2023-10-18 14:43:20,616 saving best model
2023-10-18 14:43:20,652 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:21,060 epoch 10 - iter 27/275 - loss 0.44948570 - time (sec): 0.41 - samples/sec: 5415.35 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:43:21,472 epoch 10 - iter 54/275 - loss 0.44181683 - time (sec): 0.82 - samples/sec: 5634.82 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:43:21,886 epoch 10 - iter 81/275 - loss 0.47116250 - time (sec): 1.23 - samples/sec: 5627.40 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:43:22,289 epoch 10 - iter 108/275 - loss 0.44182285 - time (sec): 1.64 - samples/sec: 5505.07 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:43:22,707 epoch 10 - iter 135/275 - loss 0.44570033 - time (sec): 2.05 - samples/sec: 5632.63 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:43:23,122 epoch 10 - iter 162/275 - loss 0.43757809 - time (sec): 2.47 - samples/sec: 5552.83 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:43:23,543 epoch 10 - iter 189/275 - loss 0.43587334 - time (sec): 2.89 - samples/sec: 5475.08 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:43:23,949 epoch 10 - iter 216/275 - loss 0.43145224 - time (sec): 3.30 - samples/sec: 5462.81 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:43:24,352 epoch 10 - iter 243/275 - loss 0.43014733 - time (sec): 3.70 - samples/sec: 5453.66 - lr: 0.000000 - momentum: 0.000000
2023-10-18 14:43:24,769 epoch 10 - iter 270/275 - loss 0.42830775 - time (sec): 4.12 - samples/sec: 5433.27 - lr: 0.000000 - momentum: 0.000000
2023-10-18 14:43:24,848 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:24,848 EPOCH 10 done: loss 0.4263 - lr: 0.000000
2023-10-18 14:43:25,217 DEV : loss 0.3487985134124756 - f1-score (micro avg) 0.5408
2023-10-18 14:43:25,251 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:25,251 Loading model from best epoch ...
2023-10-18 14:43:25,333 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-18 14:43:25,632
Results:
- F-score (micro) 0.5514
- F-score (macro) 0.3272
- Accuracy 0.3903
By class:
precision recall f1-score support
scope 0.5460 0.5398 0.5429 176
pers 0.7826 0.5625 0.6545 128
work 0.4198 0.4595 0.4387 74
object 0.0000 0.0000 0.0000 2
loc 0.0000 0.0000 0.0000 2
micro avg 0.5793 0.5262 0.5514 382
macro avg 0.3497 0.3123 0.3272 382
weighted avg 0.5951 0.5262 0.5544 382
2023-10-18 14:43:25,632 ----------------------------------------------------------------------------------------------------