stefan-it's picture
Upload folder using huggingface_hub
da1a668
2023-10-18 16:47:47,857 ----------------------------------------------------------------------------------------------------
2023-10-18 16:47:47,857 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 16:47:47,858 ----------------------------------------------------------------------------------------------------
2023-10-18 16:47:47,858 MultiCorpus: 966 train + 219 dev + 204 test sentences
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator
2023-10-18 16:47:47,858 ----------------------------------------------------------------------------------------------------
2023-10-18 16:47:47,858 Train: 966 sentences
2023-10-18 16:47:47,858 (train_with_dev=False, train_with_test=False)
2023-10-18 16:47:47,858 ----------------------------------------------------------------------------------------------------
2023-10-18 16:47:47,858 Training Params:
2023-10-18 16:47:47,858 - learning_rate: "5e-05"
2023-10-18 16:47:47,858 - mini_batch_size: "4"
2023-10-18 16:47:47,858 - max_epochs: "10"
2023-10-18 16:47:47,858 - shuffle: "True"
2023-10-18 16:47:47,858 ----------------------------------------------------------------------------------------------------
2023-10-18 16:47:47,858 Plugins:
2023-10-18 16:47:47,858 - TensorboardLogger
2023-10-18 16:47:47,858 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 16:47:47,858 ----------------------------------------------------------------------------------------------------
2023-10-18 16:47:47,858 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 16:47:47,858 - metric: "('micro avg', 'f1-score')"
2023-10-18 16:47:47,858 ----------------------------------------------------------------------------------------------------
2023-10-18 16:47:47,858 Computation:
2023-10-18 16:47:47,858 - compute on device: cuda:0
2023-10-18 16:47:47,858 - embedding storage: none
2023-10-18 16:47:47,858 ----------------------------------------------------------------------------------------------------
2023-10-18 16:47:47,858 Model training base path: "hmbench-ajmc/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-18 16:47:47,858 ----------------------------------------------------------------------------------------------------
2023-10-18 16:47:47,858 ----------------------------------------------------------------------------------------------------
2023-10-18 16:47:47,859 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 16:47:48,254 epoch 1 - iter 24/242 - loss 3.75850769 - time (sec): 0.40 - samples/sec: 5890.04 - lr: 0.000005 - momentum: 0.000000
2023-10-18 16:47:48,614 epoch 1 - iter 48/242 - loss 3.69682205 - time (sec): 0.76 - samples/sec: 5693.84 - lr: 0.000010 - momentum: 0.000000
2023-10-18 16:47:49,004 epoch 1 - iter 72/242 - loss 3.55368346 - time (sec): 1.14 - samples/sec: 6244.70 - lr: 0.000015 - momentum: 0.000000
2023-10-18 16:47:49,375 epoch 1 - iter 96/242 - loss 3.39273430 - time (sec): 1.52 - samples/sec: 6261.71 - lr: 0.000020 - momentum: 0.000000
2023-10-18 16:47:49,743 epoch 1 - iter 120/242 - loss 3.19611354 - time (sec): 1.88 - samples/sec: 6331.42 - lr: 0.000025 - momentum: 0.000000
2023-10-18 16:47:50,103 epoch 1 - iter 144/242 - loss 2.97659823 - time (sec): 2.24 - samples/sec: 6258.32 - lr: 0.000030 - momentum: 0.000000
2023-10-18 16:47:50,496 epoch 1 - iter 168/242 - loss 2.68820008 - time (sec): 2.64 - samples/sec: 6368.43 - lr: 0.000035 - momentum: 0.000000
2023-10-18 16:47:50,881 epoch 1 - iter 192/242 - loss 2.43041565 - time (sec): 3.02 - samples/sec: 6522.56 - lr: 0.000039 - momentum: 0.000000
2023-10-18 16:47:51,261 epoch 1 - iter 216/242 - loss 2.23894411 - time (sec): 3.40 - samples/sec: 6554.26 - lr: 0.000044 - momentum: 0.000000
2023-10-18 16:47:51,619 epoch 1 - iter 240/242 - loss 2.10798640 - time (sec): 3.76 - samples/sec: 6556.78 - lr: 0.000049 - momentum: 0.000000
2023-10-18 16:47:51,644 ----------------------------------------------------------------------------------------------------
2023-10-18 16:47:51,644 EPOCH 1 done: loss 2.1061 - lr: 0.000049
2023-10-18 16:47:52,153 DEV : loss 0.6494866609573364 - f1-score (micro avg) 0.0
2023-10-18 16:47:52,158 ----------------------------------------------------------------------------------------------------
2023-10-18 16:47:52,526 epoch 2 - iter 24/242 - loss 0.70753707 - time (sec): 0.37 - samples/sec: 6256.75 - lr: 0.000049 - momentum: 0.000000
2023-10-18 16:47:52,897 epoch 2 - iter 48/242 - loss 0.69061854 - time (sec): 0.74 - samples/sec: 6476.07 - lr: 0.000049 - momentum: 0.000000
2023-10-18 16:47:53,260 epoch 2 - iter 72/242 - loss 0.66939334 - time (sec): 1.10 - samples/sec: 6544.56 - lr: 0.000048 - momentum: 0.000000
2023-10-18 16:47:53,643 epoch 2 - iter 96/242 - loss 0.67917229 - time (sec): 1.48 - samples/sec: 6655.00 - lr: 0.000048 - momentum: 0.000000
2023-10-18 16:47:54,016 epoch 2 - iter 120/242 - loss 0.66943186 - time (sec): 1.86 - samples/sec: 6506.69 - lr: 0.000047 - momentum: 0.000000
2023-10-18 16:47:54,394 epoch 2 - iter 144/242 - loss 0.65133492 - time (sec): 2.24 - samples/sec: 6580.86 - lr: 0.000047 - momentum: 0.000000
2023-10-18 16:47:54,794 epoch 2 - iter 168/242 - loss 0.61596553 - time (sec): 2.64 - samples/sec: 6568.16 - lr: 0.000046 - momentum: 0.000000
2023-10-18 16:47:55,162 epoch 2 - iter 192/242 - loss 0.59965228 - time (sec): 3.00 - samples/sec: 6595.00 - lr: 0.000046 - momentum: 0.000000
2023-10-18 16:47:55,521 epoch 2 - iter 216/242 - loss 0.60012603 - time (sec): 3.36 - samples/sec: 6573.83 - lr: 0.000045 - momentum: 0.000000
2023-10-18 16:47:55,887 epoch 2 - iter 240/242 - loss 0.59113460 - time (sec): 3.73 - samples/sec: 6573.62 - lr: 0.000045 - momentum: 0.000000
2023-10-18 16:47:55,915 ----------------------------------------------------------------------------------------------------
2023-10-18 16:47:55,915 EPOCH 2 done: loss 0.5933 - lr: 0.000045
2023-10-18 16:47:56,345 DEV : loss 0.3973432183265686 - f1-score (micro avg) 0.2292
2023-10-18 16:47:56,350 saving best model
2023-10-18 16:47:56,377 ----------------------------------------------------------------------------------------------------
2023-10-18 16:47:56,758 epoch 3 - iter 24/242 - loss 0.47094547 - time (sec): 0.38 - samples/sec: 6558.17 - lr: 0.000044 - momentum: 0.000000
2023-10-18 16:47:57,124 epoch 3 - iter 48/242 - loss 0.49377345 - time (sec): 0.75 - samples/sec: 6428.73 - lr: 0.000043 - momentum: 0.000000
2023-10-18 16:47:57,481 epoch 3 - iter 72/242 - loss 0.47276259 - time (sec): 1.10 - samples/sec: 6393.37 - lr: 0.000043 - momentum: 0.000000
2023-10-18 16:47:57,859 epoch 3 - iter 96/242 - loss 0.47065066 - time (sec): 1.48 - samples/sec: 6450.24 - lr: 0.000042 - momentum: 0.000000
2023-10-18 16:47:58,250 epoch 3 - iter 120/242 - loss 0.46440215 - time (sec): 1.87 - samples/sec: 6423.01 - lr: 0.000042 - momentum: 0.000000
2023-10-18 16:47:58,622 epoch 3 - iter 144/242 - loss 0.44989391 - time (sec): 2.24 - samples/sec: 6365.98 - lr: 0.000041 - momentum: 0.000000
2023-10-18 16:47:58,996 epoch 3 - iter 168/242 - loss 0.44144086 - time (sec): 2.62 - samples/sec: 6442.83 - lr: 0.000041 - momentum: 0.000000
2023-10-18 16:47:59,377 epoch 3 - iter 192/242 - loss 0.43386345 - time (sec): 3.00 - samples/sec: 6568.59 - lr: 0.000040 - momentum: 0.000000
2023-10-18 16:47:59,760 epoch 3 - iter 216/242 - loss 0.42346837 - time (sec): 3.38 - samples/sec: 6565.04 - lr: 0.000040 - momentum: 0.000000
2023-10-18 16:48:00,133 epoch 3 - iter 240/242 - loss 0.42387037 - time (sec): 3.76 - samples/sec: 6544.86 - lr: 0.000039 - momentum: 0.000000
2023-10-18 16:48:00,159 ----------------------------------------------------------------------------------------------------
2023-10-18 16:48:00,160 EPOCH 3 done: loss 0.4236 - lr: 0.000039
2023-10-18 16:48:00,579 DEV : loss 0.315167635679245 - f1-score (micro avg) 0.5118
2023-10-18 16:48:00,583 saving best model
2023-10-18 16:48:00,617 ----------------------------------------------------------------------------------------------------
2023-10-18 16:48:00,976 epoch 4 - iter 24/242 - loss 0.37876134 - time (sec): 0.36 - samples/sec: 5829.16 - lr: 0.000038 - momentum: 0.000000
2023-10-18 16:48:01,352 epoch 4 - iter 48/242 - loss 0.39303874 - time (sec): 0.73 - samples/sec: 6092.55 - lr: 0.000038 - momentum: 0.000000
2023-10-18 16:48:01,733 epoch 4 - iter 72/242 - loss 0.38107267 - time (sec): 1.11 - samples/sec: 6523.26 - lr: 0.000037 - momentum: 0.000000
2023-10-18 16:48:02,105 epoch 4 - iter 96/242 - loss 0.38620105 - time (sec): 1.49 - samples/sec: 6472.65 - lr: 0.000037 - momentum: 0.000000
2023-10-18 16:48:02,472 epoch 4 - iter 120/242 - loss 0.38815951 - time (sec): 1.85 - samples/sec: 6515.11 - lr: 0.000036 - momentum: 0.000000
2023-10-18 16:48:02,836 epoch 4 - iter 144/242 - loss 0.37887595 - time (sec): 2.22 - samples/sec: 6579.66 - lr: 0.000036 - momentum: 0.000000
2023-10-18 16:48:03,210 epoch 4 - iter 168/242 - loss 0.37809209 - time (sec): 2.59 - samples/sec: 6674.66 - lr: 0.000035 - momentum: 0.000000
2023-10-18 16:48:03,586 epoch 4 - iter 192/242 - loss 0.36934478 - time (sec): 2.97 - samples/sec: 6664.24 - lr: 0.000035 - momentum: 0.000000
2023-10-18 16:48:03,939 epoch 4 - iter 216/242 - loss 0.37593866 - time (sec): 3.32 - samples/sec: 6695.54 - lr: 0.000034 - momentum: 0.000000
2023-10-18 16:48:04,276 epoch 4 - iter 240/242 - loss 0.36981793 - time (sec): 3.66 - samples/sec: 6742.05 - lr: 0.000033 - momentum: 0.000000
2023-10-18 16:48:04,300 ----------------------------------------------------------------------------------------------------
2023-10-18 16:48:04,300 EPOCH 4 done: loss 0.3695 - lr: 0.000033
2023-10-18 16:48:04,730 DEV : loss 0.2958347797393799 - f1-score (micro avg) 0.512
2023-10-18 16:48:04,734 saving best model
2023-10-18 16:48:04,769 ----------------------------------------------------------------------------------------------------
2023-10-18 16:48:05,110 epoch 5 - iter 24/242 - loss 0.35480541 - time (sec): 0.34 - samples/sec: 7224.38 - lr: 0.000033 - momentum: 0.000000
2023-10-18 16:48:05,450 epoch 5 - iter 48/242 - loss 0.35459295 - time (sec): 0.68 - samples/sec: 7398.74 - lr: 0.000032 - momentum: 0.000000
2023-10-18 16:48:05,784 epoch 5 - iter 72/242 - loss 0.34181991 - time (sec): 1.01 - samples/sec: 7421.59 - lr: 0.000032 - momentum: 0.000000
2023-10-18 16:48:06,124 epoch 5 - iter 96/242 - loss 0.35420576 - time (sec): 1.35 - samples/sec: 7517.23 - lr: 0.000031 - momentum: 0.000000
2023-10-18 16:48:06,457 epoch 5 - iter 120/242 - loss 0.35768803 - time (sec): 1.69 - samples/sec: 7460.18 - lr: 0.000031 - momentum: 0.000000
2023-10-18 16:48:06,792 epoch 5 - iter 144/242 - loss 0.35518931 - time (sec): 2.02 - samples/sec: 7420.94 - lr: 0.000030 - momentum: 0.000000
2023-10-18 16:48:07,137 epoch 5 - iter 168/242 - loss 0.34918538 - time (sec): 2.37 - samples/sec: 7362.92 - lr: 0.000030 - momentum: 0.000000
2023-10-18 16:48:07,471 epoch 5 - iter 192/242 - loss 0.34897337 - time (sec): 2.70 - samples/sec: 7380.41 - lr: 0.000029 - momentum: 0.000000
2023-10-18 16:48:07,809 epoch 5 - iter 216/242 - loss 0.33635374 - time (sec): 3.04 - samples/sec: 7341.39 - lr: 0.000028 - momentum: 0.000000
2023-10-18 16:48:08,132 epoch 5 - iter 240/242 - loss 0.33487180 - time (sec): 3.36 - samples/sec: 7325.66 - lr: 0.000028 - momentum: 0.000000
2023-10-18 16:48:08,157 ----------------------------------------------------------------------------------------------------
2023-10-18 16:48:08,157 EPOCH 5 done: loss 0.3352 - lr: 0.000028
2023-10-18 16:48:08,597 DEV : loss 0.2639392018318176 - f1-score (micro avg) 0.5037
2023-10-18 16:48:08,602 ----------------------------------------------------------------------------------------------------
2023-10-18 16:48:08,962 epoch 6 - iter 24/242 - loss 0.31375370 - time (sec): 0.36 - samples/sec: 7149.00 - lr: 0.000027 - momentum: 0.000000
2023-10-18 16:48:09,335 epoch 6 - iter 48/242 - loss 0.31377803 - time (sec): 0.73 - samples/sec: 6842.22 - lr: 0.000027 - momentum: 0.000000
2023-10-18 16:48:09,710 epoch 6 - iter 72/242 - loss 0.32611832 - time (sec): 1.11 - samples/sec: 6809.96 - lr: 0.000026 - momentum: 0.000000
2023-10-18 16:48:10,072 epoch 6 - iter 96/242 - loss 0.29529436 - time (sec): 1.47 - samples/sec: 6638.72 - lr: 0.000026 - momentum: 0.000000
2023-10-18 16:48:10,448 epoch 6 - iter 120/242 - loss 0.29275130 - time (sec): 1.85 - samples/sec: 6661.09 - lr: 0.000025 - momentum: 0.000000
2023-10-18 16:48:10,841 epoch 6 - iter 144/242 - loss 0.29614971 - time (sec): 2.24 - samples/sec: 6602.79 - lr: 0.000025 - momentum: 0.000000
2023-10-18 16:48:11,207 epoch 6 - iter 168/242 - loss 0.30099987 - time (sec): 2.60 - samples/sec: 6576.18 - lr: 0.000024 - momentum: 0.000000
2023-10-18 16:48:11,598 epoch 6 - iter 192/242 - loss 0.30400838 - time (sec): 3.00 - samples/sec: 6568.08 - lr: 0.000023 - momentum: 0.000000
2023-10-18 16:48:11,968 epoch 6 - iter 216/242 - loss 0.30647937 - time (sec): 3.37 - samples/sec: 6519.76 - lr: 0.000023 - momentum: 0.000000
2023-10-18 16:48:12,357 epoch 6 - iter 240/242 - loss 0.30712937 - time (sec): 3.75 - samples/sec: 6543.49 - lr: 0.000022 - momentum: 0.000000
2023-10-18 16:48:12,383 ----------------------------------------------------------------------------------------------------
2023-10-18 16:48:12,384 EPOCH 6 done: loss 0.3079 - lr: 0.000022
2023-10-18 16:48:12,821 DEV : loss 0.2603849470615387 - f1-score (micro avg) 0.5104
2023-10-18 16:48:12,826 ----------------------------------------------------------------------------------------------------
2023-10-18 16:48:13,215 epoch 7 - iter 24/242 - loss 0.34851167 - time (sec): 0.39 - samples/sec: 6616.61 - lr: 0.000022 - momentum: 0.000000
2023-10-18 16:48:13,577 epoch 7 - iter 48/242 - loss 0.34722378 - time (sec): 0.75 - samples/sec: 6247.39 - lr: 0.000021 - momentum: 0.000000
2023-10-18 16:48:13,944 epoch 7 - iter 72/242 - loss 0.32206806 - time (sec): 1.12 - samples/sec: 6327.44 - lr: 0.000021 - momentum: 0.000000
2023-10-18 16:48:14,324 epoch 7 - iter 96/242 - loss 0.30779121 - time (sec): 1.50 - samples/sec: 6279.87 - lr: 0.000020 - momentum: 0.000000
2023-10-18 16:48:14,695 epoch 7 - iter 120/242 - loss 0.29721250 - time (sec): 1.87 - samples/sec: 6374.73 - lr: 0.000020 - momentum: 0.000000
2023-10-18 16:48:15,019 epoch 7 - iter 144/242 - loss 0.29074408 - time (sec): 2.19 - samples/sec: 6609.94 - lr: 0.000019 - momentum: 0.000000
2023-10-18 16:48:15,322 epoch 7 - iter 168/242 - loss 0.28770394 - time (sec): 2.50 - samples/sec: 6789.27 - lr: 0.000018 - momentum: 0.000000
2023-10-18 16:48:15,738 epoch 7 - iter 192/242 - loss 0.28551813 - time (sec): 2.91 - samples/sec: 6828.57 - lr: 0.000018 - momentum: 0.000000
2023-10-18 16:48:16,136 epoch 7 - iter 216/242 - loss 0.28346652 - time (sec): 3.31 - samples/sec: 6699.72 - lr: 0.000017 - momentum: 0.000000
2023-10-18 16:48:16,525 epoch 7 - iter 240/242 - loss 0.28920613 - time (sec): 3.70 - samples/sec: 6634.99 - lr: 0.000017 - momentum: 0.000000
2023-10-18 16:48:16,553 ----------------------------------------------------------------------------------------------------
2023-10-18 16:48:16,553 EPOCH 7 done: loss 0.2910 - lr: 0.000017
2023-10-18 16:48:16,998 DEV : loss 0.24460992217063904 - f1-score (micro avg) 0.5291
2023-10-18 16:48:17,002 saving best model
2023-10-18 16:48:17,034 ----------------------------------------------------------------------------------------------------
2023-10-18 16:48:17,417 epoch 8 - iter 24/242 - loss 0.26034674 - time (sec): 0.38 - samples/sec: 6988.76 - lr: 0.000016 - momentum: 0.000000
2023-10-18 16:48:17,790 epoch 8 - iter 48/242 - loss 0.27344716 - time (sec): 0.76 - samples/sec: 6928.02 - lr: 0.000016 - momentum: 0.000000
2023-10-18 16:48:18,146 epoch 8 - iter 72/242 - loss 0.29604730 - time (sec): 1.11 - samples/sec: 6753.89 - lr: 0.000015 - momentum: 0.000000
2023-10-18 16:48:18,516 epoch 8 - iter 96/242 - loss 0.28714663 - time (sec): 1.48 - samples/sec: 6654.86 - lr: 0.000015 - momentum: 0.000000
2023-10-18 16:48:18,918 epoch 8 - iter 120/242 - loss 0.28223276 - time (sec): 1.88 - samples/sec: 6585.68 - lr: 0.000014 - momentum: 0.000000
2023-10-18 16:48:19,302 epoch 8 - iter 144/242 - loss 0.29541982 - time (sec): 2.27 - samples/sec: 6662.43 - lr: 0.000013 - momentum: 0.000000
2023-10-18 16:48:19,670 epoch 8 - iter 168/242 - loss 0.29484539 - time (sec): 2.63 - samples/sec: 6590.53 - lr: 0.000013 - momentum: 0.000000
2023-10-18 16:48:20,039 epoch 8 - iter 192/242 - loss 0.28665546 - time (sec): 3.00 - samples/sec: 6571.13 - lr: 0.000012 - momentum: 0.000000
2023-10-18 16:48:20,419 epoch 8 - iter 216/242 - loss 0.28212190 - time (sec): 3.38 - samples/sec: 6572.82 - lr: 0.000012 - momentum: 0.000000
2023-10-18 16:48:20,800 epoch 8 - iter 240/242 - loss 0.28301868 - time (sec): 3.76 - samples/sec: 6532.36 - lr: 0.000011 - momentum: 0.000000
2023-10-18 16:48:20,830 ----------------------------------------------------------------------------------------------------
2023-10-18 16:48:20,830 EPOCH 8 done: loss 0.2845 - lr: 0.000011
2023-10-18 16:48:21,262 DEV : loss 0.24663908779621124 - f1-score (micro avg) 0.547
2023-10-18 16:48:21,266 saving best model
2023-10-18 16:48:21,306 ----------------------------------------------------------------------------------------------------
2023-10-18 16:48:21,698 epoch 9 - iter 24/242 - loss 0.25854915 - time (sec): 0.39 - samples/sec: 5206.54 - lr: 0.000011 - momentum: 0.000000
2023-10-18 16:48:22,074 epoch 9 - iter 48/242 - loss 0.26339176 - time (sec): 0.77 - samples/sec: 5874.13 - lr: 0.000010 - momentum: 0.000000
2023-10-18 16:48:22,487 epoch 9 - iter 72/242 - loss 0.27772691 - time (sec): 1.18 - samples/sec: 6048.06 - lr: 0.000010 - momentum: 0.000000
2023-10-18 16:48:22,904 epoch 9 - iter 96/242 - loss 0.29699368 - time (sec): 1.60 - samples/sec: 6003.44 - lr: 0.000009 - momentum: 0.000000
2023-10-18 16:48:23,309 epoch 9 - iter 120/242 - loss 0.29822567 - time (sec): 2.00 - samples/sec: 6020.52 - lr: 0.000008 - momentum: 0.000000
2023-10-18 16:48:23,675 epoch 9 - iter 144/242 - loss 0.28198629 - time (sec): 2.37 - samples/sec: 6033.25 - lr: 0.000008 - momentum: 0.000000
2023-10-18 16:48:24,029 epoch 9 - iter 168/242 - loss 0.28545819 - time (sec): 2.72 - samples/sec: 6058.61 - lr: 0.000007 - momentum: 0.000000
2023-10-18 16:48:24,407 epoch 9 - iter 192/242 - loss 0.28498541 - time (sec): 3.10 - samples/sec: 6204.10 - lr: 0.000007 - momentum: 0.000000
2023-10-18 16:48:24,789 epoch 9 - iter 216/242 - loss 0.27924178 - time (sec): 3.48 - samples/sec: 6316.12 - lr: 0.000006 - momentum: 0.000000
2023-10-18 16:48:25,167 epoch 9 - iter 240/242 - loss 0.27642574 - time (sec): 3.86 - samples/sec: 6399.42 - lr: 0.000006 - momentum: 0.000000
2023-10-18 16:48:25,195 ----------------------------------------------------------------------------------------------------
2023-10-18 16:48:25,195 EPOCH 9 done: loss 0.2766 - lr: 0.000006
2023-10-18 16:48:25,629 DEV : loss 0.2363634556531906 - f1-score (micro avg) 0.5531
2023-10-18 16:48:25,633 saving best model
2023-10-18 16:48:25,666 ----------------------------------------------------------------------------------------------------
2023-10-18 16:48:26,029 epoch 10 - iter 24/242 - loss 0.25436219 - time (sec): 0.36 - samples/sec: 6207.54 - lr: 0.000005 - momentum: 0.000000
2023-10-18 16:48:26,399 epoch 10 - iter 48/242 - loss 0.25221311 - time (sec): 0.73 - samples/sec: 6402.65 - lr: 0.000005 - momentum: 0.000000
2023-10-18 16:48:26,783 epoch 10 - iter 72/242 - loss 0.25841165 - time (sec): 1.12 - samples/sec: 6376.23 - lr: 0.000004 - momentum: 0.000000
2023-10-18 16:48:27,177 epoch 10 - iter 96/242 - loss 0.25213135 - time (sec): 1.51 - samples/sec: 6420.68 - lr: 0.000003 - momentum: 0.000000
2023-10-18 16:48:27,546 epoch 10 - iter 120/242 - loss 0.26644826 - time (sec): 1.88 - samples/sec: 6397.10 - lr: 0.000003 - momentum: 0.000000
2023-10-18 16:48:27,924 epoch 10 - iter 144/242 - loss 0.27616057 - time (sec): 2.26 - samples/sec: 6541.61 - lr: 0.000002 - momentum: 0.000000
2023-10-18 16:48:28,299 epoch 10 - iter 168/242 - loss 0.26407937 - time (sec): 2.63 - samples/sec: 6555.33 - lr: 0.000002 - momentum: 0.000000
2023-10-18 16:48:28,661 epoch 10 - iter 192/242 - loss 0.26815631 - time (sec): 2.99 - samples/sec: 6561.04 - lr: 0.000001 - momentum: 0.000000
2023-10-18 16:48:29,039 epoch 10 - iter 216/242 - loss 0.26617524 - time (sec): 3.37 - samples/sec: 6565.71 - lr: 0.000001 - momentum: 0.000000
2023-10-18 16:48:29,406 epoch 10 - iter 240/242 - loss 0.27271415 - time (sec): 3.74 - samples/sec: 6568.71 - lr: 0.000000 - momentum: 0.000000
2023-10-18 16:48:29,435 ----------------------------------------------------------------------------------------------------
2023-10-18 16:48:29,435 EPOCH 10 done: loss 0.2712 - lr: 0.000000
2023-10-18 16:48:29,878 DEV : loss 0.23566073179244995 - f1-score (micro avg) 0.5531
2023-10-18 16:48:29,912 ----------------------------------------------------------------------------------------------------
2023-10-18 16:48:29,913 Loading model from best epoch ...
2023-10-18 16:48:29,981 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-18 16:48:30,398
Results:
- F-score (micro) 0.4879
- F-score (macro) 0.2774
- Accuracy 0.3387
By class:
precision recall f1-score support
scope 0.3192 0.5271 0.3977 129
pers 0.6154 0.7482 0.6753 139
work 0.4634 0.2375 0.3140 80
loc 0.0000 0.0000 0.0000 9
date 0.0000 0.0000 0.0000 3
micro avg 0.4515 0.5306 0.4879 360
macro avg 0.2796 0.3026 0.2774 360
weighted avg 0.4550 0.5306 0.4730 360
2023-10-18 16:48:30,398 ----------------------------------------------------------------------------------------------------