2023-10-23 15:10:48,290 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:10:48,291 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-23 15:10:48,291 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:10:48,291 MultiCorpus: 1100 train + 206 dev + 240 test sentences - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator 2023-10-23 15:10:48,291 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:10:48,291 Train: 1100 sentences 2023-10-23 15:10:48,291 (train_with_dev=False, train_with_test=False) 2023-10-23 15:10:48,291 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:10:48,291 Training Params: 2023-10-23 15:10:48,291 - learning_rate: "3e-05" 2023-10-23 15:10:48,291 - mini_batch_size: "4" 2023-10-23 15:10:48,291 - max_epochs: "10" 2023-10-23 15:10:48,291 - shuffle: "True" 2023-10-23 15:10:48,291 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:10:48,291 Plugins: 2023-10-23 15:10:48,291 - TensorboardLogger 2023-10-23 15:10:48,291 - LinearScheduler | warmup_fraction: '0.1' 2023-10-23 15:10:48,291 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:10:48,291 Final evaluation on model from best epoch (best-model.pt) 2023-10-23 15:10:48,291 - metric: "('micro avg', 'f1-score')" 2023-10-23 15:10:48,291 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:10:48,292 Computation: 2023-10-23 15:10:48,292 - compute on device: cuda:0 2023-10-23 15:10:48,292 - embedding storage: none 2023-10-23 15:10:48,292 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:10:48,292 Model training base path: "hmbench-ajmc/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-23 15:10:48,292 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:10:48,292 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:10:48,292 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-23 15:10:49,679 epoch 1 - iter 27/275 - loss 2.98712407 - time (sec): 1.39 - samples/sec: 1495.05 - lr: 0.000003 - momentum: 0.000000 2023-10-23 15:10:51,072 epoch 1 - iter 54/275 - loss 2.25454461 - time (sec): 2.78 - samples/sec: 1512.36 - lr: 0.000006 - momentum: 0.000000 2023-10-23 15:10:52,451 epoch 1 - iter 81/275 - loss 1.78847859 - time (sec): 4.16 - samples/sec: 1526.75 - lr: 0.000009 - momentum: 0.000000 2023-10-23 15:10:53,833 epoch 1 - iter 108/275 - loss 1.50816752 - time (sec): 5.54 - samples/sec: 1544.47 - lr: 0.000012 - momentum: 0.000000 2023-10-23 15:10:55,212 epoch 1 - iter 135/275 - loss 1.30242695 - time (sec): 6.92 - samples/sec: 1588.82 - lr: 0.000015 - momentum: 0.000000 2023-10-23 15:10:56,592 epoch 1 - iter 162/275 - loss 1.15981979 - time (sec): 8.30 - samples/sec: 1591.12 - lr: 0.000018 - momentum: 0.000000 2023-10-23 15:10:57,991 epoch 1 - iter 189/275 - loss 1.03963225 - time (sec): 9.70 - samples/sec: 1610.86 - lr: 0.000021 - momentum: 0.000000 2023-10-23 15:10:59,377 epoch 1 - iter 216/275 - loss 0.93902343 - time (sec): 11.08 - samples/sec: 1613.22 - lr: 0.000023 - momentum: 0.000000 2023-10-23 15:11:00,763 epoch 1 - iter 243/275 - loss 0.86646736 - time (sec): 12.47 - samples/sec: 1612.16 - lr: 0.000026 - momentum: 0.000000 2023-10-23 15:11:02,155 epoch 1 - iter 270/275 - loss 0.80171802 - time (sec): 13.86 - samples/sec: 1614.01 - lr: 0.000029 - momentum: 0.000000 2023-10-23 15:11:02,412 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:11:02,412 EPOCH 1 done: loss 0.7906 - lr: 0.000029 2023-10-23 15:11:02,833 DEV : loss 0.17415276169776917 - f1-score (micro avg) 0.757 2023-10-23 15:11:02,839 saving best model 2023-10-23 15:11:03,251 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:11:04,629 epoch 2 - iter 27/275 - loss 0.21117263 - time (sec): 1.38 - samples/sec: 1486.65 - lr: 0.000030 - momentum: 0.000000 2023-10-23 15:11:06,028 epoch 2 - iter 54/275 - loss 0.19557817 - time (sec): 2.78 - samples/sec: 1506.62 - lr: 0.000029 - momentum: 0.000000 2023-10-23 15:11:07,419 epoch 2 - iter 81/275 - loss 0.17493755 - time (sec): 4.17 - samples/sec: 1497.31 - lr: 0.000029 - momentum: 0.000000 2023-10-23 15:11:08,804 epoch 2 - iter 108/275 - loss 0.16479708 - time (sec): 5.55 - samples/sec: 1588.80 - lr: 0.000029 - momentum: 0.000000 2023-10-23 15:11:10,209 epoch 2 - iter 135/275 - loss 0.17148826 - time (sec): 6.96 - samples/sec: 1586.72 - lr: 0.000028 - momentum: 0.000000 2023-10-23 15:11:11,626 epoch 2 - iter 162/275 - loss 0.16754563 - time (sec): 8.37 - samples/sec: 1595.84 - lr: 0.000028 - momentum: 0.000000 2023-10-23 15:11:13,027 epoch 2 - iter 189/275 - loss 0.15856896 - time (sec): 9.77 - samples/sec: 1613.49 - lr: 0.000028 - momentum: 0.000000 2023-10-23 15:11:14,415 epoch 2 - iter 216/275 - loss 0.15952132 - time (sec): 11.16 - samples/sec: 1605.57 - lr: 0.000027 - momentum: 0.000000 2023-10-23 15:11:15,801 epoch 2 - iter 243/275 - loss 0.15685946 - time (sec): 12.55 - samples/sec: 1598.51 - lr: 0.000027 - momentum: 0.000000 2023-10-23 15:11:17,188 epoch 2 - iter 270/275 - loss 0.15667590 - time (sec): 13.94 - samples/sec: 1608.14 - lr: 0.000027 - momentum: 0.000000 2023-10-23 15:11:17,448 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:11:17,448 EPOCH 2 done: loss 0.1553 - lr: 0.000027 2023-10-23 15:11:17,993 DEV : loss 0.11974433809518814 - f1-score (micro avg) 0.8386 2023-10-23 15:11:17,999 saving best model 2023-10-23 15:11:18,556 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:11:19,954 epoch 3 - iter 27/275 - loss 0.08190109 - time (sec): 1.39 - samples/sec: 1589.65 - lr: 0.000026 - momentum: 0.000000 2023-10-23 15:11:21,347 epoch 3 - iter 54/275 - loss 0.07224184 - time (sec): 2.79 - samples/sec: 1630.55 - lr: 0.000026 - momentum: 0.000000 2023-10-23 15:11:22,748 epoch 3 - iter 81/275 - loss 0.08907885 - time (sec): 4.19 - samples/sec: 1624.61 - lr: 0.000026 - momentum: 0.000000 2023-10-23 15:11:24,132 epoch 3 - iter 108/275 - loss 0.09545926 - time (sec): 5.57 - samples/sec: 1683.21 - lr: 0.000025 - momentum: 0.000000 2023-10-23 15:11:25,524 epoch 3 - iter 135/275 - loss 0.11002197 - time (sec): 6.96 - samples/sec: 1635.95 - lr: 0.000025 - momentum: 0.000000 2023-10-23 15:11:26,913 epoch 3 - iter 162/275 - loss 0.10701995 - time (sec): 8.35 - samples/sec: 1646.36 - lr: 0.000025 - momentum: 0.000000 2023-10-23 15:11:28,304 epoch 3 - iter 189/275 - loss 0.10587302 - time (sec): 9.74 - samples/sec: 1626.83 - lr: 0.000024 - momentum: 0.000000 2023-10-23 15:11:29,700 epoch 3 - iter 216/275 - loss 0.09875895 - time (sec): 11.14 - samples/sec: 1628.62 - lr: 0.000024 - momentum: 0.000000 2023-10-23 15:11:31,255 epoch 3 - iter 243/275 - loss 0.09656745 - time (sec): 12.70 - samples/sec: 1596.35 - lr: 0.000024 - momentum: 0.000000 2023-10-23 15:11:32,623 epoch 3 - iter 270/275 - loss 0.09612509 - time (sec): 14.06 - samples/sec: 1585.43 - lr: 0.000023 - momentum: 0.000000 2023-10-23 15:11:32,880 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:11:32,880 EPOCH 3 done: loss 0.0960 - lr: 0.000023 2023-10-23 15:11:33,415 DEV : loss 0.133950337767601 - f1-score (micro avg) 0.8643 2023-10-23 15:11:33,421 saving best model 2023-10-23 15:11:33,969 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:11:35,356 epoch 4 - iter 27/275 - loss 0.07581656 - time (sec): 1.38 - samples/sec: 1409.15 - lr: 0.000023 - momentum: 0.000000 2023-10-23 15:11:36,738 epoch 4 - iter 54/275 - loss 0.06649911 - time (sec): 2.77 - samples/sec: 1469.15 - lr: 0.000023 - momentum: 0.000000 2023-10-23 15:11:38,116 epoch 4 - iter 81/275 - loss 0.07343234 - time (sec): 4.14 - samples/sec: 1485.47 - lr: 0.000022 - momentum: 0.000000 2023-10-23 15:11:39,488 epoch 4 - iter 108/275 - loss 0.07039412 - time (sec): 5.52 - samples/sec: 1511.62 - lr: 0.000022 - momentum: 0.000000 2023-10-23 15:11:40,880 epoch 4 - iter 135/275 - loss 0.06985930 - time (sec): 6.91 - samples/sec: 1561.83 - lr: 0.000022 - momentum: 0.000000 2023-10-23 15:11:42,261 epoch 4 - iter 162/275 - loss 0.07025187 - time (sec): 8.29 - samples/sec: 1573.98 - lr: 0.000021 - momentum: 0.000000 2023-10-23 15:11:43,662 epoch 4 - iter 189/275 - loss 0.07404057 - time (sec): 9.69 - samples/sec: 1575.52 - lr: 0.000021 - momentum: 0.000000 2023-10-23 15:11:45,065 epoch 4 - iter 216/275 - loss 0.07100467 - time (sec): 11.09 - samples/sec: 1564.05 - lr: 0.000021 - momentum: 0.000000 2023-10-23 15:11:46,453 epoch 4 - iter 243/275 - loss 0.07510567 - time (sec): 12.48 - samples/sec: 1601.41 - lr: 0.000020 - momentum: 0.000000 2023-10-23 15:11:47,852 epoch 4 - iter 270/275 - loss 0.07349453 - time (sec): 13.88 - samples/sec: 1606.91 - lr: 0.000020 - momentum: 0.000000 2023-10-23 15:11:48,112 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:11:48,112 EPOCH 4 done: loss 0.0734 - lr: 0.000020 2023-10-23 15:11:48,655 DEV : loss 0.15477542579174042 - f1-score (micro avg) 0.868 2023-10-23 15:11:48,661 saving best model 2023-10-23 15:11:49,214 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:11:50,601 epoch 5 - iter 27/275 - loss 0.06044995 - time (sec): 1.38 - samples/sec: 1492.89 - lr: 0.000020 - momentum: 0.000000 2023-10-23 15:11:51,975 epoch 5 - iter 54/275 - loss 0.05635748 - time (sec): 2.76 - samples/sec: 1608.01 - lr: 0.000019 - momentum: 0.000000 2023-10-23 15:11:53,296 epoch 5 - iter 81/275 - loss 0.04774531 - time (sec): 4.08 - samples/sec: 1654.49 - lr: 0.000019 - momentum: 0.000000 2023-10-23 15:11:54,684 epoch 5 - iter 108/275 - loss 0.05419372 - time (sec): 5.47 - samples/sec: 1643.54 - lr: 0.000019 - momentum: 0.000000 2023-10-23 15:11:56,066 epoch 5 - iter 135/275 - loss 0.05555452 - time (sec): 6.85 - samples/sec: 1638.67 - lr: 0.000018 - momentum: 0.000000 2023-10-23 15:11:57,452 epoch 5 - iter 162/275 - loss 0.05160946 - time (sec): 8.23 - samples/sec: 1628.45 - lr: 0.000018 - momentum: 0.000000 2023-10-23 15:11:58,841 epoch 5 - iter 189/275 - loss 0.04729330 - time (sec): 9.62 - samples/sec: 1617.99 - lr: 0.000018 - momentum: 0.000000 2023-10-23 15:12:00,229 epoch 5 - iter 216/275 - loss 0.05073596 - time (sec): 11.01 - samples/sec: 1615.73 - lr: 0.000017 - momentum: 0.000000 2023-10-23 15:12:01,618 epoch 5 - iter 243/275 - loss 0.05253668 - time (sec): 12.40 - samples/sec: 1603.74 - lr: 0.000017 - momentum: 0.000000 2023-10-23 15:12:03,013 epoch 5 - iter 270/275 - loss 0.04891192 - time (sec): 13.80 - samples/sec: 1611.14 - lr: 0.000017 - momentum: 0.000000 2023-10-23 15:12:03,277 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:12:03,278 EPOCH 5 done: loss 0.0479 - lr: 0.000017 2023-10-23 15:12:03,819 DEV : loss 0.15213628113269806 - f1-score (micro avg) 0.8765 2023-10-23 15:12:03,825 saving best model 2023-10-23 15:12:04,380 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:12:05,755 epoch 6 - iter 27/275 - loss 0.03389629 - time (sec): 1.37 - samples/sec: 1523.25 - lr: 0.000016 - momentum: 0.000000 2023-10-23 15:12:07,145 epoch 6 - iter 54/275 - loss 0.02513946 - time (sec): 2.76 - samples/sec: 1557.93 - lr: 0.000016 - momentum: 0.000000 2023-10-23 15:12:08,533 epoch 6 - iter 81/275 - loss 0.03593165 - time (sec): 4.15 - samples/sec: 1567.17 - lr: 0.000016 - momentum: 0.000000 2023-10-23 15:12:09,925 epoch 6 - iter 108/275 - loss 0.04743005 - time (sec): 5.54 - samples/sec: 1612.35 - lr: 0.000015 - momentum: 0.000000 2023-10-23 15:12:11,308 epoch 6 - iter 135/275 - loss 0.04356837 - time (sec): 6.93 - samples/sec: 1641.40 - lr: 0.000015 - momentum: 0.000000 2023-10-23 15:12:12,696 epoch 6 - iter 162/275 - loss 0.04032726 - time (sec): 8.31 - samples/sec: 1648.15 - lr: 0.000015 - momentum: 0.000000 2023-10-23 15:12:14,081 epoch 6 - iter 189/275 - loss 0.04025393 - time (sec): 9.70 - samples/sec: 1625.17 - lr: 0.000014 - momentum: 0.000000 2023-10-23 15:12:15,472 epoch 6 - iter 216/275 - loss 0.03849286 - time (sec): 11.09 - samples/sec: 1630.18 - lr: 0.000014 - momentum: 0.000000 2023-10-23 15:12:16,854 epoch 6 - iter 243/275 - loss 0.03682546 - time (sec): 12.47 - samples/sec: 1636.50 - lr: 0.000014 - momentum: 0.000000 2023-10-23 15:12:18,228 epoch 6 - iter 270/275 - loss 0.03378875 - time (sec): 13.85 - samples/sec: 1610.76 - lr: 0.000013 - momentum: 0.000000 2023-10-23 15:12:18,485 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:12:18,485 EPOCH 6 done: loss 0.0347 - lr: 0.000013 2023-10-23 15:12:19,022 DEV : loss 0.15631859004497528 - f1-score (micro avg) 0.8779 2023-10-23 15:12:19,028 saving best model 2023-10-23 15:12:19,594 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:12:20,868 epoch 7 - iter 27/275 - loss 0.02196458 - time (sec): 1.27 - samples/sec: 1969.43 - lr: 0.000013 - momentum: 0.000000 2023-10-23 15:12:22,123 epoch 7 - iter 54/275 - loss 0.04158811 - time (sec): 2.53 - samples/sec: 1817.16 - lr: 0.000013 - momentum: 0.000000 2023-10-23 15:12:23,403 epoch 7 - iter 81/275 - loss 0.04296887 - time (sec): 3.81 - samples/sec: 1706.74 - lr: 0.000012 - momentum: 0.000000 2023-10-23 15:12:24,694 epoch 7 - iter 108/275 - loss 0.04355020 - time (sec): 5.10 - samples/sec: 1746.20 - lr: 0.000012 - momentum: 0.000000 2023-10-23 15:12:25,994 epoch 7 - iter 135/275 - loss 0.04131326 - time (sec): 6.40 - samples/sec: 1717.86 - lr: 0.000012 - momentum: 0.000000 2023-10-23 15:12:27,308 epoch 7 - iter 162/275 - loss 0.03799736 - time (sec): 7.71 - samples/sec: 1736.24 - lr: 0.000011 - momentum: 0.000000 2023-10-23 15:12:28,602 epoch 7 - iter 189/275 - loss 0.03327018 - time (sec): 9.01 - samples/sec: 1738.02 - lr: 0.000011 - momentum: 0.000000 2023-10-23 15:12:29,895 epoch 7 - iter 216/275 - loss 0.03359652 - time (sec): 10.30 - samples/sec: 1727.37 - lr: 0.000011 - momentum: 0.000000 2023-10-23 15:12:31,204 epoch 7 - iter 243/275 - loss 0.03160177 - time (sec): 11.61 - samples/sec: 1718.90 - lr: 0.000010 - momentum: 0.000000 2023-10-23 15:12:32,501 epoch 7 - iter 270/275 - loss 0.02882578 - time (sec): 12.90 - samples/sec: 1737.04 - lr: 0.000010 - momentum: 0.000000 2023-10-23 15:12:32,736 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:12:32,736 EPOCH 7 done: loss 0.0285 - lr: 0.000010 2023-10-23 15:12:33,277 DEV : loss 0.16955971717834473 - f1-score (micro avg) 0.8744 2023-10-23 15:12:33,282 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:12:34,599 epoch 8 - iter 27/275 - loss 0.01998180 - time (sec): 1.32 - samples/sec: 1537.72 - lr: 0.000010 - momentum: 0.000000 2023-10-23 15:12:35,899 epoch 8 - iter 54/275 - loss 0.01378686 - time (sec): 2.62 - samples/sec: 1613.88 - lr: 0.000009 - momentum: 0.000000 2023-10-23 15:12:37,203 epoch 8 - iter 81/275 - loss 0.01143161 - time (sec): 3.92 - samples/sec: 1678.18 - lr: 0.000009 - momentum: 0.000000 2023-10-23 15:12:38,496 epoch 8 - iter 108/275 - loss 0.01176226 - time (sec): 5.21 - samples/sec: 1631.65 - lr: 0.000009 - momentum: 0.000000 2023-10-23 15:12:39,772 epoch 8 - iter 135/275 - loss 0.01366111 - time (sec): 6.49 - samples/sec: 1678.66 - lr: 0.000008 - momentum: 0.000000 2023-10-23 15:12:41,024 epoch 8 - iter 162/275 - loss 0.01647324 - time (sec): 7.74 - samples/sec: 1694.04 - lr: 0.000008 - momentum: 0.000000 2023-10-23 15:12:42,271 epoch 8 - iter 189/275 - loss 0.01798398 - time (sec): 8.99 - samples/sec: 1686.30 - lr: 0.000008 - momentum: 0.000000 2023-10-23 15:12:43,525 epoch 8 - iter 216/275 - loss 0.02142971 - time (sec): 10.24 - samples/sec: 1723.83 - lr: 0.000007 - momentum: 0.000000 2023-10-23 15:12:44,787 epoch 8 - iter 243/275 - loss 0.02026423 - time (sec): 11.50 - samples/sec: 1749.84 - lr: 0.000007 - momentum: 0.000000 2023-10-23 15:12:46,033 epoch 8 - iter 270/275 - loss 0.02207763 - time (sec): 12.75 - samples/sec: 1749.45 - lr: 0.000007 - momentum: 0.000000 2023-10-23 15:12:46,267 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:12:46,267 EPOCH 8 done: loss 0.0216 - lr: 0.000007 2023-10-23 15:12:46,812 DEV : loss 0.15825608372688293 - f1-score (micro avg) 0.8967 2023-10-23 15:12:46,818 saving best model 2023-10-23 15:12:47,390 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:12:48,799 epoch 9 - iter 27/275 - loss 0.03110460 - time (sec): 1.40 - samples/sec: 1515.44 - lr: 0.000006 - momentum: 0.000000 2023-10-23 15:12:50,181 epoch 9 - iter 54/275 - loss 0.03372885 - time (sec): 2.79 - samples/sec: 1642.27 - lr: 0.000006 - momentum: 0.000000 2023-10-23 15:12:51,581 epoch 9 - iter 81/275 - loss 0.02492065 - time (sec): 4.19 - samples/sec: 1655.58 - lr: 0.000006 - momentum: 0.000000 2023-10-23 15:12:52,963 epoch 9 - iter 108/275 - loss 0.02139891 - time (sec): 5.57 - samples/sec: 1642.73 - lr: 0.000005 - momentum: 0.000000 2023-10-23 15:12:54,343 epoch 9 - iter 135/275 - loss 0.02002837 - time (sec): 6.95 - samples/sec: 1633.58 - lr: 0.000005 - momentum: 0.000000 2023-10-23 15:12:55,729 epoch 9 - iter 162/275 - loss 0.01736368 - time (sec): 8.33 - samples/sec: 1619.78 - lr: 0.000005 - momentum: 0.000000 2023-10-23 15:12:57,098 epoch 9 - iter 189/275 - loss 0.01947021 - time (sec): 9.70 - samples/sec: 1605.30 - lr: 0.000004 - momentum: 0.000000 2023-10-23 15:12:58,487 epoch 9 - iter 216/275 - loss 0.01800969 - time (sec): 11.09 - samples/sec: 1608.53 - lr: 0.000004 - momentum: 0.000000 2023-10-23 15:12:59,873 epoch 9 - iter 243/275 - loss 0.01719567 - time (sec): 12.48 - samples/sec: 1611.67 - lr: 0.000004 - momentum: 0.000000 2023-10-23 15:13:01,256 epoch 9 - iter 270/275 - loss 0.01779564 - time (sec): 13.86 - samples/sec: 1619.62 - lr: 0.000003 - momentum: 0.000000 2023-10-23 15:13:01,516 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:13:01,517 EPOCH 9 done: loss 0.0175 - lr: 0.000003 2023-10-23 15:13:02,062 DEV : loss 0.1630079746246338 - f1-score (micro avg) 0.8851 2023-10-23 15:13:02,068 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:13:03,456 epoch 10 - iter 27/275 - loss 0.00301554 - time (sec): 1.39 - samples/sec: 1637.10 - lr: 0.000003 - momentum: 0.000000 2023-10-23 15:13:04,837 epoch 10 - iter 54/275 - loss 0.00340856 - time (sec): 2.77 - samples/sec: 1583.62 - lr: 0.000003 - momentum: 0.000000 2023-10-23 15:13:06,220 epoch 10 - iter 81/275 - loss 0.00567843 - time (sec): 4.15 - samples/sec: 1566.23 - lr: 0.000002 - momentum: 0.000000 2023-10-23 15:13:07,608 epoch 10 - iter 108/275 - loss 0.00460529 - time (sec): 5.54 - samples/sec: 1549.78 - lr: 0.000002 - momentum: 0.000000 2023-10-23 15:13:08,996 epoch 10 - iter 135/275 - loss 0.00808461 - time (sec): 6.93 - samples/sec: 1543.84 - lr: 0.000002 - momentum: 0.000000 2023-10-23 15:13:10,376 epoch 10 - iter 162/275 - loss 0.00815247 - time (sec): 8.31 - samples/sec: 1548.22 - lr: 0.000001 - momentum: 0.000000 2023-10-23 15:13:11,766 epoch 10 - iter 189/275 - loss 0.00864685 - time (sec): 9.70 - samples/sec: 1577.54 - lr: 0.000001 - momentum: 0.000000 2023-10-23 15:13:13,155 epoch 10 - iter 216/275 - loss 0.00795843 - time (sec): 11.09 - samples/sec: 1580.00 - lr: 0.000001 - momentum: 0.000000 2023-10-23 15:13:14,540 epoch 10 - iter 243/275 - loss 0.01293225 - time (sec): 12.47 - samples/sec: 1583.35 - lr: 0.000000 - momentum: 0.000000 2023-10-23 15:13:15,920 epoch 10 - iter 270/275 - loss 0.01618902 - time (sec): 13.85 - samples/sec: 1610.49 - lr: 0.000000 - momentum: 0.000000 2023-10-23 15:13:16,171 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:13:16,171 EPOCH 10 done: loss 0.0159 - lr: 0.000000 2023-10-23 15:13:16,714 DEV : loss 0.16544003784656525 - f1-score (micro avg) 0.8843 2023-10-23 15:13:17,161 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:13:17,162 Loading model from best epoch ... 2023-10-23 15:13:19,008 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-23 15:13:19,684 Results: - F-score (micro) 0.9117 - F-score (macro) 0.7427 - Accuracy 0.8543 By class: precision recall f1-score support scope 0.9000 0.9205 0.9101 176 pers 1.0000 0.9297 0.9636 128 work 0.8289 0.8514 0.8400 74 object 1.0000 1.0000 1.0000 2 loc 0.0000 0.0000 0.0000 2 micro avg 0.9178 0.9058 0.9117 382 macro avg 0.7458 0.7403 0.7427 382 weighted avg 0.9156 0.9058 0.9101 382 2023-10-23 15:13:19,684 ----------------------------------------------------------------------------------------------------