2023-10-20 09:18:28,284 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:18:28,285 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-20 09:18:28,285 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:18:28,285 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-20 09:18:28,285 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:18:28,285 Train: 6183 sentences 2023-10-20 09:18:28,285 (train_with_dev=False, train_with_test=False) 2023-10-20 09:18:28,285 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:18:28,285 Training Params: 2023-10-20 09:18:28,285 - learning_rate: "5e-05" 2023-10-20 09:18:28,285 - mini_batch_size: "8" 2023-10-20 09:18:28,285 - max_epochs: "10" 2023-10-20 09:18:28,285 - shuffle: "True" 2023-10-20 09:18:28,285 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:18:28,285 Plugins: 2023-10-20 09:18:28,285 - TensorboardLogger 2023-10-20 09:18:28,285 - LinearScheduler | warmup_fraction: '0.1' 2023-10-20 09:18:28,285 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:18:28,285 Final evaluation on model from best epoch (best-model.pt) 2023-10-20 09:18:28,285 - metric: "('micro avg', 'f1-score')" 2023-10-20 09:18:28,285 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:18:28,285 Computation: 2023-10-20 09:18:28,285 - compute on device: cuda:0 2023-10-20 09:18:28,285 - embedding storage: none 2023-10-20 09:18:28,285 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:18:28,285 Model training base path: "hmbench-topres19th/en-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-20 09:18:28,285 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:18:28,286 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:18:28,286 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-20 09:18:30,058 epoch 1 - iter 77/773 - loss 3.29697785 - time (sec): 1.77 - samples/sec: 7168.00 - lr: 0.000005 - momentum: 0.000000 2023-10-20 09:18:31,992 epoch 1 - iter 154/773 - loss 2.98890475 - time (sec): 3.71 - samples/sec: 6576.90 - lr: 0.000010 - momentum: 0.000000 2023-10-20 09:18:33,803 epoch 1 - iter 231/773 - loss 2.52454647 - time (sec): 5.52 - samples/sec: 6624.09 - lr: 0.000015 - momentum: 0.000000 2023-10-20 09:18:35,611 epoch 1 - iter 308/773 - loss 2.02302739 - time (sec): 7.32 - samples/sec: 6736.79 - lr: 0.000020 - momentum: 0.000000 2023-10-20 09:18:37,338 epoch 1 - iter 385/773 - loss 1.67463555 - time (sec): 9.05 - samples/sec: 6790.49 - lr: 0.000025 - momentum: 0.000000 2023-10-20 09:18:39,053 epoch 1 - iter 462/773 - loss 1.45168116 - time (sec): 10.77 - samples/sec: 6786.86 - lr: 0.000030 - momentum: 0.000000 2023-10-20 09:18:40,795 epoch 1 - iter 539/773 - loss 1.27870844 - time (sec): 12.51 - samples/sec: 6856.46 - lr: 0.000035 - momentum: 0.000000 2023-10-20 09:18:42,569 epoch 1 - iter 616/773 - loss 1.14349474 - time (sec): 14.28 - samples/sec: 6915.75 - lr: 0.000040 - momentum: 0.000000 2023-10-20 09:18:44,300 epoch 1 - iter 693/773 - loss 1.04649070 - time (sec): 16.01 - samples/sec: 6911.30 - lr: 0.000045 - momentum: 0.000000 2023-10-20 09:18:46,008 epoch 1 - iter 770/773 - loss 0.96192728 - time (sec): 17.72 - samples/sec: 6982.19 - lr: 0.000050 - momentum: 0.000000 2023-10-20 09:18:46,078 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:18:46,078 EPOCH 1 done: loss 0.9579 - lr: 0.000050 2023-10-20 09:18:47,081 DEV : loss 0.13065889477729797 - f1-score (micro avg) 0.0 2023-10-20 09:18:47,093 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:18:48,821 epoch 2 - iter 77/773 - loss 0.21812324 - time (sec): 1.73 - samples/sec: 7194.59 - lr: 0.000049 - momentum: 0.000000 2023-10-20 09:18:50,518 epoch 2 - iter 154/773 - loss 0.20906091 - time (sec): 3.42 - samples/sec: 7000.84 - lr: 0.000049 - momentum: 0.000000 2023-10-20 09:18:52,209 epoch 2 - iter 231/773 - loss 0.20993493 - time (sec): 5.12 - samples/sec: 6931.96 - lr: 0.000048 - momentum: 0.000000 2023-10-20 09:18:54,042 epoch 2 - iter 308/773 - loss 0.19690084 - time (sec): 6.95 - samples/sec: 6958.80 - lr: 0.000048 - momentum: 0.000000 2023-10-20 09:18:55,800 epoch 2 - iter 385/773 - loss 0.19499084 - time (sec): 8.71 - samples/sec: 7026.30 - lr: 0.000047 - momentum: 0.000000 2023-10-20 09:18:57,611 epoch 2 - iter 462/773 - loss 0.19104238 - time (sec): 10.52 - samples/sec: 6931.75 - lr: 0.000047 - momentum: 0.000000 2023-10-20 09:18:59,427 epoch 2 - iter 539/773 - loss 0.18943072 - time (sec): 12.33 - samples/sec: 6878.89 - lr: 0.000046 - momentum: 0.000000 2023-10-20 09:19:01,231 epoch 2 - iter 616/773 - loss 0.18586711 - time (sec): 14.14 - samples/sec: 6897.61 - lr: 0.000046 - momentum: 0.000000 2023-10-20 09:19:03,004 epoch 2 - iter 693/773 - loss 0.18459343 - time (sec): 15.91 - samples/sec: 6903.21 - lr: 0.000045 - momentum: 0.000000 2023-10-20 09:19:04,772 epoch 2 - iter 770/773 - loss 0.18021271 - time (sec): 17.68 - samples/sec: 6991.26 - lr: 0.000044 - momentum: 0.000000 2023-10-20 09:19:04,847 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:19:04,848 EPOCH 2 done: loss 0.1796 - lr: 0.000044 2023-10-20 09:19:06,215 DEV : loss 0.08806052803993225 - f1-score (micro avg) 0.4817 2023-10-20 09:19:06,227 saving best model 2023-10-20 09:19:06,256 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:19:08,034 epoch 3 - iter 77/773 - loss 0.16021178 - time (sec): 1.78 - samples/sec: 6437.36 - lr: 0.000044 - momentum: 0.000000 2023-10-20 09:19:09,769 epoch 3 - iter 154/773 - loss 0.14807212 - time (sec): 3.51 - samples/sec: 6858.33 - lr: 0.000043 - momentum: 0.000000 2023-10-20 09:19:11,509 epoch 3 - iter 231/773 - loss 0.14019763 - time (sec): 5.25 - samples/sec: 6941.49 - lr: 0.000043 - momentum: 0.000000 2023-10-20 09:19:13,236 epoch 3 - iter 308/773 - loss 0.14847424 - time (sec): 6.98 - samples/sec: 7082.25 - lr: 0.000042 - momentum: 0.000000 2023-10-20 09:19:14,934 epoch 3 - iter 385/773 - loss 0.14709744 - time (sec): 8.68 - samples/sec: 7070.55 - lr: 0.000042 - momentum: 0.000000 2023-10-20 09:19:16,681 epoch 3 - iter 462/773 - loss 0.14818828 - time (sec): 10.42 - samples/sec: 7183.86 - lr: 0.000041 - momentum: 0.000000 2023-10-20 09:19:18,432 epoch 3 - iter 539/773 - loss 0.14929835 - time (sec): 12.18 - samples/sec: 7178.10 - lr: 0.000041 - momentum: 0.000000 2023-10-20 09:19:20,162 epoch 3 - iter 616/773 - loss 0.14786552 - time (sec): 13.91 - samples/sec: 7189.40 - lr: 0.000040 - momentum: 0.000000 2023-10-20 09:19:21,899 epoch 3 - iter 693/773 - loss 0.14779447 - time (sec): 15.64 - samples/sec: 7105.63 - lr: 0.000039 - momentum: 0.000000 2023-10-20 09:19:23,605 epoch 3 - iter 770/773 - loss 0.14693652 - time (sec): 17.35 - samples/sec: 7129.47 - lr: 0.000039 - momentum: 0.000000 2023-10-20 09:19:23,676 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:19:23,676 EPOCH 3 done: loss 0.1467 - lr: 0.000039 2023-10-20 09:19:24,745 DEV : loss 0.07945284247398376 - f1-score (micro avg) 0.585 2023-10-20 09:19:24,756 saving best model 2023-10-20 09:19:24,790 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:19:26,577 epoch 4 - iter 77/773 - loss 0.13996426 - time (sec): 1.79 - samples/sec: 7057.97 - lr: 0.000038 - momentum: 0.000000 2023-10-20 09:19:28,330 epoch 4 - iter 154/773 - loss 0.13144568 - time (sec): 3.54 - samples/sec: 7020.35 - lr: 0.000038 - momentum: 0.000000 2023-10-20 09:19:30,061 epoch 4 - iter 231/773 - loss 0.13656365 - time (sec): 5.27 - samples/sec: 6788.28 - lr: 0.000037 - momentum: 0.000000 2023-10-20 09:19:31,745 epoch 4 - iter 308/773 - loss 0.13712233 - time (sec): 6.95 - samples/sec: 6967.80 - lr: 0.000037 - momentum: 0.000000 2023-10-20 09:19:33,517 epoch 4 - iter 385/773 - loss 0.13771739 - time (sec): 8.73 - samples/sec: 6955.05 - lr: 0.000036 - momentum: 0.000000 2023-10-20 09:19:35,269 epoch 4 - iter 462/773 - loss 0.13408346 - time (sec): 10.48 - samples/sec: 6958.02 - lr: 0.000036 - momentum: 0.000000 2023-10-20 09:19:37,000 epoch 4 - iter 539/773 - loss 0.13073922 - time (sec): 12.21 - samples/sec: 7056.03 - lr: 0.000035 - momentum: 0.000000 2023-10-20 09:19:38,795 epoch 4 - iter 616/773 - loss 0.13003178 - time (sec): 14.00 - samples/sec: 7078.88 - lr: 0.000034 - momentum: 0.000000 2023-10-20 09:19:40,490 epoch 4 - iter 693/773 - loss 0.13006617 - time (sec): 15.70 - samples/sec: 7082.33 - lr: 0.000034 - momentum: 0.000000 2023-10-20 09:19:42,234 epoch 4 - iter 770/773 - loss 0.12994693 - time (sec): 17.44 - samples/sec: 7100.14 - lr: 0.000033 - momentum: 0.000000 2023-10-20 09:19:42,298 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:19:42,298 EPOCH 4 done: loss 0.1299 - lr: 0.000033 2023-10-20 09:19:43,375 DEV : loss 0.07809021323919296 - f1-score (micro avg) 0.6071 2023-10-20 09:19:43,386 saving best model 2023-10-20 09:19:43,425 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:19:45,188 epoch 5 - iter 77/773 - loss 0.11117031 - time (sec): 1.76 - samples/sec: 6996.81 - lr: 0.000033 - momentum: 0.000000 2023-10-20 09:19:46,900 epoch 5 - iter 154/773 - loss 0.11764714 - time (sec): 3.47 - samples/sec: 6894.04 - lr: 0.000032 - momentum: 0.000000 2023-10-20 09:19:48,635 epoch 5 - iter 231/773 - loss 0.11991202 - time (sec): 5.21 - samples/sec: 6905.69 - lr: 0.000032 - momentum: 0.000000 2023-10-20 09:19:50,432 epoch 5 - iter 308/773 - loss 0.11700014 - time (sec): 7.01 - samples/sec: 7024.72 - lr: 0.000031 - momentum: 0.000000 2023-10-20 09:19:52,197 epoch 5 - iter 385/773 - loss 0.11617792 - time (sec): 8.77 - samples/sec: 7091.97 - lr: 0.000031 - momentum: 0.000000 2023-10-20 09:19:53,901 epoch 5 - iter 462/773 - loss 0.11525526 - time (sec): 10.48 - samples/sec: 7097.11 - lr: 0.000030 - momentum: 0.000000 2023-10-20 09:19:55,632 epoch 5 - iter 539/773 - loss 0.11333040 - time (sec): 12.21 - samples/sec: 7074.53 - lr: 0.000029 - momentum: 0.000000 2023-10-20 09:19:57,353 epoch 5 - iter 616/773 - loss 0.11550300 - time (sec): 13.93 - samples/sec: 7091.84 - lr: 0.000029 - momentum: 0.000000 2023-10-20 09:19:59,064 epoch 5 - iter 693/773 - loss 0.11629394 - time (sec): 15.64 - samples/sec: 7154.16 - lr: 0.000028 - momentum: 0.000000 2023-10-20 09:20:00,821 epoch 5 - iter 770/773 - loss 0.11699897 - time (sec): 17.40 - samples/sec: 7117.38 - lr: 0.000028 - momentum: 0.000000 2023-10-20 09:20:00,890 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:20:00,890 EPOCH 5 done: loss 0.1169 - lr: 0.000028 2023-10-20 09:20:01,993 DEV : loss 0.07527422904968262 - f1-score (micro avg) 0.5973 2023-10-20 09:20:02,004 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:20:03,679 epoch 6 - iter 77/773 - loss 0.09373771 - time (sec): 1.67 - samples/sec: 7129.91 - lr: 0.000027 - momentum: 0.000000 2023-10-20 09:20:05,341 epoch 6 - iter 154/773 - loss 0.10168393 - time (sec): 3.34 - samples/sec: 7107.39 - lr: 0.000027 - momentum: 0.000000 2023-10-20 09:20:07,029 epoch 6 - iter 231/773 - loss 0.11169160 - time (sec): 5.02 - samples/sec: 7123.67 - lr: 0.000026 - momentum: 0.000000 2023-10-20 09:20:08,763 epoch 6 - iter 308/773 - loss 0.11308050 - time (sec): 6.76 - samples/sec: 7205.74 - lr: 0.000026 - momentum: 0.000000 2023-10-20 09:20:10,476 epoch 6 - iter 385/773 - loss 0.11734393 - time (sec): 8.47 - samples/sec: 7119.05 - lr: 0.000025 - momentum: 0.000000 2023-10-20 09:20:12,242 epoch 6 - iter 462/773 - loss 0.11359539 - time (sec): 10.24 - samples/sec: 7161.69 - lr: 0.000024 - momentum: 0.000000 2023-10-20 09:20:13,952 epoch 6 - iter 539/773 - loss 0.11053055 - time (sec): 11.95 - samples/sec: 7179.03 - lr: 0.000024 - momentum: 0.000000 2023-10-20 09:20:15,718 epoch 6 - iter 616/773 - loss 0.10958303 - time (sec): 13.71 - samples/sec: 7213.13 - lr: 0.000023 - momentum: 0.000000 2023-10-20 09:20:17,429 epoch 6 - iter 693/773 - loss 0.10856611 - time (sec): 15.42 - samples/sec: 7165.92 - lr: 0.000023 - momentum: 0.000000 2023-10-20 09:20:19,168 epoch 6 - iter 770/773 - loss 0.10997793 - time (sec): 17.16 - samples/sec: 7211.89 - lr: 0.000022 - momentum: 0.000000 2023-10-20 09:20:19,240 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:20:19,241 EPOCH 6 done: loss 0.1098 - lr: 0.000022 2023-10-20 09:20:20,340 DEV : loss 0.07457771897315979 - f1-score (micro avg) 0.6144 2023-10-20 09:20:20,354 saving best model 2023-10-20 09:20:20,394 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:20:22,182 epoch 7 - iter 77/773 - loss 0.10104822 - time (sec): 1.79 - samples/sec: 7514.85 - lr: 0.000022 - momentum: 0.000000 2023-10-20 09:20:23,898 epoch 7 - iter 154/773 - loss 0.09821556 - time (sec): 3.50 - samples/sec: 7100.36 - lr: 0.000021 - momentum: 0.000000 2023-10-20 09:20:25,626 epoch 7 - iter 231/773 - loss 0.09548090 - time (sec): 5.23 - samples/sec: 7242.33 - lr: 0.000021 - momentum: 0.000000 2023-10-20 09:20:27,321 epoch 7 - iter 308/773 - loss 0.10355270 - time (sec): 6.93 - samples/sec: 7143.99 - lr: 0.000020 - momentum: 0.000000 2023-10-20 09:20:29,117 epoch 7 - iter 385/773 - loss 0.10438038 - time (sec): 8.72 - samples/sec: 7124.85 - lr: 0.000019 - momentum: 0.000000 2023-10-20 09:20:30,854 epoch 7 - iter 462/773 - loss 0.10313345 - time (sec): 10.46 - samples/sec: 7148.72 - lr: 0.000019 - momentum: 0.000000 2023-10-20 09:20:32,639 epoch 7 - iter 539/773 - loss 0.10557270 - time (sec): 12.24 - samples/sec: 7134.59 - lr: 0.000018 - momentum: 0.000000 2023-10-20 09:20:34,316 epoch 7 - iter 616/773 - loss 0.10463863 - time (sec): 13.92 - samples/sec: 7183.24 - lr: 0.000018 - momentum: 0.000000 2023-10-20 09:20:36,084 epoch 7 - iter 693/773 - loss 0.10402035 - time (sec): 15.69 - samples/sec: 7119.43 - lr: 0.000017 - momentum: 0.000000 2023-10-20 09:20:37,828 epoch 7 - iter 770/773 - loss 0.10392693 - time (sec): 17.43 - samples/sec: 7102.19 - lr: 0.000017 - momentum: 0.000000 2023-10-20 09:20:37,899 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:20:37,899 EPOCH 7 done: loss 0.1037 - lr: 0.000017 2023-10-20 09:20:39,024 DEV : loss 0.07657597213983536 - f1-score (micro avg) 0.6242 2023-10-20 09:20:39,037 saving best model 2023-10-20 09:20:39,075 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:20:40,803 epoch 8 - iter 77/773 - loss 0.08347295 - time (sec): 1.73 - samples/sec: 7045.81 - lr: 0.000016 - momentum: 0.000000 2023-10-20 09:20:42,539 epoch 8 - iter 154/773 - loss 0.10176786 - time (sec): 3.46 - samples/sec: 7193.34 - lr: 0.000016 - momentum: 0.000000 2023-10-20 09:20:44,261 epoch 8 - iter 231/773 - loss 0.10219115 - time (sec): 5.19 - samples/sec: 7117.05 - lr: 0.000015 - momentum: 0.000000 2023-10-20 09:20:45,993 epoch 8 - iter 308/773 - loss 0.09661994 - time (sec): 6.92 - samples/sec: 7120.96 - lr: 0.000014 - momentum: 0.000000 2023-10-20 09:20:47,714 epoch 8 - iter 385/773 - loss 0.09443349 - time (sec): 8.64 - samples/sec: 7214.99 - lr: 0.000014 - momentum: 0.000000 2023-10-20 09:20:49,480 epoch 8 - iter 462/773 - loss 0.09714750 - time (sec): 10.40 - samples/sec: 7273.85 - lr: 0.000013 - momentum: 0.000000 2023-10-20 09:20:51,177 epoch 8 - iter 539/773 - loss 0.09639649 - time (sec): 12.10 - samples/sec: 7201.04 - lr: 0.000013 - momentum: 0.000000 2023-10-20 09:20:52,899 epoch 8 - iter 616/773 - loss 0.09879490 - time (sec): 13.82 - samples/sec: 7136.22 - lr: 0.000012 - momentum: 0.000000 2023-10-20 09:20:54,622 epoch 8 - iter 693/773 - loss 0.09745774 - time (sec): 15.55 - samples/sec: 7128.81 - lr: 0.000012 - momentum: 0.000000 2023-10-20 09:20:56,382 epoch 8 - iter 770/773 - loss 0.09749713 - time (sec): 17.31 - samples/sec: 7163.01 - lr: 0.000011 - momentum: 0.000000 2023-10-20 09:20:56,439 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:20:56,439 EPOCH 8 done: loss 0.0973 - lr: 0.000011 2023-10-20 09:20:57,540 DEV : loss 0.07944915443658829 - f1-score (micro avg) 0.6269 2023-10-20 09:20:57,552 saving best model 2023-10-20 09:20:57,590 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:20:59,260 epoch 9 - iter 77/773 - loss 0.09459646 - time (sec): 1.67 - samples/sec: 7330.85 - lr: 0.000011 - momentum: 0.000000 2023-10-20 09:21:00,859 epoch 9 - iter 154/773 - loss 0.09513909 - time (sec): 3.27 - samples/sec: 7555.16 - lr: 0.000010 - momentum: 0.000000 2023-10-20 09:21:02,459 epoch 9 - iter 231/773 - loss 0.08891959 - time (sec): 4.87 - samples/sec: 7837.97 - lr: 0.000009 - momentum: 0.000000 2023-10-20 09:21:04,135 epoch 9 - iter 308/773 - loss 0.09023339 - time (sec): 6.54 - samples/sec: 7649.30 - lr: 0.000009 - momentum: 0.000000 2023-10-20 09:21:05,867 epoch 9 - iter 385/773 - loss 0.09340261 - time (sec): 8.28 - samples/sec: 7652.07 - lr: 0.000008 - momentum: 0.000000 2023-10-20 09:21:07,610 epoch 9 - iter 462/773 - loss 0.09643093 - time (sec): 10.02 - samples/sec: 7510.55 - lr: 0.000008 - momentum: 0.000000 2023-10-20 09:21:09,362 epoch 9 - iter 539/773 - loss 0.09664935 - time (sec): 11.77 - samples/sec: 7470.84 - lr: 0.000007 - momentum: 0.000000 2023-10-20 09:21:11,036 epoch 9 - iter 616/773 - loss 0.09738019 - time (sec): 13.45 - samples/sec: 7447.59 - lr: 0.000007 - momentum: 0.000000 2023-10-20 09:21:12,730 epoch 9 - iter 693/773 - loss 0.09569268 - time (sec): 15.14 - samples/sec: 7368.56 - lr: 0.000006 - momentum: 0.000000 2023-10-20 09:21:14,449 epoch 9 - iter 770/773 - loss 0.09441439 - time (sec): 16.86 - samples/sec: 7348.71 - lr: 0.000006 - momentum: 0.000000 2023-10-20 09:21:14,509 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:21:14,509 EPOCH 9 done: loss 0.0943 - lr: 0.000006 2023-10-20 09:21:15,576 DEV : loss 0.08020081371068954 - f1-score (micro avg) 0.6476 2023-10-20 09:21:15,587 saving best model 2023-10-20 09:21:15,626 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:21:17,352 epoch 10 - iter 77/773 - loss 0.10683426 - time (sec): 1.73 - samples/sec: 6965.15 - lr: 0.000005 - momentum: 0.000000 2023-10-20 09:21:19,118 epoch 10 - iter 154/773 - loss 0.09922985 - time (sec): 3.49 - samples/sec: 7124.64 - lr: 0.000005 - momentum: 0.000000 2023-10-20 09:21:20,969 epoch 10 - iter 231/773 - loss 0.09606770 - time (sec): 5.34 - samples/sec: 7135.38 - lr: 0.000004 - momentum: 0.000000 2023-10-20 09:21:22,727 epoch 10 - iter 308/773 - loss 0.09168936 - time (sec): 7.10 - samples/sec: 7163.36 - lr: 0.000003 - momentum: 0.000000 2023-10-20 09:21:24,454 epoch 10 - iter 385/773 - loss 0.09319490 - time (sec): 8.83 - samples/sec: 7145.89 - lr: 0.000003 - momentum: 0.000000 2023-10-20 09:21:26,193 epoch 10 - iter 462/773 - loss 0.09108964 - time (sec): 10.57 - samples/sec: 7122.69 - lr: 0.000002 - momentum: 0.000000 2023-10-20 09:21:27,920 epoch 10 - iter 539/773 - loss 0.08863410 - time (sec): 12.29 - samples/sec: 7160.35 - lr: 0.000002 - momentum: 0.000000 2023-10-20 09:21:29,676 epoch 10 - iter 616/773 - loss 0.08839821 - time (sec): 14.05 - samples/sec: 7084.18 - lr: 0.000001 - momentum: 0.000000 2023-10-20 09:21:31,407 epoch 10 - iter 693/773 - loss 0.09194621 - time (sec): 15.78 - samples/sec: 7085.52 - lr: 0.000001 - momentum: 0.000000 2023-10-20 09:21:33,142 epoch 10 - iter 770/773 - loss 0.09345324 - time (sec): 17.52 - samples/sec: 7078.58 - lr: 0.000000 - momentum: 0.000000 2023-10-20 09:21:33,207 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:21:33,207 EPOCH 10 done: loss 0.0933 - lr: 0.000000 2023-10-20 09:21:34,289 DEV : loss 0.08138395845890045 - f1-score (micro avg) 0.6491 2023-10-20 09:21:34,302 saving best model 2023-10-20 09:21:34,370 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:21:34,371 Loading model from best epoch ... 2023-10-20 09:21:34,443 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-20 09:21:37,354 Results: - F-score (micro) 0.5866 - F-score (macro) 0.3173 - Accuracy 0.4314 By class: precision recall f1-score support LOC 0.6274 0.6818 0.6535 946 BUILDING 0.3400 0.0919 0.1447 185 STREET 0.5556 0.0893 0.1538 56 micro avg 0.6136 0.5619 0.5866 1187 macro avg 0.5077 0.2877 0.3173 1187 weighted avg 0.5792 0.5619 0.5506 1187 2023-10-20 09:21:37,354 ----------------------------------------------------------------------------------------------------