2023-10-18 16:03:03,599 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:03,600 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 16:03:03,600 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:03,600 MultiCorpus: 1214 train + 266 dev + 251 test sentences - NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator 2023-10-18 16:03:03,600 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:03,600 Train: 1214 sentences 2023-10-18 16:03:03,600 (train_with_dev=False, train_with_test=False) 2023-10-18 16:03:03,600 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:03,600 Training Params: 2023-10-18 16:03:03,600 - learning_rate: "5e-05" 2023-10-18 16:03:03,600 - mini_batch_size: "4" 2023-10-18 16:03:03,600 - max_epochs: "10" 2023-10-18 16:03:03,600 - shuffle: "True" 2023-10-18 16:03:03,600 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:03,600 Plugins: 2023-10-18 16:03:03,600 - TensorboardLogger 2023-10-18 16:03:03,600 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 16:03:03,600 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:03,600 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 16:03:03,600 - metric: "('micro avg', 'f1-score')" 2023-10-18 16:03:03,600 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:03,600 Computation: 2023-10-18 16:03:03,600 - compute on device: cuda:0 2023-10-18 16:03:03,600 - embedding storage: none 2023-10-18 16:03:03,600 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:03,600 Model training base path: "hmbench-ajmc/en-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-18 16:03:03,600 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:03,601 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:03,601 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 16:03:04,052 epoch 1 - iter 30/304 - loss 3.41849300 - time (sec): 0.45 - samples/sec: 5984.17 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:03:04,517 epoch 1 - iter 60/304 - loss 3.39176236 - time (sec): 0.92 - samples/sec: 6393.79 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:03:04,967 epoch 1 - iter 90/304 - loss 3.25672873 - time (sec): 1.37 - samples/sec: 6365.90 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:03:05,439 epoch 1 - iter 120/304 - loss 3.00508260 - time (sec): 1.84 - samples/sec: 6496.00 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:03:05,883 epoch 1 - iter 150/304 - loss 2.73441411 - time (sec): 2.28 - samples/sec: 6562.78 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:03:06,342 epoch 1 - iter 180/304 - loss 2.44462257 - time (sec): 2.74 - samples/sec: 6658.87 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:03:06,800 epoch 1 - iter 210/304 - loss 2.22589982 - time (sec): 3.20 - samples/sec: 6615.17 - lr: 0.000034 - momentum: 0.000000 2023-10-18 16:03:07,261 epoch 1 - iter 240/304 - loss 2.04161125 - time (sec): 3.66 - samples/sec: 6689.50 - lr: 0.000039 - momentum: 0.000000 2023-10-18 16:03:07,711 epoch 1 - iter 270/304 - loss 1.90319145 - time (sec): 4.11 - samples/sec: 6649.05 - lr: 0.000044 - momentum: 0.000000 2023-10-18 16:03:08,190 epoch 1 - iter 300/304 - loss 1.78350865 - time (sec): 4.59 - samples/sec: 6672.37 - lr: 0.000049 - momentum: 0.000000 2023-10-18 16:03:08,247 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:08,247 EPOCH 1 done: loss 1.7748 - lr: 0.000049 2023-10-18 16:03:08,578 DEV : loss 0.6327566504478455 - f1-score (micro avg) 0.0 2023-10-18 16:03:08,583 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:09,036 epoch 2 - iter 30/304 - loss 0.73044315 - time (sec): 0.45 - samples/sec: 6992.55 - lr: 0.000049 - momentum: 0.000000 2023-10-18 16:03:09,489 epoch 2 - iter 60/304 - loss 0.67758439 - time (sec): 0.91 - samples/sec: 6945.61 - lr: 0.000049 - momentum: 0.000000 2023-10-18 16:03:09,935 epoch 2 - iter 90/304 - loss 0.65488942 - time (sec): 1.35 - samples/sec: 6952.36 - lr: 0.000048 - momentum: 0.000000 2023-10-18 16:03:10,395 epoch 2 - iter 120/304 - loss 0.63046535 - time (sec): 1.81 - samples/sec: 6926.77 - lr: 0.000048 - momentum: 0.000000 2023-10-18 16:03:10,844 epoch 2 - iter 150/304 - loss 0.62704908 - time (sec): 2.26 - samples/sec: 6863.02 - lr: 0.000047 - momentum: 0.000000 2023-10-18 16:03:11,315 epoch 2 - iter 180/304 - loss 0.61356493 - time (sec): 2.73 - samples/sec: 6874.44 - lr: 0.000047 - momentum: 0.000000 2023-10-18 16:03:11,782 epoch 2 - iter 210/304 - loss 0.58009022 - time (sec): 3.20 - samples/sec: 6763.45 - lr: 0.000046 - momentum: 0.000000 2023-10-18 16:03:12,244 epoch 2 - iter 240/304 - loss 0.56905276 - time (sec): 3.66 - samples/sec: 6809.87 - lr: 0.000046 - momentum: 0.000000 2023-10-18 16:03:12,716 epoch 2 - iter 270/304 - loss 0.55723235 - time (sec): 4.13 - samples/sec: 6749.75 - lr: 0.000045 - momentum: 0.000000 2023-10-18 16:03:13,161 epoch 2 - iter 300/304 - loss 0.54249495 - time (sec): 4.58 - samples/sec: 6690.50 - lr: 0.000045 - momentum: 0.000000 2023-10-18 16:03:13,218 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:13,219 EPOCH 2 done: loss 0.5419 - lr: 0.000045 2023-10-18 16:03:13,847 DEV : loss 0.40577632188796997 - f1-score (micro avg) 0.2819 2023-10-18 16:03:13,853 saving best model 2023-10-18 16:03:13,885 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:14,334 epoch 3 - iter 30/304 - loss 0.45536769 - time (sec): 0.45 - samples/sec: 6574.68 - lr: 0.000044 - momentum: 0.000000 2023-10-18 16:03:14,791 epoch 3 - iter 60/304 - loss 0.44191078 - time (sec): 0.91 - samples/sec: 6718.80 - lr: 0.000043 - momentum: 0.000000 2023-10-18 16:03:15,263 epoch 3 - iter 90/304 - loss 0.45784976 - time (sec): 1.38 - samples/sec: 6812.34 - lr: 0.000043 - momentum: 0.000000 2023-10-18 16:03:15,727 epoch 3 - iter 120/304 - loss 0.43398176 - time (sec): 1.84 - samples/sec: 6792.47 - lr: 0.000042 - momentum: 0.000000 2023-10-18 16:03:16,184 epoch 3 - iter 150/304 - loss 0.41661428 - time (sec): 2.30 - samples/sec: 6759.62 - lr: 0.000042 - momentum: 0.000000 2023-10-18 16:03:16,663 epoch 3 - iter 180/304 - loss 0.41113762 - time (sec): 2.78 - samples/sec: 6605.85 - lr: 0.000041 - momentum: 0.000000 2023-10-18 16:03:17,116 epoch 3 - iter 210/304 - loss 0.39123677 - time (sec): 3.23 - samples/sec: 6603.38 - lr: 0.000041 - momentum: 0.000000 2023-10-18 16:03:17,563 epoch 3 - iter 240/304 - loss 0.38228251 - time (sec): 3.68 - samples/sec: 6553.87 - lr: 0.000040 - momentum: 0.000000 2023-10-18 16:03:18,018 epoch 3 - iter 270/304 - loss 0.39412474 - time (sec): 4.13 - samples/sec: 6606.38 - lr: 0.000040 - momentum: 0.000000 2023-10-18 16:03:18,468 epoch 3 - iter 300/304 - loss 0.39209364 - time (sec): 4.58 - samples/sec: 6676.78 - lr: 0.000039 - momentum: 0.000000 2023-10-18 16:03:18,524 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:18,524 EPOCH 3 done: loss 0.3940 - lr: 0.000039 2023-10-18 16:03:19,040 DEV : loss 0.33201417326927185 - f1-score (micro avg) 0.364 2023-10-18 16:03:19,045 saving best model 2023-10-18 16:03:19,078 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:19,522 epoch 4 - iter 30/304 - loss 0.43769105 - time (sec): 0.44 - samples/sec: 6506.48 - lr: 0.000038 - momentum: 0.000000 2023-10-18 16:03:19,979 epoch 4 - iter 60/304 - loss 0.38667198 - time (sec): 0.90 - samples/sec: 6612.74 - lr: 0.000038 - momentum: 0.000000 2023-10-18 16:03:20,445 epoch 4 - iter 90/304 - loss 0.36723145 - time (sec): 1.37 - samples/sec: 6696.71 - lr: 0.000037 - momentum: 0.000000 2023-10-18 16:03:20,886 epoch 4 - iter 120/304 - loss 0.35749767 - time (sec): 1.81 - samples/sec: 6650.50 - lr: 0.000037 - momentum: 0.000000 2023-10-18 16:03:21,333 epoch 4 - iter 150/304 - loss 0.34604597 - time (sec): 2.25 - samples/sec: 6701.69 - lr: 0.000036 - momentum: 0.000000 2023-10-18 16:03:21,788 epoch 4 - iter 180/304 - loss 0.34194832 - time (sec): 2.71 - samples/sec: 6660.53 - lr: 0.000036 - momentum: 0.000000 2023-10-18 16:03:22,247 epoch 4 - iter 210/304 - loss 0.34182085 - time (sec): 3.17 - samples/sec: 6637.69 - lr: 0.000035 - momentum: 0.000000 2023-10-18 16:03:22,690 epoch 4 - iter 240/304 - loss 0.33618300 - time (sec): 3.61 - samples/sec: 6606.68 - lr: 0.000035 - momentum: 0.000000 2023-10-18 16:03:23,148 epoch 4 - iter 270/304 - loss 0.34094492 - time (sec): 4.07 - samples/sec: 6631.16 - lr: 0.000034 - momentum: 0.000000 2023-10-18 16:03:23,617 epoch 4 - iter 300/304 - loss 0.33560213 - time (sec): 4.54 - samples/sec: 6740.23 - lr: 0.000033 - momentum: 0.000000 2023-10-18 16:03:23,673 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:23,673 EPOCH 4 done: loss 0.3338 - lr: 0.000033 2023-10-18 16:03:24,186 DEV : loss 0.3059861958026886 - f1-score (micro avg) 0.407 2023-10-18 16:03:24,191 saving best model 2023-10-18 16:03:24,224 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:24,677 epoch 5 - iter 30/304 - loss 0.33129831 - time (sec): 0.45 - samples/sec: 6674.16 - lr: 0.000033 - momentum: 0.000000 2023-10-18 16:03:25,130 epoch 5 - iter 60/304 - loss 0.37119634 - time (sec): 0.91 - samples/sec: 6901.10 - lr: 0.000032 - momentum: 0.000000 2023-10-18 16:03:25,571 epoch 5 - iter 90/304 - loss 0.32412318 - time (sec): 1.35 - samples/sec: 6929.96 - lr: 0.000032 - momentum: 0.000000 2023-10-18 16:03:26,021 epoch 5 - iter 120/304 - loss 0.31805337 - time (sec): 1.80 - samples/sec: 6895.60 - lr: 0.000031 - momentum: 0.000000 2023-10-18 16:03:26,466 epoch 5 - iter 150/304 - loss 0.31946870 - time (sec): 2.24 - samples/sec: 6905.54 - lr: 0.000031 - momentum: 0.000000 2023-10-18 16:03:26,925 epoch 5 - iter 180/304 - loss 0.32047596 - time (sec): 2.70 - samples/sec: 6790.96 - lr: 0.000030 - momentum: 0.000000 2023-10-18 16:03:27,385 epoch 5 - iter 210/304 - loss 0.31070659 - time (sec): 3.16 - samples/sec: 6815.16 - lr: 0.000030 - momentum: 0.000000 2023-10-18 16:03:27,847 epoch 5 - iter 240/304 - loss 0.31020519 - time (sec): 3.62 - samples/sec: 6824.09 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:03:28,308 epoch 5 - iter 270/304 - loss 0.30180350 - time (sec): 4.08 - samples/sec: 6792.84 - lr: 0.000028 - momentum: 0.000000 2023-10-18 16:03:28,771 epoch 5 - iter 300/304 - loss 0.30001712 - time (sec): 4.55 - samples/sec: 6733.90 - lr: 0.000028 - momentum: 0.000000 2023-10-18 16:03:28,827 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:28,827 EPOCH 5 done: loss 0.2995 - lr: 0.000028 2023-10-18 16:03:29,342 DEV : loss 0.28282132744789124 - f1-score (micro avg) 0.4335 2023-10-18 16:03:29,348 saving best model 2023-10-18 16:03:29,386 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:29,836 epoch 6 - iter 30/304 - loss 0.26956007 - time (sec): 0.45 - samples/sec: 6573.33 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:03:30,308 epoch 6 - iter 60/304 - loss 0.26688923 - time (sec): 0.92 - samples/sec: 6400.89 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:03:30,771 epoch 6 - iter 90/304 - loss 0.25761729 - time (sec): 1.38 - samples/sec: 6363.04 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:03:31,221 epoch 6 - iter 120/304 - loss 0.27056175 - time (sec): 1.83 - samples/sec: 6388.51 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:03:31,667 epoch 6 - iter 150/304 - loss 0.26627782 - time (sec): 2.28 - samples/sec: 6472.70 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:03:32,114 epoch 6 - iter 180/304 - loss 0.27338736 - time (sec): 2.73 - samples/sec: 6547.04 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:03:32,576 epoch 6 - iter 210/304 - loss 0.27799182 - time (sec): 3.19 - samples/sec: 6589.22 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:03:33,026 epoch 6 - iter 240/304 - loss 0.27093308 - time (sec): 3.64 - samples/sec: 6570.55 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:03:33,489 epoch 6 - iter 270/304 - loss 0.27081462 - time (sec): 4.10 - samples/sec: 6603.37 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:03:33,930 epoch 6 - iter 300/304 - loss 0.27061244 - time (sec): 4.54 - samples/sec: 6729.91 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:03:33,986 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:33,986 EPOCH 6 done: loss 0.2694 - lr: 0.000022 2023-10-18 16:03:34,501 DEV : loss 0.26524126529693604 - f1-score (micro avg) 0.4669 2023-10-18 16:03:34,506 saving best model 2023-10-18 16:03:34,539 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:34,993 epoch 7 - iter 30/304 - loss 0.28101260 - time (sec): 0.45 - samples/sec: 6389.55 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:03:35,458 epoch 7 - iter 60/304 - loss 0.27498826 - time (sec): 0.92 - samples/sec: 6450.64 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:03:35,925 epoch 7 - iter 90/304 - loss 0.26776232 - time (sec): 1.39 - samples/sec: 6661.64 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:03:36,380 epoch 7 - iter 120/304 - loss 0.26913052 - time (sec): 1.84 - samples/sec: 6661.69 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:03:36,828 epoch 7 - iter 150/304 - loss 0.27673328 - time (sec): 2.29 - samples/sec: 6657.56 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:03:37,281 epoch 7 - iter 180/304 - loss 0.27165379 - time (sec): 2.74 - samples/sec: 6753.70 - lr: 0.000019 - momentum: 0.000000 2023-10-18 16:03:37,750 epoch 7 - iter 210/304 - loss 0.26761556 - time (sec): 3.21 - samples/sec: 6731.27 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:03:38,219 epoch 7 - iter 240/304 - loss 0.26677610 - time (sec): 3.68 - samples/sec: 6669.33 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:03:38,673 epoch 7 - iter 270/304 - loss 0.26443704 - time (sec): 4.13 - samples/sec: 6710.91 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:03:39,129 epoch 7 - iter 300/304 - loss 0.25585676 - time (sec): 4.59 - samples/sec: 6672.50 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:03:39,182 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:39,183 EPOCH 7 done: loss 0.2550 - lr: 0.000017 2023-10-18 16:03:39,712 DEV : loss 0.26162639260292053 - f1-score (micro avg) 0.4688 2023-10-18 16:03:39,718 saving best model 2023-10-18 16:03:39,749 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:40,194 epoch 8 - iter 30/304 - loss 0.21202093 - time (sec): 0.44 - samples/sec: 6514.37 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:03:40,644 epoch 8 - iter 60/304 - loss 0.23492687 - time (sec): 0.89 - samples/sec: 6513.80 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:03:41,106 epoch 8 - iter 90/304 - loss 0.24604760 - time (sec): 1.36 - samples/sec: 6579.71 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:03:41,555 epoch 8 - iter 120/304 - loss 0.25063907 - time (sec): 1.80 - samples/sec: 6609.82 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:03:42,011 epoch 8 - iter 150/304 - loss 0.25023869 - time (sec): 2.26 - samples/sec: 6714.22 - lr: 0.000014 - momentum: 0.000000 2023-10-18 16:03:42,469 epoch 8 - iter 180/304 - loss 0.24448700 - time (sec): 2.72 - samples/sec: 6793.03 - lr: 0.000013 - momentum: 0.000000 2023-10-18 16:03:42,925 epoch 8 - iter 210/304 - loss 0.24623843 - time (sec): 3.17 - samples/sec: 6642.87 - lr: 0.000013 - momentum: 0.000000 2023-10-18 16:03:43,395 epoch 8 - iter 240/304 - loss 0.24266111 - time (sec): 3.64 - samples/sec: 6676.07 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:03:43,874 epoch 8 - iter 270/304 - loss 0.24394083 - time (sec): 4.12 - samples/sec: 6692.48 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:03:44,321 epoch 8 - iter 300/304 - loss 0.24413546 - time (sec): 4.57 - samples/sec: 6701.64 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:03:44,381 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:44,381 EPOCH 8 done: loss 0.2434 - lr: 0.000011 2023-10-18 16:03:44,889 DEV : loss 0.25079071521759033 - f1-score (micro avg) 0.5082 2023-10-18 16:03:44,894 saving best model 2023-10-18 16:03:44,926 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:45,382 epoch 9 - iter 30/304 - loss 0.21395798 - time (sec): 0.45 - samples/sec: 6901.27 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:03:45,868 epoch 9 - iter 60/304 - loss 0.22029121 - time (sec): 0.94 - samples/sec: 6881.67 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:03:46,336 epoch 9 - iter 90/304 - loss 0.23881693 - time (sec): 1.41 - samples/sec: 6719.35 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:03:46,782 epoch 9 - iter 120/304 - loss 0.23843637 - time (sec): 1.86 - samples/sec: 6624.58 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:03:47,192 epoch 9 - iter 150/304 - loss 0.24307718 - time (sec): 2.26 - samples/sec: 6869.47 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:03:47,599 epoch 9 - iter 180/304 - loss 0.23735791 - time (sec): 2.67 - samples/sec: 6954.33 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:03:48,008 epoch 9 - iter 210/304 - loss 0.23759934 - time (sec): 3.08 - samples/sec: 7045.42 - lr: 0.000007 - momentum: 0.000000 2023-10-18 16:03:48,420 epoch 9 - iter 240/304 - loss 0.23977260 - time (sec): 3.49 - samples/sec: 7077.08 - lr: 0.000007 - momentum: 0.000000 2023-10-18 16:03:48,831 epoch 9 - iter 270/304 - loss 0.23720473 - time (sec): 3.90 - samples/sec: 7102.98 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:03:49,244 epoch 9 - iter 300/304 - loss 0.23414113 - time (sec): 4.32 - samples/sec: 7097.23 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:03:49,293 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:49,293 EPOCH 9 done: loss 0.2349 - lr: 0.000006 2023-10-18 16:03:49,810 DEV : loss 0.25176048278808594 - f1-score (micro avg) 0.5022 2023-10-18 16:03:49,815 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:50,217 epoch 10 - iter 30/304 - loss 0.19932105 - time (sec): 0.40 - samples/sec: 7766.16 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:03:50,634 epoch 10 - iter 60/304 - loss 0.21529830 - time (sec): 0.82 - samples/sec: 7479.45 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:03:51,068 epoch 10 - iter 90/304 - loss 0.22266289 - time (sec): 1.25 - samples/sec: 7348.86 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:03:51,513 epoch 10 - iter 120/304 - loss 0.22854304 - time (sec): 1.70 - samples/sec: 7313.37 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:03:51,963 epoch 10 - iter 150/304 - loss 0.22340855 - time (sec): 2.15 - samples/sec: 7277.31 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:03:52,434 epoch 10 - iter 180/304 - loss 0.21915768 - time (sec): 2.62 - samples/sec: 7089.75 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:03:52,883 epoch 10 - iter 210/304 - loss 0.22579287 - time (sec): 3.07 - samples/sec: 7079.04 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:03:53,328 epoch 10 - iter 240/304 - loss 0.23270705 - time (sec): 3.51 - samples/sec: 7064.11 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:03:53,767 epoch 10 - iter 270/304 - loss 0.24213316 - time (sec): 3.95 - samples/sec: 6994.89 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:03:54,207 epoch 10 - iter 300/304 - loss 0.23692054 - time (sec): 4.39 - samples/sec: 6960.55 - lr: 0.000000 - momentum: 0.000000 2023-10-18 16:03:54,264 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:54,264 EPOCH 10 done: loss 0.2349 - lr: 0.000000 2023-10-18 16:03:54,773 DEV : loss 0.24823446571826935 - f1-score (micro avg) 0.5054 2023-10-18 16:03:54,807 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:03:54,807 Loading model from best epoch ... 2023-10-18 16:03:54,879 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object 2023-10-18 16:03:55,341 Results: - F-score (micro) 0.5115 - F-score (macro) 0.3161 - Accuracy 0.3617 By class: precision recall f1-score support scope 0.4526 0.5695 0.5044 151 work 0.3690 0.6526 0.4715 95 pers 0.6842 0.5417 0.6047 96 loc 0.0000 0.0000 0.0000 3 date 0.0000 0.0000 0.0000 3 micro avg 0.4608 0.5747 0.5115 348 macro avg 0.3012 0.3528 0.3161 348 weighted avg 0.4859 0.5747 0.5144 348 2023-10-18 16:03:55,341 ----------------------------------------------------------------------------------------------------