|
2023-10-20 09:56:17,330 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:56:17,330 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-20 09:56:17,330 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:56:17,330 MultiCorpus: 6183 train + 680 dev + 2113 test sentences |
|
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator |
|
2023-10-20 09:56:17,330 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:56:17,330 Train: 6183 sentences |
|
2023-10-20 09:56:17,330 (train_with_dev=False, train_with_test=False) |
|
2023-10-20 09:56:17,330 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:56:17,330 Training Params: |
|
2023-10-20 09:56:17,330 - learning_rate: "5e-05" |
|
2023-10-20 09:56:17,330 - mini_batch_size: "4" |
|
2023-10-20 09:56:17,330 - max_epochs: "10" |
|
2023-10-20 09:56:17,330 - shuffle: "True" |
|
2023-10-20 09:56:17,330 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:56:17,330 Plugins: |
|
2023-10-20 09:56:17,331 - TensorboardLogger |
|
2023-10-20 09:56:17,331 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-20 09:56:17,331 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:56:17,331 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-20 09:56:17,331 - metric: "('micro avg', 'f1-score')" |
|
2023-10-20 09:56:17,331 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:56:17,331 Computation: |
|
2023-10-20 09:56:17,331 - compute on device: cuda:0 |
|
2023-10-20 09:56:17,331 - embedding storage: none |
|
2023-10-20 09:56:17,331 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:56:17,331 Model training base path: "hmbench-topres19th/en-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-20 09:56:17,331 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:56:17,331 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:56:17,331 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-20 09:56:19,693 epoch 1 - iter 154/1546 - loss 3.62102707 - time (sec): 2.36 - samples/sec: 5236.15 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-20 09:56:22,114 epoch 1 - iter 308/1546 - loss 3.13477517 - time (sec): 4.78 - samples/sec: 5182.14 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-20 09:56:24,433 epoch 1 - iter 462/1546 - loss 2.46338633 - time (sec): 7.10 - samples/sec: 5146.78 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-20 09:56:26,539 epoch 1 - iter 616/1546 - loss 1.93374896 - time (sec): 9.21 - samples/sec: 5337.75 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-20 09:56:28,673 epoch 1 - iter 770/1546 - loss 1.61995330 - time (sec): 11.34 - samples/sec: 5337.89 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-20 09:56:30,909 epoch 1 - iter 924/1546 - loss 1.39359053 - time (sec): 13.58 - samples/sec: 5362.23 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-20 09:56:33,199 epoch 1 - iter 1078/1546 - loss 1.23035252 - time (sec): 15.87 - samples/sec: 5350.81 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-20 09:56:35,625 epoch 1 - iter 1232/1546 - loss 1.09712317 - time (sec): 18.29 - samples/sec: 5368.14 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-20 09:56:37,924 epoch 1 - iter 1386/1546 - loss 0.99421232 - time (sec): 20.59 - samples/sec: 5403.80 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-20 09:56:40,172 epoch 1 - iter 1540/1546 - loss 0.91571471 - time (sec): 22.84 - samples/sec: 5422.50 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-20 09:56:40,282 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:56:40,282 EPOCH 1 done: loss 0.9134 - lr: 0.000050 |
|
2023-10-20 09:56:41,270 DEV : loss 0.11716283857822418 - f1-score (micro avg) 0.0247 |
|
2023-10-20 09:56:41,282 saving best model |
|
2023-10-20 09:56:41,311 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:56:43,719 epoch 2 - iter 154/1546 - loss 0.19306759 - time (sec): 2.41 - samples/sec: 5487.96 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-20 09:56:46,138 epoch 2 - iter 308/1546 - loss 0.19527201 - time (sec): 4.83 - samples/sec: 5486.99 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-20 09:56:48,446 epoch 2 - iter 462/1546 - loss 0.19787880 - time (sec): 7.13 - samples/sec: 5214.32 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-20 09:56:50,833 epoch 2 - iter 616/1546 - loss 0.19678261 - time (sec): 9.52 - samples/sec: 5154.17 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-20 09:56:53,210 epoch 2 - iter 770/1546 - loss 0.19605129 - time (sec): 11.90 - samples/sec: 5121.25 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-20 09:56:55,564 epoch 2 - iter 924/1546 - loss 0.19154674 - time (sec): 14.25 - samples/sec: 5156.29 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-20 09:56:57,968 epoch 2 - iter 1078/1546 - loss 0.18535945 - time (sec): 16.66 - samples/sec: 5202.28 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-20 09:57:00,295 epoch 2 - iter 1232/1546 - loss 0.18750537 - time (sec): 18.98 - samples/sec: 5175.91 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-20 09:57:02,650 epoch 2 - iter 1386/1546 - loss 0.18188318 - time (sec): 21.34 - samples/sec: 5162.78 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-20 09:57:05,066 epoch 2 - iter 1540/1546 - loss 0.18010251 - time (sec): 23.75 - samples/sec: 5209.21 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-20 09:57:05,155 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:57:05,155 EPOCH 2 done: loss 0.1797 - lr: 0.000044 |
|
2023-10-20 09:57:06,238 DEV : loss 0.09568006545305252 - f1-score (micro avg) 0.4505 |
|
2023-10-20 09:57:06,250 saving best model |
|
2023-10-20 09:57:06,289 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:57:08,762 epoch 3 - iter 154/1546 - loss 0.15246759 - time (sec): 2.47 - samples/sec: 5045.79 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-20 09:57:11,131 epoch 3 - iter 308/1546 - loss 0.15171707 - time (sec): 4.84 - samples/sec: 4969.29 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-20 09:57:13,473 epoch 3 - iter 462/1546 - loss 0.14819470 - time (sec): 7.18 - samples/sec: 5146.44 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-20 09:57:15,811 epoch 3 - iter 616/1546 - loss 0.14757401 - time (sec): 9.52 - samples/sec: 5193.61 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-20 09:57:18,159 epoch 3 - iter 770/1546 - loss 0.14865086 - time (sec): 11.87 - samples/sec: 5251.53 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-20 09:57:20,503 epoch 3 - iter 924/1546 - loss 0.15129778 - time (sec): 14.21 - samples/sec: 5149.30 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-20 09:57:22,888 epoch 3 - iter 1078/1546 - loss 0.14983046 - time (sec): 16.60 - samples/sec: 5184.46 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-20 09:57:25,248 epoch 3 - iter 1232/1546 - loss 0.14979943 - time (sec): 18.96 - samples/sec: 5220.74 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-20 09:57:27,623 epoch 3 - iter 1386/1546 - loss 0.15151253 - time (sec): 21.33 - samples/sec: 5212.66 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-20 09:57:30,024 epoch 3 - iter 1540/1546 - loss 0.14973869 - time (sec): 23.73 - samples/sec: 5220.95 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-20 09:57:30,119 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:57:30,119 EPOCH 3 done: loss 0.1500 - lr: 0.000039 |
|
2023-10-20 09:57:31,211 DEV : loss 0.09354293346405029 - f1-score (micro avg) 0.5438 |
|
2023-10-20 09:57:31,223 saving best model |
|
2023-10-20 09:57:31,257 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:57:33,586 epoch 4 - iter 154/1546 - loss 0.15714911 - time (sec): 2.33 - samples/sec: 5661.53 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-20 09:57:35,970 epoch 4 - iter 308/1546 - loss 0.14974834 - time (sec): 4.71 - samples/sec: 5257.99 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-20 09:57:38,295 epoch 4 - iter 462/1546 - loss 0.14721188 - time (sec): 7.04 - samples/sec: 5145.14 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-20 09:57:40,675 epoch 4 - iter 616/1546 - loss 0.14478087 - time (sec): 9.42 - samples/sec: 5248.76 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-20 09:57:43,001 epoch 4 - iter 770/1546 - loss 0.14146413 - time (sec): 11.74 - samples/sec: 5205.67 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-20 09:57:45,386 epoch 4 - iter 924/1546 - loss 0.13681608 - time (sec): 14.13 - samples/sec: 5194.75 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-20 09:57:47,777 epoch 4 - iter 1078/1546 - loss 0.13568701 - time (sec): 16.52 - samples/sec: 5199.00 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-20 09:57:50,225 epoch 4 - iter 1232/1546 - loss 0.13598660 - time (sec): 18.97 - samples/sec: 5200.73 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-20 09:57:52,616 epoch 4 - iter 1386/1546 - loss 0.13476872 - time (sec): 21.36 - samples/sec: 5213.71 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-20 09:57:54,877 epoch 4 - iter 1540/1546 - loss 0.13332730 - time (sec): 23.62 - samples/sec: 5236.93 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-20 09:57:54,964 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:57:54,965 EPOCH 4 done: loss 0.1335 - lr: 0.000033 |
|
2023-10-20 09:57:56,336 DEV : loss 0.08910585939884186 - f1-score (micro avg) 0.5781 |
|
2023-10-20 09:57:56,348 saving best model |
|
2023-10-20 09:57:56,389 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:57:58,684 epoch 5 - iter 154/1546 - loss 0.11074956 - time (sec): 2.29 - samples/sec: 5207.64 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-20 09:58:01,079 epoch 5 - iter 308/1546 - loss 0.12438368 - time (sec): 4.69 - samples/sec: 5411.83 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-20 09:58:03,466 epoch 5 - iter 462/1546 - loss 0.13150006 - time (sec): 7.08 - samples/sec: 5360.25 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-20 09:58:05,767 epoch 5 - iter 616/1546 - loss 0.12690522 - time (sec): 9.38 - samples/sec: 5263.95 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-20 09:58:08,159 epoch 5 - iter 770/1546 - loss 0.12550895 - time (sec): 11.77 - samples/sec: 5297.28 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-20 09:58:10,480 epoch 5 - iter 924/1546 - loss 0.12198991 - time (sec): 14.09 - samples/sec: 5273.19 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-20 09:58:12,851 epoch 5 - iter 1078/1546 - loss 0.12094948 - time (sec): 16.46 - samples/sec: 5285.15 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-20 09:58:15,325 epoch 5 - iter 1232/1546 - loss 0.12368525 - time (sec): 18.94 - samples/sec: 5235.62 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-20 09:58:17,724 epoch 5 - iter 1386/1546 - loss 0.12543156 - time (sec): 21.33 - samples/sec: 5235.20 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-20 09:58:20,123 epoch 5 - iter 1540/1546 - loss 0.12463983 - time (sec): 23.73 - samples/sec: 5213.99 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-20 09:58:20,217 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:58:20,217 EPOCH 5 done: loss 0.1245 - lr: 0.000028 |
|
2023-10-20 09:58:21,313 DEV : loss 0.09435312449932098 - f1-score (micro avg) 0.5978 |
|
2023-10-20 09:58:21,326 saving best model |
|
2023-10-20 09:58:21,360 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:58:23,747 epoch 6 - iter 154/1546 - loss 0.11053617 - time (sec): 2.39 - samples/sec: 4737.34 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-20 09:58:26,136 epoch 6 - iter 308/1546 - loss 0.11721437 - time (sec): 4.78 - samples/sec: 4972.77 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-20 09:58:28,544 epoch 6 - iter 462/1546 - loss 0.11731031 - time (sec): 7.18 - samples/sec: 5050.51 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-20 09:58:30,911 epoch 6 - iter 616/1546 - loss 0.11093231 - time (sec): 9.55 - samples/sec: 5101.05 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-20 09:58:33,258 epoch 6 - iter 770/1546 - loss 0.11841360 - time (sec): 11.90 - samples/sec: 5111.39 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-20 09:58:35,581 epoch 6 - iter 924/1546 - loss 0.11998999 - time (sec): 14.22 - samples/sec: 5129.85 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-20 09:58:37,939 epoch 6 - iter 1078/1546 - loss 0.11704457 - time (sec): 16.58 - samples/sec: 5155.77 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-20 09:58:40,326 epoch 6 - iter 1232/1546 - loss 0.11636736 - time (sec): 18.97 - samples/sec: 5193.59 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-20 09:58:42,726 epoch 6 - iter 1386/1546 - loss 0.11459251 - time (sec): 21.37 - samples/sec: 5200.43 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-20 09:58:45,072 epoch 6 - iter 1540/1546 - loss 0.11638062 - time (sec): 23.71 - samples/sec: 5211.77 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-20 09:58:45,178 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:58:45,179 EPOCH 6 done: loss 0.1158 - lr: 0.000022 |
|
2023-10-20 09:58:46,284 DEV : loss 0.09260376542806625 - f1-score (micro avg) 0.6344 |
|
2023-10-20 09:58:46,296 saving best model |
|
2023-10-20 09:58:46,333 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:58:48,697 epoch 7 - iter 154/1546 - loss 0.10982928 - time (sec): 2.36 - samples/sec: 5341.51 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-20 09:58:51,116 epoch 7 - iter 308/1546 - loss 0.10926222 - time (sec): 4.78 - samples/sec: 5195.27 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-20 09:58:53,492 epoch 7 - iter 462/1546 - loss 0.10878928 - time (sec): 7.16 - samples/sec: 5171.38 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-20 09:58:55,841 epoch 7 - iter 616/1546 - loss 0.11396563 - time (sec): 9.51 - samples/sec: 5221.24 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-20 09:58:58,270 epoch 7 - iter 770/1546 - loss 0.11096524 - time (sec): 11.94 - samples/sec: 5316.49 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-20 09:59:00,633 epoch 7 - iter 924/1546 - loss 0.10959747 - time (sec): 14.30 - samples/sec: 5299.06 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-20 09:59:03,007 epoch 7 - iter 1078/1546 - loss 0.11069285 - time (sec): 16.67 - samples/sec: 5260.74 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-20 09:59:05,348 epoch 7 - iter 1232/1546 - loss 0.11156839 - time (sec): 19.01 - samples/sec: 5246.88 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-20 09:59:07,702 epoch 7 - iter 1386/1546 - loss 0.11104971 - time (sec): 21.37 - samples/sec: 5237.77 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-20 09:59:10,067 epoch 7 - iter 1540/1546 - loss 0.10993191 - time (sec): 23.73 - samples/sec: 5217.63 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-20 09:59:10,167 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:59:10,167 EPOCH 7 done: loss 0.1098 - lr: 0.000017 |
|
2023-10-20 09:59:11,270 DEV : loss 0.09885262697935104 - f1-score (micro avg) 0.6147 |
|
2023-10-20 09:59:11,281 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:59:13,620 epoch 8 - iter 154/1546 - loss 0.10457091 - time (sec): 2.34 - samples/sec: 5371.89 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-20 09:59:16,035 epoch 8 - iter 308/1546 - loss 0.09573930 - time (sec): 4.75 - samples/sec: 5244.20 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-20 09:59:18,416 epoch 8 - iter 462/1546 - loss 0.09757855 - time (sec): 7.13 - samples/sec: 5222.20 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-20 09:59:20,846 epoch 8 - iter 616/1546 - loss 0.10232804 - time (sec): 9.56 - samples/sec: 5196.35 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-20 09:59:23,246 epoch 8 - iter 770/1546 - loss 0.10450911 - time (sec): 11.96 - samples/sec: 5213.17 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-20 09:59:25,609 epoch 8 - iter 924/1546 - loss 0.10466820 - time (sec): 14.33 - samples/sec: 5201.41 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-20 09:59:27,944 epoch 8 - iter 1078/1546 - loss 0.10690035 - time (sec): 16.66 - samples/sec: 5204.45 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-20 09:59:30,324 epoch 8 - iter 1232/1546 - loss 0.10601181 - time (sec): 19.04 - samples/sec: 5206.44 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-20 09:59:32,724 epoch 8 - iter 1386/1546 - loss 0.10561856 - time (sec): 21.44 - samples/sec: 5197.08 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-20 09:59:35,091 epoch 8 - iter 1540/1546 - loss 0.10482255 - time (sec): 23.81 - samples/sec: 5198.72 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-20 09:59:35,188 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:59:35,188 EPOCH 8 done: loss 0.1048 - lr: 0.000011 |
|
2023-10-20 09:59:36,269 DEV : loss 0.1057317703962326 - f1-score (micro avg) 0.6261 |
|
2023-10-20 09:59:36,281 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:59:38,590 epoch 9 - iter 154/1546 - loss 0.11356698 - time (sec): 2.31 - samples/sec: 5151.09 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-20 09:59:40,915 epoch 9 - iter 308/1546 - loss 0.10406692 - time (sec): 4.63 - samples/sec: 5176.55 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-20 09:59:43,279 epoch 9 - iter 462/1546 - loss 0.10401387 - time (sec): 7.00 - samples/sec: 5196.44 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-20 09:59:45,663 epoch 9 - iter 616/1546 - loss 0.09614905 - time (sec): 9.38 - samples/sec: 5264.89 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-20 09:59:48,049 epoch 9 - iter 770/1546 - loss 0.09323588 - time (sec): 11.77 - samples/sec: 5219.67 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-20 09:59:50,422 epoch 9 - iter 924/1546 - loss 0.09618676 - time (sec): 14.14 - samples/sec: 5189.75 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-20 09:59:52,822 epoch 9 - iter 1078/1546 - loss 0.09686063 - time (sec): 16.54 - samples/sec: 5188.16 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-20 09:59:55,194 epoch 9 - iter 1232/1546 - loss 0.09966681 - time (sec): 18.91 - samples/sec: 5213.59 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-20 09:59:57,578 epoch 9 - iter 1386/1546 - loss 0.09979067 - time (sec): 21.30 - samples/sec: 5253.50 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 09:59:59,886 epoch 9 - iter 1540/1546 - loss 0.10020466 - time (sec): 23.60 - samples/sec: 5245.93 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 09:59:59,978 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 09:59:59,978 EPOCH 9 done: loss 0.1004 - lr: 0.000006 |
|
2023-10-20 10:00:01,073 DEV : loss 0.10489093512296677 - f1-score (micro avg) 0.6565 |
|
2023-10-20 10:00:01,085 saving best model |
|
2023-10-20 10:00:01,123 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 10:00:03,228 epoch 10 - iter 154/1546 - loss 0.11356945 - time (sec): 2.10 - samples/sec: 5568.14 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-20 10:00:05,574 epoch 10 - iter 308/1546 - loss 0.10220205 - time (sec): 4.45 - samples/sec: 5477.44 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-20 10:00:07,947 epoch 10 - iter 462/1546 - loss 0.09862619 - time (sec): 6.82 - samples/sec: 5456.90 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-20 10:00:10,344 epoch 10 - iter 616/1546 - loss 0.09815601 - time (sec): 9.22 - samples/sec: 5398.30 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-20 10:00:12,711 epoch 10 - iter 770/1546 - loss 0.09919748 - time (sec): 11.59 - samples/sec: 5350.37 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-20 10:00:15,063 epoch 10 - iter 924/1546 - loss 0.10078099 - time (sec): 13.94 - samples/sec: 5288.38 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-20 10:00:17,441 epoch 10 - iter 1078/1546 - loss 0.10013561 - time (sec): 16.32 - samples/sec: 5287.68 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-20 10:00:19,854 epoch 10 - iter 1232/1546 - loss 0.09648546 - time (sec): 18.73 - samples/sec: 5296.75 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-20 10:00:22,219 epoch 10 - iter 1386/1546 - loss 0.09661131 - time (sec): 21.10 - samples/sec: 5270.05 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-20 10:00:24,599 epoch 10 - iter 1540/1546 - loss 0.09632277 - time (sec): 23.48 - samples/sec: 5267.14 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-20 10:00:24,695 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 10:00:24,696 EPOCH 10 done: loss 0.0962 - lr: 0.000000 |
|
2023-10-20 10:00:25,790 DEV : loss 0.10527437180280685 - f1-score (micro avg) 0.6468 |
|
2023-10-20 10:00:25,836 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 10:00:25,836 Loading model from best epoch ... |
|
2023-10-20 10:00:25,912 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET |
|
2023-10-20 10:00:28,836 |
|
Results: |
|
- F-score (micro) 0.5986 |
|
- F-score (macro) 0.342 |
|
- Accuracy 0.4367 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.6831 0.6723 0.6777 946 |
|
BUILDING 0.2317 0.1027 0.1423 185 |
|
STREET 0.5833 0.1250 0.2059 56 |
|
|
|
micro avg 0.6459 0.5577 0.5986 1187 |
|
macro avg 0.4994 0.3000 0.3420 1187 |
|
weighted avg 0.6081 0.5577 0.5720 1187 |
|
|
|
2023-10-20 10:00:28,836 ---------------------------------------------------------------------------------------------------- |
|
|