flair-icdar-fr / training.log
stefan-it's picture
Upload folder using huggingface_hub
f29a286
2023-10-12 08:04:12,035 ----------------------------------------------------------------------------------------------------
2023-10-12 08:04:12,037 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-12 08:04:12,038 ----------------------------------------------------------------------------------------------------
2023-10-12 08:04:12,038 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-12 08:04:12,038 ----------------------------------------------------------------------------------------------------
2023-10-12 08:04:12,038 Train: 7936 sentences
2023-10-12 08:04:12,038 (train_with_dev=False, train_with_test=False)
2023-10-12 08:04:12,038 ----------------------------------------------------------------------------------------------------
2023-10-12 08:04:12,038 Training Params:
2023-10-12 08:04:12,039 - learning_rate: "0.00015"
2023-10-12 08:04:12,039 - mini_batch_size: "8"
2023-10-12 08:04:12,039 - max_epochs: "10"
2023-10-12 08:04:12,039 - shuffle: "True"
2023-10-12 08:04:12,039 ----------------------------------------------------------------------------------------------------
2023-10-12 08:04:12,039 Plugins:
2023-10-12 08:04:12,039 - TensorboardLogger
2023-10-12 08:04:12,039 - LinearScheduler | warmup_fraction: '0.1'
2023-10-12 08:04:12,039 ----------------------------------------------------------------------------------------------------
2023-10-12 08:04:12,039 Final evaluation on model from best epoch (best-model.pt)
2023-10-12 08:04:12,039 - metric: "('micro avg', 'f1-score')"
2023-10-12 08:04:12,039 ----------------------------------------------------------------------------------------------------
2023-10-12 08:04:12,039 Computation:
2023-10-12 08:04:12,040 - compute on device: cuda:0
2023-10-12 08:04:12,040 - embedding storage: none
2023-10-12 08:04:12,040 ----------------------------------------------------------------------------------------------------
2023-10-12 08:04:12,040 Model training base path: "hmbench-icdar/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-1"
2023-10-12 08:04:12,040 ----------------------------------------------------------------------------------------------------
2023-10-12 08:04:12,040 ----------------------------------------------------------------------------------------------------
2023-10-12 08:04:12,040 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-12 08:05:06,449 epoch 1 - iter 99/992 - loss 2.58555570 - time (sec): 54.41 - samples/sec: 284.28 - lr: 0.000015 - momentum: 0.000000
2023-10-12 08:05:56,282 epoch 1 - iter 198/992 - loss 2.53686260 - time (sec): 104.24 - samples/sec: 302.44 - lr: 0.000030 - momentum: 0.000000
2023-10-12 08:06:46,233 epoch 1 - iter 297/992 - loss 2.34204530 - time (sec): 154.19 - samples/sec: 310.69 - lr: 0.000045 - momentum: 0.000000
2023-10-12 08:07:34,661 epoch 1 - iter 396/992 - loss 2.08169120 - time (sec): 202.62 - samples/sec: 317.45 - lr: 0.000060 - momentum: 0.000000
2023-10-12 08:08:24,602 epoch 1 - iter 495/992 - loss 1.82694454 - time (sec): 252.56 - samples/sec: 317.85 - lr: 0.000075 - momentum: 0.000000
2023-10-12 08:09:14,534 epoch 1 - iter 594/992 - loss 1.59247411 - time (sec): 302.49 - samples/sec: 320.69 - lr: 0.000090 - momentum: 0.000000
2023-10-12 08:10:03,550 epoch 1 - iter 693/992 - loss 1.39639771 - time (sec): 351.51 - samples/sec: 325.34 - lr: 0.000105 - momentum: 0.000000
2023-10-12 08:10:59,316 epoch 1 - iter 792/992 - loss 1.25466641 - time (sec): 407.27 - samples/sec: 321.81 - lr: 0.000120 - momentum: 0.000000
2023-10-12 08:11:53,277 epoch 1 - iter 891/992 - loss 1.14679909 - time (sec): 461.23 - samples/sec: 319.44 - lr: 0.000135 - momentum: 0.000000
2023-10-12 08:12:42,482 epoch 1 - iter 990/992 - loss 1.05527957 - time (sec): 510.44 - samples/sec: 320.66 - lr: 0.000150 - momentum: 0.000000
2023-10-12 08:12:43,887 ----------------------------------------------------------------------------------------------------
2023-10-12 08:12:43,888 EPOCH 1 done: loss 1.0537 - lr: 0.000150
2023-10-12 08:13:10,955 DEV : loss 0.18303227424621582 - f1-score (micro avg) 0.355
2023-10-12 08:13:11,005 saving best model
2023-10-12 08:13:11,967 ----------------------------------------------------------------------------------------------------
2023-10-12 08:14:02,123 epoch 2 - iter 99/992 - loss 0.24094875 - time (sec): 50.15 - samples/sec: 328.44 - lr: 0.000148 - momentum: 0.000000
2023-10-12 08:14:55,551 epoch 2 - iter 198/992 - loss 0.20476026 - time (sec): 103.58 - samples/sec: 315.61 - lr: 0.000147 - momentum: 0.000000
2023-10-12 08:15:49,201 epoch 2 - iter 297/992 - loss 0.19182712 - time (sec): 157.23 - samples/sec: 311.94 - lr: 0.000145 - momentum: 0.000000
2023-10-12 08:16:42,272 epoch 2 - iter 396/992 - loss 0.18719035 - time (sec): 210.30 - samples/sec: 312.52 - lr: 0.000143 - momentum: 0.000000
2023-10-12 08:17:36,327 epoch 2 - iter 495/992 - loss 0.18025399 - time (sec): 264.36 - samples/sec: 311.59 - lr: 0.000142 - momentum: 0.000000
2023-10-12 08:18:27,193 epoch 2 - iter 594/992 - loss 0.17568004 - time (sec): 315.22 - samples/sec: 312.61 - lr: 0.000140 - momentum: 0.000000
2023-10-12 08:19:21,888 epoch 2 - iter 693/992 - loss 0.17273836 - time (sec): 369.92 - samples/sec: 311.10 - lr: 0.000138 - momentum: 0.000000
2023-10-12 08:20:16,613 epoch 2 - iter 792/992 - loss 0.16679007 - time (sec): 424.64 - samples/sec: 308.04 - lr: 0.000137 - momentum: 0.000000
2023-10-12 08:21:07,673 epoch 2 - iter 891/992 - loss 0.16155770 - time (sec): 475.70 - samples/sec: 308.99 - lr: 0.000135 - momentum: 0.000000
2023-10-12 08:22:02,055 epoch 2 - iter 990/992 - loss 0.15729141 - time (sec): 530.08 - samples/sec: 308.45 - lr: 0.000133 - momentum: 0.000000
2023-10-12 08:22:03,276 ----------------------------------------------------------------------------------------------------
2023-10-12 08:22:03,276 EPOCH 2 done: loss 0.1570 - lr: 0.000133
2023-10-12 08:22:30,282 DEV : loss 0.0930715873837471 - f1-score (micro avg) 0.7059
2023-10-12 08:22:30,322 saving best model
2023-10-12 08:22:33,238 ----------------------------------------------------------------------------------------------------
2023-10-12 08:23:31,054 epoch 3 - iter 99/992 - loss 0.09366490 - time (sec): 57.81 - samples/sec: 272.24 - lr: 0.000132 - momentum: 0.000000
2023-10-12 08:24:26,452 epoch 3 - iter 198/992 - loss 0.09371713 - time (sec): 113.21 - samples/sec: 282.27 - lr: 0.000130 - momentum: 0.000000
2023-10-12 08:25:20,392 epoch 3 - iter 297/992 - loss 0.09379383 - time (sec): 167.15 - samples/sec: 291.97 - lr: 0.000128 - momentum: 0.000000
2023-10-12 08:26:13,353 epoch 3 - iter 396/992 - loss 0.09391945 - time (sec): 220.11 - samples/sec: 296.14 - lr: 0.000127 - momentum: 0.000000
2023-10-12 08:27:04,613 epoch 3 - iter 495/992 - loss 0.09225266 - time (sec): 271.37 - samples/sec: 299.92 - lr: 0.000125 - momentum: 0.000000
2023-10-12 08:27:58,875 epoch 3 - iter 594/992 - loss 0.09074357 - time (sec): 325.63 - samples/sec: 299.29 - lr: 0.000123 - momentum: 0.000000
2023-10-12 08:28:48,568 epoch 3 - iter 693/992 - loss 0.09041959 - time (sec): 375.33 - samples/sec: 301.62 - lr: 0.000122 - momentum: 0.000000
2023-10-12 08:29:39,210 epoch 3 - iter 792/992 - loss 0.08851234 - time (sec): 425.97 - samples/sec: 307.21 - lr: 0.000120 - momentum: 0.000000
2023-10-12 08:30:29,714 epoch 3 - iter 891/992 - loss 0.08699505 - time (sec): 476.47 - samples/sec: 310.07 - lr: 0.000118 - momentum: 0.000000
2023-10-12 08:31:24,309 epoch 3 - iter 990/992 - loss 0.08693648 - time (sec): 531.07 - samples/sec: 308.23 - lr: 0.000117 - momentum: 0.000000
2023-10-12 08:31:25,444 ----------------------------------------------------------------------------------------------------
2023-10-12 08:31:25,445 EPOCH 3 done: loss 0.0869 - lr: 0.000117
2023-10-12 08:31:51,548 DEV : loss 0.09189649671316147 - f1-score (micro avg) 0.7402
2023-10-12 08:31:51,594 saving best model
2023-10-12 08:31:54,213 ----------------------------------------------------------------------------------------------------
2023-10-12 08:32:43,918 epoch 4 - iter 99/992 - loss 0.06142325 - time (sec): 49.70 - samples/sec: 344.89 - lr: 0.000115 - momentum: 0.000000
2023-10-12 08:33:36,309 epoch 4 - iter 198/992 - loss 0.06089639 - time (sec): 102.09 - samples/sec: 333.62 - lr: 0.000113 - momentum: 0.000000
2023-10-12 08:34:25,710 epoch 4 - iter 297/992 - loss 0.06267594 - time (sec): 151.49 - samples/sec: 329.91 - lr: 0.000112 - momentum: 0.000000
2023-10-12 08:35:15,366 epoch 4 - iter 396/992 - loss 0.06005022 - time (sec): 201.15 - samples/sec: 326.69 - lr: 0.000110 - momentum: 0.000000
2023-10-12 08:36:07,476 epoch 4 - iter 495/992 - loss 0.05963130 - time (sec): 253.26 - samples/sec: 323.68 - lr: 0.000108 - momentum: 0.000000
2023-10-12 08:36:57,931 epoch 4 - iter 594/992 - loss 0.05891404 - time (sec): 303.71 - samples/sec: 322.82 - lr: 0.000107 - momentum: 0.000000
2023-10-12 08:37:46,481 epoch 4 - iter 693/992 - loss 0.05887390 - time (sec): 352.26 - samples/sec: 325.28 - lr: 0.000105 - momentum: 0.000000
2023-10-12 08:38:34,872 epoch 4 - iter 792/992 - loss 0.05887289 - time (sec): 400.65 - samples/sec: 326.32 - lr: 0.000103 - momentum: 0.000000
2023-10-12 08:39:29,526 epoch 4 - iter 891/992 - loss 0.05755141 - time (sec): 455.31 - samples/sec: 324.49 - lr: 0.000102 - momentum: 0.000000
2023-10-12 08:40:24,825 epoch 4 - iter 990/992 - loss 0.05782701 - time (sec): 510.61 - samples/sec: 320.69 - lr: 0.000100 - momentum: 0.000000
2023-10-12 08:40:25,816 ----------------------------------------------------------------------------------------------------
2023-10-12 08:40:25,816 EPOCH 4 done: loss 0.0578 - lr: 0.000100
2023-10-12 08:40:51,595 DEV : loss 0.09931203722953796 - f1-score (micro avg) 0.7623
2023-10-12 08:40:51,635 saving best model
2023-10-12 08:40:57,442 ----------------------------------------------------------------------------------------------------
2023-10-12 08:41:49,152 epoch 5 - iter 99/992 - loss 0.04436841 - time (sec): 51.71 - samples/sec: 312.44 - lr: 0.000098 - momentum: 0.000000
2023-10-12 08:42:42,375 epoch 5 - iter 198/992 - loss 0.03706072 - time (sec): 104.93 - samples/sec: 308.20 - lr: 0.000097 - momentum: 0.000000
2023-10-12 08:43:35,907 epoch 5 - iter 297/992 - loss 0.03821323 - time (sec): 158.46 - samples/sec: 307.02 - lr: 0.000095 - momentum: 0.000000
2023-10-12 08:44:31,403 epoch 5 - iter 396/992 - loss 0.03912269 - time (sec): 213.96 - samples/sec: 304.26 - lr: 0.000093 - momentum: 0.000000
2023-10-12 08:45:22,108 epoch 5 - iter 495/992 - loss 0.03917046 - time (sec): 264.66 - samples/sec: 307.19 - lr: 0.000092 - momentum: 0.000000
2023-10-12 08:46:11,499 epoch 5 - iter 594/992 - loss 0.04022926 - time (sec): 314.05 - samples/sec: 311.81 - lr: 0.000090 - momentum: 0.000000
2023-10-12 08:46:59,993 epoch 5 - iter 693/992 - loss 0.03989622 - time (sec): 362.55 - samples/sec: 315.65 - lr: 0.000088 - momentum: 0.000000
2023-10-12 08:47:58,985 epoch 5 - iter 792/992 - loss 0.04056929 - time (sec): 421.54 - samples/sec: 311.55 - lr: 0.000087 - momentum: 0.000000
2023-10-12 08:48:50,892 epoch 5 - iter 891/992 - loss 0.04088817 - time (sec): 473.45 - samples/sec: 312.27 - lr: 0.000085 - momentum: 0.000000
2023-10-12 08:49:38,976 epoch 5 - iter 990/992 - loss 0.04156030 - time (sec): 521.53 - samples/sec: 313.73 - lr: 0.000083 - momentum: 0.000000
2023-10-12 08:49:40,070 ----------------------------------------------------------------------------------------------------
2023-10-12 08:49:40,071 EPOCH 5 done: loss 0.0415 - lr: 0.000083
2023-10-12 08:50:06,253 DEV : loss 0.11372340470552444 - f1-score (micro avg) 0.756
2023-10-12 08:50:06,293 ----------------------------------------------------------------------------------------------------
2023-10-12 08:50:55,831 epoch 6 - iter 99/992 - loss 0.02534475 - time (sec): 49.54 - samples/sec: 316.24 - lr: 0.000082 - momentum: 0.000000
2023-10-12 08:51:50,289 epoch 6 - iter 198/992 - loss 0.02728538 - time (sec): 103.99 - samples/sec: 307.91 - lr: 0.000080 - momentum: 0.000000
2023-10-12 08:52:44,094 epoch 6 - iter 297/992 - loss 0.02693384 - time (sec): 157.80 - samples/sec: 305.30 - lr: 0.000078 - momentum: 0.000000
2023-10-12 08:53:36,994 epoch 6 - iter 396/992 - loss 0.02900133 - time (sec): 210.70 - samples/sec: 309.30 - lr: 0.000077 - momentum: 0.000000
2023-10-12 08:54:29,419 epoch 6 - iter 495/992 - loss 0.02831503 - time (sec): 263.12 - samples/sec: 308.36 - lr: 0.000075 - momentum: 0.000000
2023-10-12 08:55:19,177 epoch 6 - iter 594/992 - loss 0.02808324 - time (sec): 312.88 - samples/sec: 312.68 - lr: 0.000073 - momentum: 0.000000
2023-10-12 08:56:07,273 epoch 6 - iter 693/992 - loss 0.02892834 - time (sec): 360.98 - samples/sec: 317.73 - lr: 0.000072 - momentum: 0.000000
2023-10-12 08:56:55,649 epoch 6 - iter 792/992 - loss 0.03069744 - time (sec): 409.35 - samples/sec: 319.35 - lr: 0.000070 - momentum: 0.000000
2023-10-12 08:57:50,900 epoch 6 - iter 891/992 - loss 0.03124301 - time (sec): 464.60 - samples/sec: 317.11 - lr: 0.000068 - momentum: 0.000000
2023-10-12 08:58:43,506 epoch 6 - iter 990/992 - loss 0.03144417 - time (sec): 517.21 - samples/sec: 316.34 - lr: 0.000067 - momentum: 0.000000
2023-10-12 08:58:44,493 ----------------------------------------------------------------------------------------------------
2023-10-12 08:58:44,493 EPOCH 6 done: loss 0.0314 - lr: 0.000067
2023-10-12 08:59:08,717 DEV : loss 0.13427288830280304 - f1-score (micro avg) 0.7743
2023-10-12 08:59:08,761 saving best model
2023-10-12 08:59:11,805 ----------------------------------------------------------------------------------------------------
2023-10-12 09:00:02,976 epoch 7 - iter 99/992 - loss 0.01884774 - time (sec): 51.17 - samples/sec: 318.71 - lr: 0.000065 - momentum: 0.000000
2023-10-12 09:00:51,080 epoch 7 - iter 198/992 - loss 0.02301048 - time (sec): 99.27 - samples/sec: 332.12 - lr: 0.000063 - momentum: 0.000000
2023-10-12 09:01:38,861 epoch 7 - iter 297/992 - loss 0.02287236 - time (sec): 147.05 - samples/sec: 332.81 - lr: 0.000062 - momentum: 0.000000
2023-10-12 09:02:26,775 epoch 7 - iter 396/992 - loss 0.02348912 - time (sec): 194.97 - samples/sec: 336.71 - lr: 0.000060 - momentum: 0.000000
2023-10-12 09:03:14,307 epoch 7 - iter 495/992 - loss 0.02304502 - time (sec): 242.50 - samples/sec: 336.33 - lr: 0.000058 - momentum: 0.000000
2023-10-12 09:04:01,858 epoch 7 - iter 594/992 - loss 0.02320461 - time (sec): 290.05 - samples/sec: 337.37 - lr: 0.000057 - momentum: 0.000000
2023-10-12 09:04:51,398 epoch 7 - iter 693/992 - loss 0.02410117 - time (sec): 339.59 - samples/sec: 337.92 - lr: 0.000055 - momentum: 0.000000
2023-10-12 09:05:37,198 epoch 7 - iter 792/992 - loss 0.02467991 - time (sec): 385.39 - samples/sec: 336.38 - lr: 0.000053 - momentum: 0.000000
2023-10-12 09:06:24,113 epoch 7 - iter 891/992 - loss 0.02411616 - time (sec): 432.30 - samples/sec: 338.45 - lr: 0.000052 - momentum: 0.000000
2023-10-12 09:07:11,294 epoch 7 - iter 990/992 - loss 0.02385416 - time (sec): 479.48 - samples/sec: 341.20 - lr: 0.000050 - momentum: 0.000000
2023-10-12 09:07:12,268 ----------------------------------------------------------------------------------------------------
2023-10-12 09:07:12,268 EPOCH 7 done: loss 0.0238 - lr: 0.000050
2023-10-12 09:07:37,616 DEV : loss 0.16945815086364746 - f1-score (micro avg) 0.7625
2023-10-12 09:07:37,658 ----------------------------------------------------------------------------------------------------
2023-10-12 09:08:25,636 epoch 8 - iter 99/992 - loss 0.01826581 - time (sec): 47.98 - samples/sec: 352.82 - lr: 0.000048 - momentum: 0.000000
2023-10-12 09:09:15,818 epoch 8 - iter 198/992 - loss 0.01797263 - time (sec): 98.16 - samples/sec: 328.55 - lr: 0.000047 - momentum: 0.000000
2023-10-12 09:10:08,897 epoch 8 - iter 297/992 - loss 0.01946032 - time (sec): 151.24 - samples/sec: 315.74 - lr: 0.000045 - momentum: 0.000000
2023-10-12 09:10:58,922 epoch 8 - iter 396/992 - loss 0.01952599 - time (sec): 201.26 - samples/sec: 316.86 - lr: 0.000043 - momentum: 0.000000
2023-10-12 09:11:52,850 epoch 8 - iter 495/992 - loss 0.01857797 - time (sec): 255.19 - samples/sec: 315.56 - lr: 0.000042 - momentum: 0.000000
2023-10-12 09:12:45,735 epoch 8 - iter 594/992 - loss 0.02004084 - time (sec): 308.08 - samples/sec: 317.70 - lr: 0.000040 - momentum: 0.000000
2023-10-12 09:13:36,327 epoch 8 - iter 693/992 - loss 0.02014586 - time (sec): 358.67 - samples/sec: 317.68 - lr: 0.000038 - momentum: 0.000000
2023-10-12 09:14:30,138 epoch 8 - iter 792/992 - loss 0.01980906 - time (sec): 412.48 - samples/sec: 316.70 - lr: 0.000037 - momentum: 0.000000
2023-10-12 09:15:21,623 epoch 8 - iter 891/992 - loss 0.02006957 - time (sec): 463.96 - samples/sec: 316.10 - lr: 0.000035 - momentum: 0.000000
2023-10-12 09:16:09,180 epoch 8 - iter 990/992 - loss 0.01945183 - time (sec): 511.52 - samples/sec: 320.12 - lr: 0.000033 - momentum: 0.000000
2023-10-12 09:16:10,068 ----------------------------------------------------------------------------------------------------
2023-10-12 09:16:10,068 EPOCH 8 done: loss 0.0194 - lr: 0.000033
2023-10-12 09:16:34,568 DEV : loss 0.1777261346578598 - f1-score (micro avg) 0.7603
2023-10-12 09:16:34,606 ----------------------------------------------------------------------------------------------------
2023-10-12 09:17:22,132 epoch 9 - iter 99/992 - loss 0.02532699 - time (sec): 47.52 - samples/sec: 362.18 - lr: 0.000032 - momentum: 0.000000
2023-10-12 09:18:10,391 epoch 9 - iter 198/992 - loss 0.02156891 - time (sec): 95.78 - samples/sec: 351.32 - lr: 0.000030 - momentum: 0.000000
2023-10-12 09:18:56,755 epoch 9 - iter 297/992 - loss 0.01810471 - time (sec): 142.15 - samples/sec: 356.12 - lr: 0.000028 - momentum: 0.000000
2023-10-12 09:19:44,249 epoch 9 - iter 396/992 - loss 0.01799543 - time (sec): 189.64 - samples/sec: 348.96 - lr: 0.000027 - momentum: 0.000000
2023-10-12 09:20:31,198 epoch 9 - iter 495/992 - loss 0.01649332 - time (sec): 236.59 - samples/sec: 349.44 - lr: 0.000025 - momentum: 0.000000
2023-10-12 09:21:19,342 epoch 9 - iter 594/992 - loss 0.01544692 - time (sec): 284.73 - samples/sec: 347.94 - lr: 0.000023 - momentum: 0.000000
2023-10-12 09:22:07,048 epoch 9 - iter 693/992 - loss 0.01478198 - time (sec): 332.44 - samples/sec: 348.11 - lr: 0.000022 - momentum: 0.000000
2023-10-12 09:22:55,379 epoch 9 - iter 792/992 - loss 0.01567478 - time (sec): 380.77 - samples/sec: 344.64 - lr: 0.000020 - momentum: 0.000000
2023-10-12 09:23:42,926 epoch 9 - iter 891/992 - loss 0.01599589 - time (sec): 428.32 - samples/sec: 344.01 - lr: 0.000018 - momentum: 0.000000
2023-10-12 09:24:31,114 epoch 9 - iter 990/992 - loss 0.01558416 - time (sec): 476.51 - samples/sec: 343.42 - lr: 0.000017 - momentum: 0.000000
2023-10-12 09:24:32,095 ----------------------------------------------------------------------------------------------------
2023-10-12 09:24:32,095 EPOCH 9 done: loss 0.0156 - lr: 0.000017
2023-10-12 09:24:57,467 DEV : loss 0.18520045280456543 - f1-score (micro avg) 0.7619
2023-10-12 09:24:57,511 ----------------------------------------------------------------------------------------------------
2023-10-12 09:25:46,427 epoch 10 - iter 99/992 - loss 0.01050991 - time (sec): 48.91 - samples/sec: 341.19 - lr: 0.000015 - momentum: 0.000000
2023-10-12 09:26:34,054 epoch 10 - iter 198/992 - loss 0.01124717 - time (sec): 96.54 - samples/sec: 339.42 - lr: 0.000013 - momentum: 0.000000
2023-10-12 09:27:22,983 epoch 10 - iter 297/992 - loss 0.01103351 - time (sec): 145.47 - samples/sec: 339.57 - lr: 0.000012 - momentum: 0.000000
2023-10-12 09:28:15,052 epoch 10 - iter 396/992 - loss 0.01233967 - time (sec): 197.54 - samples/sec: 333.82 - lr: 0.000010 - momentum: 0.000000
2023-10-12 09:29:10,897 epoch 10 - iter 495/992 - loss 0.01168010 - time (sec): 253.38 - samples/sec: 325.72 - lr: 0.000008 - momentum: 0.000000
2023-10-12 09:30:07,424 epoch 10 - iter 594/992 - loss 0.01179804 - time (sec): 309.91 - samples/sec: 317.53 - lr: 0.000007 - momentum: 0.000000
2023-10-12 09:31:02,969 epoch 10 - iter 693/992 - loss 0.01193022 - time (sec): 365.46 - samples/sec: 312.37 - lr: 0.000005 - momentum: 0.000000
2023-10-12 09:31:59,360 epoch 10 - iter 792/992 - loss 0.01236357 - time (sec): 421.85 - samples/sec: 309.60 - lr: 0.000004 - momentum: 0.000000
2023-10-12 09:32:52,374 epoch 10 - iter 891/992 - loss 0.01270245 - time (sec): 474.86 - samples/sec: 310.06 - lr: 0.000002 - momentum: 0.000000
2023-10-12 09:33:40,701 epoch 10 - iter 990/992 - loss 0.01318705 - time (sec): 523.19 - samples/sec: 313.02 - lr: 0.000000 - momentum: 0.000000
2023-10-12 09:33:41,597 ----------------------------------------------------------------------------------------------------
2023-10-12 09:33:41,597 EPOCH 10 done: loss 0.0134 - lr: 0.000000
2023-10-12 09:34:08,746 DEV : loss 0.19303283095359802 - f1-score (micro avg) 0.7562
2023-10-12 09:34:09,762 ----------------------------------------------------------------------------------------------------
2023-10-12 09:34:09,764 Loading model from best epoch ...
2023-10-12 09:34:15,360 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-12 09:34:40,369
Results:
- F-score (micro) 0.7486
- F-score (macro) 0.6567
- Accuracy 0.6255
By class:
precision recall f1-score support
LOC 0.8082 0.8107 0.8095 655
PER 0.6795 0.7892 0.7303 223
ORG 0.5000 0.3780 0.4305 127
micro avg 0.7460 0.7512 0.7486 1005
macro avg 0.6626 0.6593 0.6567 1005
weighted avg 0.7407 0.7512 0.7440 1005
2023-10-12 09:34:40,369 ----------------------------------------------------------------------------------------------------