|
2023-10-19 10:38:20,282 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:38:20,282 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-19 10:38:20,282 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:38:20,282 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences |
|
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator |
|
2023-10-19 10:38:20,282 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:38:20,282 Train: 20847 sentences |
|
2023-10-19 10:38:20,282 (train_with_dev=False, train_with_test=False) |
|
2023-10-19 10:38:20,282 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:38:20,282 Training Params: |
|
2023-10-19 10:38:20,282 - learning_rate: "3e-05" |
|
2023-10-19 10:38:20,283 - mini_batch_size: "4" |
|
2023-10-19 10:38:20,283 - max_epochs: "10" |
|
2023-10-19 10:38:20,283 - shuffle: "True" |
|
2023-10-19 10:38:20,283 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:38:20,283 Plugins: |
|
2023-10-19 10:38:20,283 - TensorboardLogger |
|
2023-10-19 10:38:20,283 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-19 10:38:20,283 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:38:20,283 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-19 10:38:20,283 - metric: "('micro avg', 'f1-score')" |
|
2023-10-19 10:38:20,283 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:38:20,283 Computation: |
|
2023-10-19 10:38:20,283 - compute on device: cuda:0 |
|
2023-10-19 10:38:20,283 - embedding storage: none |
|
2023-10-19 10:38:20,283 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:38:20,283 Model training base path: "hmbench-newseye/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-19 10:38:20,283 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:38:20,283 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:38:20,283 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-19 10:38:29,211 epoch 1 - iter 521/5212 - loss 2.72371222 - time (sec): 8.93 - samples/sec: 4096.81 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 10:38:37,636 epoch 1 - iter 1042/5212 - loss 2.05502782 - time (sec): 17.35 - samples/sec: 4193.38 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 10:38:45,775 epoch 1 - iter 1563/5212 - loss 1.58883158 - time (sec): 25.49 - samples/sec: 4304.23 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 10:38:54,422 epoch 1 - iter 2084/5212 - loss 1.32512439 - time (sec): 34.14 - samples/sec: 4326.46 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 10:39:02,797 epoch 1 - iter 2605/5212 - loss 1.17407642 - time (sec): 42.51 - samples/sec: 4404.91 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 10:39:11,106 epoch 1 - iter 3126/5212 - loss 1.08043972 - time (sec): 50.82 - samples/sec: 4390.07 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 10:39:19,406 epoch 1 - iter 3647/5212 - loss 1.01154260 - time (sec): 59.12 - samples/sec: 4394.12 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 10:39:27,380 epoch 1 - iter 4168/5212 - loss 0.94868657 - time (sec): 67.10 - samples/sec: 4397.60 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 10:39:35,900 epoch 1 - iter 4689/5212 - loss 0.88980347 - time (sec): 75.62 - samples/sec: 4380.26 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 10:39:44,202 epoch 1 - iter 5210/5212 - loss 0.84260434 - time (sec): 83.92 - samples/sec: 4377.91 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 10:39:44,234 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:39:44,234 EPOCH 1 done: loss 0.8426 - lr: 0.000030 |
|
2023-10-19 10:39:46,475 DEV : loss 0.1433752328157425 - f1-score (micro avg) 0.0291 |
|
2023-10-19 10:39:46,497 saving best model |
|
2023-10-19 10:39:46,526 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:39:54,611 epoch 2 - iter 521/5212 - loss 0.40879602 - time (sec): 8.09 - samples/sec: 4230.25 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 10:40:02,957 epoch 2 - iter 1042/5212 - loss 0.41373491 - time (sec): 16.43 - samples/sec: 4320.03 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 10:40:11,068 epoch 2 - iter 1563/5212 - loss 0.41110618 - time (sec): 24.54 - samples/sec: 4287.77 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 10:40:19,346 epoch 2 - iter 2084/5212 - loss 0.39790099 - time (sec): 32.82 - samples/sec: 4380.84 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 10:40:27,777 epoch 2 - iter 2605/5212 - loss 0.39498142 - time (sec): 41.25 - samples/sec: 4415.85 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 10:40:36,056 epoch 2 - iter 3126/5212 - loss 0.38607733 - time (sec): 49.53 - samples/sec: 4425.77 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 10:40:44,415 epoch 2 - iter 3647/5212 - loss 0.37832413 - time (sec): 57.89 - samples/sec: 4438.26 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 10:40:53,047 epoch 2 - iter 4168/5212 - loss 0.37306540 - time (sec): 66.52 - samples/sec: 4423.81 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 10:41:01,276 epoch 2 - iter 4689/5212 - loss 0.36784622 - time (sec): 74.75 - samples/sec: 4421.81 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 10:41:09,649 epoch 2 - iter 5210/5212 - loss 0.36564089 - time (sec): 83.12 - samples/sec: 4419.15 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 10:41:09,684 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:41:09,684 EPOCH 2 done: loss 0.3656 - lr: 0.000027 |
|
2023-10-19 10:41:14,780 DEV : loss 0.1391393542289734 - f1-score (micro avg) 0.2928 |
|
2023-10-19 10:41:14,804 saving best model |
|
2023-10-19 10:41:14,840 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:41:23,076 epoch 3 - iter 521/5212 - loss 0.32466550 - time (sec): 8.24 - samples/sec: 4448.41 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 10:41:31,416 epoch 3 - iter 1042/5212 - loss 0.31821062 - time (sec): 16.58 - samples/sec: 4451.66 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 10:41:39,462 epoch 3 - iter 1563/5212 - loss 0.32728696 - time (sec): 24.62 - samples/sec: 4451.18 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 10:41:47,786 epoch 3 - iter 2084/5212 - loss 0.31607836 - time (sec): 32.95 - samples/sec: 4487.01 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 10:41:56,154 epoch 3 - iter 2605/5212 - loss 0.31480478 - time (sec): 41.31 - samples/sec: 4494.25 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 10:42:04,737 epoch 3 - iter 3126/5212 - loss 0.31038250 - time (sec): 49.90 - samples/sec: 4473.56 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 10:42:12,965 epoch 3 - iter 3647/5212 - loss 0.31026995 - time (sec): 58.12 - samples/sec: 4453.95 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 10:42:21,291 epoch 3 - iter 4168/5212 - loss 0.31015311 - time (sec): 66.45 - samples/sec: 4444.22 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 10:42:29,486 epoch 3 - iter 4689/5212 - loss 0.31207638 - time (sec): 74.65 - samples/sec: 4429.21 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 10:42:37,697 epoch 3 - iter 5210/5212 - loss 0.31193129 - time (sec): 82.86 - samples/sec: 4433.61 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 10:42:37,733 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:42:37,733 EPOCH 3 done: loss 0.3119 - lr: 0.000023 |
|
2023-10-19 10:42:42,852 DEV : loss 0.13720223307609558 - f1-score (micro avg) 0.311 |
|
2023-10-19 10:42:42,876 saving best model |
|
2023-10-19 10:42:42,916 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:42:51,333 epoch 4 - iter 521/5212 - loss 0.25709768 - time (sec): 8.42 - samples/sec: 4514.44 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 10:42:59,518 epoch 4 - iter 1042/5212 - loss 0.26137660 - time (sec): 16.60 - samples/sec: 4367.48 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 10:43:07,636 epoch 4 - iter 1563/5212 - loss 0.27629460 - time (sec): 24.72 - samples/sec: 4311.54 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 10:43:15,953 epoch 4 - iter 2084/5212 - loss 0.27504062 - time (sec): 33.04 - samples/sec: 4377.33 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 10:43:24,401 epoch 4 - iter 2605/5212 - loss 0.27571513 - time (sec): 41.48 - samples/sec: 4448.97 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 10:43:32,741 epoch 4 - iter 3126/5212 - loss 0.27824346 - time (sec): 49.82 - samples/sec: 4441.02 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 10:43:41,037 epoch 4 - iter 3647/5212 - loss 0.27812047 - time (sec): 58.12 - samples/sec: 4436.80 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 10:43:49,174 epoch 4 - iter 4168/5212 - loss 0.28260826 - time (sec): 66.26 - samples/sec: 4408.59 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 10:43:57,417 epoch 4 - iter 4689/5212 - loss 0.28076208 - time (sec): 74.50 - samples/sec: 4423.38 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 10:44:05,823 epoch 4 - iter 5210/5212 - loss 0.27822833 - time (sec): 82.91 - samples/sec: 4430.02 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 10:44:05,856 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:44:05,856 EPOCH 4 done: loss 0.2782 - lr: 0.000020 |
|
2023-10-19 10:44:11,009 DEV : loss 0.14805111289024353 - f1-score (micro avg) 0.2656 |
|
2023-10-19 10:44:11,033 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:44:19,265 epoch 5 - iter 521/5212 - loss 0.25458992 - time (sec): 8.23 - samples/sec: 4663.96 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 10:44:27,618 epoch 5 - iter 1042/5212 - loss 0.23812930 - time (sec): 16.58 - samples/sec: 4645.30 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 10:44:35,825 epoch 5 - iter 1563/5212 - loss 0.23909797 - time (sec): 24.79 - samples/sec: 4518.74 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 10:44:44,069 epoch 5 - iter 2084/5212 - loss 0.24733360 - time (sec): 33.04 - samples/sec: 4498.14 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 10:44:52,390 epoch 5 - iter 2605/5212 - loss 0.24607385 - time (sec): 41.36 - samples/sec: 4479.65 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 10:45:00,692 epoch 5 - iter 3126/5212 - loss 0.25227478 - time (sec): 49.66 - samples/sec: 4455.06 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 10:45:08,913 epoch 5 - iter 3647/5212 - loss 0.25331088 - time (sec): 57.88 - samples/sec: 4439.80 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 10:45:17,372 epoch 5 - iter 4168/5212 - loss 0.25226388 - time (sec): 66.34 - samples/sec: 4444.21 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 10:45:25,748 epoch 5 - iter 4689/5212 - loss 0.25437720 - time (sec): 74.71 - samples/sec: 4442.17 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 10:45:34,004 epoch 5 - iter 5210/5212 - loss 0.25346983 - time (sec): 82.97 - samples/sec: 4428.05 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 10:45:34,030 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:45:34,030 EPOCH 5 done: loss 0.2535 - lr: 0.000017 |
|
2023-10-19 10:45:39,162 DEV : loss 0.1490069180727005 - f1-score (micro avg) 0.2855 |
|
2023-10-19 10:45:39,197 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:45:47,655 epoch 6 - iter 521/5212 - loss 0.26624853 - time (sec): 8.46 - samples/sec: 3991.54 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 10:45:56,023 epoch 6 - iter 1042/5212 - loss 0.25453009 - time (sec): 16.82 - samples/sec: 4277.48 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 10:46:04,383 epoch 6 - iter 1563/5212 - loss 0.24457171 - time (sec): 25.18 - samples/sec: 4370.58 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 10:46:12,858 epoch 6 - iter 2084/5212 - loss 0.23560982 - time (sec): 33.66 - samples/sec: 4419.97 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 10:46:21,179 epoch 6 - iter 2605/5212 - loss 0.23329802 - time (sec): 41.98 - samples/sec: 4434.39 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 10:46:29,142 epoch 6 - iter 3126/5212 - loss 0.23211029 - time (sec): 49.94 - samples/sec: 4480.71 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 10:46:37,489 epoch 6 - iter 3647/5212 - loss 0.23718572 - time (sec): 58.29 - samples/sec: 4445.74 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 10:46:45,775 epoch 6 - iter 4168/5212 - loss 0.23673427 - time (sec): 66.58 - samples/sec: 4422.59 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 10:46:53,971 epoch 6 - iter 4689/5212 - loss 0.23213792 - time (sec): 74.77 - samples/sec: 4423.57 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 10:47:02,844 epoch 6 - iter 5210/5212 - loss 0.23638178 - time (sec): 83.65 - samples/sec: 4391.69 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 10:47:02,878 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:47:02,879 EPOCH 6 done: loss 0.2364 - lr: 0.000013 |
|
2023-10-19 10:47:07,421 DEV : loss 0.1647791564464569 - f1-score (micro avg) 0.2693 |
|
2023-10-19 10:47:07,444 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:47:15,620 epoch 7 - iter 521/5212 - loss 0.23817423 - time (sec): 8.18 - samples/sec: 4509.93 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 10:47:23,915 epoch 7 - iter 1042/5212 - loss 0.22435065 - time (sec): 16.47 - samples/sec: 4529.78 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 10:47:32,101 epoch 7 - iter 1563/5212 - loss 0.22347997 - time (sec): 24.66 - samples/sec: 4503.56 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 10:47:40,302 epoch 7 - iter 2084/5212 - loss 0.22299657 - time (sec): 32.86 - samples/sec: 4508.96 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 10:47:49,223 epoch 7 - iter 2605/5212 - loss 0.21747021 - time (sec): 41.78 - samples/sec: 4483.66 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 10:47:57,646 epoch 7 - iter 3126/5212 - loss 0.21824678 - time (sec): 50.20 - samples/sec: 4446.54 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 10:48:05,716 epoch 7 - iter 3647/5212 - loss 0.22182997 - time (sec): 58.27 - samples/sec: 4432.18 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 10:48:14,327 epoch 7 - iter 4168/5212 - loss 0.22075907 - time (sec): 66.88 - samples/sec: 4401.35 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 10:48:22,791 epoch 7 - iter 4689/5212 - loss 0.22067660 - time (sec): 75.35 - samples/sec: 4403.88 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 10:48:31,053 epoch 7 - iter 5210/5212 - loss 0.22219262 - time (sec): 83.61 - samples/sec: 4391.66 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 10:48:31,096 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:48:31,096 EPOCH 7 done: loss 0.2220 - lr: 0.000010 |
|
2023-10-19 10:48:35,618 DEV : loss 0.16794627904891968 - f1-score (micro avg) 0.2714 |
|
2023-10-19 10:48:35,641 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:48:44,018 epoch 8 - iter 521/5212 - loss 0.24024885 - time (sec): 8.38 - samples/sec: 4188.92 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 10:48:52,244 epoch 8 - iter 1042/5212 - loss 0.23380355 - time (sec): 16.60 - samples/sec: 4239.48 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 10:49:00,438 epoch 8 - iter 1563/5212 - loss 0.22497629 - time (sec): 24.80 - samples/sec: 4319.11 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 10:49:08,803 epoch 8 - iter 2084/5212 - loss 0.23065970 - time (sec): 33.16 - samples/sec: 4388.00 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 10:49:17,008 epoch 8 - iter 2605/5212 - loss 0.22269701 - time (sec): 41.37 - samples/sec: 4419.51 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 10:49:25,407 epoch 8 - iter 3126/5212 - loss 0.21867285 - time (sec): 49.77 - samples/sec: 4421.06 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 10:49:33,743 epoch 8 - iter 3647/5212 - loss 0.21580111 - time (sec): 58.10 - samples/sec: 4448.90 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 10:49:42,116 epoch 8 - iter 4168/5212 - loss 0.21326174 - time (sec): 66.47 - samples/sec: 4465.53 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 10:49:50,650 epoch 8 - iter 4689/5212 - loss 0.21523254 - time (sec): 75.01 - samples/sec: 4435.87 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 10:49:58,898 epoch 8 - iter 5210/5212 - loss 0.21610124 - time (sec): 83.26 - samples/sec: 4412.87 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 10:49:58,930 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:49:58,930 EPOCH 8 done: loss 0.2161 - lr: 0.000007 |
|
2023-10-19 10:50:04,152 DEV : loss 0.17187514901161194 - f1-score (micro avg) 0.266 |
|
2023-10-19 10:50:04,179 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:50:12,398 epoch 9 - iter 521/5212 - loss 0.21708792 - time (sec): 8.22 - samples/sec: 4102.45 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 10:50:20,747 epoch 9 - iter 1042/5212 - loss 0.19390674 - time (sec): 16.57 - samples/sec: 4312.08 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 10:50:29,009 epoch 9 - iter 1563/5212 - loss 0.20054262 - time (sec): 24.83 - samples/sec: 4341.33 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 10:50:37,304 epoch 9 - iter 2084/5212 - loss 0.21128490 - time (sec): 33.12 - samples/sec: 4338.21 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 10:50:45,779 epoch 9 - iter 2605/5212 - loss 0.21144254 - time (sec): 41.60 - samples/sec: 4433.69 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 10:50:54,001 epoch 9 - iter 3126/5212 - loss 0.20955816 - time (sec): 49.82 - samples/sec: 4423.37 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 10:51:02,390 epoch 9 - iter 3647/5212 - loss 0.21385600 - time (sec): 58.21 - samples/sec: 4438.87 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 10:51:10,644 epoch 9 - iter 4168/5212 - loss 0.21026236 - time (sec): 66.46 - samples/sec: 4440.93 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 10:51:19,014 epoch 9 - iter 4689/5212 - loss 0.20931466 - time (sec): 74.83 - samples/sec: 4411.30 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 10:51:27,385 epoch 9 - iter 5210/5212 - loss 0.20947495 - time (sec): 83.20 - samples/sec: 4414.82 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 10:51:27,418 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:51:27,418 EPOCH 9 done: loss 0.2094 - lr: 0.000003 |
|
2023-10-19 10:51:32,597 DEV : loss 0.1816394329071045 - f1-score (micro avg) 0.272 |
|
2023-10-19 10:51:32,621 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:51:41,163 epoch 10 - iter 521/5212 - loss 0.20993504 - time (sec): 8.54 - samples/sec: 4199.80 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 10:51:49,336 epoch 10 - iter 1042/5212 - loss 0.19721566 - time (sec): 16.71 - samples/sec: 4408.14 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 10:51:57,596 epoch 10 - iter 1563/5212 - loss 0.19886392 - time (sec): 24.97 - samples/sec: 4399.09 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 10:52:06,090 epoch 10 - iter 2084/5212 - loss 0.20201347 - time (sec): 33.47 - samples/sec: 4376.06 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 10:52:14,589 epoch 10 - iter 2605/5212 - loss 0.20351706 - time (sec): 41.97 - samples/sec: 4365.97 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 10:52:23,017 epoch 10 - iter 3126/5212 - loss 0.20748962 - time (sec): 50.40 - samples/sec: 4427.54 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 10:52:31,397 epoch 10 - iter 3647/5212 - loss 0.20967361 - time (sec): 58.78 - samples/sec: 4432.91 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 10:52:39,724 epoch 10 - iter 4168/5212 - loss 0.20825590 - time (sec): 67.10 - samples/sec: 4426.18 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 10:52:48,049 epoch 10 - iter 4689/5212 - loss 0.20604718 - time (sec): 75.43 - samples/sec: 4423.79 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-19 10:52:56,230 epoch 10 - iter 5210/5212 - loss 0.20716771 - time (sec): 83.61 - samples/sec: 4391.22 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-19 10:52:56,272 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:52:56,272 EPOCH 10 done: loss 0.2072 - lr: 0.000000 |
|
2023-10-19 10:53:01,417 DEV : loss 0.17811782658100128 - f1-score (micro avg) 0.2638 |
|
2023-10-19 10:53:01,469 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 10:53:01,470 Loading model from best epoch ... |
|
2023-10-19 10:53:01,547 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-19 10:53:07,781 |
|
Results: |
|
- F-score (micro) 0.299 |
|
- F-score (macro) 0.1455 |
|
- Accuracy 0.177 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.4586 0.4605 0.4595 1214 |
|
PER 0.1394 0.0718 0.0948 808 |
|
ORG 0.0470 0.0198 0.0279 353 |
|
HumanProd 0.0000 0.0000 0.0000 15 |
|
|
|
micro avg 0.3498 0.2611 0.2990 2390 |
|
macro avg 0.1612 0.1380 0.1455 2390 |
|
weighted avg 0.2870 0.2611 0.2696 2390 |
|
|
|
2023-10-19 10:53:07,782 ---------------------------------------------------------------------------------------------------- |
|
|