stefan-it's picture
Upload folder using huggingface_hub
21c4a77
2023-10-13 08:28:05,695 ----------------------------------------------------------------------------------------------------
2023-10-13 08:28:05,696 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 08:28:05,696 ----------------------------------------------------------------------------------------------------
2023-10-13 08:28:05,696 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-13 08:28:05,696 ----------------------------------------------------------------------------------------------------
2023-10-13 08:28:05,696 Train: 1100 sentences
2023-10-13 08:28:05,696 (train_with_dev=False, train_with_test=False)
2023-10-13 08:28:05,696 ----------------------------------------------------------------------------------------------------
2023-10-13 08:28:05,696 Training Params:
2023-10-13 08:28:05,697 - learning_rate: "5e-05"
2023-10-13 08:28:05,697 - mini_batch_size: "8"
2023-10-13 08:28:05,697 - max_epochs: "10"
2023-10-13 08:28:05,697 - shuffle: "True"
2023-10-13 08:28:05,697 ----------------------------------------------------------------------------------------------------
2023-10-13 08:28:05,697 Plugins:
2023-10-13 08:28:05,697 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 08:28:05,697 ----------------------------------------------------------------------------------------------------
2023-10-13 08:28:05,697 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 08:28:05,697 - metric: "('micro avg', 'f1-score')"
2023-10-13 08:28:05,697 ----------------------------------------------------------------------------------------------------
2023-10-13 08:28:05,697 Computation:
2023-10-13 08:28:05,697 - compute on device: cuda:0
2023-10-13 08:28:05,697 - embedding storage: none
2023-10-13 08:28:05,697 ----------------------------------------------------------------------------------------------------
2023-10-13 08:28:05,697 Model training base path: "hmbench-ajmc/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-13 08:28:05,697 ----------------------------------------------------------------------------------------------------
2023-10-13 08:28:05,697 ----------------------------------------------------------------------------------------------------
2023-10-13 08:28:06,456 epoch 1 - iter 13/138 - loss 3.66066974 - time (sec): 0.76 - samples/sec: 3113.44 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:28:07,223 epoch 1 - iter 26/138 - loss 3.34555311 - time (sec): 1.53 - samples/sec: 2865.97 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:28:07,934 epoch 1 - iter 39/138 - loss 2.78497240 - time (sec): 2.24 - samples/sec: 2906.17 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:28:08,739 epoch 1 - iter 52/138 - loss 2.24907326 - time (sec): 3.04 - samples/sec: 2971.95 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:28:09,429 epoch 1 - iter 65/138 - loss 2.01501516 - time (sec): 3.73 - samples/sec: 2931.15 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:28:10,145 epoch 1 - iter 78/138 - loss 1.82175107 - time (sec): 4.45 - samples/sec: 2926.58 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:28:10,861 epoch 1 - iter 91/138 - loss 1.67513840 - time (sec): 5.16 - samples/sec: 2905.88 - lr: 0.000033 - momentum: 0.000000
2023-10-13 08:28:11,595 epoch 1 - iter 104/138 - loss 1.51839005 - time (sec): 5.90 - samples/sec: 2965.95 - lr: 0.000037 - momentum: 0.000000
2023-10-13 08:28:12,324 epoch 1 - iter 117/138 - loss 1.40443512 - time (sec): 6.63 - samples/sec: 2956.37 - lr: 0.000042 - momentum: 0.000000
2023-10-13 08:28:13,027 epoch 1 - iter 130/138 - loss 1.31422310 - time (sec): 7.33 - samples/sec: 2942.17 - lr: 0.000047 - momentum: 0.000000
2023-10-13 08:28:13,455 ----------------------------------------------------------------------------------------------------
2023-10-13 08:28:13,455 EPOCH 1 done: loss 1.2682 - lr: 0.000047
2023-10-13 08:28:14,151 DEV : loss 0.2636374831199646 - f1-score (micro avg) 0.6659
2023-10-13 08:28:14,155 saving best model
2023-10-13 08:28:14,512 ----------------------------------------------------------------------------------------------------
2023-10-13 08:28:15,242 epoch 2 - iter 13/138 - loss 0.26846597 - time (sec): 0.73 - samples/sec: 3308.51 - lr: 0.000050 - momentum: 0.000000
2023-10-13 08:28:15,904 epoch 2 - iter 26/138 - loss 0.27120212 - time (sec): 1.39 - samples/sec: 3039.89 - lr: 0.000049 - momentum: 0.000000
2023-10-13 08:28:16,677 epoch 2 - iter 39/138 - loss 0.26006371 - time (sec): 2.16 - samples/sec: 2985.06 - lr: 0.000048 - momentum: 0.000000
2023-10-13 08:28:17,416 epoch 2 - iter 52/138 - loss 0.24659558 - time (sec): 2.90 - samples/sec: 2925.83 - lr: 0.000048 - momentum: 0.000000
2023-10-13 08:28:18,184 epoch 2 - iter 65/138 - loss 0.22354409 - time (sec): 3.67 - samples/sec: 2898.61 - lr: 0.000047 - momentum: 0.000000
2023-10-13 08:28:18,890 epoch 2 - iter 78/138 - loss 0.22828453 - time (sec): 4.38 - samples/sec: 2921.98 - lr: 0.000047 - momentum: 0.000000
2023-10-13 08:28:19,606 epoch 2 - iter 91/138 - loss 0.21803767 - time (sec): 5.09 - samples/sec: 2915.35 - lr: 0.000046 - momentum: 0.000000
2023-10-13 08:28:20,349 epoch 2 - iter 104/138 - loss 0.21482014 - time (sec): 5.84 - samples/sec: 2929.71 - lr: 0.000046 - momentum: 0.000000
2023-10-13 08:28:21,120 epoch 2 - iter 117/138 - loss 0.21421039 - time (sec): 6.61 - samples/sec: 2915.97 - lr: 0.000045 - momentum: 0.000000
2023-10-13 08:28:21,868 epoch 2 - iter 130/138 - loss 0.20951797 - time (sec): 7.36 - samples/sec: 2934.82 - lr: 0.000045 - momentum: 0.000000
2023-10-13 08:28:22,277 ----------------------------------------------------------------------------------------------------
2023-10-13 08:28:22,277 EPOCH 2 done: loss 0.2061 - lr: 0.000045
2023-10-13 08:28:22,993 DEV : loss 0.14220675826072693 - f1-score (micro avg) 0.7941
2023-10-13 08:28:23,000 saving best model
2023-10-13 08:28:23,454 ----------------------------------------------------------------------------------------------------
2023-10-13 08:28:24,118 epoch 3 - iter 13/138 - loss 0.14146311 - time (sec): 0.66 - samples/sec: 2976.67 - lr: 0.000044 - momentum: 0.000000
2023-10-13 08:28:24,821 epoch 3 - iter 26/138 - loss 0.12153795 - time (sec): 1.36 - samples/sec: 2994.78 - lr: 0.000043 - momentum: 0.000000
2023-10-13 08:28:25,591 epoch 3 - iter 39/138 - loss 0.10515299 - time (sec): 2.13 - samples/sec: 3022.73 - lr: 0.000043 - momentum: 0.000000
2023-10-13 08:28:26,289 epoch 3 - iter 52/138 - loss 0.10537670 - time (sec): 2.83 - samples/sec: 3050.55 - lr: 0.000042 - momentum: 0.000000
2023-10-13 08:28:27,079 epoch 3 - iter 65/138 - loss 0.10512733 - time (sec): 3.62 - samples/sec: 2999.95 - lr: 0.000042 - momentum: 0.000000
2023-10-13 08:28:27,818 epoch 3 - iter 78/138 - loss 0.10309178 - time (sec): 4.36 - samples/sec: 2945.67 - lr: 0.000041 - momentum: 0.000000
2023-10-13 08:28:28,586 epoch 3 - iter 91/138 - loss 0.09858725 - time (sec): 5.13 - samples/sec: 2946.08 - lr: 0.000041 - momentum: 0.000000
2023-10-13 08:28:29,299 epoch 3 - iter 104/138 - loss 0.09748195 - time (sec): 5.84 - samples/sec: 2921.99 - lr: 0.000040 - momentum: 0.000000
2023-10-13 08:28:30,040 epoch 3 - iter 117/138 - loss 0.10218262 - time (sec): 6.58 - samples/sec: 2909.74 - lr: 0.000040 - momentum: 0.000000
2023-10-13 08:28:30,791 epoch 3 - iter 130/138 - loss 0.10322278 - time (sec): 7.33 - samples/sec: 2914.55 - lr: 0.000039 - momentum: 0.000000
2023-10-13 08:28:31,252 ----------------------------------------------------------------------------------------------------
2023-10-13 08:28:31,252 EPOCH 3 done: loss 0.0998 - lr: 0.000039
2023-10-13 08:28:31,944 DEV : loss 0.15421868860721588 - f1-score (micro avg) 0.8192
2023-10-13 08:28:31,949 saving best model
2023-10-13 08:28:32,405 ----------------------------------------------------------------------------------------------------
2023-10-13 08:28:33,118 epoch 4 - iter 13/138 - loss 0.04874159 - time (sec): 0.71 - samples/sec: 2959.18 - lr: 0.000038 - momentum: 0.000000
2023-10-13 08:28:33,841 epoch 4 - iter 26/138 - loss 0.06713360 - time (sec): 1.43 - samples/sec: 3098.14 - lr: 0.000038 - momentum: 0.000000
2023-10-13 08:28:34,613 epoch 4 - iter 39/138 - loss 0.06554658 - time (sec): 2.21 - samples/sec: 2920.46 - lr: 0.000037 - momentum: 0.000000
2023-10-13 08:28:35,349 epoch 4 - iter 52/138 - loss 0.06616042 - time (sec): 2.94 - samples/sec: 2928.02 - lr: 0.000037 - momentum: 0.000000
2023-10-13 08:28:36,106 epoch 4 - iter 65/138 - loss 0.07177299 - time (sec): 3.70 - samples/sec: 2914.25 - lr: 0.000036 - momentum: 0.000000
2023-10-13 08:28:36,784 epoch 4 - iter 78/138 - loss 0.06695069 - time (sec): 4.38 - samples/sec: 2926.56 - lr: 0.000036 - momentum: 0.000000
2023-10-13 08:28:37,483 epoch 4 - iter 91/138 - loss 0.07222990 - time (sec): 5.08 - samples/sec: 2935.33 - lr: 0.000035 - momentum: 0.000000
2023-10-13 08:28:38,230 epoch 4 - iter 104/138 - loss 0.07333500 - time (sec): 5.82 - samples/sec: 2903.09 - lr: 0.000035 - momentum: 0.000000
2023-10-13 08:28:39,001 epoch 4 - iter 117/138 - loss 0.06943752 - time (sec): 6.59 - samples/sec: 2918.20 - lr: 0.000034 - momentum: 0.000000
2023-10-13 08:28:39,795 epoch 4 - iter 130/138 - loss 0.06896700 - time (sec): 7.39 - samples/sec: 2921.26 - lr: 0.000034 - momentum: 0.000000
2023-10-13 08:28:40,241 ----------------------------------------------------------------------------------------------------
2023-10-13 08:28:40,241 EPOCH 4 done: loss 0.0703 - lr: 0.000034
2023-10-13 08:28:40,941 DEV : loss 0.11516644060611725 - f1-score (micro avg) 0.8669
2023-10-13 08:28:40,946 saving best model
2023-10-13 08:28:41,436 ----------------------------------------------------------------------------------------------------
2023-10-13 08:28:42,167 epoch 5 - iter 13/138 - loss 0.03479877 - time (sec): 0.73 - samples/sec: 3007.48 - lr: 0.000033 - momentum: 0.000000
2023-10-13 08:28:42,891 epoch 5 - iter 26/138 - loss 0.04109845 - time (sec): 1.45 - samples/sec: 3029.21 - lr: 0.000032 - momentum: 0.000000
2023-10-13 08:28:43,633 epoch 5 - iter 39/138 - loss 0.05555495 - time (sec): 2.19 - samples/sec: 2949.93 - lr: 0.000032 - momentum: 0.000000
2023-10-13 08:28:44,388 epoch 5 - iter 52/138 - loss 0.05444566 - time (sec): 2.95 - samples/sec: 2882.77 - lr: 0.000031 - momentum: 0.000000
2023-10-13 08:28:45,120 epoch 5 - iter 65/138 - loss 0.05631263 - time (sec): 3.68 - samples/sec: 2910.39 - lr: 0.000031 - momentum: 0.000000
2023-10-13 08:28:45,824 epoch 5 - iter 78/138 - loss 0.05349107 - time (sec): 4.39 - samples/sec: 2935.54 - lr: 0.000030 - momentum: 0.000000
2023-10-13 08:28:46,596 epoch 5 - iter 91/138 - loss 0.05367029 - time (sec): 5.16 - samples/sec: 2911.64 - lr: 0.000030 - momentum: 0.000000
2023-10-13 08:28:47,323 epoch 5 - iter 104/138 - loss 0.05086712 - time (sec): 5.89 - samples/sec: 2893.45 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:28:48,065 epoch 5 - iter 117/138 - loss 0.04647247 - time (sec): 6.63 - samples/sec: 2890.67 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:28:48,815 epoch 5 - iter 130/138 - loss 0.04966325 - time (sec): 7.38 - samples/sec: 2895.02 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:28:49,265 ----------------------------------------------------------------------------------------------------
2023-10-13 08:28:49,265 EPOCH 5 done: loss 0.0487 - lr: 0.000028
2023-10-13 08:28:50,004 DEV : loss 0.13216754794120789 - f1-score (micro avg) 0.879
2023-10-13 08:28:50,010 saving best model
2023-10-13 08:28:50,466 ----------------------------------------------------------------------------------------------------
2023-10-13 08:28:51,217 epoch 6 - iter 13/138 - loss 0.04398162 - time (sec): 0.75 - samples/sec: 2873.78 - lr: 0.000027 - momentum: 0.000000
2023-10-13 08:28:51,968 epoch 6 - iter 26/138 - loss 0.04122476 - time (sec): 1.50 - samples/sec: 2943.01 - lr: 0.000027 - momentum: 0.000000
2023-10-13 08:28:52,738 epoch 6 - iter 39/138 - loss 0.03461267 - time (sec): 2.27 - samples/sec: 2887.58 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:28:53,455 epoch 6 - iter 52/138 - loss 0.03086480 - time (sec): 2.99 - samples/sec: 2864.22 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:28:54,189 epoch 6 - iter 65/138 - loss 0.04069526 - time (sec): 3.72 - samples/sec: 2862.68 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:28:54,898 epoch 6 - iter 78/138 - loss 0.03871127 - time (sec): 4.43 - samples/sec: 2884.43 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:28:55,628 epoch 6 - iter 91/138 - loss 0.03856114 - time (sec): 5.16 - samples/sec: 2900.31 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:28:56,299 epoch 6 - iter 104/138 - loss 0.03853103 - time (sec): 5.83 - samples/sec: 2916.76 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:28:57,045 epoch 6 - iter 117/138 - loss 0.03855877 - time (sec): 6.58 - samples/sec: 2922.06 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:28:57,816 epoch 6 - iter 130/138 - loss 0.03662516 - time (sec): 7.35 - samples/sec: 2923.47 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:28:58,257 ----------------------------------------------------------------------------------------------------
2023-10-13 08:28:58,257 EPOCH 6 done: loss 0.0364 - lr: 0.000023
2023-10-13 08:28:58,953 DEV : loss 0.1272154450416565 - f1-score (micro avg) 0.887
2023-10-13 08:28:58,959 saving best model
2023-10-13 08:28:59,437 ----------------------------------------------------------------------------------------------------
2023-10-13 08:29:00,164 epoch 7 - iter 13/138 - loss 0.00877141 - time (sec): 0.73 - samples/sec: 2847.01 - lr: 0.000022 - momentum: 0.000000
2023-10-13 08:29:00,941 epoch 7 - iter 26/138 - loss 0.01397465 - time (sec): 1.50 - samples/sec: 2864.84 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:29:01,669 epoch 7 - iter 39/138 - loss 0.01140872 - time (sec): 2.23 - samples/sec: 2798.42 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:29:02,417 epoch 7 - iter 52/138 - loss 0.01542022 - time (sec): 2.98 - samples/sec: 2894.58 - lr: 0.000020 - momentum: 0.000000
2023-10-13 08:29:03,153 epoch 7 - iter 65/138 - loss 0.01698683 - time (sec): 3.72 - samples/sec: 2892.26 - lr: 0.000020 - momentum: 0.000000
2023-10-13 08:29:03,863 epoch 7 - iter 78/138 - loss 0.02508462 - time (sec): 4.42 - samples/sec: 2918.67 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:29:04,589 epoch 7 - iter 91/138 - loss 0.02618363 - time (sec): 5.15 - samples/sec: 2927.40 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:29:05,321 epoch 7 - iter 104/138 - loss 0.03015213 - time (sec): 5.88 - samples/sec: 2947.69 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:29:06,042 epoch 7 - iter 117/138 - loss 0.02745988 - time (sec): 6.60 - samples/sec: 2937.67 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:29:06,771 epoch 7 - iter 130/138 - loss 0.02731669 - time (sec): 7.33 - samples/sec: 2922.38 - lr: 0.000017 - momentum: 0.000000
2023-10-13 08:29:07,231 ----------------------------------------------------------------------------------------------------
2023-10-13 08:29:07,231 EPOCH 7 done: loss 0.0260 - lr: 0.000017
2023-10-13 08:29:07,936 DEV : loss 0.13966915011405945 - f1-score (micro avg) 0.8801
2023-10-13 08:29:07,941 ----------------------------------------------------------------------------------------------------
2023-10-13 08:29:08,658 epoch 8 - iter 13/138 - loss 0.01104765 - time (sec): 0.72 - samples/sec: 2978.80 - lr: 0.000016 - momentum: 0.000000
2023-10-13 08:29:09,410 epoch 8 - iter 26/138 - loss 0.00752305 - time (sec): 1.47 - samples/sec: 2883.98 - lr: 0.000016 - momentum: 0.000000
2023-10-13 08:29:10,155 epoch 8 - iter 39/138 - loss 0.01898034 - time (sec): 2.21 - samples/sec: 2934.28 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:29:10,835 epoch 8 - iter 52/138 - loss 0.01860073 - time (sec): 2.89 - samples/sec: 2894.55 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:29:11,586 epoch 8 - iter 65/138 - loss 0.02049879 - time (sec): 3.64 - samples/sec: 2925.06 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:29:12,319 epoch 8 - iter 78/138 - loss 0.02039417 - time (sec): 4.38 - samples/sec: 2930.02 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:29:13,053 epoch 8 - iter 91/138 - loss 0.02209402 - time (sec): 5.11 - samples/sec: 2961.45 - lr: 0.000013 - momentum: 0.000000
2023-10-13 08:29:13,790 epoch 8 - iter 104/138 - loss 0.02010574 - time (sec): 5.85 - samples/sec: 2961.88 - lr: 0.000013 - momentum: 0.000000
2023-10-13 08:29:14,510 epoch 8 - iter 117/138 - loss 0.01915344 - time (sec): 6.57 - samples/sec: 2948.41 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:29:15,365 epoch 8 - iter 130/138 - loss 0.01886829 - time (sec): 7.42 - samples/sec: 2926.98 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:29:15,788 ----------------------------------------------------------------------------------------------------
2023-10-13 08:29:15,788 EPOCH 8 done: loss 0.0193 - lr: 0.000012
2023-10-13 08:29:16,490 DEV : loss 0.15436452627182007 - f1-score (micro avg) 0.887
2023-10-13 08:29:16,498 ----------------------------------------------------------------------------------------------------
2023-10-13 08:29:17,209 epoch 9 - iter 13/138 - loss 0.00300260 - time (sec): 0.71 - samples/sec: 3095.22 - lr: 0.000011 - momentum: 0.000000
2023-10-13 08:29:17,941 epoch 9 - iter 26/138 - loss 0.00663967 - time (sec): 1.44 - samples/sec: 2986.00 - lr: 0.000010 - momentum: 0.000000
2023-10-13 08:29:18,686 epoch 9 - iter 39/138 - loss 0.00627940 - time (sec): 2.19 - samples/sec: 2867.04 - lr: 0.000010 - momentum: 0.000000
2023-10-13 08:29:19,480 epoch 9 - iter 52/138 - loss 0.00791959 - time (sec): 2.98 - samples/sec: 2922.84 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:29:20,205 epoch 9 - iter 65/138 - loss 0.01291701 - time (sec): 3.71 - samples/sec: 2931.70 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:29:20,980 epoch 9 - iter 78/138 - loss 0.01251097 - time (sec): 4.48 - samples/sec: 2923.83 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:29:21,723 epoch 9 - iter 91/138 - loss 0.01631549 - time (sec): 5.22 - samples/sec: 2895.60 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:29:22,455 epoch 9 - iter 104/138 - loss 0.01584708 - time (sec): 5.96 - samples/sec: 2915.99 - lr: 0.000007 - momentum: 0.000000
2023-10-13 08:29:23,156 epoch 9 - iter 117/138 - loss 0.01623051 - time (sec): 6.66 - samples/sec: 2927.31 - lr: 0.000007 - momentum: 0.000000
2023-10-13 08:29:23,826 epoch 9 - iter 130/138 - loss 0.01611756 - time (sec): 7.33 - samples/sec: 2919.85 - lr: 0.000006 - momentum: 0.000000
2023-10-13 08:29:24,288 ----------------------------------------------------------------------------------------------------
2023-10-13 08:29:24,289 EPOCH 9 done: loss 0.0160 - lr: 0.000006
2023-10-13 08:29:25,032 DEV : loss 0.15071909129619598 - f1-score (micro avg) 0.8952
2023-10-13 08:29:25,038 saving best model
2023-10-13 08:29:25,513 ----------------------------------------------------------------------------------------------------
2023-10-13 08:29:26,284 epoch 10 - iter 13/138 - loss 0.01094785 - time (sec): 0.77 - samples/sec: 3073.99 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:29:27,003 epoch 10 - iter 26/138 - loss 0.00729644 - time (sec): 1.49 - samples/sec: 2938.90 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:29:27,769 epoch 10 - iter 39/138 - loss 0.00604709 - time (sec): 2.25 - samples/sec: 2859.24 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:29:28,495 epoch 10 - iter 52/138 - loss 0.00925022 - time (sec): 2.98 - samples/sec: 2876.88 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:29:29,231 epoch 10 - iter 65/138 - loss 0.00835002 - time (sec): 3.72 - samples/sec: 2837.02 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:29:30,024 epoch 10 - iter 78/138 - loss 0.00757364 - time (sec): 4.51 - samples/sec: 2832.91 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:29:30,793 epoch 10 - iter 91/138 - loss 0.01036881 - time (sec): 5.28 - samples/sec: 2857.81 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:29:31,529 epoch 10 - iter 104/138 - loss 0.00965364 - time (sec): 6.01 - samples/sec: 2888.98 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:29:32,219 epoch 10 - iter 117/138 - loss 0.01076463 - time (sec): 6.70 - samples/sec: 2882.85 - lr: 0.000001 - momentum: 0.000000
2023-10-13 08:29:32,956 epoch 10 - iter 130/138 - loss 0.01145230 - time (sec): 7.44 - samples/sec: 2876.07 - lr: 0.000000 - momentum: 0.000000
2023-10-13 08:29:33,405 ----------------------------------------------------------------------------------------------------
2023-10-13 08:29:33,405 EPOCH 10 done: loss 0.0115 - lr: 0.000000
2023-10-13 08:29:34,090 DEV : loss 0.15256953239440918 - f1-score (micro avg) 0.8929
2023-10-13 08:29:34,451 ----------------------------------------------------------------------------------------------------
2023-10-13 08:29:34,453 Loading model from best epoch ...
2023-10-13 08:29:36,011 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-13 08:29:36,732
Results:
- F-score (micro) 0.9174
- F-score (macro) 0.8813
- Accuracy 0.8621
By class:
precision recall f1-score support
scope 0.8944 0.9148 0.9045 176
pers 0.9760 0.9531 0.9644 128
work 0.8767 0.8649 0.8707 74
object 1.0000 1.0000 1.0000 2
loc 1.0000 0.5000 0.6667 2
micro avg 0.9186 0.9162 0.9174 382
macro avg 0.9494 0.8466 0.8813 382
weighted avg 0.9194 0.9162 0.9173 382
2023-10-13 08:29:36,733 ----------------------------------------------------------------------------------------------------