File size: 24,124 Bytes
7fbad1f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
2023-10-18 14:42:40,217 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:40,217 Model: "SequenceTagger(
  (embeddings): TransformerWordEmbeddings(
    (model): BertModel(
      (embeddings): BertEmbeddings(
        (word_embeddings): Embedding(32001, 128)
        (position_embeddings): Embedding(512, 128)
        (token_type_embeddings): Embedding(2, 128)
        (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
        (dropout): Dropout(p=0.1, inplace=False)
      )
      (encoder): BertEncoder(
        (layer): ModuleList(
          (0-1): 2 x BertLayer(
            (attention): BertAttention(
              (self): BertSelfAttention(
                (query): Linear(in_features=128, out_features=128, bias=True)
                (key): Linear(in_features=128, out_features=128, bias=True)
                (value): Linear(in_features=128, out_features=128, bias=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
              (output): BertSelfOutput(
                (dense): Linear(in_features=128, out_features=128, bias=True)
                (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (intermediate): BertIntermediate(
              (dense): Linear(in_features=128, out_features=512, bias=True)
              (intermediate_act_fn): GELUActivation()
            )
            (output): BertOutput(
              (dense): Linear(in_features=512, out_features=128, bias=True)
              (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
        )
      )
      (pooler): BertPooler(
        (dense): Linear(in_features=128, out_features=128, bias=True)
        (activation): Tanh()
      )
    )
  )
  (locked_dropout): LockedDropout(p=0.5)
  (linear): Linear(in_features=128, out_features=25, bias=True)
  (loss_function): CrossEntropyLoss()
)"
2023-10-18 14:42:40,217 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:40,217 MultiCorpus: 1100 train + 206 dev + 240 test sentences
 - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-18 14:42:40,217 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:40,217 Train:  1100 sentences
2023-10-18 14:42:40,217         (train_with_dev=False, train_with_test=False)
2023-10-18 14:42:40,217 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:40,217 Training Params:
2023-10-18 14:42:40,217  - learning_rate: "3e-05" 
2023-10-18 14:42:40,217  - mini_batch_size: "4"
2023-10-18 14:42:40,217  - max_epochs: "10"
2023-10-18 14:42:40,217  - shuffle: "True"
2023-10-18 14:42:40,217 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:40,217 Plugins:
2023-10-18 14:42:40,217  - TensorboardLogger
2023-10-18 14:42:40,217  - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 14:42:40,218 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:40,218 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 14:42:40,218  - metric: "('micro avg', 'f1-score')"
2023-10-18 14:42:40,218 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:40,218 Computation:
2023-10-18 14:42:40,218  - compute on device: cuda:0
2023-10-18 14:42:40,218  - embedding storage: none
2023-10-18 14:42:40,218 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:40,218 Model training base path: "hmbench-ajmc/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-18 14:42:40,218 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:40,218 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:40,218 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 14:42:40,625 epoch 1 - iter 27/275 - loss 4.04291065 - time (sec): 0.41 - samples/sec: 4876.59 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:42:41,036 epoch 1 - iter 54/275 - loss 3.99046555 - time (sec): 0.82 - samples/sec: 5295.40 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:42:41,450 epoch 1 - iter 81/275 - loss 3.88300995 - time (sec): 1.23 - samples/sec: 5217.03 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:42:41,839 epoch 1 - iter 108/275 - loss 3.72596285 - time (sec): 1.62 - samples/sec: 5451.17 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:42:42,204 epoch 1 - iter 135/275 - loss 3.59012547 - time (sec): 1.99 - samples/sec: 5587.39 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:42:42,567 epoch 1 - iter 162/275 - loss 3.42139254 - time (sec): 2.35 - samples/sec: 5723.76 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:42:42,940 epoch 1 - iter 189/275 - loss 3.22434273 - time (sec): 2.72 - samples/sec: 5769.38 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:42:43,317 epoch 1 - iter 216/275 - loss 3.00442121 - time (sec): 3.10 - samples/sec: 5893.76 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:42:43,695 epoch 1 - iter 243/275 - loss 2.83483712 - time (sec): 3.48 - samples/sec: 5821.97 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:42:44,062 epoch 1 - iter 270/275 - loss 2.67508372 - time (sec): 3.84 - samples/sec: 5835.49 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:42:44,125 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:44,125 EPOCH 1 done: loss 2.6478 - lr: 0.000029
2023-10-18 14:42:44,372 DEV : loss 0.9118794202804565 - f1-score (micro avg)  0.0
2023-10-18 14:42:44,376 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:44,748 epoch 2 - iter 27/275 - loss 0.87662447 - time (sec): 0.37 - samples/sec: 6845.47 - lr: 0.000030 - momentum: 0.000000
2023-10-18 14:42:45,110 epoch 2 - iter 54/275 - loss 0.91374560 - time (sec): 0.73 - samples/sec: 6393.62 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:42:45,483 epoch 2 - iter 81/275 - loss 0.95889295 - time (sec): 1.11 - samples/sec: 6284.21 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:42:45,845 epoch 2 - iter 108/275 - loss 0.98239975 - time (sec): 1.47 - samples/sec: 6046.67 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:42:46,212 epoch 2 - iter 135/275 - loss 0.96439524 - time (sec): 1.84 - samples/sec: 5984.52 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:42:46,584 epoch 2 - iter 162/275 - loss 0.95106604 - time (sec): 2.21 - samples/sec: 6042.69 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:42:46,944 epoch 2 - iter 189/275 - loss 0.94723193 - time (sec): 2.57 - samples/sec: 5948.24 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:42:47,317 epoch 2 - iter 216/275 - loss 0.94445831 - time (sec): 2.94 - samples/sec: 6021.22 - lr: 0.000027 - momentum: 0.000000
2023-10-18 14:42:47,685 epoch 2 - iter 243/275 - loss 0.93548787 - time (sec): 3.31 - samples/sec: 6070.65 - lr: 0.000027 - momentum: 0.000000
2023-10-18 14:42:48,054 epoch 2 - iter 270/275 - loss 0.91241355 - time (sec): 3.68 - samples/sec: 6096.45 - lr: 0.000027 - momentum: 0.000000
2023-10-18 14:42:48,125 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:48,125 EPOCH 2 done: loss 0.9173 - lr: 0.000027
2023-10-18 14:42:48,490 DEV : loss 0.7475361227989197 - f1-score (micro avg)  0.0
2023-10-18 14:42:48,496 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:48,902 epoch 3 - iter 27/275 - loss 0.84635988 - time (sec): 0.41 - samples/sec: 5704.90 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:42:49,321 epoch 3 - iter 54/275 - loss 0.86258624 - time (sec): 0.82 - samples/sec: 5841.02 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:42:49,724 epoch 3 - iter 81/275 - loss 0.82020005 - time (sec): 1.23 - samples/sec: 5773.89 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:42:50,139 epoch 3 - iter 108/275 - loss 0.77919393 - time (sec): 1.64 - samples/sec: 5707.72 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:42:50,534 epoch 3 - iter 135/275 - loss 0.76811019 - time (sec): 2.04 - samples/sec: 5700.03 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:42:50,947 epoch 3 - iter 162/275 - loss 0.74306539 - time (sec): 2.45 - samples/sec: 5612.75 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:42:51,356 epoch 3 - iter 189/275 - loss 0.73905325 - time (sec): 2.86 - samples/sec: 5540.48 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:42:51,754 epoch 3 - iter 216/275 - loss 0.73527931 - time (sec): 3.26 - samples/sec: 5553.83 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:42:52,169 epoch 3 - iter 243/275 - loss 0.72744124 - time (sec): 3.67 - samples/sec: 5554.37 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:42:52,563 epoch 3 - iter 270/275 - loss 0.73156120 - time (sec): 4.07 - samples/sec: 5514.56 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:42:52,639 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:52,639 EPOCH 3 done: loss 0.7276 - lr: 0.000023
2023-10-18 14:42:52,997 DEV : loss 0.5736738443374634 - f1-score (micro avg)  0.0998
2023-10-18 14:42:53,001 saving best model
2023-10-18 14:42:53,036 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:53,436 epoch 4 - iter 27/275 - loss 0.66304226 - time (sec): 0.40 - samples/sec: 5203.57 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:42:53,839 epoch 4 - iter 54/275 - loss 0.67760661 - time (sec): 0.80 - samples/sec: 5109.62 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:42:54,238 epoch 4 - iter 81/275 - loss 0.67789303 - time (sec): 1.20 - samples/sec: 5267.91 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:42:54,648 epoch 4 - iter 108/275 - loss 0.67021179 - time (sec): 1.61 - samples/sec: 5332.48 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:42:55,055 epoch 4 - iter 135/275 - loss 0.65333439 - time (sec): 2.02 - samples/sec: 5361.47 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:42:55,468 epoch 4 - iter 162/275 - loss 0.65800262 - time (sec): 2.43 - samples/sec: 5419.86 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:42:55,871 epoch 4 - iter 189/275 - loss 0.65164776 - time (sec): 2.83 - samples/sec: 5427.11 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:42:56,277 epoch 4 - iter 216/275 - loss 0.64308123 - time (sec): 3.24 - samples/sec: 5493.28 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:42:56,700 epoch 4 - iter 243/275 - loss 0.62933789 - time (sec): 3.66 - samples/sec: 5518.39 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:42:57,100 epoch 4 - iter 270/275 - loss 0.62240563 - time (sec): 4.06 - samples/sec: 5504.32 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:42:57,175 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:57,175 EPOCH 4 done: loss 0.6153 - lr: 0.000020
2023-10-18 14:42:57,655 DEV : loss 0.4884113371372223 - f1-score (micro avg)  0.2163
2023-10-18 14:42:57,659 saving best model
2023-10-18 14:42:57,700 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:58,108 epoch 5 - iter 27/275 - loss 0.57991470 - time (sec): 0.41 - samples/sec: 5953.53 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:42:58,511 epoch 5 - iter 54/275 - loss 0.53287747 - time (sec): 0.81 - samples/sec: 5885.03 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:42:58,934 epoch 5 - iter 81/275 - loss 0.52636296 - time (sec): 1.23 - samples/sec: 5630.94 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:42:59,347 epoch 5 - iter 108/275 - loss 0.54229407 - time (sec): 1.65 - samples/sec: 5617.38 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:42:59,771 epoch 5 - iter 135/275 - loss 0.53641215 - time (sec): 2.07 - samples/sec: 5556.48 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:43:00,171 epoch 5 - iter 162/275 - loss 0.54184928 - time (sec): 2.47 - samples/sec: 5497.97 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:43:00,583 epoch 5 - iter 189/275 - loss 0.54976097 - time (sec): 2.88 - samples/sec: 5537.77 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:43:00,985 epoch 5 - iter 216/275 - loss 0.54676931 - time (sec): 3.28 - samples/sec: 5481.02 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:43:01,389 epoch 5 - iter 243/275 - loss 0.53924318 - time (sec): 3.69 - samples/sec: 5420.89 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:43:01,804 epoch 5 - iter 270/275 - loss 0.53859676 - time (sec): 4.10 - samples/sec: 5448.37 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:43:01,878 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:01,878 EPOCH 5 done: loss 0.5378 - lr: 0.000017
2023-10-18 14:43:02,242 DEV : loss 0.4137643575668335 - f1-score (micro avg)  0.3669
2023-10-18 14:43:02,246 saving best model
2023-10-18 14:43:02,279 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:02,688 epoch 6 - iter 27/275 - loss 0.50724664 - time (sec): 0.41 - samples/sec: 5132.05 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:43:03,089 epoch 6 - iter 54/275 - loss 0.49264158 - time (sec): 0.81 - samples/sec: 5192.29 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:43:03,494 epoch 6 - iter 81/275 - loss 0.48482614 - time (sec): 1.21 - samples/sec: 5061.78 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:43:03,914 epoch 6 - iter 108/275 - loss 0.48640583 - time (sec): 1.63 - samples/sec: 5233.22 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:43:04,315 epoch 6 - iter 135/275 - loss 0.49124330 - time (sec): 2.04 - samples/sec: 5313.77 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:43:04,734 epoch 6 - iter 162/275 - loss 0.49211512 - time (sec): 2.45 - samples/sec: 5412.89 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:43:05,139 epoch 6 - iter 189/275 - loss 0.48077587 - time (sec): 2.86 - samples/sec: 5433.05 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:43:05,534 epoch 6 - iter 216/275 - loss 0.48715522 - time (sec): 3.25 - samples/sec: 5387.09 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:43:05,942 epoch 6 - iter 243/275 - loss 0.50096569 - time (sec): 3.66 - samples/sec: 5386.51 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:43:06,355 epoch 6 - iter 270/275 - loss 0.49391955 - time (sec): 4.08 - samples/sec: 5466.99 - lr: 0.000013 - momentum: 0.000000
2023-10-18 14:43:06,438 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:06,438 EPOCH 6 done: loss 0.4934 - lr: 0.000013
2023-10-18 14:43:06,811 DEV : loss 0.38504037261009216 - f1-score (micro avg)  0.4258
2023-10-18 14:43:06,816 saving best model
2023-10-18 14:43:06,850 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:07,252 epoch 7 - iter 27/275 - loss 0.54413041 - time (sec): 0.40 - samples/sec: 4764.45 - lr: 0.000013 - momentum: 0.000000
2023-10-18 14:43:07,667 epoch 7 - iter 54/275 - loss 0.50300859 - time (sec): 0.82 - samples/sec: 5186.24 - lr: 0.000013 - momentum: 0.000000
2023-10-18 14:43:08,091 epoch 7 - iter 81/275 - loss 0.46386285 - time (sec): 1.24 - samples/sec: 5448.61 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:43:08,495 epoch 7 - iter 108/275 - loss 0.47033759 - time (sec): 1.65 - samples/sec: 5491.03 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:43:08,907 epoch 7 - iter 135/275 - loss 0.47671658 - time (sec): 2.06 - samples/sec: 5404.26 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:43:09,315 epoch 7 - iter 162/275 - loss 0.47678839 - time (sec): 2.46 - samples/sec: 5373.05 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:43:09,737 epoch 7 - iter 189/275 - loss 0.47287308 - time (sec): 2.89 - samples/sec: 5432.45 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:43:10,149 epoch 7 - iter 216/275 - loss 0.47324651 - time (sec): 3.30 - samples/sec: 5403.72 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:43:10,558 epoch 7 - iter 243/275 - loss 0.47351055 - time (sec): 3.71 - samples/sec: 5457.27 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:43:10,974 epoch 7 - iter 270/275 - loss 0.46992356 - time (sec): 4.12 - samples/sec: 5429.40 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:43:11,045 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:11,045 EPOCH 7 done: loss 0.4690 - lr: 0.000010
2023-10-18 14:43:11,424 DEV : loss 0.3652515113353729 - f1-score (micro avg)  0.4913
2023-10-18 14:43:11,428 saving best model
2023-10-18 14:43:11,464 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:11,885 epoch 8 - iter 27/275 - loss 0.48504152 - time (sec): 0.42 - samples/sec: 5742.71 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:43:12,284 epoch 8 - iter 54/275 - loss 0.47425987 - time (sec): 0.82 - samples/sec: 5788.13 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:43:12,697 epoch 8 - iter 81/275 - loss 0.49146730 - time (sec): 1.23 - samples/sec: 5823.19 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:43:13,118 epoch 8 - iter 108/275 - loss 0.47302956 - time (sec): 1.65 - samples/sec: 5631.56 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:43:13,530 epoch 8 - iter 135/275 - loss 0.47714840 - time (sec): 2.07 - samples/sec: 5559.89 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:43:13,943 epoch 8 - iter 162/275 - loss 0.46515601 - time (sec): 2.48 - samples/sec: 5455.87 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:43:14,350 epoch 8 - iter 189/275 - loss 0.46293655 - time (sec): 2.89 - samples/sec: 5408.67 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:43:14,754 epoch 8 - iter 216/275 - loss 0.45905739 - time (sec): 3.29 - samples/sec: 5412.04 - lr: 0.000007 - momentum: 0.000000
2023-10-18 14:43:15,169 epoch 8 - iter 243/275 - loss 0.45136845 - time (sec): 3.71 - samples/sec: 5420.68 - lr: 0.000007 - momentum: 0.000000
2023-10-18 14:43:15,589 epoch 8 - iter 270/275 - loss 0.44211449 - time (sec): 4.12 - samples/sec: 5430.32 - lr: 0.000007 - momentum: 0.000000
2023-10-18 14:43:15,662 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:15,662 EPOCH 8 done: loss 0.4418 - lr: 0.000007
2023-10-18 14:43:16,034 DEV : loss 0.35667097568511963 - f1-score (micro avg)  0.5222
2023-10-18 14:43:16,037 saving best model
2023-10-18 14:43:16,073 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:16,475 epoch 9 - iter 27/275 - loss 0.46086074 - time (sec): 0.40 - samples/sec: 5520.10 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:43:16,880 epoch 9 - iter 54/275 - loss 0.44215721 - time (sec): 0.81 - samples/sec: 5428.44 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:43:17,284 epoch 9 - iter 81/275 - loss 0.43449692 - time (sec): 1.21 - samples/sec: 5342.81 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:43:17,695 epoch 9 - iter 108/275 - loss 0.43572935 - time (sec): 1.62 - samples/sec: 5257.18 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:43:18,106 epoch 9 - iter 135/275 - loss 0.44037196 - time (sec): 2.03 - samples/sec: 5185.35 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:43:18,504 epoch 9 - iter 162/275 - loss 0.43832233 - time (sec): 2.43 - samples/sec: 5265.56 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:43:18,920 epoch 9 - iter 189/275 - loss 0.44870956 - time (sec): 2.85 - samples/sec: 5342.73 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:43:19,346 epoch 9 - iter 216/275 - loss 0.43530574 - time (sec): 3.27 - samples/sec: 5358.24 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:43:19,757 epoch 9 - iter 243/275 - loss 0.43530429 - time (sec): 3.68 - samples/sec: 5466.45 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:43:20,171 epoch 9 - iter 270/275 - loss 0.43049081 - time (sec): 4.10 - samples/sec: 5463.51 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:43:20,242 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:20,243 EPOCH 9 done: loss 0.4341 - lr: 0.000003
2023-10-18 14:43:20,612 DEV : loss 0.3517986536026001 - f1-score (micro avg)  0.5435
2023-10-18 14:43:20,616 saving best model
2023-10-18 14:43:20,652 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:21,060 epoch 10 - iter 27/275 - loss 0.44948570 - time (sec): 0.41 - samples/sec: 5415.35 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:43:21,472 epoch 10 - iter 54/275 - loss 0.44181683 - time (sec): 0.82 - samples/sec: 5634.82 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:43:21,886 epoch 10 - iter 81/275 - loss 0.47116250 - time (sec): 1.23 - samples/sec: 5627.40 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:43:22,289 epoch 10 - iter 108/275 - loss 0.44182285 - time (sec): 1.64 - samples/sec: 5505.07 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:43:22,707 epoch 10 - iter 135/275 - loss 0.44570033 - time (sec): 2.05 - samples/sec: 5632.63 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:43:23,122 epoch 10 - iter 162/275 - loss 0.43757809 - time (sec): 2.47 - samples/sec: 5552.83 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:43:23,543 epoch 10 - iter 189/275 - loss 0.43587334 - time (sec): 2.89 - samples/sec: 5475.08 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:43:23,949 epoch 10 - iter 216/275 - loss 0.43145224 - time (sec): 3.30 - samples/sec: 5462.81 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:43:24,352 epoch 10 - iter 243/275 - loss 0.43014733 - time (sec): 3.70 - samples/sec: 5453.66 - lr: 0.000000 - momentum: 0.000000
2023-10-18 14:43:24,769 epoch 10 - iter 270/275 - loss 0.42830775 - time (sec): 4.12 - samples/sec: 5433.27 - lr: 0.000000 - momentum: 0.000000
2023-10-18 14:43:24,848 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:24,848 EPOCH 10 done: loss 0.4263 - lr: 0.000000
2023-10-18 14:43:25,217 DEV : loss 0.3487985134124756 - f1-score (micro avg)  0.5408
2023-10-18 14:43:25,251 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:25,251 Loading model from best epoch ...
2023-10-18 14:43:25,333 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-18 14:43:25,632 
Results:
- F-score (micro) 0.5514
- F-score (macro) 0.3272
- Accuracy 0.3903

By class:
              precision    recall  f1-score   support

       scope     0.5460    0.5398    0.5429       176
        pers     0.7826    0.5625    0.6545       128
        work     0.4198    0.4595    0.4387        74
      object     0.0000    0.0000    0.0000         2
         loc     0.0000    0.0000    0.0000         2

   micro avg     0.5793    0.5262    0.5514       382
   macro avg     0.3497    0.3123    0.3272       382
weighted avg     0.5951    0.5262    0.5544       382

2023-10-18 14:43:25,632 ----------------------------------------------------------------------------------------------------