Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- runs/events.out.tfevents.1697711900.46dc0c540dd0.4319.4 +3 -0
- test.tsv +0 -0
- training.log +241 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:57fc736ac5a97c6c3ce431c66d549d3255a68216c3e541edd3679a6e7ad50dd8
|
3 |
+
size 19048098
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 10:39:46 0.0000 0.8426 0.1434 0.3810 0.0152 0.0291 0.0148
|
3 |
+
2 10:41:14 0.0000 0.3656 0.1391 0.3213 0.2689 0.2928 0.1721
|
4 |
+
3 10:42:42 0.0000 0.3119 0.1372 0.2820 0.3466 0.3110 0.1848
|
5 |
+
4 10:44:11 0.0000 0.2782 0.1481 0.2199 0.3352 0.2656 0.1540
|
6 |
+
5 10:45:39 0.0000 0.2535 0.1490 0.2427 0.3466 0.2855 0.1676
|
7 |
+
6 10:47:07 0.0000 0.2364 0.1648 0.2011 0.4072 0.2693 0.1567
|
8 |
+
7 10:48:35 0.0000 0.2220 0.1679 0.2145 0.3693 0.2714 0.1583
|
9 |
+
8 10:50:04 0.0000 0.2161 0.1719 0.2079 0.3693 0.2660 0.1545
|
10 |
+
9 10:51:32 0.0000 0.2094 0.1816 0.2071 0.3958 0.2720 0.1583
|
11 |
+
10 10:53:01 0.0000 0.2072 0.1781 0.2076 0.3617 0.2638 0.1528
|
runs/events.out.tfevents.1697711900.46dc0c540dd0.4319.4
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:2d6e0ab3e878c2cfd0c06ff68e3191663621e8a9c98f7fcffcdde7cd827b2e6f
|
3 |
+
size 2923780
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,241 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-19 10:38:20,282 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-19 10:38:20,282 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 128)
|
7 |
+
(position_embeddings): Embedding(512, 128)
|
8 |
+
(token_type_embeddings): Embedding(2, 128)
|
9 |
+
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-1): 2 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=128, out_features=128, bias=True)
|
18 |
+
(key): Linear(in_features=128, out_features=128, bias=True)
|
19 |
+
(value): Linear(in_features=128, out_features=128, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=128, out_features=128, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=128, out_features=512, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=512, out_features=128, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=128, out_features=128, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=128, out_features=17, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-19 10:38:20,282 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-19 10:38:20,282 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
|
52 |
+
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
|
53 |
+
2023-10-19 10:38:20,282 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-19 10:38:20,282 Train: 20847 sentences
|
55 |
+
2023-10-19 10:38:20,282 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-19 10:38:20,282 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-19 10:38:20,282 Training Params:
|
58 |
+
2023-10-19 10:38:20,282 - learning_rate: "3e-05"
|
59 |
+
2023-10-19 10:38:20,283 - mini_batch_size: "4"
|
60 |
+
2023-10-19 10:38:20,283 - max_epochs: "10"
|
61 |
+
2023-10-19 10:38:20,283 - shuffle: "True"
|
62 |
+
2023-10-19 10:38:20,283 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-19 10:38:20,283 Plugins:
|
64 |
+
2023-10-19 10:38:20,283 - TensorboardLogger
|
65 |
+
2023-10-19 10:38:20,283 - LinearScheduler | warmup_fraction: '0.1'
|
66 |
+
2023-10-19 10:38:20,283 ----------------------------------------------------------------------------------------------------
|
67 |
+
2023-10-19 10:38:20,283 Final evaluation on model from best epoch (best-model.pt)
|
68 |
+
2023-10-19 10:38:20,283 - metric: "('micro avg', 'f1-score')"
|
69 |
+
2023-10-19 10:38:20,283 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-19 10:38:20,283 Computation:
|
71 |
+
2023-10-19 10:38:20,283 - compute on device: cuda:0
|
72 |
+
2023-10-19 10:38:20,283 - embedding storage: none
|
73 |
+
2023-10-19 10:38:20,283 ----------------------------------------------------------------------------------------------------
|
74 |
+
2023-10-19 10:38:20,283 Model training base path: "hmbench-newseye/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
|
75 |
+
2023-10-19 10:38:20,283 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-19 10:38:20,283 ----------------------------------------------------------------------------------------------------
|
77 |
+
2023-10-19 10:38:20,283 Logging anything other than scalars to TensorBoard is currently not supported.
|
78 |
+
2023-10-19 10:38:29,211 epoch 1 - iter 521/5212 - loss 2.72371222 - time (sec): 8.93 - samples/sec: 4096.81 - lr: 0.000003 - momentum: 0.000000
|
79 |
+
2023-10-19 10:38:37,636 epoch 1 - iter 1042/5212 - loss 2.05502782 - time (sec): 17.35 - samples/sec: 4193.38 - lr: 0.000006 - momentum: 0.000000
|
80 |
+
2023-10-19 10:38:45,775 epoch 1 - iter 1563/5212 - loss 1.58883158 - time (sec): 25.49 - samples/sec: 4304.23 - lr: 0.000009 - momentum: 0.000000
|
81 |
+
2023-10-19 10:38:54,422 epoch 1 - iter 2084/5212 - loss 1.32512439 - time (sec): 34.14 - samples/sec: 4326.46 - lr: 0.000012 - momentum: 0.000000
|
82 |
+
2023-10-19 10:39:02,797 epoch 1 - iter 2605/5212 - loss 1.17407642 - time (sec): 42.51 - samples/sec: 4404.91 - lr: 0.000015 - momentum: 0.000000
|
83 |
+
2023-10-19 10:39:11,106 epoch 1 - iter 3126/5212 - loss 1.08043972 - time (sec): 50.82 - samples/sec: 4390.07 - lr: 0.000018 - momentum: 0.000000
|
84 |
+
2023-10-19 10:39:19,406 epoch 1 - iter 3647/5212 - loss 1.01154260 - time (sec): 59.12 - samples/sec: 4394.12 - lr: 0.000021 - momentum: 0.000000
|
85 |
+
2023-10-19 10:39:27,380 epoch 1 - iter 4168/5212 - loss 0.94868657 - time (sec): 67.10 - samples/sec: 4397.60 - lr: 0.000024 - momentum: 0.000000
|
86 |
+
2023-10-19 10:39:35,900 epoch 1 - iter 4689/5212 - loss 0.88980347 - time (sec): 75.62 - samples/sec: 4380.26 - lr: 0.000027 - momentum: 0.000000
|
87 |
+
2023-10-19 10:39:44,202 epoch 1 - iter 5210/5212 - loss 0.84260434 - time (sec): 83.92 - samples/sec: 4377.91 - lr: 0.000030 - momentum: 0.000000
|
88 |
+
2023-10-19 10:39:44,234 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-19 10:39:44,234 EPOCH 1 done: loss 0.8426 - lr: 0.000030
|
90 |
+
2023-10-19 10:39:46,475 DEV : loss 0.1433752328157425 - f1-score (micro avg) 0.0291
|
91 |
+
2023-10-19 10:39:46,497 saving best model
|
92 |
+
2023-10-19 10:39:46,526 ----------------------------------------------------------------------------------------------------
|
93 |
+
2023-10-19 10:39:54,611 epoch 2 - iter 521/5212 - loss 0.40879602 - time (sec): 8.09 - samples/sec: 4230.25 - lr: 0.000030 - momentum: 0.000000
|
94 |
+
2023-10-19 10:40:02,957 epoch 2 - iter 1042/5212 - loss 0.41373491 - time (sec): 16.43 - samples/sec: 4320.03 - lr: 0.000029 - momentum: 0.000000
|
95 |
+
2023-10-19 10:40:11,068 epoch 2 - iter 1563/5212 - loss 0.41110618 - time (sec): 24.54 - samples/sec: 4287.77 - lr: 0.000029 - momentum: 0.000000
|
96 |
+
2023-10-19 10:40:19,346 epoch 2 - iter 2084/5212 - loss 0.39790099 - time (sec): 32.82 - samples/sec: 4380.84 - lr: 0.000029 - momentum: 0.000000
|
97 |
+
2023-10-19 10:40:27,777 epoch 2 - iter 2605/5212 - loss 0.39498142 - time (sec): 41.25 - samples/sec: 4415.85 - lr: 0.000028 - momentum: 0.000000
|
98 |
+
2023-10-19 10:40:36,056 epoch 2 - iter 3126/5212 - loss 0.38607733 - time (sec): 49.53 - samples/sec: 4425.77 - lr: 0.000028 - momentum: 0.000000
|
99 |
+
2023-10-19 10:40:44,415 epoch 2 - iter 3647/5212 - loss 0.37832413 - time (sec): 57.89 - samples/sec: 4438.26 - lr: 0.000028 - momentum: 0.000000
|
100 |
+
2023-10-19 10:40:53,047 epoch 2 - iter 4168/5212 - loss 0.37306540 - time (sec): 66.52 - samples/sec: 4423.81 - lr: 0.000027 - momentum: 0.000000
|
101 |
+
2023-10-19 10:41:01,276 epoch 2 - iter 4689/5212 - loss 0.36784622 - time (sec): 74.75 - samples/sec: 4421.81 - lr: 0.000027 - momentum: 0.000000
|
102 |
+
2023-10-19 10:41:09,649 epoch 2 - iter 5210/5212 - loss 0.36564089 - time (sec): 83.12 - samples/sec: 4419.15 - lr: 0.000027 - momentum: 0.000000
|
103 |
+
2023-10-19 10:41:09,684 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-19 10:41:09,684 EPOCH 2 done: loss 0.3656 - lr: 0.000027
|
105 |
+
2023-10-19 10:41:14,780 DEV : loss 0.1391393542289734 - f1-score (micro avg) 0.2928
|
106 |
+
2023-10-19 10:41:14,804 saving best model
|
107 |
+
2023-10-19 10:41:14,840 ----------------------------------------------------------------------------------------------------
|
108 |
+
2023-10-19 10:41:23,076 epoch 3 - iter 521/5212 - loss 0.32466550 - time (sec): 8.24 - samples/sec: 4448.41 - lr: 0.000026 - momentum: 0.000000
|
109 |
+
2023-10-19 10:41:31,416 epoch 3 - iter 1042/5212 - loss 0.31821062 - time (sec): 16.58 - samples/sec: 4451.66 - lr: 0.000026 - momentum: 0.000000
|
110 |
+
2023-10-19 10:41:39,462 epoch 3 - iter 1563/5212 - loss 0.32728696 - time (sec): 24.62 - samples/sec: 4451.18 - lr: 0.000026 - momentum: 0.000000
|
111 |
+
2023-10-19 10:41:47,786 epoch 3 - iter 2084/5212 - loss 0.31607836 - time (sec): 32.95 - samples/sec: 4487.01 - lr: 0.000025 - momentum: 0.000000
|
112 |
+
2023-10-19 10:41:56,154 epoch 3 - iter 2605/5212 - loss 0.31480478 - time (sec): 41.31 - samples/sec: 4494.25 - lr: 0.000025 - momentum: 0.000000
|
113 |
+
2023-10-19 10:42:04,737 epoch 3 - iter 3126/5212 - loss 0.31038250 - time (sec): 49.90 - samples/sec: 4473.56 - lr: 0.000025 - momentum: 0.000000
|
114 |
+
2023-10-19 10:42:12,965 epoch 3 - iter 3647/5212 - loss 0.31026995 - time (sec): 58.12 - samples/sec: 4453.95 - lr: 0.000024 - momentum: 0.000000
|
115 |
+
2023-10-19 10:42:21,291 epoch 3 - iter 4168/5212 - loss 0.31015311 - time (sec): 66.45 - samples/sec: 4444.22 - lr: 0.000024 - momentum: 0.000000
|
116 |
+
2023-10-19 10:42:29,486 epoch 3 - iter 4689/5212 - loss 0.31207638 - time (sec): 74.65 - samples/sec: 4429.21 - lr: 0.000024 - momentum: 0.000000
|
117 |
+
2023-10-19 10:42:37,697 epoch 3 - iter 5210/5212 - loss 0.31193129 - time (sec): 82.86 - samples/sec: 4433.61 - lr: 0.000023 - momentum: 0.000000
|
118 |
+
2023-10-19 10:42:37,733 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-19 10:42:37,733 EPOCH 3 done: loss 0.3119 - lr: 0.000023
|
120 |
+
2023-10-19 10:42:42,852 DEV : loss 0.13720223307609558 - f1-score (micro avg) 0.311
|
121 |
+
2023-10-19 10:42:42,876 saving best model
|
122 |
+
2023-10-19 10:42:42,916 ----------------------------------------------------------------------------------------------------
|
123 |
+
2023-10-19 10:42:51,333 epoch 4 - iter 521/5212 - loss 0.25709768 - time (sec): 8.42 - samples/sec: 4514.44 - lr: 0.000023 - momentum: 0.000000
|
124 |
+
2023-10-19 10:42:59,518 epoch 4 - iter 1042/5212 - loss 0.26137660 - time (sec): 16.60 - samples/sec: 4367.48 - lr: 0.000023 - momentum: 0.000000
|
125 |
+
2023-10-19 10:43:07,636 epoch 4 - iter 1563/5212 - loss 0.27629460 - time (sec): 24.72 - samples/sec: 4311.54 - lr: 0.000022 - momentum: 0.000000
|
126 |
+
2023-10-19 10:43:15,953 epoch 4 - iter 2084/5212 - loss 0.27504062 - time (sec): 33.04 - samples/sec: 4377.33 - lr: 0.000022 - momentum: 0.000000
|
127 |
+
2023-10-19 10:43:24,401 epoch 4 - iter 2605/5212 - loss 0.27571513 - time (sec): 41.48 - samples/sec: 4448.97 - lr: 0.000022 - momentum: 0.000000
|
128 |
+
2023-10-19 10:43:32,741 epoch 4 - iter 3126/5212 - loss 0.27824346 - time (sec): 49.82 - samples/sec: 4441.02 - lr: 0.000021 - momentum: 0.000000
|
129 |
+
2023-10-19 10:43:41,037 epoch 4 - iter 3647/5212 - loss 0.27812047 - time (sec): 58.12 - samples/sec: 4436.80 - lr: 0.000021 - momentum: 0.000000
|
130 |
+
2023-10-19 10:43:49,174 epoch 4 - iter 4168/5212 - loss 0.28260826 - time (sec): 66.26 - samples/sec: 4408.59 - lr: 0.000021 - momentum: 0.000000
|
131 |
+
2023-10-19 10:43:57,417 epoch 4 - iter 4689/5212 - loss 0.28076208 - time (sec): 74.50 - samples/sec: 4423.38 - lr: 0.000020 - momentum: 0.000000
|
132 |
+
2023-10-19 10:44:05,823 epoch 4 - iter 5210/5212 - loss 0.27822833 - time (sec): 82.91 - samples/sec: 4430.02 - lr: 0.000020 - momentum: 0.000000
|
133 |
+
2023-10-19 10:44:05,856 ----------------------------------------------------------------------------------------------------
|
134 |
+
2023-10-19 10:44:05,856 EPOCH 4 done: loss 0.2782 - lr: 0.000020
|
135 |
+
2023-10-19 10:44:11,009 DEV : loss 0.14805111289024353 - f1-score (micro avg) 0.2656
|
136 |
+
2023-10-19 10:44:11,033 ----------------------------------------------------------------------------------------------------
|
137 |
+
2023-10-19 10:44:19,265 epoch 5 - iter 521/5212 - loss 0.25458992 - time (sec): 8.23 - samples/sec: 4663.96 - lr: 0.000020 - momentum: 0.000000
|
138 |
+
2023-10-19 10:44:27,618 epoch 5 - iter 1042/5212 - loss 0.23812930 - time (sec): 16.58 - samples/sec: 4645.30 - lr: 0.000019 - momentum: 0.000000
|
139 |
+
2023-10-19 10:44:35,825 epoch 5 - iter 1563/5212 - loss 0.23909797 - time (sec): 24.79 - samples/sec: 4518.74 - lr: 0.000019 - momentum: 0.000000
|
140 |
+
2023-10-19 10:44:44,069 epoch 5 - iter 2084/5212 - loss 0.24733360 - time (sec): 33.04 - samples/sec: 4498.14 - lr: 0.000019 - momentum: 0.000000
|
141 |
+
2023-10-19 10:44:52,390 epoch 5 - iter 2605/5212 - loss 0.24607385 - time (sec): 41.36 - samples/sec: 4479.65 - lr: 0.000018 - momentum: 0.000000
|
142 |
+
2023-10-19 10:45:00,692 epoch 5 - iter 3126/5212 - loss 0.25227478 - time (sec): 49.66 - samples/sec: 4455.06 - lr: 0.000018 - momentum: 0.000000
|
143 |
+
2023-10-19 10:45:08,913 epoch 5 - iter 3647/5212 - loss 0.25331088 - time (sec): 57.88 - samples/sec: 4439.80 - lr: 0.000018 - momentum: 0.000000
|
144 |
+
2023-10-19 10:45:17,372 epoch 5 - iter 4168/5212 - loss 0.25226388 - time (sec): 66.34 - samples/sec: 4444.21 - lr: 0.000017 - momentum: 0.000000
|
145 |
+
2023-10-19 10:45:25,748 epoch 5 - iter 4689/5212 - loss 0.25437720 - time (sec): 74.71 - samples/sec: 4442.17 - lr: 0.000017 - momentum: 0.000000
|
146 |
+
2023-10-19 10:45:34,004 epoch 5 - iter 5210/5212 - loss 0.25346983 - time (sec): 82.97 - samples/sec: 4428.05 - lr: 0.000017 - momentum: 0.000000
|
147 |
+
2023-10-19 10:45:34,030 ----------------------------------------------------------------------------------------------------
|
148 |
+
2023-10-19 10:45:34,030 EPOCH 5 done: loss 0.2535 - lr: 0.000017
|
149 |
+
2023-10-19 10:45:39,162 DEV : loss 0.1490069180727005 - f1-score (micro avg) 0.2855
|
150 |
+
2023-10-19 10:45:39,197 ----------------------------------------------------------------------------------------------------
|
151 |
+
2023-10-19 10:45:47,655 epoch 6 - iter 521/5212 - loss 0.26624853 - time (sec): 8.46 - samples/sec: 3991.54 - lr: 0.000016 - momentum: 0.000000
|
152 |
+
2023-10-19 10:45:56,023 epoch 6 - iter 1042/5212 - loss 0.25453009 - time (sec): 16.82 - samples/sec: 4277.48 - lr: 0.000016 - momentum: 0.000000
|
153 |
+
2023-10-19 10:46:04,383 epoch 6 - iter 1563/5212 - loss 0.24457171 - time (sec): 25.18 - samples/sec: 4370.58 - lr: 0.000016 - momentum: 0.000000
|
154 |
+
2023-10-19 10:46:12,858 epoch 6 - iter 2084/5212 - loss 0.23560982 - time (sec): 33.66 - samples/sec: 4419.97 - lr: 0.000015 - momentum: 0.000000
|
155 |
+
2023-10-19 10:46:21,179 epoch 6 - iter 2605/5212 - loss 0.23329802 - time (sec): 41.98 - samples/sec: 4434.39 - lr: 0.000015 - momentum: 0.000000
|
156 |
+
2023-10-19 10:46:29,142 epoch 6 - iter 3126/5212 - loss 0.23211029 - time (sec): 49.94 - samples/sec: 4480.71 - lr: 0.000015 - momentum: 0.000000
|
157 |
+
2023-10-19 10:46:37,489 epoch 6 - iter 3647/5212 - loss 0.23718572 - time (sec): 58.29 - samples/sec: 4445.74 - lr: 0.000014 - momentum: 0.000000
|
158 |
+
2023-10-19 10:46:45,775 epoch 6 - iter 4168/5212 - loss 0.23673427 - time (sec): 66.58 - samples/sec: 4422.59 - lr: 0.000014 - momentum: 0.000000
|
159 |
+
2023-10-19 10:46:53,971 epoch 6 - iter 4689/5212 - loss 0.23213792 - time (sec): 74.77 - samples/sec: 4423.57 - lr: 0.000014 - momentum: 0.000000
|
160 |
+
2023-10-19 10:47:02,844 epoch 6 - iter 5210/5212 - loss 0.23638178 - time (sec): 83.65 - samples/sec: 4391.69 - lr: 0.000013 - momentum: 0.000000
|
161 |
+
2023-10-19 10:47:02,878 ----------------------------------------------------------------------------------------------------
|
162 |
+
2023-10-19 10:47:02,879 EPOCH 6 done: loss 0.2364 - lr: 0.000013
|
163 |
+
2023-10-19 10:47:07,421 DEV : loss 0.1647791564464569 - f1-score (micro avg) 0.2693
|
164 |
+
2023-10-19 10:47:07,444 ----------------------------------------------------------------------------------------------------
|
165 |
+
2023-10-19 10:47:15,620 epoch 7 - iter 521/5212 - loss 0.23817423 - time (sec): 8.18 - samples/sec: 4509.93 - lr: 0.000013 - momentum: 0.000000
|
166 |
+
2023-10-19 10:47:23,915 epoch 7 - iter 1042/5212 - loss 0.22435065 - time (sec): 16.47 - samples/sec: 4529.78 - lr: 0.000013 - momentum: 0.000000
|
167 |
+
2023-10-19 10:47:32,101 epoch 7 - iter 1563/5212 - loss 0.22347997 - time (sec): 24.66 - samples/sec: 4503.56 - lr: 0.000012 - momentum: 0.000000
|
168 |
+
2023-10-19 10:47:40,302 epoch 7 - iter 2084/5212 - loss 0.22299657 - time (sec): 32.86 - samples/sec: 4508.96 - lr: 0.000012 - momentum: 0.000000
|
169 |
+
2023-10-19 10:47:49,223 epoch 7 - iter 2605/5212 - loss 0.21747021 - time (sec): 41.78 - samples/sec: 4483.66 - lr: 0.000012 - momentum: 0.000000
|
170 |
+
2023-10-19 10:47:57,646 epoch 7 - iter 3126/5212 - loss 0.21824678 - time (sec): 50.20 - samples/sec: 4446.54 - lr: 0.000011 - momentum: 0.000000
|
171 |
+
2023-10-19 10:48:05,716 epoch 7 - iter 3647/5212 - loss 0.22182997 - time (sec): 58.27 - samples/sec: 4432.18 - lr: 0.000011 - momentum: 0.000000
|
172 |
+
2023-10-19 10:48:14,327 epoch 7 - iter 4168/5212 - loss 0.22075907 - time (sec): 66.88 - samples/sec: 4401.35 - lr: 0.000011 - momentum: 0.000000
|
173 |
+
2023-10-19 10:48:22,791 epoch 7 - iter 4689/5212 - loss 0.22067660 - time (sec): 75.35 - samples/sec: 4403.88 - lr: 0.000010 - momentum: 0.000000
|
174 |
+
2023-10-19 10:48:31,053 epoch 7 - iter 5210/5212 - loss 0.22219262 - time (sec): 83.61 - samples/sec: 4391.66 - lr: 0.000010 - momentum: 0.000000
|
175 |
+
2023-10-19 10:48:31,096 ----------------------------------------------------------------------------------------------------
|
176 |
+
2023-10-19 10:48:31,096 EPOCH 7 done: loss 0.2220 - lr: 0.000010
|
177 |
+
2023-10-19 10:48:35,618 DEV : loss 0.16794627904891968 - f1-score (micro avg) 0.2714
|
178 |
+
2023-10-19 10:48:35,641 ----------------------------------------------------------------------------------------------------
|
179 |
+
2023-10-19 10:48:44,018 epoch 8 - iter 521/5212 - loss 0.24024885 - time (sec): 8.38 - samples/sec: 4188.92 - lr: 0.000010 - momentum: 0.000000
|
180 |
+
2023-10-19 10:48:52,244 epoch 8 - iter 1042/5212 - loss 0.23380355 - time (sec): 16.60 - samples/sec: 4239.48 - lr: 0.000009 - momentum: 0.000000
|
181 |
+
2023-10-19 10:49:00,438 epoch 8 - iter 1563/5212 - loss 0.22497629 - time (sec): 24.80 - samples/sec: 4319.11 - lr: 0.000009 - momentum: 0.000000
|
182 |
+
2023-10-19 10:49:08,803 epoch 8 - iter 2084/5212 - loss 0.23065970 - time (sec): 33.16 - samples/sec: 4388.00 - lr: 0.000009 - momentum: 0.000000
|
183 |
+
2023-10-19 10:49:17,008 epoch 8 - iter 2605/5212 - loss 0.22269701 - time (sec): 41.37 - samples/sec: 4419.51 - lr: 0.000008 - momentum: 0.000000
|
184 |
+
2023-10-19 10:49:25,407 epoch 8 - iter 3126/5212 - loss 0.21867285 - time (sec): 49.77 - samples/sec: 4421.06 - lr: 0.000008 - momentum: 0.000000
|
185 |
+
2023-10-19 10:49:33,743 epoch 8 - iter 3647/5212 - loss 0.21580111 - time (sec): 58.10 - samples/sec: 4448.90 - lr: 0.000008 - momentum: 0.000000
|
186 |
+
2023-10-19 10:49:42,116 epoch 8 - iter 4168/5212 - loss 0.21326174 - time (sec): 66.47 - samples/sec: 4465.53 - lr: 0.000007 - momentum: 0.000000
|
187 |
+
2023-10-19 10:49:50,650 epoch 8 - iter 4689/5212 - loss 0.21523254 - time (sec): 75.01 - samples/sec: 4435.87 - lr: 0.000007 - momentum: 0.000000
|
188 |
+
2023-10-19 10:49:58,898 epoch 8 - iter 5210/5212 - loss 0.21610124 - time (sec): 83.26 - samples/sec: 4412.87 - lr: 0.000007 - momentum: 0.000000
|
189 |
+
2023-10-19 10:49:58,930 ----------------------------------------------------------------------------------------------------
|
190 |
+
2023-10-19 10:49:58,930 EPOCH 8 done: loss 0.2161 - lr: 0.000007
|
191 |
+
2023-10-19 10:50:04,152 DEV : loss 0.17187514901161194 - f1-score (micro avg) 0.266
|
192 |
+
2023-10-19 10:50:04,179 ----------------------------------------------------------------------------------------------------
|
193 |
+
2023-10-19 10:50:12,398 epoch 9 - iter 521/5212 - loss 0.21708792 - time (sec): 8.22 - samples/sec: 4102.45 - lr: 0.000006 - momentum: 0.000000
|
194 |
+
2023-10-19 10:50:20,747 epoch 9 - iter 1042/5212 - loss 0.19390674 - time (sec): 16.57 - samples/sec: 4312.08 - lr: 0.000006 - momentum: 0.000000
|
195 |
+
2023-10-19 10:50:29,009 epoch 9 - iter 1563/5212 - loss 0.20054262 - time (sec): 24.83 - samples/sec: 4341.33 - lr: 0.000006 - momentum: 0.000000
|
196 |
+
2023-10-19 10:50:37,304 epoch 9 - iter 2084/5212 - loss 0.21128490 - time (sec): 33.12 - samples/sec: 4338.21 - lr: 0.000005 - momentum: 0.000000
|
197 |
+
2023-10-19 10:50:45,779 epoch 9 - iter 2605/5212 - loss 0.21144254 - time (sec): 41.60 - samples/sec: 4433.69 - lr: 0.000005 - momentum: 0.000000
|
198 |
+
2023-10-19 10:50:54,001 epoch 9 - iter 3126/5212 - loss 0.20955816 - time (sec): 49.82 - samples/sec: 4423.37 - lr: 0.000005 - momentum: 0.000000
|
199 |
+
2023-10-19 10:51:02,390 epoch 9 - iter 3647/5212 - loss 0.21385600 - time (sec): 58.21 - samples/sec: 4438.87 - lr: 0.000004 - momentum: 0.000000
|
200 |
+
2023-10-19 10:51:10,644 epoch 9 - iter 4168/5212 - loss 0.21026236 - time (sec): 66.46 - samples/sec: 4440.93 - lr: 0.000004 - momentum: 0.000000
|
201 |
+
2023-10-19 10:51:19,014 epoch 9 - iter 4689/5212 - loss 0.20931466 - time (sec): 74.83 - samples/sec: 4411.30 - lr: 0.000004 - momentum: 0.000000
|
202 |
+
2023-10-19 10:51:27,385 epoch 9 - iter 5210/5212 - loss 0.20947495 - time (sec): 83.20 - samples/sec: 4414.82 - lr: 0.000003 - momentum: 0.000000
|
203 |
+
2023-10-19 10:51:27,418 ----------------------------------------------------------------------------------------------------
|
204 |
+
2023-10-19 10:51:27,418 EPOCH 9 done: loss 0.2094 - lr: 0.000003
|
205 |
+
2023-10-19 10:51:32,597 DEV : loss 0.1816394329071045 - f1-score (micro avg) 0.272
|
206 |
+
2023-10-19 10:51:32,621 ----------------------------------------------------------------------------------------------------
|
207 |
+
2023-10-19 10:51:41,163 epoch 10 - iter 521/5212 - loss 0.20993504 - time (sec): 8.54 - samples/sec: 4199.80 - lr: 0.000003 - momentum: 0.000000
|
208 |
+
2023-10-19 10:51:49,336 epoch 10 - iter 1042/5212 - loss 0.19721566 - time (sec): 16.71 - samples/sec: 4408.14 - lr: 0.000003 - momentum: 0.000000
|
209 |
+
2023-10-19 10:51:57,596 epoch 10 - iter 1563/5212 - loss 0.19886392 - time (sec): 24.97 - samples/sec: 4399.09 - lr: 0.000002 - momentum: 0.000000
|
210 |
+
2023-10-19 10:52:06,090 epoch 10 - iter 2084/5212 - loss 0.20201347 - time (sec): 33.47 - samples/sec: 4376.06 - lr: 0.000002 - momentum: 0.000000
|
211 |
+
2023-10-19 10:52:14,589 epoch 10 - iter 2605/5212 - loss 0.20351706 - time (sec): 41.97 - samples/sec: 4365.97 - lr: 0.000002 - momentum: 0.000000
|
212 |
+
2023-10-19 10:52:23,017 epoch 10 - iter 3126/5212 - loss 0.20748962 - time (sec): 50.40 - samples/sec: 4427.54 - lr: 0.000001 - momentum: 0.000000
|
213 |
+
2023-10-19 10:52:31,397 epoch 10 - iter 3647/5212 - loss 0.20967361 - time (sec): 58.78 - samples/sec: 4432.91 - lr: 0.000001 - momentum: 0.000000
|
214 |
+
2023-10-19 10:52:39,724 epoch 10 - iter 4168/5212 - loss 0.20825590 - time (sec): 67.10 - samples/sec: 4426.18 - lr: 0.000001 - momentum: 0.000000
|
215 |
+
2023-10-19 10:52:48,049 epoch 10 - iter 4689/5212 - loss 0.20604718 - time (sec): 75.43 - samples/sec: 4423.79 - lr: 0.000000 - momentum: 0.000000
|
216 |
+
2023-10-19 10:52:56,230 epoch 10 - iter 5210/5212 - loss 0.20716771 - time (sec): 83.61 - samples/sec: 4391.22 - lr: 0.000000 - momentum: 0.000000
|
217 |
+
2023-10-19 10:52:56,272 ----------------------------------------------------------------------------------------------------
|
218 |
+
2023-10-19 10:52:56,272 EPOCH 10 done: loss 0.2072 - lr: 0.000000
|
219 |
+
2023-10-19 10:53:01,417 DEV : loss 0.17811782658100128 - f1-score (micro avg) 0.2638
|
220 |
+
2023-10-19 10:53:01,469 ----------------------------------------------------------------------------------------------------
|
221 |
+
2023-10-19 10:53:01,470 Loading model from best epoch ...
|
222 |
+
2023-10-19 10:53:01,547 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
|
223 |
+
2023-10-19 10:53:07,781
|
224 |
+
Results:
|
225 |
+
- F-score (micro) 0.299
|
226 |
+
- F-score (macro) 0.1455
|
227 |
+
- Accuracy 0.177
|
228 |
+
|
229 |
+
By class:
|
230 |
+
precision recall f1-score support
|
231 |
+
|
232 |
+
LOC 0.4586 0.4605 0.4595 1214
|
233 |
+
PER 0.1394 0.0718 0.0948 808
|
234 |
+
ORG 0.0470 0.0198 0.0279 353
|
235 |
+
HumanProd 0.0000 0.0000 0.0000 15
|
236 |
+
|
237 |
+
micro avg 0.3498 0.2611 0.2990 2390
|
238 |
+
macro avg 0.1612 0.1380 0.1455 2390
|
239 |
+
weighted avg 0.2870 0.2611 0.2696 2390
|
240 |
+
|
241 |
+
2023-10-19 10:53:07,782 ----------------------------------------------------------------------------------------------------
|