Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- test.tsv +0 -0
- training.log +239 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:467f72e144796a1d43624d0c8583661e68ba29d7f7b556588c6b7083144bbe53
|
3 |
+
size 443323527
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 19:53:53 0.0000 0.3708 0.0987 0.2859 0.3958 0.3320 0.1998
|
3 |
+
2 19:57:15 0.0000 0.1524 0.1584 0.2472 0.5473 0.3406 0.2067
|
4 |
+
3 20:00:40 0.0000 0.1078 0.1524 0.3169 0.5739 0.4084 0.2585
|
5 |
+
4 20:03:59 0.0000 0.0776 0.2422 0.2974 0.5644 0.3895 0.2437
|
6 |
+
5 20:07:18 0.0000 0.0584 0.3234 0.2523 0.5720 0.3501 0.2137
|
7 |
+
6 20:10:38 0.0000 0.0442 0.3603 0.2762 0.6250 0.3831 0.2383
|
8 |
+
7 20:13:58 0.0000 0.0326 0.3720 0.2851 0.5966 0.3858 0.2406
|
9 |
+
8 20:17:17 0.0000 0.0231 0.3844 0.2919 0.5777 0.3878 0.2421
|
10 |
+
9 20:20:35 0.0000 0.0153 0.4430 0.2738 0.5947 0.3749 0.2321
|
11 |
+
10 20:23:53 0.0000 0.0100 0.4735 0.2795 0.6004 0.3815 0.2366
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,239 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-15 19:50:34,450 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-15 19:50:34,451 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=768, out_features=17, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-15 19:50:34,451 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-15 19:50:34,451 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
|
52 |
+
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
|
53 |
+
2023-10-15 19:50:34,451 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-15 19:50:34,451 Train: 20847 sentences
|
55 |
+
2023-10-15 19:50:34,451 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-15 19:50:34,451 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-15 19:50:34,451 Training Params:
|
58 |
+
2023-10-15 19:50:34,451 - learning_rate: "5e-05"
|
59 |
+
2023-10-15 19:50:34,451 - mini_batch_size: "8"
|
60 |
+
2023-10-15 19:50:34,451 - max_epochs: "10"
|
61 |
+
2023-10-15 19:50:34,451 - shuffle: "True"
|
62 |
+
2023-10-15 19:50:34,451 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-15 19:50:34,451 Plugins:
|
64 |
+
2023-10-15 19:50:34,451 - LinearScheduler | warmup_fraction: '0.1'
|
65 |
+
2023-10-15 19:50:34,451 ----------------------------------------------------------------------------------------------------
|
66 |
+
2023-10-15 19:50:34,451 Final evaluation on model from best epoch (best-model.pt)
|
67 |
+
2023-10-15 19:50:34,451 - metric: "('micro avg', 'f1-score')"
|
68 |
+
2023-10-15 19:50:34,451 ----------------------------------------------------------------------------------------------------
|
69 |
+
2023-10-15 19:50:34,451 Computation:
|
70 |
+
2023-10-15 19:50:34,451 - compute on device: cuda:0
|
71 |
+
2023-10-15 19:50:34,451 - embedding storage: none
|
72 |
+
2023-10-15 19:50:34,451 ----------------------------------------------------------------------------------------------------
|
73 |
+
2023-10-15 19:50:34,452 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
|
74 |
+
2023-10-15 19:50:34,452 ----------------------------------------------------------------------------------------------------
|
75 |
+
2023-10-15 19:50:34,452 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-15 19:50:52,747 epoch 1 - iter 260/2606 - loss 1.59028644 - time (sec): 18.29 - samples/sec: 1909.58 - lr: 0.000005 - momentum: 0.000000
|
77 |
+
2023-10-15 19:51:12,202 epoch 1 - iter 520/2606 - loss 0.98518790 - time (sec): 37.75 - samples/sec: 1945.90 - lr: 0.000010 - momentum: 0.000000
|
78 |
+
2023-10-15 19:51:30,947 epoch 1 - iter 780/2606 - loss 0.76092996 - time (sec): 56.49 - samples/sec: 1924.76 - lr: 0.000015 - momentum: 0.000000
|
79 |
+
2023-10-15 19:51:50,886 epoch 1 - iter 1040/2606 - loss 0.63417270 - time (sec): 76.43 - samples/sec: 1893.42 - lr: 0.000020 - momentum: 0.000000
|
80 |
+
2023-10-15 19:52:10,166 epoch 1 - iter 1300/2606 - loss 0.55309956 - time (sec): 95.71 - samples/sec: 1915.70 - lr: 0.000025 - momentum: 0.000000
|
81 |
+
2023-10-15 19:52:29,070 epoch 1 - iter 1560/2606 - loss 0.49758201 - time (sec): 114.62 - samples/sec: 1905.72 - lr: 0.000030 - momentum: 0.000000
|
82 |
+
2023-10-15 19:52:48,118 epoch 1 - iter 1820/2606 - loss 0.45440578 - time (sec): 133.67 - samples/sec: 1904.65 - lr: 0.000035 - momentum: 0.000000
|
83 |
+
2023-10-15 19:53:07,758 epoch 1 - iter 2080/2606 - loss 0.41855391 - time (sec): 153.30 - samples/sec: 1900.97 - lr: 0.000040 - momentum: 0.000000
|
84 |
+
2023-10-15 19:53:26,472 epoch 1 - iter 2340/2606 - loss 0.39560327 - time (sec): 172.02 - samples/sec: 1905.23 - lr: 0.000045 - momentum: 0.000000
|
85 |
+
2023-10-15 19:53:46,488 epoch 1 - iter 2600/2606 - loss 0.37125077 - time (sec): 192.03 - samples/sec: 1909.56 - lr: 0.000050 - momentum: 0.000000
|
86 |
+
2023-10-15 19:53:46,912 ----------------------------------------------------------------------------------------------------
|
87 |
+
2023-10-15 19:53:46,912 EPOCH 1 done: loss 0.3708 - lr: 0.000050
|
88 |
+
2023-10-15 19:53:53,679 DEV : loss 0.09866511821746826 - f1-score (micro avg) 0.332
|
89 |
+
2023-10-15 19:53:53,707 saving best model
|
90 |
+
2023-10-15 19:53:54,083 ----------------------------------------------------------------------------------------------------
|
91 |
+
2023-10-15 19:54:13,897 epoch 2 - iter 260/2606 - loss 0.16979246 - time (sec): 19.81 - samples/sec: 1913.91 - lr: 0.000049 - momentum: 0.000000
|
92 |
+
2023-10-15 19:54:33,718 epoch 2 - iter 520/2606 - loss 0.15396119 - time (sec): 39.63 - samples/sec: 1921.04 - lr: 0.000049 - momentum: 0.000000
|
93 |
+
2023-10-15 19:54:52,579 epoch 2 - iter 780/2606 - loss 0.15278405 - time (sec): 58.49 - samples/sec: 1938.50 - lr: 0.000048 - momentum: 0.000000
|
94 |
+
2023-10-15 19:55:11,345 epoch 2 - iter 1040/2606 - loss 0.15627393 - time (sec): 77.26 - samples/sec: 1914.12 - lr: 0.000048 - momentum: 0.000000
|
95 |
+
2023-10-15 19:55:30,228 epoch 2 - iter 1300/2606 - loss 0.15531609 - time (sec): 96.14 - samples/sec: 1926.26 - lr: 0.000047 - momentum: 0.000000
|
96 |
+
2023-10-15 19:55:48,789 epoch 2 - iter 1560/2606 - loss 0.15197189 - time (sec): 114.70 - samples/sec: 1925.11 - lr: 0.000047 - momentum: 0.000000
|
97 |
+
2023-10-15 19:56:08,319 epoch 2 - iter 1820/2606 - loss 0.14969900 - time (sec): 134.23 - samples/sec: 1933.14 - lr: 0.000046 - momentum: 0.000000
|
98 |
+
2023-10-15 19:56:26,763 epoch 2 - iter 2080/2606 - loss 0.15359140 - time (sec): 152.68 - samples/sec: 1928.53 - lr: 0.000046 - momentum: 0.000000
|
99 |
+
2023-10-15 19:56:46,738 epoch 2 - iter 2340/2606 - loss 0.15309562 - time (sec): 172.65 - samples/sec: 1927.59 - lr: 0.000045 - momentum: 0.000000
|
100 |
+
2023-10-15 19:57:05,105 epoch 2 - iter 2600/2606 - loss 0.15242289 - time (sec): 191.02 - samples/sec: 1921.72 - lr: 0.000044 - momentum: 0.000000
|
101 |
+
2023-10-15 19:57:05,409 ----------------------------------------------------------------------------------------------------
|
102 |
+
2023-10-15 19:57:05,409 EPOCH 2 done: loss 0.1524 - lr: 0.000044
|
103 |
+
2023-10-15 19:57:15,626 DEV : loss 0.1584077775478363 - f1-score (micro avg) 0.3406
|
104 |
+
2023-10-15 19:57:15,657 saving best model
|
105 |
+
2023-10-15 19:57:16,211 ----------------------------------------------------------------------------------------------------
|
106 |
+
2023-10-15 19:57:35,187 epoch 3 - iter 260/2606 - loss 0.11299931 - time (sec): 18.97 - samples/sec: 1921.95 - lr: 0.000044 - momentum: 0.000000
|
107 |
+
2023-10-15 19:57:54,309 epoch 3 - iter 520/2606 - loss 0.10767341 - time (sec): 38.09 - samples/sec: 1918.22 - lr: 0.000043 - momentum: 0.000000
|
108 |
+
2023-10-15 19:58:15,019 epoch 3 - iter 780/2606 - loss 0.10899205 - time (sec): 58.80 - samples/sec: 1875.68 - lr: 0.000043 - momentum: 0.000000
|
109 |
+
2023-10-15 19:58:35,731 epoch 3 - iter 1040/2606 - loss 0.11063881 - time (sec): 79.52 - samples/sec: 1865.12 - lr: 0.000042 - momentum: 0.000000
|
110 |
+
2023-10-15 19:58:55,613 epoch 3 - iter 1300/2606 - loss 0.10698292 - time (sec): 99.40 - samples/sec: 1857.48 - lr: 0.000042 - momentum: 0.000000
|
111 |
+
2023-10-15 19:59:14,663 epoch 3 - iter 1560/2606 - loss 0.10679134 - time (sec): 118.45 - samples/sec: 1858.88 - lr: 0.000041 - momentum: 0.000000
|
112 |
+
2023-10-15 19:59:33,448 epoch 3 - iter 1820/2606 - loss 0.10893641 - time (sec): 137.23 - samples/sec: 1867.67 - lr: 0.000041 - momentum: 0.000000
|
113 |
+
2023-10-15 19:59:53,465 epoch 3 - iter 2080/2606 - loss 0.10710962 - time (sec): 157.25 - samples/sec: 1877.55 - lr: 0.000040 - momentum: 0.000000
|
114 |
+
2023-10-15 20:00:12,483 epoch 3 - iter 2340/2606 - loss 0.10687887 - time (sec): 176.27 - samples/sec: 1883.24 - lr: 0.000039 - momentum: 0.000000
|
115 |
+
2023-10-15 20:00:30,949 epoch 3 - iter 2600/2606 - loss 0.10778076 - time (sec): 194.74 - samples/sec: 1882.60 - lr: 0.000039 - momentum: 0.000000
|
116 |
+
2023-10-15 20:00:31,324 ----------------------------------------------------------------------------------------------------
|
117 |
+
2023-10-15 20:00:31,324 EPOCH 3 done: loss 0.1078 - lr: 0.000039
|
118 |
+
2023-10-15 20:00:40,591 DEV : loss 0.15240149199962616 - f1-score (micro avg) 0.4084
|
119 |
+
2023-10-15 20:00:40,626 saving best model
|
120 |
+
2023-10-15 20:00:41,208 ----------------------------------------------------------------------------------------------------
|
121 |
+
2023-10-15 20:01:00,629 epoch 4 - iter 260/2606 - loss 0.08035620 - time (sec): 19.41 - samples/sec: 1887.95 - lr: 0.000038 - momentum: 0.000000
|
122 |
+
2023-10-15 20:01:18,964 epoch 4 - iter 520/2606 - loss 0.07715406 - time (sec): 37.75 - samples/sec: 1884.39 - lr: 0.000038 - momentum: 0.000000
|
123 |
+
2023-10-15 20:01:37,159 epoch 4 - iter 780/2606 - loss 0.08039817 - time (sec): 55.94 - samples/sec: 1918.31 - lr: 0.000037 - momentum: 0.000000
|
124 |
+
2023-10-15 20:01:56,554 epoch 4 - iter 1040/2606 - loss 0.07833131 - time (sec): 75.34 - samples/sec: 1914.94 - lr: 0.000037 - momentum: 0.000000
|
125 |
+
2023-10-15 20:02:14,628 epoch 4 - iter 1300/2606 - loss 0.07761709 - time (sec): 93.41 - samples/sec: 1924.78 - lr: 0.000036 - momentum: 0.000000
|
126 |
+
2023-10-15 20:02:32,876 epoch 4 - iter 1560/2606 - loss 0.07793021 - time (sec): 111.66 - samples/sec: 1930.25 - lr: 0.000036 - momentum: 0.000000
|
127 |
+
2023-10-15 20:02:52,296 epoch 4 - iter 1820/2606 - loss 0.07911638 - time (sec): 131.08 - samples/sec: 1936.60 - lr: 0.000035 - momentum: 0.000000
|
128 |
+
2023-10-15 20:03:10,776 epoch 4 - iter 2080/2606 - loss 0.07888385 - time (sec): 149.56 - samples/sec: 1932.16 - lr: 0.000034 - momentum: 0.000000
|
129 |
+
2023-10-15 20:03:29,996 epoch 4 - iter 2340/2606 - loss 0.07940196 - time (sec): 168.78 - samples/sec: 1941.28 - lr: 0.000034 - momentum: 0.000000
|
130 |
+
2023-10-15 20:03:50,163 epoch 4 - iter 2600/2606 - loss 0.07772595 - time (sec): 188.95 - samples/sec: 1939.73 - lr: 0.000033 - momentum: 0.000000
|
131 |
+
2023-10-15 20:03:50,679 ----------------------------------------------------------------------------------------------------
|
132 |
+
2023-10-15 20:03:50,679 EPOCH 4 done: loss 0.0776 - lr: 0.000033
|
133 |
+
2023-10-15 20:03:59,788 DEV : loss 0.24223537743091583 - f1-score (micro avg) 0.3895
|
134 |
+
2023-10-15 20:03:59,816 ----------------------------------------------------------------------------------------------------
|
135 |
+
2023-10-15 20:04:17,856 epoch 5 - iter 260/2606 - loss 0.05320156 - time (sec): 18.04 - samples/sec: 1901.00 - lr: 0.000033 - momentum: 0.000000
|
136 |
+
2023-10-15 20:04:36,525 epoch 5 - iter 520/2606 - loss 0.06004399 - time (sec): 36.71 - samples/sec: 1892.24 - lr: 0.000032 - momentum: 0.000000
|
137 |
+
2023-10-15 20:04:55,064 epoch 5 - iter 780/2606 - loss 0.06361788 - time (sec): 55.25 - samples/sec: 1900.04 - lr: 0.000032 - momentum: 0.000000
|
138 |
+
2023-10-15 20:05:13,980 epoch 5 - iter 1040/2606 - loss 0.06308322 - time (sec): 74.16 - samples/sec: 1916.92 - lr: 0.000031 - momentum: 0.000000
|
139 |
+
2023-10-15 20:05:33,591 epoch 5 - iter 1300/2606 - loss 0.06120360 - time (sec): 93.77 - samples/sec: 1917.50 - lr: 0.000031 - momentum: 0.000000
|
140 |
+
2023-10-15 20:05:52,267 epoch 5 - iter 1560/2606 - loss 0.06016703 - time (sec): 112.45 - samples/sec: 1913.29 - lr: 0.000030 - momentum: 0.000000
|
141 |
+
2023-10-15 20:06:11,407 epoch 5 - iter 1820/2606 - loss 0.06035800 - time (sec): 131.59 - samples/sec: 1920.61 - lr: 0.000029 - momentum: 0.000000
|
142 |
+
2023-10-15 20:06:30,790 epoch 5 - iter 2080/2606 - loss 0.05910496 - time (sec): 150.97 - samples/sec: 1928.30 - lr: 0.000029 - momentum: 0.000000
|
143 |
+
2023-10-15 20:06:50,194 epoch 5 - iter 2340/2606 - loss 0.05814657 - time (sec): 170.38 - samples/sec: 1929.74 - lr: 0.000028 - momentum: 0.000000
|
144 |
+
2023-10-15 20:07:09,957 epoch 5 - iter 2600/2606 - loss 0.05838664 - time (sec): 190.14 - samples/sec: 1927.76 - lr: 0.000028 - momentum: 0.000000
|
145 |
+
2023-10-15 20:07:10,400 ----------------------------------------------------------------------------------------------------
|
146 |
+
2023-10-15 20:07:10,401 EPOCH 5 done: loss 0.0584 - lr: 0.000028
|
147 |
+
2023-10-15 20:07:18,725 DEV : loss 0.3234298527240753 - f1-score (micro avg) 0.3501
|
148 |
+
2023-10-15 20:07:18,755 ----------------------------------------------------------------------------------------------------
|
149 |
+
2023-10-15 20:07:38,043 epoch 6 - iter 260/2606 - loss 0.04968909 - time (sec): 19.29 - samples/sec: 1943.44 - lr: 0.000027 - momentum: 0.000000
|
150 |
+
2023-10-15 20:07:58,295 epoch 6 - iter 520/2606 - loss 0.04591580 - time (sec): 39.54 - samples/sec: 1911.03 - lr: 0.000027 - momentum: 0.000000
|
151 |
+
2023-10-15 20:08:16,820 epoch 6 - iter 780/2606 - loss 0.04497115 - time (sec): 58.06 - samples/sec: 1917.34 - lr: 0.000026 - momentum: 0.000000
|
152 |
+
2023-10-15 20:08:36,677 epoch 6 - iter 1040/2606 - loss 0.04222920 - time (sec): 77.92 - samples/sec: 1932.37 - lr: 0.000026 - momentum: 0.000000
|
153 |
+
2023-10-15 20:08:55,221 epoch 6 - iter 1300/2606 - loss 0.04176238 - time (sec): 96.46 - samples/sec: 1935.86 - lr: 0.000025 - momentum: 0.000000
|
154 |
+
2023-10-15 20:09:14,166 epoch 6 - iter 1560/2606 - loss 0.04201919 - time (sec): 115.41 - samples/sec: 1930.35 - lr: 0.000024 - momentum: 0.000000
|
155 |
+
2023-10-15 20:09:32,322 epoch 6 - iter 1820/2606 - loss 0.04211214 - time (sec): 133.57 - samples/sec: 1928.39 - lr: 0.000024 - momentum: 0.000000
|
156 |
+
2023-10-15 20:09:51,338 epoch 6 - iter 2080/2606 - loss 0.04276177 - time (sec): 152.58 - samples/sec: 1919.57 - lr: 0.000023 - momentum: 0.000000
|
157 |
+
2023-10-15 20:10:09,800 epoch 6 - iter 2340/2606 - loss 0.04345221 - time (sec): 171.04 - samples/sec: 1915.62 - lr: 0.000023 - momentum: 0.000000
|
158 |
+
2023-10-15 20:10:29,580 epoch 6 - iter 2600/2606 - loss 0.04434131 - time (sec): 190.82 - samples/sec: 1918.41 - lr: 0.000022 - momentum: 0.000000
|
159 |
+
2023-10-15 20:10:30,220 ----------------------------------------------------------------------------------------------------
|
160 |
+
2023-10-15 20:10:30,220 EPOCH 6 done: loss 0.0442 - lr: 0.000022
|
161 |
+
2023-10-15 20:10:38,461 DEV : loss 0.360347181558609 - f1-score (micro avg) 0.3831
|
162 |
+
2023-10-15 20:10:38,489 ----------------------------------------------------------------------------------------------------
|
163 |
+
2023-10-15 20:10:57,550 epoch 7 - iter 260/2606 - loss 0.03605703 - time (sec): 19.06 - samples/sec: 1967.84 - lr: 0.000022 - momentum: 0.000000
|
164 |
+
2023-10-15 20:11:17,462 epoch 7 - iter 520/2606 - loss 0.03233024 - time (sec): 38.97 - samples/sec: 1942.69 - lr: 0.000021 - momentum: 0.000000
|
165 |
+
2023-10-15 20:11:36,061 epoch 7 - iter 780/2606 - loss 0.03422854 - time (sec): 57.57 - samples/sec: 1930.25 - lr: 0.000021 - momentum: 0.000000
|
166 |
+
2023-10-15 20:11:55,421 epoch 7 - iter 1040/2606 - loss 0.03509936 - time (sec): 76.93 - samples/sec: 1887.52 - lr: 0.000020 - momentum: 0.000000
|
167 |
+
2023-10-15 20:12:15,268 epoch 7 - iter 1300/2606 - loss 0.03481716 - time (sec): 96.78 - samples/sec: 1892.73 - lr: 0.000019 - momentum: 0.000000
|
168 |
+
2023-10-15 20:12:33,527 epoch 7 - iter 1560/2606 - loss 0.03453842 - time (sec): 115.04 - samples/sec: 1897.80 - lr: 0.000019 - momentum: 0.000000
|
169 |
+
2023-10-15 20:12:53,305 epoch 7 - iter 1820/2606 - loss 0.03453028 - time (sec): 134.81 - samples/sec: 1910.27 - lr: 0.000018 - momentum: 0.000000
|
170 |
+
2023-10-15 20:13:11,312 epoch 7 - iter 2080/2606 - loss 0.03351666 - time (sec): 152.82 - samples/sec: 1910.11 - lr: 0.000018 - momentum: 0.000000
|
171 |
+
2023-10-15 20:13:30,700 epoch 7 - iter 2340/2606 - loss 0.03294077 - time (sec): 172.21 - samples/sec: 1917.56 - lr: 0.000017 - momentum: 0.000000
|
172 |
+
2023-10-15 20:13:49,306 epoch 7 - iter 2600/2606 - loss 0.03257111 - time (sec): 190.82 - samples/sec: 1920.47 - lr: 0.000017 - momentum: 0.000000
|
173 |
+
2023-10-15 20:13:49,775 ----------------------------------------------------------------------------------------------------
|
174 |
+
2023-10-15 20:13:49,775 EPOCH 7 done: loss 0.0326 - lr: 0.000017
|
175 |
+
2023-10-15 20:13:57,989 DEV : loss 0.37199294567108154 - f1-score (micro avg) 0.3858
|
176 |
+
2023-10-15 20:13:58,018 ----------------------------------------------------------------------------------------------------
|
177 |
+
2023-10-15 20:14:16,046 epoch 8 - iter 260/2606 - loss 0.02342308 - time (sec): 18.03 - samples/sec: 1889.83 - lr: 0.000016 - momentum: 0.000000
|
178 |
+
2023-10-15 20:14:35,427 epoch 8 - iter 520/2606 - loss 0.02364228 - time (sec): 37.41 - samples/sec: 1945.44 - lr: 0.000016 - momentum: 0.000000
|
179 |
+
2023-10-15 20:14:54,448 epoch 8 - iter 780/2606 - loss 0.02356941 - time (sec): 56.43 - samples/sec: 1926.14 - lr: 0.000015 - momentum: 0.000000
|
180 |
+
2023-10-15 20:15:13,454 epoch 8 - iter 1040/2606 - loss 0.02491977 - time (sec): 75.44 - samples/sec: 1918.27 - lr: 0.000014 - momentum: 0.000000
|
181 |
+
2023-10-15 20:15:32,549 epoch 8 - iter 1300/2606 - loss 0.02428659 - time (sec): 94.53 - samples/sec: 1932.58 - lr: 0.000014 - momentum: 0.000000
|
182 |
+
2023-10-15 20:15:52,160 epoch 8 - iter 1560/2606 - loss 0.02366101 - time (sec): 114.14 - samples/sec: 1931.16 - lr: 0.000013 - momentum: 0.000000
|
183 |
+
2023-10-15 20:16:10,784 epoch 8 - iter 1820/2606 - loss 0.02343244 - time (sec): 132.77 - samples/sec: 1938.37 - lr: 0.000013 - momentum: 0.000000
|
184 |
+
2023-10-15 20:16:30,328 epoch 8 - iter 2080/2606 - loss 0.02351442 - time (sec): 152.31 - samples/sec: 1925.81 - lr: 0.000012 - momentum: 0.000000
|
185 |
+
2023-10-15 20:16:49,792 epoch 8 - iter 2340/2606 - loss 0.02300749 - time (sec): 171.77 - samples/sec: 1923.52 - lr: 0.000012 - momentum: 0.000000
|
186 |
+
2023-10-15 20:17:08,499 epoch 8 - iter 2600/2606 - loss 0.02311972 - time (sec): 190.48 - samples/sec: 1924.21 - lr: 0.000011 - momentum: 0.000000
|
187 |
+
2023-10-15 20:17:08,968 ----------------------------------------------------------------------------------------------------
|
188 |
+
2023-10-15 20:17:08,968 EPOCH 8 done: loss 0.0231 - lr: 0.000011
|
189 |
+
2023-10-15 20:17:17,209 DEV : loss 0.38435670733451843 - f1-score (micro avg) 0.3878
|
190 |
+
2023-10-15 20:17:17,237 ----------------------------------------------------------------------------------------------------
|
191 |
+
2023-10-15 20:17:36,880 epoch 9 - iter 260/2606 - loss 0.01429002 - time (sec): 19.64 - samples/sec: 2021.11 - lr: 0.000011 - momentum: 0.000000
|
192 |
+
2023-10-15 20:17:56,464 epoch 9 - iter 520/2606 - loss 0.01483258 - time (sec): 39.23 - samples/sec: 1987.77 - lr: 0.000010 - momentum: 0.000000
|
193 |
+
2023-10-15 20:18:15,715 epoch 9 - iter 780/2606 - loss 0.01447465 - time (sec): 58.48 - samples/sec: 1973.65 - lr: 0.000009 - momentum: 0.000000
|
194 |
+
2023-10-15 20:18:34,051 epoch 9 - iter 1040/2606 - loss 0.01472942 - time (sec): 76.81 - samples/sec: 1971.06 - lr: 0.000009 - momentum: 0.000000
|
195 |
+
2023-10-15 20:18:52,832 epoch 9 - iter 1300/2606 - loss 0.01528011 - time (sec): 95.59 - samples/sec: 1965.67 - lr: 0.000008 - momentum: 0.000000
|
196 |
+
2023-10-15 20:19:10,916 epoch 9 - iter 1560/2606 - loss 0.01518566 - time (sec): 113.68 - samples/sec: 1940.69 - lr: 0.000008 - momentum: 0.000000
|
197 |
+
2023-10-15 20:19:30,000 epoch 9 - iter 1820/2606 - loss 0.01575329 - time (sec): 132.76 - samples/sec: 1940.91 - lr: 0.000007 - momentum: 0.000000
|
198 |
+
2023-10-15 20:19:48,598 epoch 9 - iter 2080/2606 - loss 0.01545732 - time (sec): 151.36 - samples/sec: 1943.70 - lr: 0.000007 - momentum: 0.000000
|
199 |
+
2023-10-15 20:20:07,608 epoch 9 - iter 2340/2606 - loss 0.01535373 - time (sec): 170.37 - samples/sec: 1939.88 - lr: 0.000006 - momentum: 0.000000
|
200 |
+
2023-10-15 20:20:26,963 epoch 9 - iter 2600/2606 - loss 0.01534271 - time (sec): 189.72 - samples/sec: 1933.83 - lr: 0.000006 - momentum: 0.000000
|
201 |
+
2023-10-15 20:20:27,294 ----------------------------------------------------------------------------------------------------
|
202 |
+
2023-10-15 20:20:27,295 EPOCH 9 done: loss 0.0153 - lr: 0.000006
|
203 |
+
2023-10-15 20:20:35,652 DEV : loss 0.44299206137657166 - f1-score (micro avg) 0.3749
|
204 |
+
2023-10-15 20:20:35,697 ----------------------------------------------------------------------------------------------------
|
205 |
+
2023-10-15 20:20:53,742 epoch 10 - iter 260/2606 - loss 0.01007017 - time (sec): 18.04 - samples/sec: 1966.73 - lr: 0.000005 - momentum: 0.000000
|
206 |
+
2023-10-15 20:21:11,849 epoch 10 - iter 520/2606 - loss 0.00872718 - time (sec): 36.15 - samples/sec: 1916.23 - lr: 0.000004 - momentum: 0.000000
|
207 |
+
2023-10-15 20:21:30,620 epoch 10 - iter 780/2606 - loss 0.00877023 - time (sec): 54.92 - samples/sec: 1938.58 - lr: 0.000004 - momentum: 0.000000
|
208 |
+
2023-10-15 20:21:48,942 epoch 10 - iter 1040/2606 - loss 0.00900546 - time (sec): 73.24 - samples/sec: 1944.56 - lr: 0.000003 - momentum: 0.000000
|
209 |
+
2023-10-15 20:22:07,349 epoch 10 - iter 1300/2606 - loss 0.00925451 - time (sec): 91.65 - samples/sec: 1947.96 - lr: 0.000003 - momentum: 0.000000
|
210 |
+
2023-10-15 20:22:26,240 epoch 10 - iter 1560/2606 - loss 0.00955119 - time (sec): 110.54 - samples/sec: 1945.38 - lr: 0.000002 - momentum: 0.000000
|
211 |
+
2023-10-15 20:22:45,017 epoch 10 - iter 1820/2606 - loss 0.00958418 - time (sec): 129.32 - samples/sec: 1949.64 - lr: 0.000002 - momentum: 0.000000
|
212 |
+
2023-10-15 20:23:04,810 epoch 10 - iter 2080/2606 - loss 0.00978376 - time (sec): 149.11 - samples/sec: 1952.87 - lr: 0.000001 - momentum: 0.000000
|
213 |
+
2023-10-15 20:23:23,622 epoch 10 - iter 2340/2606 - loss 0.00995251 - time (sec): 167.92 - samples/sec: 1946.93 - lr: 0.000001 - momentum: 0.000000
|
214 |
+
2023-10-15 20:23:43,809 epoch 10 - iter 2600/2606 - loss 0.01001595 - time (sec): 188.11 - samples/sec: 1948.65 - lr: 0.000000 - momentum: 0.000000
|
215 |
+
2023-10-15 20:23:44,199 ----------------------------------------------------------------------------------------------------
|
216 |
+
2023-10-15 20:23:44,200 EPOCH 10 done: loss 0.0100 - lr: 0.000000
|
217 |
+
2023-10-15 20:23:53,258 DEV : loss 0.4734904170036316 - f1-score (micro avg) 0.3815
|
218 |
+
2023-10-15 20:23:53,688 ----------------------------------------------------------------------------------------------------
|
219 |
+
2023-10-15 20:23:53,689 Loading model from best epoch ...
|
220 |
+
2023-10-15 20:23:55,224 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
|
221 |
+
2023-10-15 20:24:11,965
|
222 |
+
Results:
|
223 |
+
- F-score (micro) 0.446
|
224 |
+
- F-score (macro) 0.2767
|
225 |
+
- Accuracy 0.2916
|
226 |
+
|
227 |
+
By class:
|
228 |
+
precision recall f1-score support
|
229 |
+
|
230 |
+
LOC 0.4970 0.6063 0.5462 1214
|
231 |
+
PER 0.4008 0.3651 0.3821 808
|
232 |
+
ORG 0.2091 0.1558 0.1786 353
|
233 |
+
HumanProd 0.0000 0.0000 0.0000 15
|
234 |
+
|
235 |
+
micro avg 0.4379 0.4544 0.4460 2390
|
236 |
+
macro avg 0.2767 0.2818 0.2767 2390
|
237 |
+
weighted avg 0.4188 0.4544 0.4330 2390
|
238 |
+
|
239 |
+
2023-10-15 20:24:11,966 ----------------------------------------------------------------------------------------------------
|