Upload ./training.log with huggingface_hub
Browse files- training.log +266 -0
training.log
ADDED
@@ -0,0 +1,266 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2024-03-26 11:28:53,846 ----------------------------------------------------------------------------------------------------
|
2 |
+
2024-03-26 11:28:53,847 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(30001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=768, out_features=17, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2024-03-26 11:28:53,847 ----------------------------------------------------------------------------------------------------
|
51 |
+
2024-03-26 11:28:53,847 Corpus: 758 train + 94 dev + 96 test sentences
|
52 |
+
2024-03-26 11:28:53,847 ----------------------------------------------------------------------------------------------------
|
53 |
+
2024-03-26 11:28:53,847 Train: 758 sentences
|
54 |
+
2024-03-26 11:28:53,847 (train_with_dev=False, train_with_test=False)
|
55 |
+
2024-03-26 11:28:53,847 ----------------------------------------------------------------------------------------------------
|
56 |
+
2024-03-26 11:28:53,847 Training Params:
|
57 |
+
2024-03-26 11:28:53,847 - learning_rate: "3e-05"
|
58 |
+
2024-03-26 11:28:53,847 - mini_batch_size: "16"
|
59 |
+
2024-03-26 11:28:53,847 - max_epochs: "10"
|
60 |
+
2024-03-26 11:28:53,847 - shuffle: "True"
|
61 |
+
2024-03-26 11:28:53,847 ----------------------------------------------------------------------------------------------------
|
62 |
+
2024-03-26 11:28:53,847 Plugins:
|
63 |
+
2024-03-26 11:28:53,847 - TensorboardLogger
|
64 |
+
2024-03-26 11:28:53,847 - LinearScheduler | warmup_fraction: '0.1'
|
65 |
+
2024-03-26 11:28:53,847 ----------------------------------------------------------------------------------------------------
|
66 |
+
2024-03-26 11:28:53,847 Final evaluation on model from best epoch (best-model.pt)
|
67 |
+
2024-03-26 11:28:53,847 - metric: "('micro avg', 'f1-score')"
|
68 |
+
2024-03-26 11:28:53,847 ----------------------------------------------------------------------------------------------------
|
69 |
+
2024-03-26 11:28:53,847 Computation:
|
70 |
+
2024-03-26 11:28:53,847 - compute on device: cuda:0
|
71 |
+
2024-03-26 11:28:53,847 - embedding storage: none
|
72 |
+
2024-03-26 11:28:53,847 ----------------------------------------------------------------------------------------------------
|
73 |
+
2024-03-26 11:28:53,847 Model training base path: "flair-co-funer-german_bert_base-bs16-e10-lr3e-05-3"
|
74 |
+
2024-03-26 11:28:53,847 ----------------------------------------------------------------------------------------------------
|
75 |
+
2024-03-26 11:28:53,847 ----------------------------------------------------------------------------------------------------
|
76 |
+
2024-03-26 11:28:53,847 Logging anything other than scalars to TensorBoard is currently not supported.
|
77 |
+
2024-03-26 11:28:55,168 epoch 1 - iter 4/48 - loss 3.07964766 - time (sec): 1.32 - samples/sec: 2084.33 - lr: 0.000002 - momentum: 0.000000
|
78 |
+
2024-03-26 11:28:57,277 epoch 1 - iter 8/48 - loss 3.11459541 - time (sec): 3.43 - samples/sec: 1698.14 - lr: 0.000004 - momentum: 0.000000
|
79 |
+
2024-03-26 11:28:58,788 epoch 1 - iter 12/48 - loss 3.01247506 - time (sec): 4.94 - samples/sec: 1694.91 - lr: 0.000007 - momentum: 0.000000
|
80 |
+
2024-03-26 11:29:01,798 epoch 1 - iter 16/48 - loss 2.90095188 - time (sec): 7.95 - samples/sec: 1458.76 - lr: 0.000009 - momentum: 0.000000
|
81 |
+
2024-03-26 11:29:03,518 epoch 1 - iter 20/48 - loss 2.77027282 - time (sec): 9.67 - samples/sec: 1489.31 - lr: 0.000012 - momentum: 0.000000
|
82 |
+
2024-03-26 11:29:04,982 epoch 1 - iter 24/48 - loss 2.67348040 - time (sec): 11.13 - samples/sec: 1541.94 - lr: 0.000014 - momentum: 0.000000
|
83 |
+
2024-03-26 11:29:06,354 epoch 1 - iter 28/48 - loss 2.56554049 - time (sec): 12.51 - samples/sec: 1561.39 - lr: 0.000017 - momentum: 0.000000
|
84 |
+
2024-03-26 11:29:08,476 epoch 1 - iter 32/48 - loss 2.46046261 - time (sec): 14.63 - samples/sec: 1553.68 - lr: 0.000019 - momentum: 0.000000
|
85 |
+
2024-03-26 11:29:10,048 epoch 1 - iter 36/48 - loss 2.36040139 - time (sec): 16.20 - samples/sec: 1575.19 - lr: 0.000022 - momentum: 0.000000
|
86 |
+
2024-03-26 11:29:12,298 epoch 1 - iter 40/48 - loss 2.25501999 - time (sec): 18.45 - samples/sec: 1570.45 - lr: 0.000024 - momentum: 0.000000
|
87 |
+
2024-03-26 11:29:14,214 epoch 1 - iter 44/48 - loss 2.15905715 - time (sec): 20.37 - samples/sec: 1574.46 - lr: 0.000027 - momentum: 0.000000
|
88 |
+
2024-03-26 11:29:15,874 epoch 1 - iter 48/48 - loss 2.07940884 - time (sec): 22.03 - samples/sec: 1565.04 - lr: 0.000029 - momentum: 0.000000
|
89 |
+
2024-03-26 11:29:15,874 ----------------------------------------------------------------------------------------------------
|
90 |
+
2024-03-26 11:29:15,874 EPOCH 1 done: loss 2.0794 - lr: 0.000029
|
91 |
+
2024-03-26 11:29:16,722 DEV : loss 0.8170562386512756 - f1-score (micro avg) 0.4639
|
92 |
+
2024-03-26 11:29:16,723 saving best model
|
93 |
+
2024-03-26 11:29:16,986 ----------------------------------------------------------------------------------------------------
|
94 |
+
2024-03-26 11:29:18,437 epoch 2 - iter 4/48 - loss 1.02754611 - time (sec): 1.45 - samples/sec: 1720.69 - lr: 0.000030 - momentum: 0.000000
|
95 |
+
2024-03-26 11:29:19,922 epoch 2 - iter 8/48 - loss 0.84369377 - time (sec): 2.93 - samples/sec: 1663.09 - lr: 0.000030 - momentum: 0.000000
|
96 |
+
2024-03-26 11:29:21,389 epoch 2 - iter 12/48 - loss 0.81565184 - time (sec): 4.40 - samples/sec: 1746.21 - lr: 0.000029 - momentum: 0.000000
|
97 |
+
2024-03-26 11:29:23,337 epoch 2 - iter 16/48 - loss 0.74966979 - time (sec): 6.35 - samples/sec: 1694.05 - lr: 0.000029 - momentum: 0.000000
|
98 |
+
2024-03-26 11:29:25,667 epoch 2 - iter 20/48 - loss 0.71542244 - time (sec): 8.68 - samples/sec: 1633.66 - lr: 0.000029 - momentum: 0.000000
|
99 |
+
2024-03-26 11:29:27,750 epoch 2 - iter 24/48 - loss 0.67118474 - time (sec): 10.76 - samples/sec: 1611.61 - lr: 0.000028 - momentum: 0.000000
|
100 |
+
2024-03-26 11:29:30,603 epoch 2 - iter 28/48 - loss 0.64754457 - time (sec): 13.62 - samples/sec: 1536.84 - lr: 0.000028 - momentum: 0.000000
|
101 |
+
2024-03-26 11:29:32,828 epoch 2 - iter 32/48 - loss 0.62155564 - time (sec): 15.84 - samples/sec: 1506.72 - lr: 0.000028 - momentum: 0.000000
|
102 |
+
2024-03-26 11:29:34,583 epoch 2 - iter 36/48 - loss 0.60558902 - time (sec): 17.60 - samples/sec: 1501.29 - lr: 0.000028 - momentum: 0.000000
|
103 |
+
2024-03-26 11:29:36,341 epoch 2 - iter 40/48 - loss 0.59758423 - time (sec): 19.35 - samples/sec: 1507.28 - lr: 0.000027 - momentum: 0.000000
|
104 |
+
2024-03-26 11:29:38,656 epoch 2 - iter 44/48 - loss 0.57874089 - time (sec): 21.67 - samples/sec: 1495.35 - lr: 0.000027 - momentum: 0.000000
|
105 |
+
2024-03-26 11:29:40,218 epoch 2 - iter 48/48 - loss 0.56587701 - time (sec): 23.23 - samples/sec: 1483.88 - lr: 0.000027 - momentum: 0.000000
|
106 |
+
2024-03-26 11:29:40,218 ----------------------------------------------------------------------------------------------------
|
107 |
+
2024-03-26 11:29:40,218 EPOCH 2 done: loss 0.5659 - lr: 0.000027
|
108 |
+
2024-03-26 11:29:41,146 DEV : loss 0.3203175961971283 - f1-score (micro avg) 0.767
|
109 |
+
2024-03-26 11:29:41,147 saving best model
|
110 |
+
2024-03-26 11:29:41,560 ----------------------------------------------------------------------------------------------------
|
111 |
+
2024-03-26 11:29:43,209 epoch 3 - iter 4/48 - loss 0.37148736 - time (sec): 1.65 - samples/sec: 1486.34 - lr: 0.000026 - momentum: 0.000000
|
112 |
+
2024-03-26 11:29:46,032 epoch 3 - iter 8/48 - loss 0.31752787 - time (sec): 4.47 - samples/sec: 1281.51 - lr: 0.000026 - momentum: 0.000000
|
113 |
+
2024-03-26 11:29:47,322 epoch 3 - iter 12/48 - loss 0.31729253 - time (sec): 5.76 - samples/sec: 1414.04 - lr: 0.000026 - momentum: 0.000000
|
114 |
+
2024-03-26 11:29:48,736 epoch 3 - iter 16/48 - loss 0.29219495 - time (sec): 7.18 - samples/sec: 1538.33 - lr: 0.000026 - momentum: 0.000000
|
115 |
+
2024-03-26 11:29:50,239 epoch 3 - iter 20/48 - loss 0.29329130 - time (sec): 8.68 - samples/sec: 1557.00 - lr: 0.000025 - momentum: 0.000000
|
116 |
+
2024-03-26 11:29:53,048 epoch 3 - iter 24/48 - loss 0.28339544 - time (sec): 11.49 - samples/sec: 1454.15 - lr: 0.000025 - momentum: 0.000000
|
117 |
+
2024-03-26 11:29:55,047 epoch 3 - iter 28/48 - loss 0.28604906 - time (sec): 13.49 - samples/sec: 1469.63 - lr: 0.000025 - momentum: 0.000000
|
118 |
+
2024-03-26 11:29:57,569 epoch 3 - iter 32/48 - loss 0.27422515 - time (sec): 16.01 - samples/sec: 1424.34 - lr: 0.000025 - momentum: 0.000000
|
119 |
+
2024-03-26 11:29:59,547 epoch 3 - iter 36/48 - loss 0.27654979 - time (sec): 17.99 - samples/sec: 1422.10 - lr: 0.000024 - momentum: 0.000000
|
120 |
+
2024-03-26 11:30:01,932 epoch 3 - iter 40/48 - loss 0.26611635 - time (sec): 20.37 - samples/sec: 1402.18 - lr: 0.000024 - momentum: 0.000000
|
121 |
+
2024-03-26 11:30:04,421 epoch 3 - iter 44/48 - loss 0.27617654 - time (sec): 22.86 - samples/sec: 1392.02 - lr: 0.000024 - momentum: 0.000000
|
122 |
+
2024-03-26 11:30:06,866 epoch 3 - iter 48/48 - loss 0.26829778 - time (sec): 25.30 - samples/sec: 1362.29 - lr: 0.000023 - momentum: 0.000000
|
123 |
+
2024-03-26 11:30:06,866 ----------------------------------------------------------------------------------------------------
|
124 |
+
2024-03-26 11:30:06,866 EPOCH 3 done: loss 0.2683 - lr: 0.000023
|
125 |
+
2024-03-26 11:30:07,798 DEV : loss 0.2340826541185379 - f1-score (micro avg) 0.841
|
126 |
+
2024-03-26 11:30:07,799 saving best model
|
127 |
+
2024-03-26 11:30:08,237 ----------------------------------------------------------------------------------------------------
|
128 |
+
2024-03-26 11:30:09,655 epoch 4 - iter 4/48 - loss 0.21797697 - time (sec): 1.42 - samples/sec: 1769.28 - lr: 0.000023 - momentum: 0.000000
|
129 |
+
2024-03-26 11:30:11,585 epoch 4 - iter 8/48 - loss 0.20290756 - time (sec): 3.35 - samples/sec: 1601.65 - lr: 0.000023 - momentum: 0.000000
|
130 |
+
2024-03-26 11:30:14,300 epoch 4 - iter 12/48 - loss 0.19289447 - time (sec): 6.06 - samples/sec: 1392.35 - lr: 0.000023 - momentum: 0.000000
|
131 |
+
2024-03-26 11:30:16,254 epoch 4 - iter 16/48 - loss 0.19394381 - time (sec): 8.01 - samples/sec: 1412.90 - lr: 0.000022 - momentum: 0.000000
|
132 |
+
2024-03-26 11:30:18,739 epoch 4 - iter 20/48 - loss 0.18292450 - time (sec): 10.50 - samples/sec: 1399.42 - lr: 0.000022 - momentum: 0.000000
|
133 |
+
2024-03-26 11:30:21,660 epoch 4 - iter 24/48 - loss 0.17189080 - time (sec): 13.42 - samples/sec: 1358.68 - lr: 0.000022 - momentum: 0.000000
|
134 |
+
2024-03-26 11:30:22,837 epoch 4 - iter 28/48 - loss 0.16802568 - time (sec): 14.60 - samples/sec: 1391.56 - lr: 0.000022 - momentum: 0.000000
|
135 |
+
2024-03-26 11:30:25,937 epoch 4 - iter 32/48 - loss 0.16476876 - time (sec): 17.70 - samples/sec: 1334.53 - lr: 0.000021 - momentum: 0.000000
|
136 |
+
2024-03-26 11:30:27,713 epoch 4 - iter 36/48 - loss 0.16884874 - time (sec): 19.47 - samples/sec: 1366.23 - lr: 0.000021 - momentum: 0.000000
|
137 |
+
2024-03-26 11:30:30,604 epoch 4 - iter 40/48 - loss 0.17341879 - time (sec): 22.36 - samples/sec: 1336.25 - lr: 0.000021 - momentum: 0.000000
|
138 |
+
2024-03-26 11:30:31,516 epoch 4 - iter 44/48 - loss 0.17612870 - time (sec): 23.28 - samples/sec: 1382.42 - lr: 0.000020 - momentum: 0.000000
|
139 |
+
2024-03-26 11:30:33,007 epoch 4 - iter 48/48 - loss 0.17762477 - time (sec): 24.77 - samples/sec: 1391.84 - lr: 0.000020 - momentum: 0.000000
|
140 |
+
2024-03-26 11:30:33,007 ----------------------------------------------------------------------------------------------------
|
141 |
+
2024-03-26 11:30:33,007 EPOCH 4 done: loss 0.1776 - lr: 0.000020
|
142 |
+
2024-03-26 11:30:33,938 DEV : loss 0.21595697104930878 - f1-score (micro avg) 0.8703
|
143 |
+
2024-03-26 11:30:33,939 saving best model
|
144 |
+
2024-03-26 11:30:34,355 ----------------------------------------------------------------------------------------------------
|
145 |
+
2024-03-26 11:30:36,824 epoch 5 - iter 4/48 - loss 0.10128394 - time (sec): 2.47 - samples/sec: 1288.85 - lr: 0.000020 - momentum: 0.000000
|
146 |
+
2024-03-26 11:30:38,283 epoch 5 - iter 8/48 - loss 0.12636567 - time (sec): 3.92 - samples/sec: 1450.48 - lr: 0.000020 - momentum: 0.000000
|
147 |
+
2024-03-26 11:30:39,795 epoch 5 - iter 12/48 - loss 0.12705709 - time (sec): 5.44 - samples/sec: 1515.78 - lr: 0.000019 - momentum: 0.000000
|
148 |
+
2024-03-26 11:30:42,025 epoch 5 - iter 16/48 - loss 0.12982090 - time (sec): 7.67 - samples/sec: 1435.07 - lr: 0.000019 - momentum: 0.000000
|
149 |
+
2024-03-26 11:30:44,163 epoch 5 - iter 20/48 - loss 0.14095353 - time (sec): 9.81 - samples/sec: 1436.03 - lr: 0.000019 - momentum: 0.000000
|
150 |
+
2024-03-26 11:30:46,773 epoch 5 - iter 24/48 - loss 0.13087999 - time (sec): 12.42 - samples/sec: 1417.95 - lr: 0.000018 - momentum: 0.000000
|
151 |
+
2024-03-26 11:30:49,376 epoch 5 - iter 28/48 - loss 0.12372680 - time (sec): 15.02 - samples/sec: 1401.54 - lr: 0.000018 - momentum: 0.000000
|
152 |
+
2024-03-26 11:30:51,372 epoch 5 - iter 32/48 - loss 0.12363549 - time (sec): 17.01 - samples/sec: 1399.62 - lr: 0.000018 - momentum: 0.000000
|
153 |
+
2024-03-26 11:30:53,267 epoch 5 - iter 36/48 - loss 0.12028316 - time (sec): 18.91 - samples/sec: 1398.55 - lr: 0.000018 - momentum: 0.000000
|
154 |
+
2024-03-26 11:30:55,711 epoch 5 - iter 40/48 - loss 0.12087516 - time (sec): 21.35 - samples/sec: 1383.09 - lr: 0.000017 - momentum: 0.000000
|
155 |
+
2024-03-26 11:30:57,735 epoch 5 - iter 44/48 - loss 0.12372093 - time (sec): 23.38 - samples/sec: 1381.72 - lr: 0.000017 - momentum: 0.000000
|
156 |
+
2024-03-26 11:30:58,831 epoch 5 - iter 48/48 - loss 0.12208489 - time (sec): 24.47 - samples/sec: 1408.53 - lr: 0.000017 - momentum: 0.000000
|
157 |
+
2024-03-26 11:30:58,832 ----------------------------------------------------------------------------------------------------
|
158 |
+
2024-03-26 11:30:58,832 EPOCH 5 done: loss 0.1221 - lr: 0.000017
|
159 |
+
2024-03-26 11:30:59,770 DEV : loss 0.1940278708934784 - f1-score (micro avg) 0.8874
|
160 |
+
2024-03-26 11:30:59,771 saving best model
|
161 |
+
2024-03-26 11:31:00,182 ----------------------------------------------------------------------------------------------------
|
162 |
+
2024-03-26 11:31:02,827 epoch 6 - iter 4/48 - loss 0.07889472 - time (sec): 2.64 - samples/sec: 1202.24 - lr: 0.000017 - momentum: 0.000000
|
163 |
+
2024-03-26 11:31:04,838 epoch 6 - iter 8/48 - loss 0.08598492 - time (sec): 4.65 - samples/sec: 1261.32 - lr: 0.000016 - momentum: 0.000000
|
164 |
+
2024-03-26 11:31:06,425 epoch 6 - iter 12/48 - loss 0.09225964 - time (sec): 6.24 - samples/sec: 1413.33 - lr: 0.000016 - momentum: 0.000000
|
165 |
+
2024-03-26 11:31:08,439 epoch 6 - iter 16/48 - loss 0.08541448 - time (sec): 8.26 - samples/sec: 1408.53 - lr: 0.000016 - momentum: 0.000000
|
166 |
+
2024-03-26 11:31:09,557 epoch 6 - iter 20/48 - loss 0.08708355 - time (sec): 9.37 - samples/sec: 1490.55 - lr: 0.000015 - momentum: 0.000000
|
167 |
+
2024-03-26 11:31:11,470 epoch 6 - iter 24/48 - loss 0.08587520 - time (sec): 11.29 - samples/sec: 1480.88 - lr: 0.000015 - momentum: 0.000000
|
168 |
+
2024-03-26 11:31:12,641 epoch 6 - iter 28/48 - loss 0.08689022 - time (sec): 12.46 - samples/sec: 1526.88 - lr: 0.000015 - momentum: 0.000000
|
169 |
+
2024-03-26 11:31:14,427 epoch 6 - iter 32/48 - loss 0.08360013 - time (sec): 14.24 - samples/sec: 1546.66 - lr: 0.000015 - momentum: 0.000000
|
170 |
+
2024-03-26 11:31:16,926 epoch 6 - iter 36/48 - loss 0.09308529 - time (sec): 16.74 - samples/sec: 1516.61 - lr: 0.000014 - momentum: 0.000000
|
171 |
+
2024-03-26 11:31:19,044 epoch 6 - iter 40/48 - loss 0.08998376 - time (sec): 18.86 - samples/sec: 1506.52 - lr: 0.000014 - momentum: 0.000000
|
172 |
+
2024-03-26 11:31:20,955 epoch 6 - iter 44/48 - loss 0.09427633 - time (sec): 20.77 - samples/sec: 1514.39 - lr: 0.000014 - momentum: 0.000000
|
173 |
+
2024-03-26 11:31:22,622 epoch 6 - iter 48/48 - loss 0.09601652 - time (sec): 22.44 - samples/sec: 1536.29 - lr: 0.000014 - momentum: 0.000000
|
174 |
+
2024-03-26 11:31:22,622 ----------------------------------------------------------------------------------------------------
|
175 |
+
2024-03-26 11:31:22,622 EPOCH 6 done: loss 0.0960 - lr: 0.000014
|
176 |
+
2024-03-26 11:31:23,637 DEV : loss 0.16633208096027374 - f1-score (micro avg) 0.9057
|
177 |
+
2024-03-26 11:31:23,638 saving best model
|
178 |
+
2024-03-26 11:31:24,064 ----------------------------------------------------------------------------------------------------
|
179 |
+
2024-03-26 11:31:26,319 epoch 7 - iter 4/48 - loss 0.09949447 - time (sec): 2.25 - samples/sec: 1227.36 - lr: 0.000013 - momentum: 0.000000
|
180 |
+
2024-03-26 11:31:28,067 epoch 7 - iter 8/48 - loss 0.08834209 - time (sec): 4.00 - samples/sec: 1436.31 - lr: 0.000013 - momentum: 0.000000
|
181 |
+
2024-03-26 11:31:30,171 epoch 7 - iter 12/48 - loss 0.07538884 - time (sec): 6.10 - samples/sec: 1403.52 - lr: 0.000013 - momentum: 0.000000
|
182 |
+
2024-03-26 11:31:32,848 epoch 7 - iter 16/48 - loss 0.07115551 - time (sec): 8.78 - samples/sec: 1345.96 - lr: 0.000012 - momentum: 0.000000
|
183 |
+
2024-03-26 11:31:35,609 epoch 7 - iter 20/48 - loss 0.07329202 - time (sec): 11.54 - samples/sec: 1354.16 - lr: 0.000012 - momentum: 0.000000
|
184 |
+
2024-03-26 11:31:37,148 epoch 7 - iter 24/48 - loss 0.07469969 - time (sec): 13.08 - samples/sec: 1377.79 - lr: 0.000012 - momentum: 0.000000
|
185 |
+
2024-03-26 11:31:39,322 epoch 7 - iter 28/48 - loss 0.07052829 - time (sec): 15.25 - samples/sec: 1394.57 - lr: 0.000012 - momentum: 0.000000
|
186 |
+
2024-03-26 11:31:41,517 epoch 7 - iter 32/48 - loss 0.07200985 - time (sec): 17.45 - samples/sec: 1401.00 - lr: 0.000011 - momentum: 0.000000
|
187 |
+
2024-03-26 11:31:43,783 epoch 7 - iter 36/48 - loss 0.07551991 - time (sec): 19.72 - samples/sec: 1388.47 - lr: 0.000011 - momentum: 0.000000
|
188 |
+
2024-03-26 11:31:45,405 epoch 7 - iter 40/48 - loss 0.07178417 - time (sec): 21.34 - samples/sec: 1397.27 - lr: 0.000011 - momentum: 0.000000
|
189 |
+
2024-03-26 11:31:47,155 epoch 7 - iter 44/48 - loss 0.07612817 - time (sec): 23.09 - samples/sec: 1412.76 - lr: 0.000010 - momentum: 0.000000
|
190 |
+
2024-03-26 11:31:48,494 epoch 7 - iter 48/48 - loss 0.07589320 - time (sec): 24.43 - samples/sec: 1411.18 - lr: 0.000010 - momentum: 0.000000
|
191 |
+
2024-03-26 11:31:48,495 ----------------------------------------------------------------------------------------------------
|
192 |
+
2024-03-26 11:31:48,495 EPOCH 7 done: loss 0.0759 - lr: 0.000010
|
193 |
+
2024-03-26 11:31:49,434 DEV : loss 0.1753850281238556 - f1-score (micro avg) 0.9099
|
194 |
+
2024-03-26 11:31:49,435 saving best model
|
195 |
+
2024-03-26 11:31:49,852 ----------------------------------------------------------------------------------------------------
|
196 |
+
2024-03-26 11:31:52,193 epoch 8 - iter 4/48 - loss 0.04971259 - time (sec): 2.34 - samples/sec: 1256.69 - lr: 0.000010 - momentum: 0.000000
|
197 |
+
2024-03-26 11:31:54,792 epoch 8 - iter 8/48 - loss 0.04272157 - time (sec): 4.94 - samples/sec: 1339.53 - lr: 0.000010 - momentum: 0.000000
|
198 |
+
2024-03-26 11:31:56,850 epoch 8 - iter 12/48 - loss 0.04580288 - time (sec): 7.00 - samples/sec: 1315.34 - lr: 0.000009 - momentum: 0.000000
|
199 |
+
2024-03-26 11:31:58,964 epoch 8 - iter 16/48 - loss 0.04476270 - time (sec): 9.11 - samples/sec: 1316.14 - lr: 0.000009 - momentum: 0.000000
|
200 |
+
2024-03-26 11:32:00,486 epoch 8 - iter 20/48 - loss 0.04843952 - time (sec): 10.63 - samples/sec: 1344.78 - lr: 0.000009 - momentum: 0.000000
|
201 |
+
2024-03-26 11:32:02,983 epoch 8 - iter 24/48 - loss 0.04775011 - time (sec): 13.13 - samples/sec: 1320.97 - lr: 0.000009 - momentum: 0.000000
|
202 |
+
2024-03-26 11:32:05,218 epoch 8 - iter 28/48 - loss 0.04750025 - time (sec): 15.36 - samples/sec: 1311.92 - lr: 0.000008 - momentum: 0.000000
|
203 |
+
2024-03-26 11:32:07,589 epoch 8 - iter 32/48 - loss 0.05742224 - time (sec): 17.73 - samples/sec: 1323.77 - lr: 0.000008 - momentum: 0.000000
|
204 |
+
2024-03-26 11:32:10,891 epoch 8 - iter 36/48 - loss 0.05911377 - time (sec): 21.04 - samples/sec: 1275.71 - lr: 0.000008 - momentum: 0.000000
|
205 |
+
2024-03-26 11:32:12,982 epoch 8 - iter 40/48 - loss 0.06240256 - time (sec): 23.13 - samples/sec: 1279.67 - lr: 0.000007 - momentum: 0.000000
|
206 |
+
2024-03-26 11:32:13,802 epoch 8 - iter 44/48 - loss 0.06115800 - time (sec): 23.95 - samples/sec: 1326.62 - lr: 0.000007 - momentum: 0.000000
|
207 |
+
2024-03-26 11:32:15,674 epoch 8 - iter 48/48 - loss 0.06146428 - time (sec): 25.82 - samples/sec: 1335.10 - lr: 0.000007 - momentum: 0.000000
|
208 |
+
2024-03-26 11:32:15,674 ----------------------------------------------------------------------------------------------------
|
209 |
+
2024-03-26 11:32:15,674 EPOCH 8 done: loss 0.0615 - lr: 0.000007
|
210 |
+
2024-03-26 11:32:16,618 DEV : loss 0.1729622334241867 - f1-score (micro avg) 0.9217
|
211 |
+
2024-03-26 11:32:16,619 saving best model
|
212 |
+
2024-03-26 11:32:17,053 ----------------------------------------------------------------------------------------------------
|
213 |
+
2024-03-26 11:32:19,812 epoch 9 - iter 4/48 - loss 0.02513346 - time (sec): 2.76 - samples/sec: 1195.60 - lr: 0.000007 - momentum: 0.000000
|
214 |
+
2024-03-26 11:32:21,527 epoch 9 - iter 8/48 - loss 0.03936410 - time (sec): 4.47 - samples/sec: 1282.75 - lr: 0.000006 - momentum: 0.000000
|
215 |
+
2024-03-26 11:32:23,745 epoch 9 - iter 12/48 - loss 0.04345948 - time (sec): 6.69 - samples/sec: 1341.51 - lr: 0.000006 - momentum: 0.000000
|
216 |
+
2024-03-26 11:32:25,919 epoch 9 - iter 16/48 - loss 0.04929510 - time (sec): 8.87 - samples/sec: 1365.99 - lr: 0.000006 - momentum: 0.000000
|
217 |
+
2024-03-26 11:32:28,305 epoch 9 - iter 20/48 - loss 0.04329415 - time (sec): 11.25 - samples/sec: 1344.62 - lr: 0.000006 - momentum: 0.000000
|
218 |
+
2024-03-26 11:32:30,275 epoch 9 - iter 24/48 - loss 0.04316844 - time (sec): 13.22 - samples/sec: 1339.34 - lr: 0.000005 - momentum: 0.000000
|
219 |
+
2024-03-26 11:32:33,584 epoch 9 - iter 28/48 - loss 0.04521706 - time (sec): 16.53 - samples/sec: 1292.32 - lr: 0.000005 - momentum: 0.000000
|
220 |
+
2024-03-26 11:32:35,023 epoch 9 - iter 32/48 - loss 0.04632784 - time (sec): 17.97 - samples/sec: 1329.46 - lr: 0.000005 - momentum: 0.000000
|
221 |
+
2024-03-26 11:32:37,559 epoch 9 - iter 36/48 - loss 0.04576751 - time (sec): 20.51 - samples/sec: 1314.24 - lr: 0.000004 - momentum: 0.000000
|
222 |
+
2024-03-26 11:32:39,087 epoch 9 - iter 40/48 - loss 0.04672450 - time (sec): 22.03 - samples/sec: 1330.69 - lr: 0.000004 - momentum: 0.000000
|
223 |
+
2024-03-26 11:32:40,686 epoch 9 - iter 44/48 - loss 0.05077131 - time (sec): 23.63 - samples/sec: 1346.09 - lr: 0.000004 - momentum: 0.000000
|
224 |
+
2024-03-26 11:32:42,147 epoch 9 - iter 48/48 - loss 0.05019774 - time (sec): 25.09 - samples/sec: 1373.75 - lr: 0.000004 - momentum: 0.000000
|
225 |
+
2024-03-26 11:32:42,147 ----------------------------------------------------------------------------------------------------
|
226 |
+
2024-03-26 11:32:42,147 EPOCH 9 done: loss 0.0502 - lr: 0.000004
|
227 |
+
2024-03-26 11:32:43,109 DEV : loss 0.1699032187461853 - f1-score (micro avg) 0.9231
|
228 |
+
2024-03-26 11:32:43,110 saving best model
|
229 |
+
2024-03-26 11:32:43,558 ----------------------------------------------------------------------------------------------------
|
230 |
+
2024-03-26 11:32:45,998 epoch 10 - iter 4/48 - loss 0.03662752 - time (sec): 2.44 - samples/sec: 1349.96 - lr: 0.000003 - momentum: 0.000000
|
231 |
+
2024-03-26 11:32:47,962 epoch 10 - iter 8/48 - loss 0.03037126 - time (sec): 4.40 - samples/sec: 1327.43 - lr: 0.000003 - momentum: 0.000000
|
232 |
+
2024-03-26 11:32:49,206 epoch 10 - iter 12/48 - loss 0.04628769 - time (sec): 5.65 - samples/sec: 1477.30 - lr: 0.000003 - momentum: 0.000000
|
233 |
+
2024-03-26 11:32:50,823 epoch 10 - iter 16/48 - loss 0.04875219 - time (sec): 7.26 - samples/sec: 1544.68 - lr: 0.000002 - momentum: 0.000000
|
234 |
+
2024-03-26 11:32:52,553 epoch 10 - iter 20/48 - loss 0.04803083 - time (sec): 8.99 - samples/sec: 1587.29 - lr: 0.000002 - momentum: 0.000000
|
235 |
+
2024-03-26 11:32:54,697 epoch 10 - iter 24/48 - loss 0.04545013 - time (sec): 11.14 - samples/sec: 1530.21 - lr: 0.000002 - momentum: 0.000000
|
236 |
+
2024-03-26 11:32:56,857 epoch 10 - iter 28/48 - loss 0.04259668 - time (sec): 13.30 - samples/sec: 1500.05 - lr: 0.000002 - momentum: 0.000000
|
237 |
+
2024-03-26 11:32:59,062 epoch 10 - iter 32/48 - loss 0.04332711 - time (sec): 15.50 - samples/sec: 1504.92 - lr: 0.000001 - momentum: 0.000000
|
238 |
+
2024-03-26 11:33:00,477 epoch 10 - iter 36/48 - loss 0.04375102 - time (sec): 16.92 - samples/sec: 1507.29 - lr: 0.000001 - momentum: 0.000000
|
239 |
+
2024-03-26 11:33:03,190 epoch 10 - iter 40/48 - loss 0.04177886 - time (sec): 19.63 - samples/sec: 1470.02 - lr: 0.000001 - momentum: 0.000000
|
240 |
+
2024-03-26 11:33:05,735 epoch 10 - iter 44/48 - loss 0.04534989 - time (sec): 22.18 - samples/sec: 1449.98 - lr: 0.000001 - momentum: 0.000000
|
241 |
+
2024-03-26 11:33:07,424 epoch 10 - iter 48/48 - loss 0.04586777 - time (sec): 23.87 - samples/sec: 1444.45 - lr: 0.000000 - momentum: 0.000000
|
242 |
+
2024-03-26 11:33:07,425 ----------------------------------------------------------------------------------------------------
|
243 |
+
2024-03-26 11:33:07,425 EPOCH 10 done: loss 0.0459 - lr: 0.000000
|
244 |
+
2024-03-26 11:33:08,375 DEV : loss 0.17032475769519806 - f1-score (micro avg) 0.9201
|
245 |
+
2024-03-26 11:33:08,648 ----------------------------------------------------------------------------------------------------
|
246 |
+
2024-03-26 11:33:08,648 Loading model from best epoch ...
|
247 |
+
2024-03-26 11:33:09,467 SequenceTagger predicts: Dictionary with 17 tags: O, S-Unternehmen, B-Unternehmen, E-Unternehmen, I-Unternehmen, S-Auslagerung, B-Auslagerung, E-Auslagerung, I-Auslagerung, S-Ort, B-Ort, E-Ort, I-Ort, S-Software, B-Software, E-Software, I-Software
|
248 |
+
2024-03-26 11:33:10,236
|
249 |
+
Results:
|
250 |
+
- F-score (micro) 0.899
|
251 |
+
- F-score (macro) 0.6855
|
252 |
+
- Accuracy 0.8188
|
253 |
+
|
254 |
+
By class:
|
255 |
+
precision recall f1-score support
|
256 |
+
|
257 |
+
Unternehmen 0.8826 0.8759 0.8792 266
|
258 |
+
Auslagerung 0.8496 0.9076 0.8777 249
|
259 |
+
Ort 0.9779 0.9925 0.9852 134
|
260 |
+
Software 0.0000 0.0000 0.0000 0
|
261 |
+
|
262 |
+
micro avg 0.8862 0.9122 0.8990 649
|
263 |
+
macro avg 0.6775 0.6940 0.6855 649
|
264 |
+
weighted avg 0.8896 0.9122 0.9005 649
|
265 |
+
|
266 |
+
2024-03-26 11:33:10,236 ----------------------------------------------------------------------------------------------------
|