Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- runs/events.out.tfevents.1697539067.bce904bcef33.2023.0 +3 -0
- test.tsv +0 -0
- training.log +237 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:6665bdf1354c734899cad0677ba13d4a117896976079a585c16b5fae9224232d
|
3 |
+
size 440941957
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 10:39:22 0.0000 0.3695 0.0982 0.6690 0.7681 0.7151 0.5764
|
3 |
+
2 10:40:58 0.0000 0.1142 0.0969 0.7367 0.7658 0.7510 0.6143
|
4 |
+
3 10:42:35 0.0000 0.0841 0.1118 0.7370 0.7828 0.7592 0.6297
|
5 |
+
4 10:44:15 0.0000 0.0653 0.1471 0.7637 0.7896 0.7764 0.6499
|
6 |
+
5 10:45:50 0.0000 0.0493 0.2094 0.7414 0.7817 0.7610 0.6293
|
7 |
+
6 10:47:25 0.0000 0.0380 0.2023 0.7517 0.7704 0.7609 0.6294
|
8 |
+
7 10:49:02 0.0000 0.0282 0.2101 0.7382 0.7817 0.7593 0.6265
|
9 |
+
8 10:50:35 0.0000 0.0191 0.2338 0.7479 0.7885 0.7676 0.6389
|
10 |
+
9 10:52:10 0.0000 0.0132 0.2385 0.7593 0.7851 0.7720 0.6420
|
11 |
+
10 10:53:45 0.0000 0.0095 0.2461 0.7492 0.7873 0.7678 0.6379
|
runs/events.out.tfevents.1697539067.bce904bcef33.2023.0
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:cce1a3dfd2895bafcf0df839fc1e9ea292721d061dc71232c5b39e0f7873a1c9
|
3 |
+
size 1108164
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,237 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-17 10:37:47,590 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-17 10:37:47,591 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): ElectraModel(
|
5 |
+
(embeddings): ElectraEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): ElectraEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x ElectraLayer(
|
15 |
+
(attention): ElectraAttention(
|
16 |
+
(self): ElectraSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): ElectraSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): ElectraIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): ElectraOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
)
|
41 |
+
)
|
42 |
+
(locked_dropout): LockedDropout(p=0.5)
|
43 |
+
(linear): Linear(in_features=768, out_features=13, bias=True)
|
44 |
+
(loss_function): CrossEntropyLoss()
|
45 |
+
)"
|
46 |
+
2023-10-17 10:37:47,591 ----------------------------------------------------------------------------------------------------
|
47 |
+
2023-10-17 10:37:47,591 MultiCorpus: 7936 train + 992 dev + 992 test sentences
|
48 |
+
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
|
49 |
+
2023-10-17 10:37:47,591 ----------------------------------------------------------------------------------------------------
|
50 |
+
2023-10-17 10:37:47,591 Train: 7936 sentences
|
51 |
+
2023-10-17 10:37:47,591 (train_with_dev=False, train_with_test=False)
|
52 |
+
2023-10-17 10:37:47,592 ----------------------------------------------------------------------------------------------------
|
53 |
+
2023-10-17 10:37:47,592 Training Params:
|
54 |
+
2023-10-17 10:37:47,592 - learning_rate: "3e-05"
|
55 |
+
2023-10-17 10:37:47,592 - mini_batch_size: "4"
|
56 |
+
2023-10-17 10:37:47,592 - max_epochs: "10"
|
57 |
+
2023-10-17 10:37:47,592 - shuffle: "True"
|
58 |
+
2023-10-17 10:37:47,592 ----------------------------------------------------------------------------------------------------
|
59 |
+
2023-10-17 10:37:47,592 Plugins:
|
60 |
+
2023-10-17 10:37:47,592 - TensorboardLogger
|
61 |
+
2023-10-17 10:37:47,592 - LinearScheduler | warmup_fraction: '0.1'
|
62 |
+
2023-10-17 10:37:47,592 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-17 10:37:47,592 Final evaluation on model from best epoch (best-model.pt)
|
64 |
+
2023-10-17 10:37:47,592 - metric: "('micro avg', 'f1-score')"
|
65 |
+
2023-10-17 10:37:47,592 ----------------------------------------------------------------------------------------------------
|
66 |
+
2023-10-17 10:37:47,592 Computation:
|
67 |
+
2023-10-17 10:37:47,592 - compute on device: cuda:0
|
68 |
+
2023-10-17 10:37:47,592 - embedding storage: none
|
69 |
+
2023-10-17 10:37:47,592 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-17 10:37:47,592 Model training base path: "hmbench-icdar/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
|
71 |
+
2023-10-17 10:37:47,592 ----------------------------------------------------------------------------------------------------
|
72 |
+
2023-10-17 10:37:47,592 ----------------------------------------------------------------------------------------------------
|
73 |
+
2023-10-17 10:37:47,592 Logging anything other than scalars to TensorBoard is currently not supported.
|
74 |
+
2023-10-17 10:37:57,473 epoch 1 - iter 198/1984 - loss 1.96262558 - time (sec): 9.88 - samples/sec: 1637.74 - lr: 0.000003 - momentum: 0.000000
|
75 |
+
2023-10-17 10:38:06,276 epoch 1 - iter 396/1984 - loss 1.14353968 - time (sec): 18.68 - samples/sec: 1779.11 - lr: 0.000006 - momentum: 0.000000
|
76 |
+
2023-10-17 10:38:15,247 epoch 1 - iter 594/1984 - loss 0.85110538 - time (sec): 27.65 - samples/sec: 1794.33 - lr: 0.000009 - momentum: 0.000000
|
77 |
+
2023-10-17 10:38:23,912 epoch 1 - iter 792/1984 - loss 0.69733723 - time (sec): 36.32 - samples/sec: 1800.50 - lr: 0.000012 - momentum: 0.000000
|
78 |
+
2023-10-17 10:38:32,916 epoch 1 - iter 990/1984 - loss 0.59220786 - time (sec): 45.32 - samples/sec: 1805.64 - lr: 0.000015 - momentum: 0.000000
|
79 |
+
2023-10-17 10:38:41,867 epoch 1 - iter 1188/1984 - loss 0.51720323 - time (sec): 54.27 - samples/sec: 1817.61 - lr: 0.000018 - momentum: 0.000000
|
80 |
+
2023-10-17 10:38:50,819 epoch 1 - iter 1386/1984 - loss 0.46529206 - time (sec): 63.23 - samples/sec: 1815.63 - lr: 0.000021 - momentum: 0.000000
|
81 |
+
2023-10-17 10:38:59,939 epoch 1 - iter 1584/1984 - loss 0.42634007 - time (sec): 72.35 - samples/sec: 1814.85 - lr: 0.000024 - momentum: 0.000000
|
82 |
+
2023-10-17 10:39:09,082 epoch 1 - iter 1782/1984 - loss 0.39543817 - time (sec): 81.49 - samples/sec: 1806.96 - lr: 0.000027 - momentum: 0.000000
|
83 |
+
2023-10-17 10:39:18,133 epoch 1 - iter 1980/1984 - loss 0.37002480 - time (sec): 90.54 - samples/sec: 1807.55 - lr: 0.000030 - momentum: 0.000000
|
84 |
+
2023-10-17 10:39:18,315 ----------------------------------------------------------------------------------------------------
|
85 |
+
2023-10-17 10:39:18,315 EPOCH 1 done: loss 0.3695 - lr: 0.000030
|
86 |
+
2023-10-17 10:39:22,176 DEV : loss 0.09817007929086685 - f1-score (micro avg) 0.7151
|
87 |
+
2023-10-17 10:39:22,205 saving best model
|
88 |
+
2023-10-17 10:39:22,659 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-17 10:39:32,314 epoch 2 - iter 198/1984 - loss 0.11703805 - time (sec): 9.65 - samples/sec: 1751.29 - lr: 0.000030 - momentum: 0.000000
|
90 |
+
2023-10-17 10:39:41,399 epoch 2 - iter 396/1984 - loss 0.12135474 - time (sec): 18.74 - samples/sec: 1767.42 - lr: 0.000029 - momentum: 0.000000
|
91 |
+
2023-10-17 10:39:50,558 epoch 2 - iter 594/1984 - loss 0.12116589 - time (sec): 27.90 - samples/sec: 1778.79 - lr: 0.000029 - momentum: 0.000000
|
92 |
+
2023-10-17 10:39:59,873 epoch 2 - iter 792/1984 - loss 0.11894017 - time (sec): 37.21 - samples/sec: 1785.54 - lr: 0.000029 - momentum: 0.000000
|
93 |
+
2023-10-17 10:40:08,857 epoch 2 - iter 990/1984 - loss 0.11639961 - time (sec): 46.20 - samples/sec: 1787.34 - lr: 0.000028 - momentum: 0.000000
|
94 |
+
2023-10-17 10:40:18,317 epoch 2 - iter 1188/1984 - loss 0.11443201 - time (sec): 55.66 - samples/sec: 1784.94 - lr: 0.000028 - momentum: 0.000000
|
95 |
+
2023-10-17 10:40:27,227 epoch 2 - iter 1386/1984 - loss 0.11365206 - time (sec): 64.57 - samples/sec: 1796.15 - lr: 0.000028 - momentum: 0.000000
|
96 |
+
2023-10-17 10:40:36,279 epoch 2 - iter 1584/1984 - loss 0.11390623 - time (sec): 73.62 - samples/sec: 1788.69 - lr: 0.000027 - momentum: 0.000000
|
97 |
+
2023-10-17 10:40:45,274 epoch 2 - iter 1782/1984 - loss 0.11339437 - time (sec): 82.61 - samples/sec: 1782.61 - lr: 0.000027 - momentum: 0.000000
|
98 |
+
2023-10-17 10:40:54,345 epoch 2 - iter 1980/1984 - loss 0.11417783 - time (sec): 91.68 - samples/sec: 1784.96 - lr: 0.000027 - momentum: 0.000000
|
99 |
+
2023-10-17 10:40:54,529 ----------------------------------------------------------------------------------------------------
|
100 |
+
2023-10-17 10:40:54,529 EPOCH 2 done: loss 0.1142 - lr: 0.000027
|
101 |
+
2023-10-17 10:40:58,257 DEV : loss 0.09690196067094803 - f1-score (micro avg) 0.751
|
102 |
+
2023-10-17 10:40:58,286 saving best model
|
103 |
+
2023-10-17 10:40:58,864 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-17 10:41:08,525 epoch 3 - iter 198/1984 - loss 0.08243740 - time (sec): 9.66 - samples/sec: 1715.82 - lr: 0.000026 - momentum: 0.000000
|
105 |
+
2023-10-17 10:41:17,969 epoch 3 - iter 396/1984 - loss 0.08405210 - time (sec): 19.10 - samples/sec: 1720.39 - lr: 0.000026 - momentum: 0.000000
|
106 |
+
2023-10-17 10:41:28,471 epoch 3 - iter 594/1984 - loss 0.08612073 - time (sec): 29.60 - samples/sec: 1675.84 - lr: 0.000026 - momentum: 0.000000
|
107 |
+
2023-10-17 10:41:37,701 epoch 3 - iter 792/1984 - loss 0.08283051 - time (sec): 38.83 - samples/sec: 1710.41 - lr: 0.000025 - momentum: 0.000000
|
108 |
+
2023-10-17 10:41:46,586 epoch 3 - iter 990/1984 - loss 0.08274479 - time (sec): 47.72 - samples/sec: 1733.22 - lr: 0.000025 - momentum: 0.000000
|
109 |
+
2023-10-17 10:41:55,624 epoch 3 - iter 1188/1984 - loss 0.08251098 - time (sec): 56.76 - samples/sec: 1734.90 - lr: 0.000025 - momentum: 0.000000
|
110 |
+
2023-10-17 10:42:04,447 epoch 3 - iter 1386/1984 - loss 0.08323712 - time (sec): 65.58 - samples/sec: 1750.52 - lr: 0.000024 - momentum: 0.000000
|
111 |
+
2023-10-17 10:42:13,596 epoch 3 - iter 1584/1984 - loss 0.08296027 - time (sec): 74.73 - samples/sec: 1755.28 - lr: 0.000024 - momentum: 0.000000
|
112 |
+
2023-10-17 10:42:22,756 epoch 3 - iter 1782/1984 - loss 0.08247111 - time (sec): 83.89 - samples/sec: 1763.01 - lr: 0.000024 - momentum: 0.000000
|
113 |
+
2023-10-17 10:42:31,763 epoch 3 - iter 1980/1984 - loss 0.08393188 - time (sec): 92.90 - samples/sec: 1761.21 - lr: 0.000023 - momentum: 0.000000
|
114 |
+
2023-10-17 10:42:31,952 ----------------------------------------------------------------------------------------------------
|
115 |
+
2023-10-17 10:42:31,952 EPOCH 3 done: loss 0.0841 - lr: 0.000023
|
116 |
+
2023-10-17 10:42:35,645 DEV : loss 0.11184526234865189 - f1-score (micro avg) 0.7592
|
117 |
+
2023-10-17 10:42:35,669 saving best model
|
118 |
+
2023-10-17 10:42:36,275 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-17 10:42:45,911 epoch 4 - iter 198/1984 - loss 0.06238332 - time (sec): 9.63 - samples/sec: 1704.13 - lr: 0.000023 - momentum: 0.000000
|
120 |
+
2023-10-17 10:42:55,157 epoch 4 - iter 396/1984 - loss 0.06005836 - time (sec): 18.88 - samples/sec: 1808.83 - lr: 0.000023 - momentum: 0.000000
|
121 |
+
2023-10-17 10:43:04,798 epoch 4 - iter 594/1984 - loss 0.06300783 - time (sec): 28.52 - samples/sec: 1785.86 - lr: 0.000022 - momentum: 0.000000
|
122 |
+
2023-10-17 10:43:15,294 epoch 4 - iter 792/1984 - loss 0.06418959 - time (sec): 39.02 - samples/sec: 1728.50 - lr: 0.000022 - momentum: 0.000000
|
123 |
+
2023-10-17 10:43:25,402 epoch 4 - iter 990/1984 - loss 0.06410527 - time (sec): 49.12 - samples/sec: 1703.66 - lr: 0.000022 - momentum: 0.000000
|
124 |
+
2023-10-17 10:43:34,556 epoch 4 - iter 1188/1984 - loss 0.06627853 - time (sec): 58.28 - samples/sec: 1704.06 - lr: 0.000021 - momentum: 0.000000
|
125 |
+
2023-10-17 10:43:43,828 epoch 4 - iter 1386/1984 - loss 0.06493629 - time (sec): 67.55 - samples/sec: 1701.29 - lr: 0.000021 - momentum: 0.000000
|
126 |
+
2023-10-17 10:43:53,244 epoch 4 - iter 1584/1984 - loss 0.06437724 - time (sec): 76.97 - samples/sec: 1702.26 - lr: 0.000021 - momentum: 0.000000
|
127 |
+
2023-10-17 10:44:02,367 epoch 4 - iter 1782/1984 - loss 0.06554549 - time (sec): 86.09 - samples/sec: 1715.79 - lr: 0.000020 - momentum: 0.000000
|
128 |
+
2023-10-17 10:44:11,335 epoch 4 - iter 1980/1984 - loss 0.06532216 - time (sec): 95.06 - samples/sec: 1722.76 - lr: 0.000020 - momentum: 0.000000
|
129 |
+
2023-10-17 10:44:11,528 ----------------------------------------------------------------------------------------------------
|
130 |
+
2023-10-17 10:44:11,528 EPOCH 4 done: loss 0.0653 - lr: 0.000020
|
131 |
+
2023-10-17 10:44:15,291 DEV : loss 0.14710469543933868 - f1-score (micro avg) 0.7764
|
132 |
+
2023-10-17 10:44:15,314 saving best model
|
133 |
+
2023-10-17 10:44:15,834 ----------------------------------------------------------------------------------------------------
|
134 |
+
2023-10-17 10:44:24,991 epoch 5 - iter 198/1984 - loss 0.05310506 - time (sec): 9.16 - samples/sec: 1776.04 - lr: 0.000020 - momentum: 0.000000
|
135 |
+
2023-10-17 10:44:33,692 epoch 5 - iter 396/1984 - loss 0.05550943 - time (sec): 17.86 - samples/sec: 1876.49 - lr: 0.000019 - momentum: 0.000000
|
136 |
+
2023-10-17 10:44:42,752 epoch 5 - iter 594/1984 - loss 0.05128920 - time (sec): 26.92 - samples/sec: 1865.39 - lr: 0.000019 - momentum: 0.000000
|
137 |
+
2023-10-17 10:44:51,952 epoch 5 - iter 792/1984 - loss 0.05178507 - time (sec): 36.12 - samples/sec: 1875.12 - lr: 0.000019 - momentum: 0.000000
|
138 |
+
2023-10-17 10:45:01,073 epoch 5 - iter 990/1984 - loss 0.05015438 - time (sec): 45.24 - samples/sec: 1860.06 - lr: 0.000018 - momentum: 0.000000
|
139 |
+
2023-10-17 10:45:10,073 epoch 5 - iter 1188/1984 - loss 0.05003515 - time (sec): 54.24 - samples/sec: 1843.52 - lr: 0.000018 - momentum: 0.000000
|
140 |
+
2023-10-17 10:45:19,019 epoch 5 - iter 1386/1984 - loss 0.05017555 - time (sec): 63.18 - samples/sec: 1842.13 - lr: 0.000018 - momentum: 0.000000
|
141 |
+
2023-10-17 10:45:27,581 epoch 5 - iter 1584/1984 - loss 0.05031975 - time (sec): 71.75 - samples/sec: 1842.93 - lr: 0.000017 - momentum: 0.000000
|
142 |
+
2023-10-17 10:45:36,697 epoch 5 - iter 1782/1984 - loss 0.05007302 - time (sec): 80.86 - samples/sec: 1832.85 - lr: 0.000017 - momentum: 0.000000
|
143 |
+
2023-10-17 10:45:46,443 epoch 5 - iter 1980/1984 - loss 0.04916707 - time (sec): 90.61 - samples/sec: 1805.91 - lr: 0.000017 - momentum: 0.000000
|
144 |
+
2023-10-17 10:45:46,629 ----------------------------------------------------------------------------------------------------
|
145 |
+
2023-10-17 10:45:46,629 EPOCH 5 done: loss 0.0493 - lr: 0.000017
|
146 |
+
2023-10-17 10:45:50,160 DEV : loss 0.20937786996364594 - f1-score (micro avg) 0.761
|
147 |
+
2023-10-17 10:45:50,185 ----------------------------------------------------------------------------------------------------
|
148 |
+
2023-10-17 10:46:00,578 epoch 6 - iter 198/1984 - loss 0.03889353 - time (sec): 10.39 - samples/sec: 1582.61 - lr: 0.000016 - momentum: 0.000000
|
149 |
+
2023-10-17 10:46:09,604 epoch 6 - iter 396/1984 - loss 0.04124372 - time (sec): 19.42 - samples/sec: 1681.97 - lr: 0.000016 - momentum: 0.000000
|
150 |
+
2023-10-17 10:46:18,836 epoch 6 - iter 594/1984 - loss 0.04058583 - time (sec): 28.65 - samples/sec: 1753.98 - lr: 0.000016 - momentum: 0.000000
|
151 |
+
2023-10-17 10:46:27,818 epoch 6 - iter 792/1984 - loss 0.04104349 - time (sec): 37.63 - samples/sec: 1770.74 - lr: 0.000015 - momentum: 0.000000
|
152 |
+
2023-10-17 10:46:36,799 epoch 6 - iter 990/1984 - loss 0.03959461 - time (sec): 46.61 - samples/sec: 1789.31 - lr: 0.000015 - momentum: 0.000000
|
153 |
+
2023-10-17 10:46:45,771 epoch 6 - iter 1188/1984 - loss 0.03893122 - time (sec): 55.58 - samples/sec: 1793.74 - lr: 0.000015 - momentum: 0.000000
|
154 |
+
2023-10-17 10:46:54,632 epoch 6 - iter 1386/1984 - loss 0.03835506 - time (sec): 64.45 - samples/sec: 1789.43 - lr: 0.000014 - momentum: 0.000000
|
155 |
+
2023-10-17 10:47:03,663 epoch 6 - iter 1584/1984 - loss 0.03800712 - time (sec): 73.48 - samples/sec: 1785.14 - lr: 0.000014 - momentum: 0.000000
|
156 |
+
2023-10-17 10:47:12,792 epoch 6 - iter 1782/1984 - loss 0.03800912 - time (sec): 82.61 - samples/sec: 1783.20 - lr: 0.000014 - momentum: 0.000000
|
157 |
+
2023-10-17 10:47:21,770 epoch 6 - iter 1980/1984 - loss 0.03804788 - time (sec): 91.58 - samples/sec: 1786.04 - lr: 0.000013 - momentum: 0.000000
|
158 |
+
2023-10-17 10:47:21,950 ----------------------------------------------------------------------------------------------------
|
159 |
+
2023-10-17 10:47:21,950 EPOCH 6 done: loss 0.0380 - lr: 0.000013
|
160 |
+
2023-10-17 10:47:25,502 DEV : loss 0.20233358442783356 - f1-score (micro avg) 0.7609
|
161 |
+
2023-10-17 10:47:25,529 ----------------------------------------------------------------------------------------------------
|
162 |
+
2023-10-17 10:47:35,922 epoch 7 - iter 198/1984 - loss 0.02769256 - time (sec): 10.39 - samples/sec: 1568.34 - lr: 0.000013 - momentum: 0.000000
|
163 |
+
2023-10-17 10:47:45,345 epoch 7 - iter 396/1984 - loss 0.02809458 - time (sec): 19.81 - samples/sec: 1656.25 - lr: 0.000013 - momentum: 0.000000
|
164 |
+
2023-10-17 10:47:54,754 epoch 7 - iter 594/1984 - loss 0.02925218 - time (sec): 29.22 - samples/sec: 1696.02 - lr: 0.000012 - momentum: 0.000000
|
165 |
+
2023-10-17 10:48:03,765 epoch 7 - iter 792/1984 - loss 0.02875332 - time (sec): 38.23 - samples/sec: 1717.27 - lr: 0.000012 - momentum: 0.000000
|
166 |
+
2023-10-17 10:48:12,807 epoch 7 - iter 990/1984 - loss 0.02726148 - time (sec): 47.28 - samples/sec: 1733.61 - lr: 0.000012 - momentum: 0.000000
|
167 |
+
2023-10-17 10:48:21,730 epoch 7 - iter 1188/1984 - loss 0.02657124 - time (sec): 56.20 - samples/sec: 1751.52 - lr: 0.000011 - momentum: 0.000000
|
168 |
+
2023-10-17 10:48:30,737 epoch 7 - iter 1386/1984 - loss 0.02702982 - time (sec): 65.21 - samples/sec: 1759.96 - lr: 0.000011 - momentum: 0.000000
|
169 |
+
2023-10-17 10:48:39,970 epoch 7 - iter 1584/1984 - loss 0.02650602 - time (sec): 74.44 - samples/sec: 1750.60 - lr: 0.000011 - momentum: 0.000000
|
170 |
+
2023-10-17 10:48:49,162 epoch 7 - iter 1782/1984 - loss 0.02844952 - time (sec): 83.63 - samples/sec: 1762.76 - lr: 0.000010 - momentum: 0.000000
|
171 |
+
2023-10-17 10:48:58,295 epoch 7 - iter 1980/1984 - loss 0.02825356 - time (sec): 92.76 - samples/sec: 1764.63 - lr: 0.000010 - momentum: 0.000000
|
172 |
+
2023-10-17 10:48:58,477 ----------------------------------------------------------------------------------------------------
|
173 |
+
2023-10-17 10:48:58,477 EPOCH 7 done: loss 0.0282 - lr: 0.000010
|
174 |
+
2023-10-17 10:49:01,991 DEV : loss 0.21014443039894104 - f1-score (micro avg) 0.7593
|
175 |
+
2023-10-17 10:49:02,014 ----------------------------------------------------------------------------------------------------
|
176 |
+
2023-10-17 10:49:10,679 epoch 8 - iter 198/1984 - loss 0.01310734 - time (sec): 8.66 - samples/sec: 1889.48 - lr: 0.000010 - momentum: 0.000000
|
177 |
+
2023-10-17 10:49:19,332 epoch 8 - iter 396/1984 - loss 0.01617564 - time (sec): 17.32 - samples/sec: 1870.85 - lr: 0.000009 - momentum: 0.000000
|
178 |
+
2023-10-17 10:49:28,226 epoch 8 - iter 594/1984 - loss 0.01587113 - time (sec): 26.21 - samples/sec: 1908.46 - lr: 0.000009 - momentum: 0.000000
|
179 |
+
2023-10-17 10:49:36,913 epoch 8 - iter 792/1984 - loss 0.01491232 - time (sec): 34.90 - samples/sec: 1893.99 - lr: 0.000009 - momentum: 0.000000
|
180 |
+
2023-10-17 10:49:45,623 epoch 8 - iter 990/1984 - loss 0.01558421 - time (sec): 43.61 - samples/sec: 1907.85 - lr: 0.000008 - momentum: 0.000000
|
181 |
+
2023-10-17 10:49:54,766 epoch 8 - iter 1188/1984 - loss 0.01606079 - time (sec): 52.75 - samples/sec: 1894.03 - lr: 0.000008 - momentum: 0.000000
|
182 |
+
2023-10-17 10:50:03,951 epoch 8 - iter 1386/1984 - loss 0.01694365 - time (sec): 61.94 - samples/sec: 1866.52 - lr: 0.000008 - momentum: 0.000000
|
183 |
+
2023-10-17 10:50:13,283 epoch 8 - iter 1584/1984 - loss 0.01809506 - time (sec): 71.27 - samples/sec: 1832.48 - lr: 0.000007 - momentum: 0.000000
|
184 |
+
2023-10-17 10:50:22,600 epoch 8 - iter 1782/1984 - loss 0.01857775 - time (sec): 80.58 - samples/sec: 1826.78 - lr: 0.000007 - momentum: 0.000000
|
185 |
+
2023-10-17 10:50:31,758 epoch 8 - iter 1980/1984 - loss 0.01906275 - time (sec): 89.74 - samples/sec: 1823.40 - lr: 0.000007 - momentum: 0.000000
|
186 |
+
2023-10-17 10:50:31,947 ----------------------------------------------------------------------------------------------------
|
187 |
+
2023-10-17 10:50:31,947 EPOCH 8 done: loss 0.0191 - lr: 0.000007
|
188 |
+
2023-10-17 10:50:35,465 DEV : loss 0.2338264435529709 - f1-score (micro avg) 0.7676
|
189 |
+
2023-10-17 10:50:35,489 ----------------------------------------------------------------------------------------------------
|
190 |
+
2023-10-17 10:50:44,512 epoch 9 - iter 198/1984 - loss 0.01453751 - time (sec): 9.02 - samples/sec: 1755.79 - lr: 0.000006 - momentum: 0.000000
|
191 |
+
2023-10-17 10:50:53,683 epoch 9 - iter 396/1984 - loss 0.01196746 - time (sec): 18.19 - samples/sec: 1825.23 - lr: 0.000006 - momentum: 0.000000
|
192 |
+
2023-10-17 10:51:03,022 epoch 9 - iter 594/1984 - loss 0.01160464 - time (sec): 27.53 - samples/sec: 1831.16 - lr: 0.000006 - momentum: 0.000000
|
193 |
+
2023-10-17 10:51:12,214 epoch 9 - iter 792/1984 - loss 0.01106266 - time (sec): 36.72 - samples/sec: 1815.83 - lr: 0.000005 - momentum: 0.000000
|
194 |
+
2023-10-17 10:51:21,261 epoch 9 - iter 990/1984 - loss 0.01089225 - time (sec): 45.77 - samples/sec: 1803.84 - lr: 0.000005 - momentum: 0.000000
|
195 |
+
2023-10-17 10:51:30,381 epoch 9 - iter 1188/1984 - loss 0.01111371 - time (sec): 54.89 - samples/sec: 1798.55 - lr: 0.000005 - momentum: 0.000000
|
196 |
+
2023-10-17 10:51:39,368 epoch 9 - iter 1386/1984 - loss 0.01130474 - time (sec): 63.88 - samples/sec: 1796.78 - lr: 0.000004 - momentum: 0.000000
|
197 |
+
2023-10-17 10:51:48,561 epoch 9 - iter 1584/1984 - loss 0.01217727 - time (sec): 73.07 - samples/sec: 1800.21 - lr: 0.000004 - momentum: 0.000000
|
198 |
+
2023-10-17 10:51:57,852 epoch 9 - iter 1782/1984 - loss 0.01235461 - time (sec): 82.36 - samples/sec: 1793.51 - lr: 0.000004 - momentum: 0.000000
|
199 |
+
2023-10-17 10:52:07,034 epoch 9 - iter 1980/1984 - loss 0.01325613 - time (sec): 91.54 - samples/sec: 1788.07 - lr: 0.000003 - momentum: 0.000000
|
200 |
+
2023-10-17 10:52:07,211 ----------------------------------------------------------------------------------------------------
|
201 |
+
2023-10-17 10:52:07,212 EPOCH 9 done: loss 0.0132 - lr: 0.000003
|
202 |
+
2023-10-17 10:52:10,636 DEV : loss 0.2385382354259491 - f1-score (micro avg) 0.772
|
203 |
+
2023-10-17 10:52:10,663 ----------------------------------------------------------------------------------------------------
|
204 |
+
2023-10-17 10:52:19,890 epoch 10 - iter 198/1984 - loss 0.00637340 - time (sec): 9.23 - samples/sec: 1789.65 - lr: 0.000003 - momentum: 0.000000
|
205 |
+
2023-10-17 10:52:28,935 epoch 10 - iter 396/1984 - loss 0.00651409 - time (sec): 18.27 - samples/sec: 1759.71 - lr: 0.000003 - momentum: 0.000000
|
206 |
+
2023-10-17 10:52:38,136 epoch 10 - iter 594/1984 - loss 0.00741839 - time (sec): 27.47 - samples/sec: 1769.92 - lr: 0.000002 - momentum: 0.000000
|
207 |
+
2023-10-17 10:52:47,372 epoch 10 - iter 792/1984 - loss 0.00875355 - time (sec): 36.71 - samples/sec: 1765.52 - lr: 0.000002 - momentum: 0.000000
|
208 |
+
2023-10-17 10:52:56,770 epoch 10 - iter 990/1984 - loss 0.00887012 - time (sec): 46.11 - samples/sec: 1747.58 - lr: 0.000002 - momentum: 0.000000
|
209 |
+
2023-10-17 10:53:06,020 epoch 10 - iter 1188/1984 - loss 0.00809607 - time (sec): 55.36 - samples/sec: 1760.41 - lr: 0.000001 - momentum: 0.000000
|
210 |
+
2023-10-17 10:53:15,196 epoch 10 - iter 1386/1984 - loss 0.00842195 - time (sec): 64.53 - samples/sec: 1776.32 - lr: 0.000001 - momentum: 0.000000
|
211 |
+
2023-10-17 10:53:24,107 epoch 10 - iter 1584/1984 - loss 0.00846139 - time (sec): 73.44 - samples/sec: 1784.87 - lr: 0.000001 - momentum: 0.000000
|
212 |
+
2023-10-17 10:53:32,767 epoch 10 - iter 1782/1984 - loss 0.00886546 - time (sec): 82.10 - samples/sec: 1789.89 - lr: 0.000000 - momentum: 0.000000
|
213 |
+
2023-10-17 10:53:41,465 epoch 10 - iter 1980/1984 - loss 0.00947778 - time (sec): 90.80 - samples/sec: 1802.94 - lr: 0.000000 - momentum: 0.000000
|
214 |
+
2023-10-17 10:53:41,637 ----------------------------------------------------------------------------------------------------
|
215 |
+
2023-10-17 10:53:41,637 EPOCH 10 done: loss 0.0095 - lr: 0.000000
|
216 |
+
2023-10-17 10:53:45,532 DEV : loss 0.24614956974983215 - f1-score (micro avg) 0.7678
|
217 |
+
2023-10-17 10:53:45,963 ----------------------------------------------------------------------------------------------------
|
218 |
+
2023-10-17 10:53:45,964 Loading model from best epoch ...
|
219 |
+
2023-10-17 10:53:47,827 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
|
220 |
+
2023-10-17 10:53:51,361
|
221 |
+
Results:
|
222 |
+
- F-score (micro) 0.7855
|
223 |
+
- F-score (macro) 0.7063
|
224 |
+
- Accuracy 0.6667
|
225 |
+
|
226 |
+
By class:
|
227 |
+
precision recall f1-score support
|
228 |
+
|
229 |
+
LOC 0.8135 0.8656 0.8388 655
|
230 |
+
PER 0.7333 0.7892 0.7603 223
|
231 |
+
ORG 0.5900 0.4646 0.5198 127
|
232 |
+
|
233 |
+
micro avg 0.7734 0.7980 0.7855 1005
|
234 |
+
macro avg 0.7123 0.7065 0.7063 1005
|
235 |
+
weighted avg 0.7675 0.7980 0.7810 1005
|
236 |
+
|
237 |
+
2023-10-17 10:53:51,362 ----------------------------------------------------------------------------------------------------
|