adrianeboyd commited on
Commit
d94404d
1 Parent(s): c125aa9

Update spaCy pipeline

Browse files
README.md CHANGED
@@ -14,72 +14,72 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.8841463415
18
  - name: NER Recall
19
  type: recall
20
- value: 0.917721519
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.900621118
24
  - task:
25
  name: TAG
26
  type: token-classification
27
  metrics:
28
  - name: TAG (XPOS) Accuracy
29
  type: accuracy
30
- value: 0.9770274602
31
  - task:
32
  name: POS
33
  type: token-classification
34
  metrics:
35
  - name: POS (UPOS) Accuracy
36
  type: accuracy
37
- value: 0.9899348221
38
  - task:
39
  name: MORPH
40
  type: token-classification
41
  metrics:
42
  - name: Morph (UFeats) Accuracy
43
  type: accuracy
44
- value: 0.9788652634
45
  - task:
46
  name: LEMMA
47
  type: token-classification
48
  metrics:
49
  - name: Lemma Accuracy
50
  type: accuracy
51
- value: 0.9679666631
52
  - task:
53
  name: UNLABELED_DEPENDENCIES
54
  type: token-classification
55
  metrics:
56
  - name: Unlabeled Attachment Score (UAS)
57
  type: f_score
58
- value: 0.93363503
59
  - task:
60
  name: LABELED_DEPENDENCIES
61
  type: token-classification
62
  metrics:
63
  - name: Labeled Attachment Score (LAS)
64
  type: f_score
65
- value: 0.9170251386
66
  - task:
67
  name: SENTS
68
  type: token-classification
69
  metrics:
70
  - name: Sentences F-Score
71
  type: f_score
72
- value: 0.9373219373
73
  ---
74
  ### Details: https://spacy.io/models/sl#sl_core_news_trf
75
 
76
- Slovenian transformer pipeline (EMBEDDIA/sloberta). Components: transformer, tagger, morphologizer, parser, lemmatizer (trainable_lemmatizer), ner.
77
 
78
  | Feature | Description |
79
  | --- | --- |
80
  | **Name** | `sl_core_news_trf` |
81
- | **Version** | `3.6.1` |
82
- | **spaCy** | `>=3.6.0,<3.7.0` |
83
  | **Default Pipeline** | `transformer`, `tagger`, `morphologizer`, `parser`, `lemmatizer`, `attribute_ruler`, `ner` |
84
  | **Components** | `transformer`, `tagger`, `morphologizer`, `parser`, `lemmatizer`, `attribute_ruler`, `ner` |
85
  | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
@@ -110,18 +110,18 @@ Slovenian transformer pipeline (EMBEDDIA/sloberta). Components: transformer, tag
110
  | `TOKEN_P` | 99.81 |
111
  | `TOKEN_R` | 99.57 |
112
  | `TOKEN_F` | 99.69 |
113
- | `TAG_ACC` | 97.70 |
114
- | `POS_ACC` | 98.99 |
115
- | `MORPH_ACC` | 97.89 |
116
- | `MORPH_MICRO_P` | 99.01 |
117
- | `MORPH_MICRO_R` | 98.89 |
118
- | `MORPH_MICRO_F` | 98.95 |
119
- | `SENTS_P` | 93.20 |
120
- | `SENTS_R` | 94.27 |
121
- | `SENTS_F` | 93.73 |
122
- | `DEP_UAS` | 93.36 |
123
- | `DEP_LAS` | 91.70 |
124
- | `LEMMA_ACC` | 96.80 |
125
- | `ENTS_P` | 88.41 |
126
- | `ENTS_R` | 91.77 |
127
- | `ENTS_F` | 90.06 |
 
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
+ value: 0.9316770186
18
  - name: NER Recall
19
  type: recall
20
+ value: 0.9493670886
21
  - name: NER F Score
22
  type: f_score
23
+ value: 0.9404388715
24
  - task:
25
  name: TAG
26
  type: token-classification
27
  metrics:
28
  - name: TAG (XPOS) Accuracy
29
  type: accuracy
30
+ value: 0.9802329309
31
  - task:
32
  name: POS
33
  type: token-classification
34
  metrics:
35
  - name: POS (UPOS) Accuracy
36
  type: accuracy
37
+ value: 0.9908323539
38
  - task:
39
  name: MORPH
40
  type: token-classification
41
  metrics:
42
  - name: Morph (UFeats) Accuracy
43
  type: accuracy
44
+ value: 0.981857036
45
  - task:
46
  name: LEMMA
47
  type: token-classification
48
  metrics:
49
  - name: Lemma Accuracy
50
  type: accuracy
51
+ value: 0.9694198098
52
  - task:
53
  name: UNLABELED_DEPENDENCIES
54
  type: token-classification
55
  metrics:
56
  - name: Unlabeled Attachment Score (UAS)
57
  type: f_score
58
+ value: 0.9361807921
59
  - task:
60
  name: LABELED_DEPENDENCIES
61
  type: token-classification
62
  metrics:
63
  - name: Labeled Attachment Score (LAS)
64
  type: f_score
65
+ value: 0.9223081322
66
  - task:
67
  name: SENTS
68
  type: token-classification
69
  metrics:
70
  - name: Sentences F-Score
71
  type: f_score
72
+ value: 0.9325789722
73
  ---
74
  ### Details: https://spacy.io/models/sl#sl_core_news_trf
75
 
76
+ Slovenian transformer pipeline (Transformer(name='EMBEDDIA/sloberta', piece_encoder='camembert-sentencepiece', stride=128, type='camembert', width=768, window=168, vocab_size=32005)). Components: transformer, tagger, morphologizer, parser, lemmatizer (trainable_lemmatizer), ner.
77
 
78
  | Feature | Description |
79
  | --- | --- |
80
  | **Name** | `sl_core_news_trf` |
81
+ | **Version** | `3.7.2` |
82
+ | **spaCy** | `>=3.7.0,<3.8.0` |
83
  | **Default Pipeline** | `transformer`, `tagger`, `morphologizer`, `parser`, `lemmatizer`, `attribute_ruler`, `ner` |
84
  | **Components** | `transformer`, `tagger`, `morphologizer`, `parser`, `lemmatizer`, `attribute_ruler`, `ner` |
85
  | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
 
110
  | `TOKEN_P` | 99.81 |
111
  | `TOKEN_R` | 99.57 |
112
  | `TOKEN_F` | 99.69 |
113
+ | `TAG_ACC` | 98.02 |
114
+ | `POS_ACC` | 99.08 |
115
+ | `MORPH_ACC` | 98.19 |
116
+ | `MORPH_MICRO_P` | 99.15 |
117
+ | `MORPH_MICRO_R` | 99.04 |
118
+ | `MORPH_MICRO_F` | 99.09 |
119
+ | `SENTS_P` | 92.09 |
120
+ | `SENTS_R` | 94.46 |
121
+ | `SENTS_F` | 93.26 |
122
+ | `DEP_UAS` | 93.62 |
123
+ | `DEP_LAS` | 92.23 |
124
+ | `LEMMA_ACC` | 96.94 |
125
+ | `ENTS_P` | 93.17 |
126
+ | `ENTS_R` | 94.94 |
127
+ | `ENTS_F` | 94.04 |
accuracy.json CHANGED
@@ -3,57 +3,57 @@
3
  "token_p": 0.9980762654,
4
  "token_r": 0.9956925964,
5
  "token_f": 0.996883006,
6
- "tag_acc": 0.9770274602,
7
- "pos_acc": 0.9899348221,
8
- "morph_acc": 0.9788652634,
9
- "morph_micro_p": 0.9900582551,
10
- "morph_micro_r": 0.9888854047,
11
- "morph_micro_f": 0.9894714824,
12
  "morph_per_feat": {
13
  "Case": {
14
- "p": 0.9839550079,
15
- "r": 0.9840363937,
16
- "f": 0.9839956991
17
  },
18
  "Gender": {
19
- "p": 0.9841089671,
20
- "r": 0.9844814534,
21
- "f": 0.984295175
22
  },
23
  "Number": {
24
- "p": 0.9930539296,
25
- "r": 0.9932864949,
26
- "f": 0.9931701986
27
  },
28
  "Aspect": {
29
- "p": 0.9868204283,
30
- "r": 0.9933665008,
31
- "f": 0.9900826446
32
  },
33
  "Mood": {
34
- "p": 0.9951520494,
35
- "r": 0.9947136564,
36
- "f": 0.9949328046
37
  },
38
  "Person": {
39
- "p": 0.9972166998,
40
- "r": 0.9964243147,
41
- "f": 0.9968203498
42
  },
43
  "Tense": {
44
- "p": 0.9980582524,
45
- "r": 0.9951597289,
46
- "f": 0.9966068832
47
  },
48
  "VerbForm": {
49
- "p": 0.997615894,
50
- "r": 0.9949801849,
51
- "f": 0.9962962963
52
  },
53
  "PronType": {
54
- "p": 0.9929214929,
55
  "r": 0.995483871,
56
- "f": 0.9942010309
57
  },
58
  "Variant": {
59
  "p": 0.9981412639,
@@ -62,18 +62,18 @@
62
  },
63
  "NumForm": {
64
  "p": 1.0,
65
- "r": 0.9617834395,
66
- "f": 0.9805194805
67
  },
68
  "NumType": {
69
  "p": 1.0,
70
- "r": 0.9653179191,
71
- "f": 0.9823529412
72
  },
73
  "Degree": {
74
- "p": 0.9925989183,
75
- "r": 0.9923164485,
76
- "f": 0.9924576633
77
  },
78
  "Polarity": {
79
  "p": 1.0,
@@ -81,14 +81,14 @@
81
  "f": 0.9992902768
82
  },
83
  "Number[psor]": {
84
- "p": 0.9375,
85
- "r": 0.9375,
86
- "f": 0.9375
87
  },
88
  "Poss": {
89
- "p": 1.0,
90
- "r": 0.995,
91
- "f": 0.9974937343
92
  },
93
  "Reflex": {
94
  "p": 0.9975,
@@ -101,9 +101,9 @@
101
  "f": 0.9896551724
102
  },
103
  "Animacy": {
104
- "p": 0.9634831461,
105
- "r": 0.9346049046,
106
- "f": 0.948824343
107
  },
108
  "Abbr": {
109
  "p": 1.0,
@@ -111,186 +111,186 @@
111
  "f": 1.0
112
  },
113
  "Foreign": {
114
- "p": 0.8947368421,
115
- "r": 0.4473684211,
116
- "f": 0.5964912281
117
  },
118
  "Gender[psor]": {
119
- "p": 0.9090909091,
120
  "r": 0.9090909091,
121
- "f": 0.9090909091
122
  }
123
  },
124
- "sents_p": 0.9320113314,
125
- "sents_r": 0.9426934097,
126
- "sents_f": 0.9373219373,
127
- "dep_uas": 0.93363503,
128
- "dep_las": 0.9170251386,
129
  "dep_las_per_type": {
130
  "nsubj": {
131
- "p": 0.942804428,
132
- "r": 0.9560336763,
133
- "f": 0.949372968
134
  },
135
  "nmod": {
136
- "p": 0.8998951782,
137
- "r": 0.8850515464,
138
- "f": 0.8924116424
139
  },
140
  "root": {
141
- "p": 0.9437022901,
142
- "r": 0.9446036294,
143
- "f": 0.9441527446
144
  },
145
  "obj": {
146
- "p": 0.9335899904,
147
- "r": 0.9371980676,
148
- "f": 0.9353905497
149
  },
150
  "iobj": {
151
- "p": 0.8732394366,
152
- "r": 0.7209302326,
153
- "f": 0.7898089172
154
  },
155
  "case": {
156
- "p": 0.9897260274,
157
- "r": 0.9863481229,
158
- "f": 0.988034188
159
  },
160
  "cc": {
161
- "p": 0.9503546099,
162
- "r": 0.9447708578,
163
- "f": 0.947554508
164
  },
165
  "conj": {
166
- "p": 0.8023483366,
167
- "r": 0.8,
168
- "f": 0.8011724475
169
  },
170
  "mark": {
171
- "p": 0.9721115538,
172
- "r": 0.9721115538,
173
- "f": 0.9721115538
174
  },
175
  "obl": {
176
- "p": 0.9013806706,
177
- "r": 0.8984272608,
178
- "f": 0.8999015425
179
  },
180
  "nummod": {
181
- "p": 0.9423076923,
182
- "r": 0.9171122995,
183
- "f": 0.9295392954
184
  },
185
  "acl": {
186
- "p": 0.8353221957,
187
- "r": 0.8177570093,
188
- "f": 0.826446281
189
  },
190
  "amod": {
191
- "p": 0.9797830375,
192
- "r": 0.9895418327,
193
- "f": 0.9846382557
194
  },
195
  "aux": {
196
- "p": 0.9900552486,
197
- "r": 0.9900552486,
198
- "f": 0.9900552486
199
  },
200
  "det": {
201
- "p": 0.9719222462,
202
- "r": 0.9846827133,
203
- "f": 0.9782608696
204
  },
205
  "advmod": {
206
- "p": 0.8700209644,
207
- "r": 0.8718487395,
208
- "f": 0.870933893
209
  },
210
  "parataxis": {
211
- "p": 0.7,
212
  "r": 0.6549707602,
213
- "f": 0.6767371601
214
  },
215
  "flat:name": {
216
- "p": 0.9681528662,
217
- "r": 0.987012987,
218
- "f": 0.9774919614
219
  },
220
  "cop": {
221
- "p": 0.9393939394,
222
- "r": 0.9482352941,
223
- "f": 0.943793911
224
  },
225
  "csubj": {
226
- "p": 0.8863636364,
227
- "r": 0.8764044944,
228
- "f": 0.8813559322
229
  },
230
  "expl": {
231
- "p": 0.9272151899,
232
- "r": 0.9391025641,
233
- "f": 0.9331210191
234
  },
235
  "xcomp": {
236
- "p": 0.9496855346,
237
- "r": 0.94375,
238
- "f": 0.9467084639
239
  },
240
  "ccomp": {
241
- "p": 0.9438202247,
242
- "r": 0.9130434783,
243
- "f": 0.9281767956
 
 
 
 
 
244
  },
245
  "appos": {
246
- "p": 0.5921052632,
247
- "r": 0.625,
248
- "f": 0.6081081081
249
  },
250
  "flat": {
251
- "p": 0.6086956522,
252
- "r": 0.7368421053,
253
- "f": 0.6666666667
254
  },
255
  "orphan": {
256
- "p": 0.6470588235,
257
- "r": 0.4583333333,
258
- "f": 0.5365853659
259
  },
260
  "advcl": {
261
- "p": 0.75,
262
- "r": 0.7285714286,
263
- "f": 0.7391304348
264
  },
265
  "fixed": {
266
- "p": 0.9090909091,
267
- "r": 0.9090909091,
268
- "f": 0.9090909091
269
  },
270
  "list": {
271
- "p": 0.7717391304,
272
  "r": 0.8658536585,
273
- "f": 0.816091954
274
  },
275
  "flat:foreign": {
276
- "p": 0.625,
277
- "r": 0.625,
278
- "f": 0.625
279
- },
280
- "dep": {
281
- "p": 0.0344827586,
282
- "r": 0.1666666667,
283
- "f": 0.0571428571
284
  },
285
  "discourse": {
286
- "p": 0.6,
287
  "r": 0.6428571429,
288
- "f": 0.6206896552
289
  },
290
  "vocative": {
291
- "p": 1.0,
292
  "r": 0.2,
293
- "f": 0.3333333333
294
  },
295
  "cc:preconj": {
296
  "p": 1.0,
@@ -298,25 +298,25 @@
298
  "f": 0.8421052632
299
  }
300
  },
301
- "lemma_acc": 0.9679666631,
302
- "ents_p": 0.8841463415,
303
- "ents_r": 0.917721519,
304
- "ents_f": 0.900621118,
305
  "ents_per_type": {
306
  "ORG": {
307
- "p": 0.7692307692,
308
- "r": 0.75,
309
- "f": 0.7594936709
310
  },
311
  "PER": {
312
  "p": 1.0,
313
- "r": 0.9833333333,
314
- "f": 0.9915966387
315
  },
316
  "LOC": {
317
- "p": 0.8666666667,
318
  "r": 0.962962963,
319
- "f": 0.9122807018
320
  },
321
  "MISC": {
322
  "p": 0.75,
@@ -329,5 +329,5 @@
329
  "f": 0.6666666667
330
  }
331
  },
332
- "speed": 1211.6699495749
333
  }
 
3
  "token_p": 0.9980762654,
4
  "token_r": 0.9956925964,
5
  "token_f": 0.996883006,
6
+ "tag_acc": 0.9802329309,
7
+ "pos_acc": 0.9908323539,
8
+ "morph_acc": 0.981857036,
9
+ "morph_micro_p": 0.9914889164,
10
+ "morph_micro_r": 0.9903661893,
11
+ "morph_micro_f": 0.9909272349,
12
  "morph_per_feat": {
13
  "Case": {
14
+ "p": 0.9859446052,
15
+ "r": 0.9863523573,
16
+ "f": 0.9861484391
17
  },
18
  "Gender": {
19
+ "p": 0.986196464,
20
+ "r": 0.9870363361,
21
+ "f": 0.9866162213
22
  },
23
  "Number": {
24
+ "p": 0.9937587767,
25
+ "r": 0.9943793911,
26
+ "f": 0.994068987
27
  },
28
  "Aspect": {
29
+ "p": 0.9906439185,
30
+ "r": 0.9950248756,
31
+ "f": 0.9928295643
32
  },
33
  "Mood": {
34
+ "p": 0.9955927721,
35
+ "r": 0.995154185,
36
+ "f": 0.9953734303
37
  },
38
  "Person": {
39
+ "p": 0.9984095427,
40
+ "r": 0.9976162098,
41
+ "f": 0.9980127186
42
  },
43
  "Tense": {
44
+ "p": 0.9985436893,
45
+ "r": 0.9956437561,
46
+ "f": 0.9970916142
47
  },
48
  "VerbForm": {
49
+ "p": 0.9986772487,
50
+ "r": 0.9973579921,
51
+ "f": 0.9980171844
52
  },
53
  "PronType": {
54
+ "p": 0.9948420374,
55
  "r": 0.995483871,
56
+ "f": 0.9951628507
57
  },
58
  "Variant": {
59
  "p": 0.9981412639,
 
62
  },
63
  "NumForm": {
64
  "p": 1.0,
65
+ "r": 0.9681528662,
66
+ "f": 0.9838187702
67
  },
68
  "NumType": {
69
  "p": 1.0,
70
+ "r": 0.9710982659,
71
+ "f": 0.9853372434
72
  },
73
  "Degree": {
74
+ "p": 0.9937268321,
75
+ "r": 0.9917472965,
76
+ "f": 0.9927360775
77
  },
78
  "Polarity": {
79
  "p": 1.0,
 
81
  "f": 0.9992902768
82
  },
83
  "Number[psor]": {
84
+ "p": 0.9583333333,
85
+ "r": 0.9583333333,
86
+ "f": 0.9583333333
87
  },
88
  "Poss": {
89
+ "p": 0.9949748744,
90
+ "r": 0.99,
91
+ "f": 0.992481203
92
  },
93
  "Reflex": {
94
  "p": 0.9975,
 
101
  "f": 0.9896551724
102
  },
103
  "Animacy": {
104
+ "p": 0.9630681818,
105
+ "r": 0.9237057221,
106
+ "f": 0.9429763561
107
  },
108
  "Abbr": {
109
  "p": 1.0,
 
111
  "f": 1.0
112
  },
113
  "Foreign": {
114
+ "p": 1.0,
115
+ "r": 0.3947368421,
116
+ "f": 0.5660377358
117
  },
118
  "Gender[psor]": {
119
+ "p": 0.9375,
120
  "r": 0.9090909091,
121
+ "f": 0.9230769231
122
  }
123
  },
124
+ "sents_p": 0.9208566108,
125
+ "sents_r": 0.9446036294,
126
+ "sents_f": 0.9325789722,
127
+ "dep_uas": 0.9361807921,
128
+ "dep_las": 0.9223081322,
129
  "dep_las_per_type": {
130
  "nsubj": {
131
+ "p": 0.9579831933,
132
+ "r": 0.9597754911,
133
+ "f": 0.9588785047
134
  },
135
  "nmod": {
136
+ "p": 0.8976744186,
137
+ "r": 0.8953608247,
138
+ "f": 0.896516129
139
  },
140
  "root": {
141
+ "p": 0.9408960915,
142
+ "r": 0.9426934097,
143
+ "f": 0.9417938931
144
  },
145
  "obj": {
146
+ "p": 0.9537126326,
147
+ "r": 0.9555555556,
148
+ "f": 0.9546332046
149
  },
150
  "iobj": {
151
+ "p": 0.8441558442,
152
+ "r": 0.7558139535,
153
+ "f": 0.7975460123
154
  },
155
  "case": {
156
+ "p": 0.9897128161,
157
+ "r": 0.9850682594,
158
+ "f": 0.9873850759
159
  },
160
  "cc": {
161
+ "p": 0.9610849057,
162
+ "r": 0.9576968273,
163
+ "f": 0.9593878752
164
  },
165
  "conj": {
166
+ "p": 0.820909971,
167
+ "r": 0.8273170732,
168
+ "f": 0.824101069
169
  },
170
  "mark": {
171
+ "p": 0.9654255319,
172
+ "r": 0.9641434263,
173
+ "f": 0.9647840532
174
  },
175
  "obl": {
176
+ "p": 0.923128793,
177
+ "r": 0.8971166448,
178
+ "f": 0.9099368561
179
  },
180
  "nummod": {
181
+ "p": 0.9453551913,
182
+ "r": 0.9251336898,
183
+ "f": 0.9351351351
184
  },
185
  "acl": {
186
+ "p": 0.8486997636,
187
+ "r": 0.8387850467,
188
+ "f": 0.8437132785
189
  },
190
  "amod": {
191
+ "p": 0.9846000994,
192
+ "r": 0.9870517928,
193
+ "f": 0.9858244218
194
  },
195
  "aux": {
196
+ "p": 0.9911406423,
197
+ "r": 0.9889502762,
198
+ "f": 0.9900442478
199
  },
200
  "det": {
201
+ "p": 0.9739696312,
202
+ "r": 0.9824945295,
203
+ "f": 0.9782135076
204
  },
205
  "advmod": {
206
+ "p": 0.8706004141,
207
+ "r": 0.8834033613,
208
+ "f": 0.8769551616
209
  },
210
  "parataxis": {
211
+ "p": 0.6892307692,
212
  "r": 0.6549707602,
213
+ "f": 0.6716641679
214
  },
215
  "flat:name": {
216
+ "p": 0.9741935484,
217
+ "r": 0.9805194805,
218
+ "f": 0.9773462783
219
  },
220
  "cop": {
221
+ "p": 0.9443155452,
222
+ "r": 0.9576470588,
223
+ "f": 0.9509345794
224
  },
225
  "csubj": {
226
+ "p": 0.9101123596,
227
+ "r": 0.9101123596,
228
+ "f": 0.9101123596
229
  },
230
  "expl": {
231
+ "p": 0.961414791,
232
+ "r": 0.9583333333,
233
+ "f": 0.9598715891
234
  },
235
  "xcomp": {
236
+ "p": 0.9440993789,
237
+ "r": 0.95,
238
+ "f": 0.9470404984
239
  },
240
  "ccomp": {
241
+ "p": 0.9184782609,
242
+ "r": 0.9184782609,
243
+ "f": 0.9184782609
244
+ },
245
+ "dep": {
246
+ "p": 0.0465116279,
247
+ "r": 0.3333333333,
248
+ "f": 0.0816326531
249
  },
250
  "appos": {
251
+ "p": 0.6258503401,
252
+ "r": 0.6388888889,
253
+ "f": 0.6323024055
254
  },
255
  "flat": {
256
+ "p": 0.7647058824,
257
+ "r": 0.6842105263,
258
+ "f": 0.7222222222
259
  },
260
  "orphan": {
261
+ "p": 0.6478873239,
262
+ "r": 0.4791666667,
263
+ "f": 0.5508982036
264
  },
265
  "advcl": {
266
+ "p": 0.7486910995,
267
+ "r": 0.680952381,
268
+ "f": 0.7132169576
269
  },
270
  "fixed": {
271
+ "p": 0.9393939394,
272
+ "r": 0.9393939394,
273
+ "f": 0.9393939394
274
  },
275
  "list": {
276
+ "p": 0.7553191489,
277
  "r": 0.8658536585,
278
+ "f": 0.8068181818
279
  },
280
  "flat:foreign": {
281
+ "p": 0.6666666667,
282
+ "r": 0.5,
283
+ "f": 0.5714285714
 
 
 
 
 
284
  },
285
  "discourse": {
286
+ "p": 0.9,
287
  "r": 0.6428571429,
288
+ "f": 0.75
289
  },
290
  "vocative": {
291
+ "p": 0.5,
292
  "r": 0.2,
293
+ "f": 0.2857142857
294
  },
295
  "cc:preconj": {
296
  "p": 1.0,
 
298
  "f": 0.8421052632
299
  }
300
  },
301
+ "lemma_acc": 0.9694198098,
302
+ "ents_p": 0.9316770186,
303
+ "ents_r": 0.9493670886,
304
+ "ents_f": 0.9404388715,
305
  "ents_per_type": {
306
  "ORG": {
307
+ "p": 0.9189189189,
308
+ "r": 0.85,
309
+ "f": 0.8831168831
310
  },
311
  "PER": {
312
  "p": 1.0,
313
+ "r": 1.0,
314
+ "f": 1.0
315
  },
316
  "LOC": {
317
+ "p": 0.8965517241,
318
  "r": 0.962962963,
319
+ "f": 0.9285714286
320
  },
321
  "MISC": {
322
  "p": 0.75,
 
329
  "f": 0.6666666667
330
  }
331
  },
332
+ "speed": 1008.1462815944
333
  }
config.cfg CHANGED
@@ -17,6 +17,7 @@ after_creation = null
17
  after_pipeline_creation = null
18
  batch_size = 64
19
  tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
 
20
 
21
  [components]
22
 
@@ -39,10 +40,11 @@ nO = null
39
  normalize = false
40
 
41
  [components.lemmatizer.model.tok2vec]
42
- @architectures = "spacy-transformers.TransformerListener.v1"
43
- grad_factor = 1.0
44
  upstream = "transformer"
45
  pooling = {"@layers":"reduce_mean.v1"}
 
46
 
47
  [components.morphologizer]
48
  factory = "morphologizer"
@@ -57,10 +59,11 @@ nO = null
57
  normalize = false
58
 
59
  [components.morphologizer.model.tok2vec]
60
- @architectures = "spacy-transformers.TransformerListener.v1"
61
- grad_factor = 1.0
62
  upstream = "transformer"
63
  pooling = {"@layers":"reduce_mean.v1"}
 
64
 
65
  [components.ner]
66
  factory = "ner"
@@ -79,10 +82,11 @@ use_upper = false
79
  nO = null
80
 
81
  [components.ner.model.tok2vec]
82
- @architectures = "spacy-transformers.TransformerListener.v1"
83
- grad_factor = 1.0
84
  upstream = "transformer"
85
  pooling = {"@layers":"reduce_mean.v1"}
 
86
 
87
  [components.parser]
88
  factory = "parser"
@@ -102,10 +106,11 @@ use_upper = false
102
  nO = null
103
 
104
  [components.parser.model.tok2vec]
105
- @architectures = "spacy-transformers.TransformerListener.v1"
106
- grad_factor = 1.0
107
  upstream = "transformer"
108
  pooling = {"@layers":"reduce_mean.v1"}
 
109
 
110
  [components.tagger]
111
  factory = "tagger"
@@ -120,32 +125,44 @@ nO = null
120
  normalize = false
121
 
122
  [components.tagger.model.tok2vec]
123
- @architectures = "spacy-transformers.TransformerListener.v1"
124
- grad_factor = 1.0
125
  upstream = "transformer"
126
  pooling = {"@layers":"reduce_mean.v1"}
 
127
 
128
  [components.transformer]
129
- factory = "transformer"
130
- max_batch_items = 4096
131
- set_extra_annotations = {"@annotation_setters":"spacy-transformers.null_annotation_setter.v1"}
132
 
133
  [components.transformer.model]
134
- name = "EMBEDDIA/sloberta"
135
- @architectures = "spacy-transformers.TransformerModel.v3"
 
 
 
 
 
 
 
 
 
 
 
 
 
136
  mixed_precision = false
137
-
138
- [components.transformer.model.get_spans]
139
- @span_getters = "spacy-transformers.strided_spans.v1"
140
- window = 128
141
- stride = 96
142
 
143
  [components.transformer.model.grad_scaler_config]
144
 
145
- [components.transformer.model.tokenizer_config]
146
- use_fast = true
147
-
148
- [components.transformer.model.transformer_config]
 
149
 
150
  [corpora]
151
 
@@ -182,11 +199,11 @@ annotating_components = []
182
  before_update = null
183
 
184
  [training.batcher]
185
- @batchers = "spacy.batch_by_padded.v1"
186
- discard_oversize = true
187
- get_length = null
188
  size = 2000
189
- buffer = 256
 
190
 
191
  [training.logger]
192
  @loggers = "spacy.ConsoleLogger.v1"
@@ -272,6 +289,18 @@ require = false
272
  path = "corpus/labels/tagger.json"
273
  require = false
274
 
 
 
 
 
 
 
 
 
 
 
 
 
275
  [initialize.lookups]
276
  @misc = "spacy.LookupsDataLoader.v1"
277
  lang = ${nlp.lang}
 
17
  after_pipeline_creation = null
18
  batch_size = 64
19
  tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
20
+ vectors = {"@vectors":"spacy.Vectors.v1"}
21
 
22
  [components]
23
 
 
40
  normalize = false
41
 
42
  [components.lemmatizer.model.tok2vec]
43
+ @architectures = "spacy-curated-transformers.LastTransformerLayerListener.v1"
44
+ width = ${components.transformer.model.hidden_width}
45
  upstream = "transformer"
46
  pooling = {"@layers":"reduce_mean.v1"}
47
+ grad_factor = 1.0
48
 
49
  [components.morphologizer]
50
  factory = "morphologizer"
 
59
  normalize = false
60
 
61
  [components.morphologizer.model.tok2vec]
62
+ @architectures = "spacy-curated-transformers.LastTransformerLayerListener.v1"
63
+ width = ${components.transformer.model.hidden_width}
64
  upstream = "transformer"
65
  pooling = {"@layers":"reduce_mean.v1"}
66
+ grad_factor = 1.0
67
 
68
  [components.ner]
69
  factory = "ner"
 
82
  nO = null
83
 
84
  [components.ner.model.tok2vec]
85
+ @architectures = "spacy-curated-transformers.LastTransformerLayerListener.v1"
86
+ width = ${components.transformer.model.hidden_width}
87
  upstream = "transformer"
88
  pooling = {"@layers":"reduce_mean.v1"}
89
+ grad_factor = 1.0
90
 
91
  [components.parser]
92
  factory = "parser"
 
106
  nO = null
107
 
108
  [components.parser.model.tok2vec]
109
+ @architectures = "spacy-curated-transformers.LastTransformerLayerListener.v1"
110
+ width = ${components.transformer.model.hidden_width}
111
  upstream = "transformer"
112
  pooling = {"@layers":"reduce_mean.v1"}
113
+ grad_factor = 1.0
114
 
115
  [components.tagger]
116
  factory = "tagger"
 
125
  normalize = false
126
 
127
  [components.tagger.model.tok2vec]
128
+ @architectures = "spacy-curated-transformers.LastTransformerLayerListener.v1"
129
+ width = ${components.transformer.model.hidden_width}
130
  upstream = "transformer"
131
  pooling = {"@layers":"reduce_mean.v1"}
132
+ grad_factor = 1.0
133
 
134
  [components.transformer]
135
+ factory = "curated_transformer"
136
+ all_layer_outputs = false
137
+ frozen = false
138
 
139
  [components.transformer.model]
140
+ @architectures = "spacy-curated-transformers.CamembertTransformer.v1"
141
+ vocab_size = 32005
142
+ hidden_width = 768
143
+ piece_encoder = {"@architectures":"spacy-curated-transformers.CamembertSentencepieceEncoder.v1"}
144
+ attention_probs_dropout_prob = 0.1
145
+ hidden_act = "gelu"
146
+ hidden_dropout_prob = 0.1
147
+ intermediate_width = 3072
148
+ layer_norm_eps = 0.00001
149
+ max_position_embeddings = 514
150
+ model_max_length = 512
151
+ num_attention_heads = 12
152
+ num_hidden_layers = 12
153
+ padding_idx = 1
154
+ type_vocab_size = 1
155
  mixed_precision = false
156
+ torchscript = false
157
+ wrapped_listener = null
 
 
 
158
 
159
  [components.transformer.model.grad_scaler_config]
160
 
161
+ [components.transformer.model.with_spans]
162
+ @architectures = "spacy-curated-transformers.WithStridedSpans.v1"
163
+ stride = 128
164
+ window = 168
165
+ batch_size = 384
166
 
167
  [corpora]
168
 
 
199
  before_update = null
200
 
201
  [training.batcher]
202
+ @batchers = "spacy.batch_by_words.v1"
203
+ discard_oversize = false
 
204
  size = 2000
205
+ tolerance = 0.2
206
+ get_length = null
207
 
208
  [training.logger]
209
  @loggers = "spacy.ConsoleLogger.v1"
 
289
  path = "corpus/labels/tagger.json"
290
  require = false
291
 
292
+ [initialize.components.transformer]
293
+
294
+ [initialize.components.transformer.encoder_loader]
295
+ @model_loaders = "spacy-curated-transformers.HFTransformerEncoderLoader.v1"
296
+ name = "EMBEDDIA/sloberta"
297
+ revision = "main"
298
+
299
+ [initialize.components.transformer.piecer_loader]
300
+ @model_loaders = "spacy-curated-transformers.HFPieceEncoderLoader.v1"
301
+ name = "EMBEDDIA/sloberta"
302
+ revision = "main"
303
+
304
  [initialize.lookups]
305
  @misc = "spacy.LookupsDataLoader.v1"
306
  lang = ${nlp.lang}
lemmatizer/cfg CHANGED
@@ -102,31 +102,31 @@
102
  202,
103
  204,
104
  206,
105
- 208,
106
- 210,
107
- 212,
108
- 215,
109
- 217,
110
- 222,
111
- 224,
112
- 226,
113
  227,
 
114
  230,
115
- 232,
116
- 234,
117
- 237,
118
  238,
119
- 239,
120
  241,
121
- 243,
122
  244,
123
  246,
124
- 248,
125
- 252,
126
- 253,
127
- 254,
128
  256,
129
- 258,
130
  259,
131
  261,
132
  262,
@@ -135,418 +135,422 @@
135
  267,
136
  268,
137
  270,
138
- 272,
 
139
  275,
140
- 276,
141
- 277,
142
  279,
143
- 281,
144
- 283,
145
- 285,
146
  289,
147
- 292,
148
- 295,
149
  297,
 
 
 
150
  301,
151
- 302,
152
  303,
153
  304,
154
- 306,
155
  307,
156
  308,
157
- 310,
158
- 311,
159
  313,
160
- 316,
161
  317,
162
  319,
163
- 321,
 
164
  323,
165
- 324,
166
  326,
167
  327,
168
- 330,
169
  331,
170
  333,
171
- 335,
 
172
  337,
173
  338,
174
- 340,
175
  341,
176
- 342,
177
  343,
178
- 345,
179
- 347,
180
- 350,
 
181
  353,
182
  355,
183
- 356,
184
  357,
185
- 359,
186
  361,
 
187
  364,
188
- 365,
189
- 366,
190
  368,
191
- 372,
192
- 374,
193
- 379,
 
194
  380,
195
- 381,
196
- 384,
197
  386,
198
- 389,
199
  390,
200
  392,
201
  394,
202
- 396,
203
  397,
204
- 398,
205
- 400,
206
  401,
207
  403,
208
  405,
209
- 407,
210
- 408,
211
  409,
 
212
  412,
213
  413,
214
- 414,
215
- 415,
216
- 417,
217
  418,
218
- 421,
219
- 423,
 
220
  424,
221
  425,
222
- 427,
223
  429,
224
- 430,
225
  433,
226
  434,
227
- 436,
 
228
  438,
229
- 439,
230
- 441,
231
  442,
232
- 446,
 
233
  447,
234
- 448,
235
- 451,
236
  453,
237
  454,
238
- 457,
239
  458,
 
240
  460,
241
- 462,
242
- 463,
243
  464,
244
- 465,
245
- 467,
246
  469,
 
247
  472,
248
- 473,
249
- 475,
250
  476,
251
  478,
252
  480,
253
- 481,
254
  483,
 
255
  485,
256
  487,
257
- 488,
258
  489,
259
- 490,
260
  492,
261
  494,
 
 
262
  497,
263
- 498,
264
  499,
265
  501,
266
- 502,
267
  503,
268
- 504,
269
  506,
270
- 508,
 
271
  510,
272
- 512,
273
- 513,
274
  514,
275
- 516,
276
  518,
277
- 519,
278
  520,
279
  523,
 
 
280
  527,
281
- 529,
 
282
  532,
283
  533,
 
284
  535,
285
  536,
286
  537,
287
  539,
 
288
  541,
289
  542,
290
  543,
291
  544,
292
- 545,
293
  546,
294
  548,
295
- 549,
296
  550,
297
  551,
298
- 552,
299
- 554,
300
  555,
301
- 556,
302
  558,
303
- 560,
 
304
  562,
305
  563,
306
  565,
307
  567,
308
- 570,
309
  571,
 
310
  573,
311
  574,
312
- 575,
313
  577,
314
- 579,
 
315
  581,
 
316
  583,
317
- 584,
318
  585,
319
  586,
320
- 587,
321
  589,
322
- 590,
323
- 591,
324
  593,
325
  594,
326
  595,
327
  597,
328
  598,
329
- 600,
330
  601,
331
  605,
332
- 606,
333
  607,
 
334
  609,
335
  610,
 
336
  613,
337
- 617,
338
  619,
339
- 620,
340
  621,
341
  622,
342
  624,
343
- 625,
 
344
  628,
 
 
345
  631,
346
- 633,
347
  634,
348
  636,
349
  638,
350
- 639,
351
  640,
352
  641,
353
- 642,
354
  643,
355
- 644,
356
  646,
357
  648,
358
  650,
359
  652,
360
- 653,
361
- 655,
362
  657,
363
- 658,
364
  660,
365
- 662,
366
- 664,
367
  666,
368
  668,
369
  669,
370
- 671,
371
  672,
372
- 673,
373
- 675,
374
- 678,
375
- 680,
376
  681,
377
- 682,
378
  685,
379
  687,
380
- 690,
381
  692,
382
- 693,
383
  695,
384
- 697,
385
  698,
386
- 700,
 
387
  702,
388
- 703,
 
389
  706,
 
390
  709,
391
  710,
392
  712,
393
- 713,
394
  715,
395
  716,
 
396
  718,
397
- 107,
398
  720,
399
- 722,
400
  723,
401
- 724,
402
  726,
403
- 728,
404
  729,
405
- 730,
406
- 731,
407
  732,
408
- 733,
409
- 735,
410
  736,
411
  738,
412
- 741,
 
 
413
  744,
414
- 747,
 
 
415
  749,
416
  751,
417
- 753,
 
418
  755,
 
419
  757,
420
- 758,
421
  759,
422
- 760,
423
  761,
 
424
  763,
425
  764,
 
426
  766,
427
  767,
428
  769,
429
- 770,
430
  771,
 
431
  773,
432
- 775,
433
  776,
434
  777,
435
  778,
436
- 779,
437
  780,
438
  781,
439
- 783,
440
  785,
441
  786,
442
  787,
443
- 790,
444
- 791,
445
  792,
446
  794,
447
- 795,
448
- 796,
449
- 799,
450
  800,
451
  801,
452
- 803,
 
 
453
  806,
454
  808,
 
455
  811,
456
- 813,
457
  814,
458
- 815,
459
- 817,
460
  818,
461
- 484,
462
- 820,
463
- 822,
464
- 824,
465
  825,
466
- 827,
467
- 829,
468
- 831,
 
 
469
  834,
470
- 836,
471
  838,
472
  839,
473
- 841,
 
474
  843,
475
- 845,
476
  846,
477
  847,
478
- 848,
479
  851,
480
- 852,
481
  853,
482
  855,
483
- 856,
484
  857,
485
  859,
486
  860,
487
- 862,
 
488
  864,
489
- 866,
490
  868,
491
  870,
492
- 872,
493
  873,
494
- 874,
495
  876,
496
- 879,
497
  881,
498
- 884,
499
- 886,
 
500
  887,
501
- 888,
502
- 892,
 
 
503
  893,
504
  894,
505
  896,
 
506
  898,
 
507
  900,
508
- 901,
509
- 166,
510
  903,
511
  904,
512
- 906,
513
- 907,
514
- 908,
515
  909,
516
- 912,
517
  913,
 
518
  918,
 
519
  922,
 
520
  925,
521
- 927,
 
522
  930,
523
- 932,
 
524
  935,
525
- 936,
526
  937,
527
  939,
528
- 941,
529
- 943,
530
  945,
 
531
  948,
532
- 949,
533
  950,
534
  952,
535
- 954,
 
536
  956,
537
  958,
538
- 961,
 
 
539
  963,
540
  965,
541
  967,
542
  968,
 
543
  970,
544
  971,
545
- 973,
546
  974,
547
  975,
548
  977,
549
  978,
 
550
  980,
551
  982,
552
  983,
@@ -557,42 +561,38 @@
557
  989,
558
  990,
559
  992,
560
- 993,
561
  994,
562
- 995,
563
- 997,
564
  998,
565
- 999,
566
  1000,
567
- 1001,
568
  1002,
569
- 1004,
570
  1005,
571
- 1007,
572
- 1009,
573
  1011,
574
  1013,
575
- 1015,
576
- 1017,
577
  1018,
578
  1020,
 
579
  1023,
580
- 1024,
581
- 1026,
582
- 1028,
583
  1029,
584
- 1031,
585
- 1033,
586
- 1035,
587
  1036,
588
- 1038,
589
- 1040,
590
- 1042,
591
  1043,
592
  1044,
593
  1045,
594
- 48,
595
  1046,
 
596
  1047,
597
  1048,
598
  1049,
@@ -618,7 +618,6 @@
618
  1069,
619
  1070,
620
  1071,
621
- 1072,
622
  1073,
623
  1074,
624
  1075,
@@ -647,313 +646,312 @@
647
  1098,
648
  1099,
649
  1100,
650
- 1101,
651
  1102,
652
- 1103,
653
- 1104,
654
  1105,
655
- 1106,
656
  1107,
657
- 1108,
658
  1109,
659
- 1110,
660
  1111,
661
- 1112,
662
  1113,
663
  1114,
664
  1115,
665
  1116,
666
  1117,
667
- 1118,
668
  1119,
669
- 1120,
670
  1121,
 
671
  1123,
672
  1124,
673
- 1127,
 
674
  1129,
675
- 1131,
676
- 1133,
677
  1135,
678
- 1137,
679
- 1139,
680
  1140,
681
- 1141,
682
  1143,
683
- 1145,
684
  1146,
685
  1147,
686
- 1148,
687
- 1150,
688
- 1152,
689
  1153,
690
- 1156,
 
 
691
  1158,
692
- 1159,
693
  1160,
694
  1162,
 
695
  1164,
696
- 1167,
 
697
  1168,
698
  1169,
699
- 1171,
700
  1172,
 
701
  1174,
702
- 1176,
703
- 1178,
704
  1179,
705
  1180,
706
- 1181,
707
  1183,
708
- 1184,
709
  1186,
710
  1187,
711
- 1188,
712
- 1190,
713
- 1191,
714
- 1193,
715
  1194,
716
  1195,
717
  1197,
718
  1198,
 
 
719
  1202,
720
  1203,
721
- 1205,
722
  1206,
 
 
723
  1209,
724
  1210,
725
  1212,
 
726
  1215,
727
  1217,
728
- 1218,
729
  1220,
730
  1221,
731
- 140,
732
  1222,
733
- 1223,
 
734
  1226,
735
  1227,
 
 
736
  1230,
737
  1231,
738
  1232,
739
  1233,
 
740
  1235,
741
- 1236,
742
  1238,
 
 
743
  1240,
744
- 1242,
745
  1243,
746
  1244,
747
  1245,
748
  1246,
749
  1248,
750
  1249,
751
- 1250,
752
- 1251,
753
  1252,
754
- 1253,
755
- 1254,
756
  1255,
 
757
  1257,
758
- 1258,
759
  1259,
760
- 1262,
761
- 746,
762
  1263,
763
- 1264,
764
  1266,
765
  1267,
766
  1268,
767
- 1269,
768
  1271,
769
  1272,
 
770
  1275,
771
- 1278,
772
  1279,
773
  1280,
774
- 1281,
775
  1283,
776
- 1285,
 
777
  1287,
778
  1289,
779
  1290,
780
- 1291,
781
  1292,
 
782
  1294,
783
- 1295,
784
- 1297,
785
  1298,
786
  1300,
787
- 1302,
788
  1303,
789
- 1305,
790
  1306,
791
  1307,
 
792
  1309,
793
- 1310,
794
  1312,
795
- 1313,
796
  1315,
797
- 1316,
798
  1318,
799
- 1319,
800
- 1321,
801
  1322,
802
  1323,
803
- 1324,
804
  1326,
805
  1327,
806
- 1330,
807
- 1333,
 
808
  1334,
809
- 445,
810
  1336,
811
- 1338,
812
  1339,
 
813
  1341,
 
814
  1343,
815
  1344,
816
  1345,
817
  1346,
 
818
  1349,
819
  1350,
820
  1351,
821
  1352,
822
  1353,
 
823
  1355,
824
  1356,
825
  1357,
 
826
  1359,
827
  1360,
828
  1361,
829
  1362,
830
  1363,
831
  1364,
832
- 1365,
833
- 1366,
834
  1367,
835
  1368,
836
  1369,
837
- 1370,
838
  1371,
839
- 1372,
840
  1373,
841
- 1374,
842
- 1375,
843
  1376,
844
  1377,
 
 
845
  1380,
846
  1381,
847
  1382,
848
- 1383,
849
  1385,
 
850
  1387,
 
851
  1390,
852
- 1391,
853
  1392,
854
- 1393,
855
- 1394,
856
  1395,
857
- 1396,
858
  1397,
 
 
859
  1400,
860
- 1401,
861
  1402,
 
862
  1404,
 
863
  1407,
864
- 1409,
865
  1410,
866
  1411,
867
  1412,
868
- 1413,
869
  1415,
870
  1416,
 
871
  1418,
872
  1420,
873
- 1421,
874
  1422,
875
  1423,
876
  1424,
877
  1425,
878
- 1427,
879
- 1429,
880
  1430,
881
- 1431,
882
  1432,
883
  1433,
884
- 1435,
885
- 1437,
886
  1438,
887
  1439,
 
888
  1442,
 
889
  1444,
890
- 1445,
891
  1446,
892
- 1447,
893
- 1449,
894
  1450,
 
895
  1452,
896
  1453,
897
  1455,
898
  1457,
899
- 1459,
900
- 1461,
901
  1462,
902
  1463,
903
  1465,
904
- 1467,
905
  1468,
906
  1470,
907
  1472,
908
  1473,
909
- 1475,
910
  1477,
 
911
  1479,
 
912
  1481,
913
  1482,
914
  1483,
 
 
915
  1486,
916
- 1487,
917
  1488,
918
  1489,
919
  1490,
920
  1491,
921
  1492,
922
- 1493,
923
  1494,
924
- 1496,
925
  1497,
926
- 1498,
927
- 1499,
928
  1501,
929
  1502,
930
- 1504,
931
  1505,
 
932
  1508,
933
  1509,
934
- 1510,
935
  1511,
936
- 1513,
 
937
  1515,
938
  1516,
939
  1517,
940
- 1519,
941
- 1520,
942
  1522,
943
  1523,
944
- 1524,
945
  1528,
946
  1529,
947
  1531,
 
948
  1534,
949
  1535,
950
- 1537,
951
  1538,
952
- 1540,
953
  1541,
 
954
  1543,
955
  1544,
956
  1545,
 
957
  1547,
958
  1548,
959
  1549,
@@ -1006,7 +1004,7 @@
1006
  1596,
1007
  1597,
1008
  1598,
1009
- 1599,
1010
  1601,
1011
  1602,
1012
  1603,
@@ -1025,98 +1023,89 @@
1025
  1617,
1026
  1618,
1027
  1619,
1028
- 1620,
1029
- 1622,
1030
  1625,
1031
  1626,
1032
- 1627,
1033
  1628,
1034
  1629,
 
1035
  1631,
1036
  1632,
1037
  1633,
1038
- 1634,
1039
  1636,
1040
- 1637,
1041
  1638,
 
1042
  1640,
1043
  1641,
1044
- 1642,
1045
  1643,
 
1046
  1645,
1047
- 1646,
1048
- 1647,
1049
  1650,
1050
- 1652,
1051
  1653,
 
1052
  1655,
1053
- 1656,
1054
  1657,
1055
- 1658,
1056
- 1659,
1057
  1660,
 
1058
  1662,
 
 
1059
  1665,
1060
  1666,
1061
  1667,
1062
- 1668,
1063
  1669,
1064
  1670,
1065
  1671,
1066
  1672,
1067
- 1674,
1068
  1675,
1069
  1676,
1070
- 1677,
1071
  1678,
1072
  1679,
 
1073
  1681,
1074
- 1682,
1075
  1683,
1076
- 1685,
1077
  1686,
1078
  1687,
1079
  1689,
1080
  1690,
1081
- 1692,
1082
- 1694,
1083
- 1695,
1084
  1696,
1085
  1697,
 
1086
  1699,
1087
- 1702,
1088
- 1703,
1089
  1704,
1090
  1705,
1091
- 1707,
1092
  1708,
1093
- 1709,
1094
  1710,
1095
  1711,
1096
- 1712,
1097
  1713,
1098
- 1716,
1099
  1717,
1100
- 1718,
1101
- 1720,
1102
  1722,
1103
  1724,
1104
  1726,
1105
  1728,
 
 
1106
  1732,
1107
- 1733,
1108
- 1735,
1109
- 1737,
1110
- 1739,
1111
  1741,
1112
  1742,
1113
  1743,
1114
- 1745,
1115
- 1747,
1116
- 1749,
1117
- 1751,
1118
- 1752,
1119
- 1753,
1120
- 1754
1121
  ]
1122
  }
 
102
  202,
103
  204,
104
  206,
105
+ 207,
106
+ 209,
107
+ 211,
108
+ 213,
109
+ 216,
110
+ 218,
111
+ 223,
112
+ 225,
113
  227,
114
+ 229,
115
  230,
116
+ 233,
117
+ 235,
118
+ 236,
119
  238,
 
120
  241,
121
+ 242,
122
  244,
123
  246,
124
+ 247,
125
+ 249,
126
+ 251,
127
+ 255,
128
  256,
129
+ 257,
130
  259,
131
  261,
132
  262,
 
135
  267,
136
  268,
137
  270,
138
+ 271,
139
+ 273,
140
  275,
141
+ 278,
 
142
  279,
143
+ 280,
144
+ 282,
145
+ 286,
146
  289,
147
+ 290,
148
+ 293,
149
  297,
150
+ 298,
151
+ 299,
152
+ 300,
153
  301,
 
154
  303,
155
  304,
156
+ 305,
157
  307,
158
  308,
159
+ 309,
160
+ 312,
161
  313,
162
+ 315,
163
  317,
164
  319,
165
+ 320,
166
+ 322,
167
  323,
 
168
  326,
169
  327,
170
+ 329,
171
  331,
172
  333,
173
+ 334,
174
+ 336,
175
  337,
176
  338,
177
+ 339,
178
  341,
 
179
  343,
180
+ 346,
181
+ 349,
182
+ 351,
183
+ 352,
184
  353,
185
  355,
 
186
  357,
187
+ 360,
188
  361,
189
+ 362,
190
  364,
 
 
191
  368,
192
+ 370,
193
+ 375,
194
+ 376,
195
+ 377,
196
  380,
197
+ 382,
198
+ 385,
199
  386,
200
+ 388,
201
  390,
202
  392,
203
  394,
204
+ 395,
205
  397,
206
+ 399,
 
207
  401,
208
  403,
209
  405,
210
+ 406,
 
211
  409,
212
+ 410,
213
  412,
214
  413,
215
+ 416,
 
 
216
  418,
217
+ 419,
218
+ 420,
219
+ 422,
220
  424,
221
  425,
222
+ 428,
223
  429,
224
+ 431,
225
  433,
226
  434,
227
+ 435,
228
+ 437,
229
  438,
 
 
230
  442,
231
+ 443,
232
+ 444,
233
  447,
234
+ 449,
235
+ 450,
236
  453,
237
  454,
238
+ 456,
239
  458,
240
+ 459,
241
  460,
242
+ 461,
 
243
  464,
244
+ 466,
245
+ 468,
246
  469,
247
+ 471,
248
  472,
249
+ 474,
 
250
  476,
251
  478,
252
  480,
253
+ 482,
254
  483,
255
+ 484,
256
  485,
257
  487,
 
258
  489,
 
259
  492,
260
  494,
261
+ 495,
262
+ 496,
263
  497,
 
264
  499,
265
  501,
 
266
  503,
267
+ 505,
268
  506,
269
+ 507,
270
+ 509,
271
  510,
272
+ 511,
 
273
  514,
 
274
  518,
 
275
  520,
276
  523,
277
+ 524,
278
+ 526,
279
  527,
280
+ 528,
281
+ 530,
282
  532,
283
  533,
284
+ 534,
285
  535,
286
  536,
287
  537,
288
  539,
289
+ 540,
290
  541,
291
  542,
292
  543,
293
  544,
 
294
  546,
295
  548,
 
296
  550,
297
  551,
298
+ 553,
 
299
  555,
 
300
  558,
301
+ 559,
302
+ 561,
303
  562,
304
  563,
305
  565,
306
  567,
307
+ 569,
308
  571,
309
+ 572,
310
  573,
311
  574,
312
+ 576,
313
  577,
314
+ 578,
315
+ 580,
316
  581,
317
+ 582,
318
  583,
 
319
  585,
320
  586,
321
+ 588,
322
  589,
 
 
323
  593,
324
  594,
325
  595,
326
  597,
327
  598,
 
328
  601,
329
  605,
 
330
  607,
331
+ 608,
332
  609,
333
  610,
334
+ 612,
335
  613,
336
+ 616,
337
  619,
 
338
  621,
339
  622,
340
  624,
341
+ 626,
342
+ 627,
343
  628,
344
+ 629,
345
+ 630,
346
  631,
347
+ 632,
348
  634,
349
  636,
350
  638,
 
351
  640,
352
  641,
 
353
  643,
354
+ 645,
355
  646,
356
  648,
357
  650,
358
  652,
359
+ 654,
360
+ 656,
361
  657,
362
+ 659,
363
  660,
364
+ 661,
365
+ 663,
366
  666,
367
  668,
368
  669,
 
369
  672,
370
+ 674,
371
+ 677,
372
+ 679,
 
373
  681,
374
+ 683,
375
  685,
376
  687,
377
+ 689,
378
  692,
 
379
  695,
380
+ 696,
381
  698,
382
+ 699,
383
+ 701,
384
  702,
385
+ 704,
386
+ 107,
387
  706,
388
+ 708,
389
  709,
390
  710,
391
  712,
392
+ 714,
393
  715,
394
  716,
395
+ 717,
396
  718,
 
397
  720,
398
+ 721,
399
  723,
 
400
  726,
 
401
  729,
 
 
402
  732,
403
+ 734,
 
404
  736,
405
  738,
406
+ 740,
407
+ 742,
408
+ 743,
409
  744,
410
+ 745,
411
+ 746,
412
+ 748,
413
  749,
414
  751,
415
+ 752,
416
+ 754,
417
  755,
418
+ 756,
419
  757,
 
420
  759,
 
421
  761,
422
+ 762,
423
  763,
424
  764,
425
+ 765,
426
  766,
427
  767,
428
  769,
 
429
  771,
430
+ 772,
431
  773,
 
432
  776,
433
  777,
434
  778,
 
435
  780,
436
  781,
437
+ 782,
438
  785,
439
  786,
440
  787,
441
+ 789,
 
442
  792,
443
  794,
444
+ 797,
445
+ 798,
 
446
  800,
447
  801,
448
+ 802,
449
+ 804,
450
+ 805,
451
  806,
452
  808,
453
+ 809,
454
  811,
455
+ 812,
456
  814,
457
+ 816,
 
458
  818,
459
+ 821,
460
+ 823,
 
 
461
  825,
462
+ 826,
463
+ 828,
464
+ 830,
465
+ 832,
466
+ 833,
467
  834,
468
+ 835,
469
  838,
470
  839,
471
+ 840,
472
+ 842,
473
  843,
474
+ 844,
475
  846,
476
  847,
477
+ 849,
478
  851,
 
479
  853,
480
  855,
 
481
  857,
482
  859,
483
  860,
484
+ 861,
485
+ 863,
486
  864,
487
+ 865,
488
  868,
489
  870,
 
490
  873,
491
+ 875,
492
  876,
493
+ 877,
494
  881,
495
+ 882,
496
+ 883,
497
+ 885,
498
  887,
499
+ 889,
500
+ 890,
501
+ 891,
502
+ 166,
503
  893,
504
  894,
505
  896,
506
+ 897,
507
  898,
508
+ 899,
509
  900,
 
 
510
  903,
511
  904,
 
 
 
512
  909,
 
513
  913,
514
+ 916,
515
  918,
516
+ 919,
517
  922,
518
+ 923,
519
  925,
520
+ 928,
521
+ 929,
522
  930,
523
+ 931,
524
+ 933,
525
  935,
 
526
  937,
527
  939,
528
+ 942,
 
529
  945,
530
+ 947,
531
  948,
 
532
  950,
533
  952,
534
+ 953,
535
+ 955,
536
  956,
537
  958,
538
+ 959,
539
+ 960,
540
+ 962,
541
  963,
542
  965,
543
  967,
544
  968,
545
+ 969,
546
  970,
547
  971,
548
+ 972,
549
  974,
550
  975,
551
  977,
552
  978,
553
+ 979,
554
  980,
555
  982,
556
  983,
 
561
  989,
562
  990,
563
  992,
 
564
  994,
565
+ 996,
 
566
  998,
 
567
  1000,
 
568
  1002,
569
+ 1003,
570
  1005,
571
+ 1008,
572
+ 1010,
573
  1011,
574
  1013,
575
+ 1014,
576
+ 1016,
577
  1018,
578
  1020,
579
+ 1021,
580
  1023,
581
+ 1025,
582
+ 1027,
 
583
  1029,
584
+ 1030,
585
+ 1032,
586
+ 1034,
587
  1036,
588
+ 1037,
589
+ 1039,
590
+ 1041,
591
  1043,
592
  1044,
593
  1045,
 
594
  1046,
595
+ 48,
596
  1047,
597
  1048,
598
  1049,
 
618
  1069,
619
  1070,
620
  1071,
 
621
  1073,
622
  1074,
623
  1075,
 
646
  1098,
647
  1099,
648
  1100,
 
649
  1102,
 
 
650
  1105,
 
651
  1107,
 
652
  1109,
 
653
  1111,
 
654
  1113,
655
  1114,
656
  1115,
657
  1116,
658
  1117,
 
659
  1119,
 
660
  1121,
661
+ 1122,
662
  1123,
663
  1124,
664
+ 1126,
665
+ 1128,
666
  1129,
667
+ 1132,
668
+ 1134,
669
  1135,
670
+ 1136,
671
+ 1138,
672
  1140,
 
673
  1143,
674
+ 1144,
675
  1146,
676
  1147,
677
+ 1149,
678
+ 1151,
 
679
  1153,
680
+ 1154,
681
+ 1155,
682
+ 1157,
683
  1158,
 
684
  1160,
685
  1162,
686
+ 1163,
687
  1164,
688
+ 1165,
689
+ 1166,
690
  1168,
691
  1169,
692
+ 1170,
693
  1172,
694
+ 1173,
695
  1174,
696
+ 1175,
 
697
  1179,
698
  1180,
699
+ 1182,
700
  1183,
 
701
  1186,
702
  1187,
703
+ 1189,
704
+ 1192,
 
 
705
  1194,
706
  1195,
707
  1197,
708
  1198,
709
+ 140,
710
+ 1199,
711
  1202,
712
  1203,
 
713
  1206,
714
+ 1207,
715
+ 1208,
716
  1209,
717
  1210,
718
  1212,
719
+ 1213,
720
  1215,
721
  1217,
722
+ 1219,
723
  1220,
724
  1221,
 
725
  1222,
726
+ 1224,
727
+ 1225,
728
  1226,
729
  1227,
730
+ 1228,
731
+ 1229,
732
  1230,
733
  1231,
734
  1232,
735
  1233,
736
+ 1234,
737
  1235,
 
738
  1238,
739
+ 1239,
740
+ 731,
741
  1240,
742
+ 1241,
743
  1243,
744
  1244,
745
  1245,
746
  1246,
747
  1248,
748
  1249,
 
 
749
  1252,
 
 
750
  1255,
751
+ 1256,
752
  1257,
 
753
  1259,
754
+ 1261,
 
755
  1263,
756
+ 1265,
757
  1266,
758
  1267,
759
  1268,
760
+ 1270,
761
  1271,
762
  1272,
763
+ 1274,
764
  1275,
765
+ 1277,
766
  1279,
767
  1280,
768
+ 1282,
769
  1283,
770
+ 1284,
771
+ 1286,
772
  1287,
773
  1289,
774
  1290,
 
775
  1292,
776
+ 1293,
777
  1294,
778
+ 1296,
 
779
  1298,
780
  1300,
781
+ 1301,
782
  1303,
783
+ 1304,
784
  1306,
785
  1307,
786
+ 1308,
787
  1309,
788
+ 1311,
789
  1312,
 
790
  1315,
 
791
  1318,
792
+ 441,
793
+ 1320,
794
  1322,
795
  1323,
796
+ 1325,
797
  1326,
798
  1327,
799
+ 1328,
800
+ 1331,
801
+ 1332,
802
  1334,
803
+ 1335,
804
  1336,
805
+ 1337,
806
  1339,
807
+ 1340,
808
  1341,
809
+ 1342,
810
  1343,
811
  1344,
812
  1345,
813
  1346,
814
+ 1348,
815
  1349,
816
  1350,
817
  1351,
818
  1352,
819
  1353,
820
+ 1354,
821
  1355,
822
  1356,
823
  1357,
824
+ 1358,
825
  1359,
826
  1360,
827
  1361,
828
  1362,
829
  1363,
830
  1364,
 
 
831
  1367,
832
  1368,
833
  1369,
 
834
  1371,
 
835
  1373,
 
 
836
  1376,
837
  1377,
838
+ 1378,
839
+ 1379,
840
  1380,
841
  1381,
842
  1382,
 
843
  1385,
844
+ 1386,
845
  1387,
846
+ 1388,
847
  1390,
 
848
  1392,
 
 
849
  1395,
 
850
  1397,
851
+ 1398,
852
+ 1399,
853
  1400,
 
854
  1402,
855
+ 1403,
856
  1404,
857
+ 1406,
858
  1407,
859
+ 1408,
860
  1410,
861
  1411,
862
  1412,
863
+ 1414,
864
  1415,
865
  1416,
866
+ 1417,
867
  1418,
868
  1420,
 
869
  1422,
870
  1423,
871
  1424,
872
  1425,
873
+ 1426,
874
+ 1428,
875
  1430,
 
876
  1432,
877
  1433,
878
+ 1436,
 
879
  1438,
880
  1439,
881
+ 1440,
882
  1442,
883
+ 1443,
884
  1444,
 
885
  1446,
886
+ 1448,
 
887
  1450,
888
+ 1451,
889
  1452,
890
  1453,
891
  1455,
892
  1457,
893
+ 1458,
894
+ 1460,
895
  1462,
896
  1463,
897
  1465,
898
+ 1466,
899
  1468,
900
  1470,
901
  1472,
902
  1473,
903
+ 1474,
904
  1477,
905
+ 1478,
906
  1479,
907
+ 1480,
908
  1481,
909
  1482,
910
  1483,
911
+ 1484,
912
+ 1485,
913
  1486,
 
914
  1488,
915
  1489,
916
  1490,
917
  1491,
918
  1492,
 
919
  1494,
920
+ 1495,
921
  1497,
922
+ 1500,
 
923
  1501,
924
  1502,
925
+ 1503,
926
  1505,
927
+ 1507,
928
  1508,
929
  1509,
 
930
  1511,
931
+ 1512,
932
+ 1514,
933
  1515,
934
  1516,
935
  1517,
936
+ 1521,
 
937
  1522,
938
  1523,
939
+ 1525,
940
  1528,
941
  1529,
942
  1531,
943
+ 1532,
944
  1534,
945
  1535,
946
+ 1536,
947
  1538,
948
+ 1539,
949
  1541,
950
+ 1542,
951
  1543,
952
  1544,
953
  1545,
954
+ 1546,
955
  1547,
956
  1548,
957
  1549,
 
1004
  1596,
1005
  1597,
1006
  1598,
1007
+ 1600,
1008
  1601,
1009
  1602,
1010
  1603,
 
1023
  1617,
1024
  1618,
1025
  1619,
1026
+ 1621,
1027
+ 1624,
1028
  1625,
1029
  1626,
 
1030
  1628,
1031
  1629,
1032
+ 1630,
1033
  1631,
1034
  1632,
1035
  1633,
1036
+ 1635,
1037
  1636,
 
1038
  1638,
1039
+ 1639,
1040
  1640,
1041
  1641,
 
1042
  1643,
1043
+ 1644,
1044
  1645,
1045
+ 1648,
 
1046
  1650,
1047
+ 1651,
1048
  1653,
1049
+ 1654,
1050
  1655,
 
1051
  1657,
 
 
1052
  1660,
1053
+ 1661,
1054
  1662,
1055
+ 1663,
1056
+ 1664,
1057
  1665,
1058
  1666,
1059
  1667,
 
1060
  1669,
1061
  1670,
1062
  1671,
1063
  1672,
1064
+ 1673,
1065
  1675,
1066
  1676,
 
1067
  1678,
1068
  1679,
1069
+ 1680,
1070
  1681,
 
1071
  1683,
1072
+ 1684,
1073
  1686,
1074
  1687,
1075
  1689,
1076
  1690,
1077
+ 1691,
1078
+ 1693,
 
1079
  1696,
1080
  1697,
1081
+ 1698,
1082
  1699,
1083
+ 1700,
1084
+ 1701,
1085
  1704,
1086
  1705,
1087
+ 1706,
1088
  1708,
 
1089
  1710,
1090
  1711,
 
1091
  1713,
1092
+ 1715,
1093
  1717,
1094
+ 1721,
 
1095
  1722,
1096
  1724,
1097
  1726,
1098
  1728,
1099
+ 1730,
1100
+ 1731,
1101
  1732,
1102
+ 1734,
1103
+ 1736,
1104
+ 1738,
1105
+ 1740,
1106
  1741,
1107
  1742,
1108
  1743,
1109
+ 1744
 
 
 
 
 
 
1110
  ]
1111
  }
lemmatizer/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e1127147ab65ad6929e0f4685f452de94c481da040eb40213d48f37da216d810
3
- size 3439621
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:aa40645b19dddf7b29ed1769770b9676aa676751efc953ff281f0441cfe97b8e
3
+ size 3405869
lemmatizer/trees CHANGED
Binary files a/lemmatizer/trees and b/lemmatizer/trees differ
 
meta.json CHANGED
@@ -1,14 +1,14 @@
1
  {
2
  "lang":"sl",
3
  "name":"core_news_trf",
4
- "version":"3.6.1",
5
- "description":"Slovenian transformer pipeline (EMBEDDIA/sloberta). Components: transformer, tagger, morphologizer, parser, lemmatizer (trainable_lemmatizer), ner.",
6
  "author":"Explosion",
7
  "email":"[email protected]",
8
  "url":"https://explosion.ai",
9
  "license":"CC BY-SA 4.0",
10
- "spacy_version":">=3.6.0,<3.7.0",
11
- "spacy_git_version":"c067b5264",
12
  "vectors":{
13
  "width":0,
14
  "vectors":0,
@@ -2419,57 +2419,57 @@
2419
  "token_p":0.9980762654,
2420
  "token_r":0.9956925964,
2421
  "token_f":0.996883006,
2422
- "tag_acc":0.9770274602,
2423
- "pos_acc":0.9899348221,
2424
- "morph_acc":0.9788652634,
2425
- "morph_micro_p":0.9900582551,
2426
- "morph_micro_r":0.9888854047,
2427
- "morph_micro_f":0.9894714824,
2428
  "morph_per_feat":{
2429
  "Case":{
2430
- "p":0.9839550079,
2431
- "r":0.9840363937,
2432
- "f":0.9839956991
2433
  },
2434
  "Gender":{
2435
- "p":0.9841089671,
2436
- "r":0.9844814534,
2437
- "f":0.984295175
2438
  },
2439
  "Number":{
2440
- "p":0.9930539296,
2441
- "r":0.9932864949,
2442
- "f":0.9931701986
2443
  },
2444
  "Aspect":{
2445
- "p":0.9868204283,
2446
- "r":0.9933665008,
2447
- "f":0.9900826446
2448
  },
2449
  "Mood":{
2450
- "p":0.9951520494,
2451
- "r":0.9947136564,
2452
- "f":0.9949328046
2453
  },
2454
  "Person":{
2455
- "p":0.9972166998,
2456
- "r":0.9964243147,
2457
- "f":0.9968203498
2458
  },
2459
  "Tense":{
2460
- "p":0.9980582524,
2461
- "r":0.9951597289,
2462
- "f":0.9966068832
2463
  },
2464
  "VerbForm":{
2465
- "p":0.997615894,
2466
- "r":0.9949801849,
2467
- "f":0.9962962963
2468
  },
2469
  "PronType":{
2470
- "p":0.9929214929,
2471
  "r":0.995483871,
2472
- "f":0.9942010309
2473
  },
2474
  "Variant":{
2475
  "p":0.9981412639,
@@ -2478,18 +2478,18 @@
2478
  },
2479
  "NumForm":{
2480
  "p":1.0,
2481
- "r":0.9617834395,
2482
- "f":0.9805194805
2483
  },
2484
  "NumType":{
2485
  "p":1.0,
2486
- "r":0.9653179191,
2487
- "f":0.9823529412
2488
  },
2489
  "Degree":{
2490
- "p":0.9925989183,
2491
- "r":0.9923164485,
2492
- "f":0.9924576633
2493
  },
2494
  "Polarity":{
2495
  "p":1.0,
@@ -2497,14 +2497,14 @@
2497
  "f":0.9992902768
2498
  },
2499
  "Number[psor]":{
2500
- "p":0.9375,
2501
- "r":0.9375,
2502
- "f":0.9375
2503
  },
2504
  "Poss":{
2505
- "p":1.0,
2506
- "r":0.995,
2507
- "f":0.9974937343
2508
  },
2509
  "Reflex":{
2510
  "p":0.9975,
@@ -2517,9 +2517,9 @@
2517
  "f":0.9896551724
2518
  },
2519
  "Animacy":{
2520
- "p":0.9634831461,
2521
- "r":0.9346049046,
2522
- "f":0.948824343
2523
  },
2524
  "Abbr":{
2525
  "p":1.0,
@@ -2527,186 +2527,186 @@
2527
  "f":1.0
2528
  },
2529
  "Foreign":{
2530
- "p":0.8947368421,
2531
- "r":0.4473684211,
2532
- "f":0.5964912281
2533
  },
2534
  "Gender[psor]":{
2535
- "p":0.9090909091,
2536
  "r":0.9090909091,
2537
- "f":0.9090909091
2538
  }
2539
  },
2540
- "sents_p":0.9320113314,
2541
- "sents_r":0.9426934097,
2542
- "sents_f":0.9373219373,
2543
- "dep_uas":0.93363503,
2544
- "dep_las":0.9170251386,
2545
  "dep_las_per_type":{
2546
  "nsubj":{
2547
- "p":0.942804428,
2548
- "r":0.9560336763,
2549
- "f":0.949372968
2550
  },
2551
  "nmod":{
2552
- "p":0.8998951782,
2553
- "r":0.8850515464,
2554
- "f":0.8924116424
2555
  },
2556
  "root":{
2557
- "p":0.9437022901,
2558
- "r":0.9446036294,
2559
- "f":0.9441527446
2560
  },
2561
  "obj":{
2562
- "p":0.9335899904,
2563
- "r":0.9371980676,
2564
- "f":0.9353905497
2565
  },
2566
  "iobj":{
2567
- "p":0.8732394366,
2568
- "r":0.7209302326,
2569
- "f":0.7898089172
2570
  },
2571
  "case":{
2572
- "p":0.9897260274,
2573
- "r":0.9863481229,
2574
- "f":0.988034188
2575
  },
2576
  "cc":{
2577
- "p":0.9503546099,
2578
- "r":0.9447708578,
2579
- "f":0.947554508
2580
  },
2581
  "conj":{
2582
- "p":0.8023483366,
2583
- "r":0.8,
2584
- "f":0.8011724475
2585
  },
2586
  "mark":{
2587
- "p":0.9721115538,
2588
- "r":0.9721115538,
2589
- "f":0.9721115538
2590
  },
2591
  "obl":{
2592
- "p":0.9013806706,
2593
- "r":0.8984272608,
2594
- "f":0.8999015425
2595
  },
2596
  "nummod":{
2597
- "p":0.9423076923,
2598
- "r":0.9171122995,
2599
- "f":0.9295392954
2600
  },
2601
  "acl":{
2602
- "p":0.8353221957,
2603
- "r":0.8177570093,
2604
- "f":0.826446281
2605
  },
2606
  "amod":{
2607
- "p":0.9797830375,
2608
- "r":0.9895418327,
2609
- "f":0.9846382557
2610
  },
2611
  "aux":{
2612
- "p":0.9900552486,
2613
- "r":0.9900552486,
2614
- "f":0.9900552486
2615
  },
2616
  "det":{
2617
- "p":0.9719222462,
2618
- "r":0.9846827133,
2619
- "f":0.9782608696
2620
  },
2621
  "advmod":{
2622
- "p":0.8700209644,
2623
- "r":0.8718487395,
2624
- "f":0.870933893
2625
  },
2626
  "parataxis":{
2627
- "p":0.7,
2628
  "r":0.6549707602,
2629
- "f":0.6767371601
2630
  },
2631
  "flat:name":{
2632
- "p":0.9681528662,
2633
- "r":0.987012987,
2634
- "f":0.9774919614
2635
  },
2636
  "cop":{
2637
- "p":0.9393939394,
2638
- "r":0.9482352941,
2639
- "f":0.943793911
2640
  },
2641
  "csubj":{
2642
- "p":0.8863636364,
2643
- "r":0.8764044944,
2644
- "f":0.8813559322
2645
  },
2646
  "expl":{
2647
- "p":0.9272151899,
2648
- "r":0.9391025641,
2649
- "f":0.9331210191
2650
  },
2651
  "xcomp":{
2652
- "p":0.9496855346,
2653
- "r":0.94375,
2654
- "f":0.9467084639
2655
  },
2656
  "ccomp":{
2657
- "p":0.9438202247,
2658
- "r":0.9130434783,
2659
- "f":0.9281767956
 
 
 
 
 
2660
  },
2661
  "appos":{
2662
- "p":0.5921052632,
2663
- "r":0.625,
2664
- "f":0.6081081081
2665
  },
2666
  "flat":{
2667
- "p":0.6086956522,
2668
- "r":0.7368421053,
2669
- "f":0.6666666667
2670
  },
2671
  "orphan":{
2672
- "p":0.6470588235,
2673
- "r":0.4583333333,
2674
- "f":0.5365853659
2675
  },
2676
  "advcl":{
2677
- "p":0.75,
2678
- "r":0.7285714286,
2679
- "f":0.7391304348
2680
  },
2681
  "fixed":{
2682
- "p":0.9090909091,
2683
- "r":0.9090909091,
2684
- "f":0.9090909091
2685
  },
2686
  "list":{
2687
- "p":0.7717391304,
2688
  "r":0.8658536585,
2689
- "f":0.816091954
2690
  },
2691
  "flat:foreign":{
2692
- "p":0.625,
2693
- "r":0.625,
2694
- "f":0.625
2695
- },
2696
- "dep":{
2697
- "p":0.0344827586,
2698
- "r":0.1666666667,
2699
- "f":0.0571428571
2700
  },
2701
  "discourse":{
2702
- "p":0.6,
2703
  "r":0.6428571429,
2704
- "f":0.6206896552
2705
  },
2706
  "vocative":{
2707
- "p":1.0,
2708
  "r":0.2,
2709
- "f":0.3333333333
2710
  },
2711
  "cc:preconj":{
2712
  "p":1.0,
@@ -2714,25 +2714,25 @@
2714
  "f":0.8421052632
2715
  }
2716
  },
2717
- "lemma_acc":0.9679666631,
2718
- "ents_p":0.8841463415,
2719
- "ents_r":0.917721519,
2720
- "ents_f":0.900621118,
2721
  "ents_per_type":{
2722
  "ORG":{
2723
- "p":0.7692307692,
2724
- "r":0.75,
2725
- "f":0.7594936709
2726
  },
2727
  "PER":{
2728
  "p":1.0,
2729
- "r":0.9833333333,
2730
- "f":0.9915966387
2731
  },
2732
  "LOC":{
2733
- "p":0.8666666667,
2734
  "r":0.962962963,
2735
- "f":0.9122807018
2736
  },
2737
  "MISC":{
2738
  "p":0.75,
@@ -2745,7 +2745,7 @@
2745
  "f":0.6666666667
2746
  }
2747
  },
2748
- "speed":1211.6699495749
2749
  },
2750
  "sources":[
2751
  {
@@ -2762,6 +2762,8 @@
2762
  }
2763
  ],
2764
  "requirements":[
2765
- "spacy-transformers>=1.2.2,<1.3.0"
 
 
2766
  ]
2767
  }
 
1
  {
2
  "lang":"sl",
3
  "name":"core_news_trf",
4
+ "version":"3.7.2",
5
+ "description":"Slovenian transformer pipeline (Transformer(name='EMBEDDIA/sloberta', piece_encoder='camembert-sentencepiece', stride=128, type='camembert', width=768, window=168, vocab_size=32005)). Components: transformer, tagger, morphologizer, parser, lemmatizer (trainable_lemmatizer), ner.",
6
  "author":"Explosion",
7
  "email":"[email protected]",
8
  "url":"https://explosion.ai",
9
  "license":"CC BY-SA 4.0",
10
+ "spacy_version":">=3.7.0,<3.8.0",
11
+ "spacy_git_version":"6b4f77441",
12
  "vectors":{
13
  "width":0,
14
  "vectors":0,
 
2419
  "token_p":0.9980762654,
2420
  "token_r":0.9956925964,
2421
  "token_f":0.996883006,
2422
+ "tag_acc":0.9802329309,
2423
+ "pos_acc":0.9908323539,
2424
+ "morph_acc":0.981857036,
2425
+ "morph_micro_p":0.9914889164,
2426
+ "morph_micro_r":0.9903661893,
2427
+ "morph_micro_f":0.9909272349,
2428
  "morph_per_feat":{
2429
  "Case":{
2430
+ "p":0.9859446052,
2431
+ "r":0.9863523573,
2432
+ "f":0.9861484391
2433
  },
2434
  "Gender":{
2435
+ "p":0.986196464,
2436
+ "r":0.9870363361,
2437
+ "f":0.9866162213
2438
  },
2439
  "Number":{
2440
+ "p":0.9937587767,
2441
+ "r":0.9943793911,
2442
+ "f":0.994068987
2443
  },
2444
  "Aspect":{
2445
+ "p":0.9906439185,
2446
+ "r":0.9950248756,
2447
+ "f":0.9928295643
2448
  },
2449
  "Mood":{
2450
+ "p":0.9955927721,
2451
+ "r":0.995154185,
2452
+ "f":0.9953734303
2453
  },
2454
  "Person":{
2455
+ "p":0.9984095427,
2456
+ "r":0.9976162098,
2457
+ "f":0.9980127186
2458
  },
2459
  "Tense":{
2460
+ "p":0.9985436893,
2461
+ "r":0.9956437561,
2462
+ "f":0.9970916142
2463
  },
2464
  "VerbForm":{
2465
+ "p":0.9986772487,
2466
+ "r":0.9973579921,
2467
+ "f":0.9980171844
2468
  },
2469
  "PronType":{
2470
+ "p":0.9948420374,
2471
  "r":0.995483871,
2472
+ "f":0.9951628507
2473
  },
2474
  "Variant":{
2475
  "p":0.9981412639,
 
2478
  },
2479
  "NumForm":{
2480
  "p":1.0,
2481
+ "r":0.9681528662,
2482
+ "f":0.9838187702
2483
  },
2484
  "NumType":{
2485
  "p":1.0,
2486
+ "r":0.9710982659,
2487
+ "f":0.9853372434
2488
  },
2489
  "Degree":{
2490
+ "p":0.9937268321,
2491
+ "r":0.9917472965,
2492
+ "f":0.9927360775
2493
  },
2494
  "Polarity":{
2495
  "p":1.0,
 
2497
  "f":0.9992902768
2498
  },
2499
  "Number[psor]":{
2500
+ "p":0.9583333333,
2501
+ "r":0.9583333333,
2502
+ "f":0.9583333333
2503
  },
2504
  "Poss":{
2505
+ "p":0.9949748744,
2506
+ "r":0.99,
2507
+ "f":0.992481203
2508
  },
2509
  "Reflex":{
2510
  "p":0.9975,
 
2517
  "f":0.9896551724
2518
  },
2519
  "Animacy":{
2520
+ "p":0.9630681818,
2521
+ "r":0.9237057221,
2522
+ "f":0.9429763561
2523
  },
2524
  "Abbr":{
2525
  "p":1.0,
 
2527
  "f":1.0
2528
  },
2529
  "Foreign":{
2530
+ "p":1.0,
2531
+ "r":0.3947368421,
2532
+ "f":0.5660377358
2533
  },
2534
  "Gender[psor]":{
2535
+ "p":0.9375,
2536
  "r":0.9090909091,
2537
+ "f":0.9230769231
2538
  }
2539
  },
2540
+ "sents_p":0.9208566108,
2541
+ "sents_r":0.9446036294,
2542
+ "sents_f":0.9325789722,
2543
+ "dep_uas":0.9361807921,
2544
+ "dep_las":0.9223081322,
2545
  "dep_las_per_type":{
2546
  "nsubj":{
2547
+ "p":0.9579831933,
2548
+ "r":0.9597754911,
2549
+ "f":0.9588785047
2550
  },
2551
  "nmod":{
2552
+ "p":0.8976744186,
2553
+ "r":0.8953608247,
2554
+ "f":0.896516129
2555
  },
2556
  "root":{
2557
+ "p":0.9408960915,
2558
+ "r":0.9426934097,
2559
+ "f":0.9417938931
2560
  },
2561
  "obj":{
2562
+ "p":0.9537126326,
2563
+ "r":0.9555555556,
2564
+ "f":0.9546332046
2565
  },
2566
  "iobj":{
2567
+ "p":0.8441558442,
2568
+ "r":0.7558139535,
2569
+ "f":0.7975460123
2570
  },
2571
  "case":{
2572
+ "p":0.9897128161,
2573
+ "r":0.9850682594,
2574
+ "f":0.9873850759
2575
  },
2576
  "cc":{
2577
+ "p":0.9610849057,
2578
+ "r":0.9576968273,
2579
+ "f":0.9593878752
2580
  },
2581
  "conj":{
2582
+ "p":0.820909971,
2583
+ "r":0.8273170732,
2584
+ "f":0.824101069
2585
  },
2586
  "mark":{
2587
+ "p":0.9654255319,
2588
+ "r":0.9641434263,
2589
+ "f":0.9647840532
2590
  },
2591
  "obl":{
2592
+ "p":0.923128793,
2593
+ "r":0.8971166448,
2594
+ "f":0.9099368561
2595
  },
2596
  "nummod":{
2597
+ "p":0.9453551913,
2598
+ "r":0.9251336898,
2599
+ "f":0.9351351351
2600
  },
2601
  "acl":{
2602
+ "p":0.8486997636,
2603
+ "r":0.8387850467,
2604
+ "f":0.8437132785
2605
  },
2606
  "amod":{
2607
+ "p":0.9846000994,
2608
+ "r":0.9870517928,
2609
+ "f":0.9858244218
2610
  },
2611
  "aux":{
2612
+ "p":0.9911406423,
2613
+ "r":0.9889502762,
2614
+ "f":0.9900442478
2615
  },
2616
  "det":{
2617
+ "p":0.9739696312,
2618
+ "r":0.9824945295,
2619
+ "f":0.9782135076
2620
  },
2621
  "advmod":{
2622
+ "p":0.8706004141,
2623
+ "r":0.8834033613,
2624
+ "f":0.8769551616
2625
  },
2626
  "parataxis":{
2627
+ "p":0.6892307692,
2628
  "r":0.6549707602,
2629
+ "f":0.6716641679
2630
  },
2631
  "flat:name":{
2632
+ "p":0.9741935484,
2633
+ "r":0.9805194805,
2634
+ "f":0.9773462783
2635
  },
2636
  "cop":{
2637
+ "p":0.9443155452,
2638
+ "r":0.9576470588,
2639
+ "f":0.9509345794
2640
  },
2641
  "csubj":{
2642
+ "p":0.9101123596,
2643
+ "r":0.9101123596,
2644
+ "f":0.9101123596
2645
  },
2646
  "expl":{
2647
+ "p":0.961414791,
2648
+ "r":0.9583333333,
2649
+ "f":0.9598715891
2650
  },
2651
  "xcomp":{
2652
+ "p":0.9440993789,
2653
+ "r":0.95,
2654
+ "f":0.9470404984
2655
  },
2656
  "ccomp":{
2657
+ "p":0.9184782609,
2658
+ "r":0.9184782609,
2659
+ "f":0.9184782609
2660
+ },
2661
+ "dep":{
2662
+ "p":0.0465116279,
2663
+ "r":0.3333333333,
2664
+ "f":0.0816326531
2665
  },
2666
  "appos":{
2667
+ "p":0.6258503401,
2668
+ "r":0.6388888889,
2669
+ "f":0.6323024055
2670
  },
2671
  "flat":{
2672
+ "p":0.7647058824,
2673
+ "r":0.6842105263,
2674
+ "f":0.7222222222
2675
  },
2676
  "orphan":{
2677
+ "p":0.6478873239,
2678
+ "r":0.4791666667,
2679
+ "f":0.5508982036
2680
  },
2681
  "advcl":{
2682
+ "p":0.7486910995,
2683
+ "r":0.680952381,
2684
+ "f":0.7132169576
2685
  },
2686
  "fixed":{
2687
+ "p":0.9393939394,
2688
+ "r":0.9393939394,
2689
+ "f":0.9393939394
2690
  },
2691
  "list":{
2692
+ "p":0.7553191489,
2693
  "r":0.8658536585,
2694
+ "f":0.8068181818
2695
  },
2696
  "flat:foreign":{
2697
+ "p":0.6666666667,
2698
+ "r":0.5,
2699
+ "f":0.5714285714
 
 
 
 
 
2700
  },
2701
  "discourse":{
2702
+ "p":0.9,
2703
  "r":0.6428571429,
2704
+ "f":0.75
2705
  },
2706
  "vocative":{
2707
+ "p":0.5,
2708
  "r":0.2,
2709
+ "f":0.2857142857
2710
  },
2711
  "cc:preconj":{
2712
  "p":1.0,
 
2714
  "f":0.8421052632
2715
  }
2716
  },
2717
+ "lemma_acc":0.9694198098,
2718
+ "ents_p":0.9316770186,
2719
+ "ents_r":0.9493670886,
2720
+ "ents_f":0.9404388715,
2721
  "ents_per_type":{
2722
  "ORG":{
2723
+ "p":0.9189189189,
2724
+ "r":0.85,
2725
+ "f":0.8831168831
2726
  },
2727
  "PER":{
2728
  "p":1.0,
2729
+ "r":1.0,
2730
+ "f":1.0
2731
  },
2732
  "LOC":{
2733
+ "p":0.8965517241,
2734
  "r":0.962962963,
2735
+ "f":0.9285714286
2736
  },
2737
  "MISC":{
2738
  "p":0.75,
 
2745
  "f":0.6666666667
2746
  }
2747
  },
2748
+ "speed":1008.1462815944
2749
  },
2750
  "sources":[
2751
  {
 
2762
  }
2763
  ],
2764
  "requirements":[
2765
+ "spacy-curated-transformers>=0.2.0,<0.3.0",
2766
+ "sentencepiece>=0.1.91,!=0.1.92",
2767
+ "protobuf<3.21.0"
2768
  ]
2769
  }
morphologizer/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f127e8db060afb095771948c1f1169cdd75ad5cc151cbe45eef8108fee2d6c64
3
- size 3685701
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9fd5c4e3dfa2bf6ac7368e61cc7e7f34c447fe5af9972651441bc88cf783d0c4
3
+ size 3685785
ner/model CHANGED
Binary files a/ner/model and b/ner/model differ
 
parser/model CHANGED
Binary files a/parser/model and b/parser/model differ
 
sl_core_news_trf-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8094b2c2b4d2024a05f1795e2a63606e8bb221b4c7bb6e44ee45745553beb622
3
- size 419759778
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bcfce7fb9b5f08c87011d7ff65856a1ce19a53a75f5ac7570d0394037d1b4324
3
+ size 416542321
tagger/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9025c664e957b190cd343d24b394f2116951af02d47114ac192841ca6bf08fa1
3
- size 3458077
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2916216093dd41cd67a8cbebb98b4c38fed6b3705bc36bd039f7c62ce82524b7
3
+ size 3458161
transformer/cfg CHANGED
@@ -1,3 +1,3 @@
1
  {
2
- "max_batch_items":4096
3
  }
 
1
  {
2
+
3
  }
transformer/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:87995fccfad76c91bee71fe2efd21082db30dfbc59ce9dbe5bfc80b4d99972b1
3
- size 445707181
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:72f89ebe7ab0349486b2a256def9def699e785b561ea1058882674d230ecffc0
3
+ size 440984165
vocab/strings.json CHANGED
The diff for this file is too large to render. See raw diff