KennethEnevoldsen commited on
Commit
91c9fac
1 Parent(s): 631be60

update dacy pipeline

Browse files
.gitattributes CHANGED
@@ -15,3 +15,8 @@
15
  *.pt filter=lfs diff=lfs merge=lfs -text
16
  *.pth filter=lfs diff=lfs merge=lfs -text
17
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
15
  *.pt filter=lfs diff=lfs merge=lfs -text
16
  *.pth filter=lfs diff=lfs merge=lfs -text
17
  *tfevents* filter=lfs diff=lfs merge=lfs -text
18
+ *.whl filter=lfs diff=lfs merge=lfs -text
19
+ *.npz filter=lfs diff=lfs merge=lfs -text
20
+ *strings.json filter=lfs diff=lfs merge=lfs -text
21
+ vectors filter=lfs diff=lfs merge=lfs -text
22
+ model filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,193 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - spacy
4
+ - token-classification
5
+ language:
6
+ - da
7
+ license: Apache-2.0-License
8
+ model-index:
9
+ - name: da_dacy_medium_trf
10
+ results:
11
+ - tasks:
12
+ name: NER
13
+ type: token-classification
14
+ metrics:
15
+ - name: Precision
16
+ type: precision
17
+ value: 0.817047817
18
+ - name: Recall
19
+ type: recall
20
+ value: 0.81875
21
+ - name: F Score
22
+ type: f_score
23
+ value: 0.8178980229
24
+ - tasks:
25
+ name: SENTER
26
+ type: token-classification
27
+ metrics:
28
+ - name: Precision
29
+ type: precision
30
+ value: 0.873015873
31
+ - name: Recall
32
+ type: recall
33
+ value: 0.8776595745
34
+ - name: F Score
35
+ type: f_score
36
+ value: 0.875331565
37
+ - tasks:
38
+ name: UNLABELED_DEPENDENCIES
39
+ type: token-classification
40
+ metrics:
41
+ - name: Accuracy
42
+ type: accuracy
43
+ value: 0.8714971531
44
+ - tasks:
45
+ name: LABELED_DEPENDENCIES
46
+ type: token-classification
47
+ metrics:
48
+ - name: Accuracy
49
+ type: accuracy
50
+ value: 0.8714971531
51
+ ---
52
+
53
+ <a href="https://github.com/centre-for-humanities-computing/Dacy"><img src="https://centre-for-humanities-computing.github.io/DaCy/_static/icon.png" width="175" height="175" align="right" /></a>
54
+
55
+ # DaCy medium transformer
56
+
57
+ DaCy is a Danish language processing framework with state-of-the-art pipelines as well as functionality for analysing Danish pipelines.
58
+ DaCy's largest pipeline has achieved State-of-the-Art performance on Named entity recognition, part-of-speech tagging and dependency
59
+ parsing for Danish on the DaNE dataset. Check out the [DaCy repository](https://github.com/centre-for-humanities-computing/DaCy) for material on how to use DaCy and reproduce the results.
60
+ DaCy also contains guides on usage of the package as well as behavioural test for biases and robustness of Danish NLP pipelines.
61
+
62
+
63
+ | Feature | Description |
64
+ | --- | --- |
65
+ | **Name** | `da_dacy_medium_trf` |
66
+ | **Version** | `0.1.0` |
67
+ | **spaCy** | `>=3.1.1,<3.2.0` |
68
+ | **Default Pipeline** | `transformer`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
69
+ | **Components** | `transformer`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
70
+ | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
71
+ | **Sources** | [UD Danish DDT v2.5](https://github.com/UniversalDependencies/UD_Danish-DDT) (Johannsen, Anders; Martínez Alonso, Héctor; Plank, Barbara)<br />[DaNE](https://github.com/alexandrainst/danlp/blob/master/docs/datasets.md#danish-dependency-treebank-dane) (Rasmus Hvingelby, Amalie B. Pauli, Maria Barrett, Christina Rosted, Lasse M. Lidegaard, Anders Søgaard)<br />[Maltehb/danish-bert-botxo](https://huggingface.co/Maltehb/danish-bert-botxo) (BotXO.ai) |
72
+ | **License** | `Apache-2.0 License` |
73
+ | **Author** | [Centre for Humanities Computing Aarhus](https://chcaa.io/#/) |
74
+
75
+ ### Label Scheme
76
+
77
+ <details>
78
+
79
+ <summary>View label scheme (192 labels for 3 components)</summary>
80
+
81
+ | Component | Labels |
82
+ | --- | --- |
83
+ | **`morphologizer`** | `AdpType=Prep\|POS=ADP`, `Definite=Ind\|Gender=Com\|Number=Sing\|POS=NOUN`, `Mood=Ind\|POS=AUX\|Tense=Pres\|VerbForm=Fin\|Voice=Act`, `POS=PROPN`, `Definite=Ind\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Def\|Gender=Neut\|Number=Sing\|POS=NOUN`, `POS=SCONJ`, `Definite=Def\|Gender=Com\|Number=Sing\|POS=NOUN`, `Mood=Ind\|POS=VERB\|Tense=Pres\|VerbForm=Fin\|Voice=Act`, `POS=ADV`, `Number=Plur\|POS=DET\|PronType=Dem`, `Degree=Pos\|Number=Plur\|POS=ADJ`, `Definite=Ind\|Gender=Com\|Number=Plur\|POS=NOUN`, `POS=PUNCT`, `POS=CCONJ`, `Definite=Ind\|Degree=Cmp\|Number=Sing\|POS=ADJ`, `Degree=Cmp\|POS=ADJ`, `POS=PRON\|PartType=Inf`, `Gender=Com\|Number=Sing\|POS=DET\|PronType=Ind`, `Definite=Ind\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Case=Acc\|Gender=Neut\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Definite=Ind\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Definite=Def\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Gender=Neut\|Number=Sing\|POS=DET\|PronType=Dem`, `Degree=Pos\|POS=ADV`, `Definite=Def\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Ind\|Gender=Neut\|Number=Sing\|POS=NOUN`, `POS=PRON\|PronType=Dem`, `NumType=Card\|POS=NUM`, `Definite=Ind\|Degree=Pos\|Gender=Neut\|Number=Sing\|POS=ADJ`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Degree=Pos\|Gender=Com\|Number=Sing\|POS=ADJ`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `NumType=Ord\|POS=ADJ`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Mood=Ind\|POS=AUX\|Tense=Past\|VerbForm=Fin\|Voice=Act`, `POS=VERB\|VerbForm=Inf\|Voice=Act`, `Mood=Ind\|POS=VERB\|Tense=Past\|VerbForm=Fin\|Voice=Act`, `POS=NOUN`, `Mood=Ind\|POS=VERB\|Tense=Pres\|VerbForm=Fin\|Voice=Pass`, `POS=ADP\|PartType=Inf`, `Degree=Pos\|POS=ADJ`, `Definite=Def\|Gender=Com\|Number=Plur\|POS=NOUN`, `Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Com\|Number=Sing\|POS=NOUN`, `POS=AUX\|VerbForm=Inf\|Voice=Act`, `Definite=Ind\|Degree=Pos\|Gender=Com\|Number=Sing\|POS=ADJ`, `Gender=Com\|Number=Sing\|POS=DET\|PronType=Dem`, `Number=Plur\|POS=DET\|PronType=Ind`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Ind`, `Case=Acc\|POS=PRON\|Person=3\|PronType=Prs\|Reflex=Yes`, `POS=PART\|PartType=Inf`, `Gender=Neut\|Number=Sing\|POS=DET\|PronType=Ind`, `Case=Acc\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Neut\|Number=Sing\|POS=NOUN`, `Case=Nom\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Nom\|Gender=Com\|POS=PRON\|PronType=Ind`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Ind`, `Mood=Imp\|POS=VERB`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Definite=Ind\|Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=X`, `Case=Nom\|Gender=Com\|Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Com\|Number=Plur\|POS=NOUN`, `POS=VERB\|Tense=Pres\|VerbForm=Part`, `Number=Plur\|POS=PRON\|PronType=Int,Rel`, `POS=VERB\|VerbForm=Inf\|Voice=Pass`, `Case=Gen\|Definite=Ind\|Gender=Com\|Number=Sing\|POS=NOUN`, `Degree=Cmp\|POS=ADV`, `POS=ADV\|PartType=Inf`, `Degree=Sup\|POS=ADV`, `Number=Plur\|POS=PRON\|PronType=Dem`, `Number=Plur\|POS=PRON\|PronType=Ind`, `Definite=Def\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|POS=PROPN`, `POS=ADP`, `Degree=Cmp\|Number=Plur\|POS=ADJ`, `Definite=Def\|Degree=Sup\|POS=ADJ`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Degree=Pos\|Number=Sing\|POS=ADJ`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Gender=Com\|Number=Sing\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Number=Plur\|POS=PRON\|PronType=Rcp`, `Case=Gen\|Degree=Cmp\|POS=ADJ`, `Case=Gen\|Definite=Def\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Number[psor]=Plur\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs`, `POS=INTJ`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Degree=Pos\|Gender=Neut\|Number=Sing\|POS=ADJ`, `Gender=Neut\|Number=Sing\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Definite=Ind\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Number=Sing\|POS=PRON\|PronType=Int,Rel`, `Number=Plur\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Int,Rel`, `Definite=Def\|Degree=Sup\|Number=Plur\|POS=ADJ`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Definite=Ind\|Number=Sing\|POS=NOUN`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Plur\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `POS=SYM`, `Case=Nom\|Gender=Com\|POS=PRON\|Person=2\|Polite=Form\|PronType=Prs`, `Degree=Sup\|POS=ADJ`, `Number=Plur\|POS=DET\|PronType=Ind\|Style=Arch`, `Case=Gen\|Gender=Com\|Number=Sing\|POS=DET\|PronType=Dem`, `Foreign=Yes\|POS=X`, `POS=DET\|Person=2\|Polite=Form\|Poss=Yes\|PronType=Prs`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Dem`, `Case=Acc\|Gender=Com\|Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|Definite=Ind\|Gender=Neut\|Number=Sing\|POS=NOUN`, `Case=Gen\|POS=PRON\|PronType=Int,Rel`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Dem`, `Abbr=Yes\|POS=X`, `Case=Gen\|Definite=Ind\|Gender=Com\|Number=Plur\|POS=NOUN`, `Definite=Def\|Degree=Abs\|POS=ADJ`, `Definite=Ind\|Degree=Sup\|Number=Sing\|POS=ADJ`, `Definite=Ind\|POS=NOUN`, `Gender=Com\|Number=Plur\|POS=NOUN`, `Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Gender=Com\|POS=PRON\|PronType=Int,Rel`, `Case=Nom\|Gender=Com\|Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Degree=Abs\|POS=ADV`, `POS=VERB\|VerbForm=Ger`, `POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Def\|Degree=Sup\|Number=Sing\|POS=ADJ`, `Number=Plur\|Number[psor]=Plur\|POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Case=Gen\|Definite=Def\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Case=Gen\|Degree=Pos\|Number=Plur\|POS=ADJ`, `Case=Acc\|Gender=Com\|POS=PRON\|Person=2\|Polite=Form\|PronType=Prs`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Int,Rel`, `POS=VERB\|Tense=Pres`, `Case=Gen\|Number=Plur\|POS=DET\|PronType=Ind`, `Number[psor]=Plur\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `POS=PRON\|Person=2\|Polite=Form\|Poss=Yes\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `POS=AUX\|Tense=Pres\|VerbForm=Part`, `Mood=Ind\|POS=VERB\|Tense=Past\|VerbForm=Fin\|Voice=Pass`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Degree=Sup\|Number=Plur\|POS=ADJ`, `Case=Acc\|Gender=Com\|Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Definite=Ind\|Number=Plur\|POS=NOUN`, `Case=Gen\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Mood=Imp\|POS=AUX`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs`, `Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `Definite=Def\|Gender=Com\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Gender=Com\|Number=Sing\|POS=DET\|PronType=Ind`, `Case=Gen\|POS=NOUN`, `Number[psor]=Plur\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `POS=DET\|PronType=Dem`, `Definite=Def\|Number=Plur\|POS=NOUN` |
84
+ | **`parser`** | `ROOT`, `acl:relcl`, `advcl`, `advmod`, `amod`, `appos`, `aux`, `case`, `cc`, `ccomp`, `compound:prt`, `conj`, `cop`, `dep`, `det`, `expl`, `fixed`, `flat`, `iobj`, `list`, `mark`, `nmod`, `nmod:poss`, `nsubj`, `nummod`, `obj`, `obl`, `obl:loc`, `obl:tmod`, `punct`, `xcomp` |
85
+ | **`ner`** | `LOC`, `MISC`, `ORG`, `PER` |
86
+
87
+ </details>
88
+
89
+ ### Accuracy
90
+
91
+ | Type | Score |
92
+ | --- | --- |
93
+ | `POS_ACC` | 97.44 |
94
+ | `MORPH_ACC` | 97.24 |
95
+ | `DEP_UAS` | 87.15 |
96
+ | `DEP_LAS` | 83.97 |
97
+ | `SENTS_P` | 87.30 |
98
+ | `SENTS_R` | 87.77 |
99
+ | `SENTS_F` | 87.53 |
100
+ | `LEMMA_ACC` | 84.91 |
101
+ | `ENTS_F` | 81.79 |
102
+ | `ENTS_P` | 81.70 |
103
+ | `ENTS_R` | 81.88 |
104
+ | `TRANSFORMER_LOSS` | 1224302.39 |
105
+ | `MORPHOLOGIZER_LOSS` | 388869.90 |
106
+ | `PARSER_LOSS` | 7861802.70 |
107
+ | `NER_LOSS` | 68503.20 |
108
+
109
+
110
+ ## Bias and Robustness
111
+
112
+ Besides the validation done by SpaCy on the DaNE testset, DaCy also provides a series of augmentations to the DaNE test set to see how well the models deal with these types of augmentations.
113
+ The can be seen as behavioural probes akinn to the NLP checklist.
114
+
115
+ ### Deterministic Augmentations
116
+ Deterministic augmentations are augmentation which always yield the same result.
117
+
118
+ | Augmentation | Part-of-speech tagging (Accuracy) | Morphological tagging (Accuracy) | Dependency Parsing (UAS) | Dependency Parsing (LAS) | Sentence segmentation (F1) | Lemmatization (Accuracy) | Named entity recognition (F1) |
119
+ | --- | --- | --- | --- | --- | --- | --- | --- |
120
+ | No augmentation | 0.98 | 0.975 | 0.888 | 0.857 | 0.936 | 0.844 | 0.765 |
121
+ | Æøå Augmentation | 0.963 | 0.955 | 0.88 | 0.844 | 0.944 | 0.754 | 0.712 |
122
+ | Lowercase | 0.98 | 0.975 | 0.888 | 0.857 | 0.936 | 0.848 | 0.765 |
123
+ | No Spacing | 0.229 | 0.229 | 0.004 | 0.004 | 0.683 | 0.225 | 0.058 |
124
+ | Abbreviated first names | 0.976 | 0.974 | 0.885 | 0.854 | 0.934 | 0.845 | 0.741 |
125
+ | Input size augmentation 5 sentences | 0.978 | 0.973 | 0.88 | 0.85 | 0.883 | 0.844 | 0.77 |
126
+ | Input size augmentation 10 sentences | 0.977 | 0.973 | 0.878 | 0.847 | 0.872 | 0.844 | 0.768 |
127
+
128
+
129
+
130
+ ### Stochastic Augmentations
131
+ Stochastic augmentations are augmentation which are repeated mulitple times to estimate the effect of the augmentation.
132
+
133
+ | Augmentation | Part-of-speech tagging (Accuracy) | Morphological tagging (Accuracy) | Dependency Parsing (UAS) | Dependency Parsing (LAS) | Sentence segmentation (F1) | Lemmatization (Accuracy) | Named entity recognition (F1) |
134
+ | --- | --- | --- | --- | --- | --- | --- | --- |
135
+ | Keystroke errors 2% | 0.936 (0.002) | 0.934 (0.002) | 0.836 (0.002) | 0.795 (0.002) | 0.889 (0.002) | 0.773 (0.002) | 0.627 (0.002) |
136
+ | Keystroke errors 5% | 0.869 (0.003) | 0.873 (0.003) | 0.753 (0.003) | 0.696 (0.003) | 0.829 (0.003) | 0.68 (0.003) | 0.487 (0.003) |
137
+ | Keystroke errors 15% | 0.647 (0.007) | 0.684 (0.007) | 0.5 (0.007) | 0.417 (0.007) | 0.664 (0.007) | 0.46 (0.007) | 0.256 (0.007) |
138
+ | Danish names | 0.978 (0.0) | 0.975 (0.0) | 0.885 (0.0) | 0.855 (0.0) | 0.934 (0.0) | 0.847 (0.0) | 0.771 (0.0) |
139
+ | Muslim names | 0.978 (0.0) | 0.975 (0.0) | 0.886 (0.0) | 0.855 (0.0) | 0.935 (0.0) | 0.847 (0.0) | 0.749 (0.0) |
140
+ | Female names | 0.979 (0.0) | 0.975 (0.0) | 0.886 (0.0) | 0.856 (0.0) | 0.933 (0.0) | 0.847 (0.0) | 0.775 (0.0) |
141
+ | Male names | 0.978 (0.0) | 0.975 (0.0) | 0.885 (0.0) | 0.855 (0.0) | 0.933 (0.0) | 0.847 (0.0) | 0.773 (0.0) |
142
+ | Spacing Augmention 5% | 0.941 (0.002) | 0.937 (0.002) | 0.78 (0.002) | 0.751 (0.002) | 0.905 (0.002) | 0.812 (0.002) | 0.701 (0.002) |
143
+
144
+ <details>
145
+
146
+ <summary> Description of Augmenters </summary>
147
+
148
+
149
+
150
+ **No augmentation:**
151
+ Applies no augmentation to the DaNE test set.
152
+
153
+ **Æøå Augmentation:**
154
+ This augmentation replace the æ,ø, and å with their spelling variations ae, oe and aa respectively.
155
+
156
+ **Lowercase:**
157
+ This augmentation lowercases all text.
158
+
159
+ **No Spacing:**
160
+ This augmentation removed all spacing from the text.
161
+
162
+ **Abbreviated first names:**
163
+ This agmentation abbreviates the first names of entities. For instance 'Kenneth Enevoldsen' would turn to 'K. Enevoldsen'.
164
+
165
+ **Keystroke errors 2%:**
166
+ This agmentation simulate keystroke errors by replacing 2% of keys with a neighbouring key on a Danish QWERTY keyboard. As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.
167
+
168
+ **Keystroke errors 5%:**
169
+ This agmentation simulate keystroke errors by replacing 5% of keys with a neighbouring key on a Danish QWERTY keyboard. As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.
170
+
171
+ **Keystroke errors 15%:**
172
+ This agmentation simulate keystroke errors by replacing 15% of keys with a neighbouring key on a Danish QWERTY keyboard. As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.
173
+
174
+ **Danish names:**
175
+ This agmentation replace all names with Danish names derived from Danmarks Statistik (2021). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.
176
+
177
+ **Muslim names:**
178
+ This agmentation replace all names with Muslim names derived from Meldgaard (2005). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.
179
+
180
+ **Female names:**
181
+ This agmentation replace all names with Danish female names derived from Danmarks Statistik (2021). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.
182
+
183
+ **Male names:**
184
+ This agmentation replace all names with Danish male names derived from Danmarks Statistik (2021). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.
185
+
186
+ **Spacing Augmention 5%:**
187
+ This agmentation replace all names with Danish male names derived from Danmarks Statistik (2021). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.
188
+ </details>
189
+ <br />
190
+
191
+
192
+ ### Hardware
193
+ This was run an trained on a Quadro RTX 8000 GPU.
attribute_ruler/patterns ADDED
@@ -0,0 +1 @@
 
 
1
+
config.cfg ADDED
@@ -0,0 +1,226 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [paths]
2
+ train = "corpus/dane/train.spacy"
3
+ dev = "corpus/dane/dev.spacy"
4
+ vectors = null
5
+ raw = null
6
+ init_tok2vec = null
7
+ vocab_data = null
8
+
9
+ [system]
10
+ gpu_allocator = "pytorch"
11
+ seed = 1
12
+
13
+ [nlp]
14
+ lang = "da"
15
+ pipeline = ["transformer","morphologizer","parser","attribute_ruler","lemmatizer","ner"]
16
+ disabled = []
17
+ before_creation = null
18
+ after_creation = null
19
+ after_pipeline_creation = null
20
+ batch_size = 64
21
+ tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
22
+
23
+ [components]
24
+
25
+ [components.attribute_ruler]
26
+ factory = "attribute_ruler"
27
+ validate = false
28
+
29
+ [components.lemmatizer]
30
+ factory = "lemmatizer"
31
+ mode = "lookup"
32
+ model = null
33
+ overwrite = false
34
+
35
+ [components.morphologizer]
36
+ factory = "morphologizer"
37
+
38
+ [components.morphologizer.model]
39
+ @architectures = "spacy.Tagger.v1"
40
+ nO = null
41
+
42
+ [components.morphologizer.model.tok2vec]
43
+ @architectures = "spacy-transformers.TransformerListener.v1"
44
+ grad_factor = 1.0
45
+ upstream = "transformer"
46
+ pooling = {"@layers":"reduce_mean.v1"}
47
+
48
+ [components.ner]
49
+ factory = "ner"
50
+ incorrect_spans_key = null
51
+ moves = null
52
+ update_with_oracle_cut_size = 100
53
+
54
+ [components.ner.model]
55
+ @architectures = "spacy.TransitionBasedParser.v2"
56
+ state_type = "ner"
57
+ extra_state_tokens = false
58
+ hidden_width = 64
59
+ maxout_pieces = 2
60
+ use_upper = false
61
+ nO = null
62
+
63
+ [components.ner.model.tok2vec]
64
+ @architectures = "spacy-transformers.TransformerListener.v1"
65
+ grad_factor = 1.0
66
+ upstream = "transformer"
67
+ pooling = {"@layers":"reduce_mean.v1"}
68
+
69
+ [components.parser]
70
+ factory = "parser"
71
+ learn_tokens = false
72
+ min_action_freq = 30
73
+ moves = null
74
+ update_with_oracle_cut_size = 100
75
+
76
+ [components.parser.model]
77
+ @architectures = "spacy.TransitionBasedParser.v2"
78
+ state_type = "parser"
79
+ extra_state_tokens = false
80
+ hidden_width = 64
81
+ maxout_pieces = 2
82
+ use_upper = false
83
+ nO = null
84
+
85
+ [components.parser.model.tok2vec]
86
+ @architectures = "spacy-transformers.TransformerListener.v1"
87
+ grad_factor = 1.0
88
+ upstream = "transformer"
89
+ pooling = {"@layers":"reduce_mean.v1"}
90
+
91
+ [components.transformer]
92
+ factory = "transformer"
93
+ max_batch_items = 4096
94
+ set_extra_annotations = {"@annotation_setters":"spacy-transformers.null_annotation_setter.v1"}
95
+
96
+ [components.transformer.model]
97
+ @architectures = "spacy-transformers.TransformerModel.v1"
98
+ name = "Maltehb/danish-bert-botxo"
99
+
100
+ [components.transformer.model.get_spans]
101
+ @span_getters = "spacy-transformers.strided_spans.v1"
102
+ window = 128
103
+ stride = 96
104
+
105
+ [components.transformer.model.tokenizer_config]
106
+ use_fast = true
107
+ strip_accents = false
108
+
109
+ [corpora]
110
+
111
+ [corpora.dev]
112
+ @readers = "spacy.Corpus.v1"
113
+ limit = 0
114
+ max_length = 0
115
+ path = ${paths:dev}
116
+ gold_preproc = false
117
+ augmenter = null
118
+
119
+ [corpora.train]
120
+ @readers = "spacy.Corpus.v1"
121
+ path = ${paths:train}
122
+ max_length = 500
123
+ gold_preproc = false
124
+ limit = 0
125
+
126
+ [corpora.train.augmenter]
127
+ @augmenters = "spacy.lower_case.v1"
128
+ level = 0.1
129
+
130
+ [training]
131
+ train_corpus = "corpora.train"
132
+ dev_corpus = "corpora.dev"
133
+ seed = ${system:seed}
134
+ gpu_allocator = ${system:gpu_allocator}
135
+ dropout = 0.1
136
+ accumulate_gradient = 3
137
+ patience = 5000
138
+ max_epochs = 0
139
+ max_steps = 20000
140
+ eval_frequency = 1000
141
+ frozen_components = []
142
+ before_to_disk = null
143
+ annotating_components = []
144
+
145
+ [training.batcher]
146
+ @batchers = "spacy.batch_by_padded.v1"
147
+ discard_oversize = true
148
+ get_length = null
149
+ size = 2000
150
+ buffer = 256
151
+
152
+ [training.logger]
153
+ @loggers = "spacy.WandbLogger.v1"
154
+ project_name = "dacy-an-efficient-pipeline-for-danish"
155
+ remove_config_values = []
156
+
157
+ [training.optimizer]
158
+ @optimizers = "Adam.v1"
159
+ beta1 = 0.9
160
+ beta2 = 0.999
161
+ L2_is_weight_decay = true
162
+ L2 = 0.01
163
+ grad_clip = 1.0
164
+ use_averages = true
165
+ eps = 0.00000001
166
+
167
+ [training.optimizer.learn_rate]
168
+ @schedules = "warmup_linear.v1"
169
+ warmup_steps = 250
170
+ total_steps = 20000
171
+ initial_rate = 0.00005
172
+
173
+ [training.score_weights]
174
+ pos_acc = 0.08
175
+ morph_acc = 0.08
176
+ morph_per_feat = null
177
+ dep_uas = 0.0
178
+ dep_las = 0.16
179
+ dep_las_per_type = null
180
+ sents_p = null
181
+ sents_r = null
182
+ sents_f = 0.02
183
+ lemma_acc = 0.5
184
+ ents_f = 0.16
185
+ ents_p = 0.0
186
+ ents_r = 0.0
187
+ ents_per_type = null
188
+
189
+ [pretraining]
190
+
191
+ [initialize]
192
+ vocab_data = ${paths.vocab_data}
193
+ vectors = ${paths.vectors}
194
+ init_tok2vec = ${paths.init_tok2vec}
195
+ before_init = null
196
+ after_init = null
197
+
198
+ [initialize.components]
199
+
200
+ [initialize.components.morphologizer]
201
+
202
+ [initialize.components.morphologizer.labels]
203
+ @readers = "spacy.read_labels.v1"
204
+ path = "corpus/labels/morphologizer.json"
205
+ require = false
206
+
207
+ [initialize.components.ner]
208
+
209
+ [initialize.components.ner.labels]
210
+ @readers = "spacy.read_labels.v1"
211
+ path = "corpus/labels/ner.json"
212
+ require = false
213
+
214
+ [initialize.components.parser]
215
+
216
+ [initialize.components.parser.labels]
217
+ @readers = "spacy.read_labels.v1"
218
+ path = "corpus/labels/parser.json"
219
+ require = false
220
+
221
+ [initialize.lookups]
222
+ @misc = "spacy.LookupsDataLoader.v1"
223
+ lang = ${nlp.lang}
224
+ tables = ["lexeme_norm"]
225
+
226
+ [initialize.tokenizer]
da_dacy_medium_trf-any-py3-none-any.whl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:18148d5a7c83d6842645d471d6ea6973c11b1e039bdc3a1a704f4c2c7c9ea7b4
3
+ size 417787063
lemmatizer/lookups/lookups.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6864ce8705293ba1b6dcf349ec133cdc33db3ba57f6e9337458cfe5073b6f103
3
+ size 11537995
meta.json ADDED
@@ -0,0 +1,581 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "lang":"da",
3
+ "name":"dacy_medium_trf",
4
+ "version":"0.1.0",
5
+ "description":"\n<a href=\"https://github.com/centre-for-humanities-computing/Dacy\"><img src=\"https://centre-for-humanities-computing.github.io/DaCy/_static/icon.png\" width=\"175\" height=\"175\" align=\"right\" /></a>\n\n# DaCy medium transformer\n\nDaCy is a Danish language processing framework with state-of-the-art pipelines as well as functionality for analysing Danish pipelines.\nDaCy's largest pipeline has achieved State-of-the-Art performance on Named entity recognition, part-of-speech tagging and dependency \nparsing for Danish on the DaNE dataset. Check out the [DaCy repository](https://github.com/centre-for-humanities-computing/DaCy) for material on how to use DaCy and reproduce the results. \nDaCy also contains guides on usage of the package as well as behavioural test for biases and robustness of Danish NLP pipelines.\n ",
6
+ "author":"Centre for Humanities Computing Aarhus",
7
+ "email":"[email protected]",
8
+ "url":"https://chcaa.io/#/",
9
+ "license":"Apache-2.0 License",
10
+ "spacy_version":">=3.1.1,<3.2.0",
11
+ "spacy_git_version":"ffaead8fe",
12
+ "vectors":{
13
+ "width":0,
14
+ "vectors":0,
15
+ "keys":0,
16
+ "name":null
17
+ },
18
+ "labels":{
19
+ "transformer":[
20
+
21
+ ],
22
+ "morphologizer":[
23
+ "AdpType=Prep|POS=ADP",
24
+ "Definite=Ind|Gender=Com|Number=Sing|POS=NOUN",
25
+ "Mood=Ind|POS=AUX|Tense=Pres|VerbForm=Fin|Voice=Act",
26
+ "POS=PROPN",
27
+ "Definite=Ind|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part",
28
+ "Definite=Def|Gender=Neut|Number=Sing|POS=NOUN",
29
+ "POS=SCONJ",
30
+ "Definite=Def|Gender=Com|Number=Sing|POS=NOUN",
31
+ "Mood=Ind|POS=VERB|Tense=Pres|VerbForm=Fin|Voice=Act",
32
+ "POS=ADV",
33
+ "Number=Plur|POS=DET|PronType=Dem",
34
+ "Degree=Pos|Number=Plur|POS=ADJ",
35
+ "Definite=Ind|Gender=Com|Number=Plur|POS=NOUN",
36
+ "POS=PUNCT",
37
+ "POS=CCONJ",
38
+ "Definite=Ind|Degree=Cmp|Number=Sing|POS=ADJ",
39
+ "Degree=Cmp|POS=ADJ",
40
+ "POS=PRON|PartType=Inf",
41
+ "Gender=Com|Number=Sing|POS=DET|PronType=Ind",
42
+ "Definite=Ind|Degree=Pos|Number=Sing|POS=ADJ",
43
+ "Case=Acc|Gender=Neut|Number=Sing|POS=PRON|Person=3|PronType=Prs",
44
+ "Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN",
45
+ "Definite=Def|Degree=Pos|Number=Sing|POS=ADJ",
46
+ "Gender=Neut|Number=Sing|POS=DET|PronType=Dem",
47
+ "Degree=Pos|POS=ADV",
48
+ "Definite=Def|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part",
49
+ "Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN",
50
+ "POS=PRON|PronType=Dem",
51
+ "NumType=Card|POS=NUM",
52
+ "Definite=Ind|Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ",
53
+ "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs",
54
+ "Degree=Pos|Gender=Com|Number=Sing|POS=ADJ",
55
+ "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs",
56
+ "NumType=Ord|POS=ADJ",
57
+ "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
58
+ "Mood=Ind|POS=AUX|Tense=Past|VerbForm=Fin|Voice=Act",
59
+ "POS=VERB|VerbForm=Inf|Voice=Act",
60
+ "Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Act",
61
+ "POS=NOUN",
62
+ "Mood=Ind|POS=VERB|Tense=Pres|VerbForm=Fin|Voice=Pass",
63
+ "POS=ADP|PartType=Inf",
64
+ "Degree=Pos|POS=ADJ",
65
+ "Definite=Def|Gender=Com|Number=Plur|POS=NOUN",
66
+ "Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs",
67
+ "Case=Gen|Definite=Def|Gender=Com|Number=Sing|POS=NOUN",
68
+ "POS=AUX|VerbForm=Inf|Voice=Act",
69
+ "Definite=Ind|Degree=Pos|Gender=Com|Number=Sing|POS=ADJ",
70
+ "Gender=Com|Number=Sing|POS=DET|PronType=Dem",
71
+ "Number=Plur|POS=DET|PronType=Ind",
72
+ "Gender=Com|Number=Sing|POS=PRON|PronType=Ind",
73
+ "Case=Acc|POS=PRON|Person=3|PronType=Prs|Reflex=Yes",
74
+ "POS=PART|PartType=Inf",
75
+ "Gender=Neut|Number=Sing|POS=DET|PronType=Ind",
76
+ "Case=Acc|Number=Plur|POS=PRON|Person=3|PronType=Prs",
77
+ "Case=Gen|Definite=Def|Gender=Neut|Number=Sing|POS=NOUN",
78
+ "Case=Nom|Number=Plur|POS=PRON|Person=3|PronType=Prs",
79
+ "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs",
80
+ "Case=Nom|Gender=Com|POS=PRON|PronType=Ind",
81
+ "Gender=Neut|Number=Sing|POS=PRON|PronType=Ind",
82
+ "Mood=Imp|POS=VERB",
83
+ "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs",
84
+ "Definite=Ind|Number=Sing|POS=AUX|Tense=Past|VerbForm=Part",
85
+ "POS=X",
86
+ "Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs",
87
+ "Case=Gen|Definite=Def|Gender=Com|Number=Plur|POS=NOUN",
88
+ "POS=VERB|Tense=Pres|VerbForm=Part",
89
+ "Number=Plur|POS=PRON|PronType=Int,Rel",
90
+ "POS=VERB|VerbForm=Inf|Voice=Pass",
91
+ "Case=Gen|Definite=Ind|Gender=Com|Number=Sing|POS=NOUN",
92
+ "Degree=Cmp|POS=ADV",
93
+ "POS=ADV|PartType=Inf",
94
+ "Degree=Sup|POS=ADV",
95
+ "Number=Plur|POS=PRON|PronType=Dem",
96
+ "Number=Plur|POS=PRON|PronType=Ind",
97
+ "Definite=Def|Gender=Neut|Number=Plur|POS=NOUN",
98
+ "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs",
99
+ "Case=Gen|POS=PROPN",
100
+ "POS=ADP",
101
+ "Degree=Cmp|Number=Plur|POS=ADJ",
102
+ "Definite=Def|Degree=Sup|POS=ADJ",
103
+ "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs",
104
+ "Degree=Pos|Number=Sing|POS=ADJ",
105
+ "Number=Plur|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
106
+ "Gender=Com|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form",
107
+ "Number=Plur|POS=PRON|PronType=Rcp",
108
+ "Case=Gen|Degree=Cmp|POS=ADJ",
109
+ "Case=Gen|Definite=Def|Gender=Neut|Number=Plur|POS=NOUN",
110
+ "Number[psor]=Plur|POS=DET|Person=3|Poss=Yes|PronType=Prs",
111
+ "POS=INTJ",
112
+ "Number=Plur|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs",
113
+ "Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ",
114
+ "Gender=Neut|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form",
115
+ "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs",
116
+ "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs",
117
+ "Case=Gen|Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN",
118
+ "Number=Sing|POS=PRON|PronType=Int,Rel",
119
+ "Number=Plur|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form",
120
+ "Gender=Neut|Number=Sing|POS=PRON|PronType=Int,Rel",
121
+ "Definite=Def|Degree=Sup|Number=Plur|POS=ADJ",
122
+ "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs",
123
+ "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
124
+ "Definite=Ind|Number=Sing|POS=NOUN",
125
+ "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part",
126
+ "Number=Plur|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
127
+ "POS=SYM",
128
+ "Case=Nom|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs",
129
+ "Degree=Sup|POS=ADJ",
130
+ "Number=Plur|POS=DET|PronType=Ind|Style=Arch",
131
+ "Case=Gen|Gender=Com|Number=Sing|POS=DET|PronType=Dem",
132
+ "Foreign=Yes|POS=X",
133
+ "POS=DET|Person=2|Polite=Form|Poss=Yes|PronType=Prs",
134
+ "Gender=Neut|Number=Sing|POS=PRON|PronType=Dem",
135
+ "Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs",
136
+ "Case=Gen|Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN",
137
+ "Case=Gen|POS=PRON|PronType=Int,Rel",
138
+ "Gender=Com|Number=Sing|POS=PRON|PronType=Dem",
139
+ "Abbr=Yes|POS=X",
140
+ "Case=Gen|Definite=Ind|Gender=Com|Number=Plur|POS=NOUN",
141
+ "Definite=Def|Degree=Abs|POS=ADJ",
142
+ "Definite=Ind|Degree=Sup|Number=Sing|POS=ADJ",
143
+ "Definite=Ind|POS=NOUN",
144
+ "Gender=Com|Number=Plur|POS=NOUN",
145
+ "Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs",
146
+ "Gender=Com|POS=PRON|PronType=Int,Rel",
147
+ "Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs",
148
+ "Degree=Abs|POS=ADV",
149
+ "POS=VERB|VerbForm=Ger",
150
+ "POS=VERB|Tense=Past|VerbForm=Part",
151
+ "Definite=Def|Degree=Sup|Number=Sing|POS=ADJ",
152
+ "Number=Plur|Number[psor]=Plur|POS=PRON|Person=1|Poss=Yes|PronType=Prs|Style=Form",
153
+ "Case=Gen|Definite=Def|Degree=Pos|Number=Sing|POS=ADJ",
154
+ "Case=Gen|Degree=Pos|Number=Plur|POS=ADJ",
155
+ "Case=Acc|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs",
156
+ "Gender=Com|Number=Sing|POS=PRON|PronType=Int,Rel",
157
+ "POS=VERB|Tense=Pres",
158
+ "Case=Gen|Number=Plur|POS=DET|PronType=Ind",
159
+ "Number[psor]=Plur|POS=DET|Person=2|Poss=Yes|PronType=Prs",
160
+ "POS=PRON|Person=2|Polite=Form|Poss=Yes|PronType=Prs",
161
+ "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs",
162
+ "POS=AUX|Tense=Pres|VerbForm=Part",
163
+ "Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Pass",
164
+ "Gender=Com|Number=Sing|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
165
+ "Degree=Sup|Number=Plur|POS=ADJ",
166
+ "Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs",
167
+ "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
168
+ "Definite=Ind|Number=Plur|POS=NOUN",
169
+ "Case=Gen|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part",
170
+ "Mood=Imp|POS=AUX",
171
+ "Gender=Com|Number=Sing|Number[psor]=Sing|POS=PRON|Person=1|Poss=Yes|PronType=Prs",
172
+ "Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs",
173
+ "Definite=Def|Gender=Com|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part",
174
+ "Number=Plur|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs",
175
+ "Case=Gen|Gender=Com|Number=Sing|POS=DET|PronType=Ind",
176
+ "Case=Gen|POS=NOUN",
177
+ "Number[psor]=Plur|POS=PRON|Person=3|Poss=Yes|PronType=Prs",
178
+ "POS=DET|PronType=Dem",
179
+ "Definite=Def|Number=Plur|POS=NOUN"
180
+ ],
181
+ "parser":[
182
+ "ROOT",
183
+ "acl:relcl",
184
+ "advcl",
185
+ "advmod",
186
+ "amod",
187
+ "appos",
188
+ "aux",
189
+ "case",
190
+ "cc",
191
+ "ccomp",
192
+ "compound:prt",
193
+ "conj",
194
+ "cop",
195
+ "dep",
196
+ "det",
197
+ "expl",
198
+ "fixed",
199
+ "flat",
200
+ "iobj",
201
+ "list",
202
+ "mark",
203
+ "nmod",
204
+ "nmod:poss",
205
+ "nsubj",
206
+ "nummod",
207
+ "obj",
208
+ "obl",
209
+ "obl:loc",
210
+ "obl:tmod",
211
+ "punct",
212
+ "xcomp"
213
+ ],
214
+ "attribute_ruler":[
215
+
216
+ ],
217
+ "lemmatizer":[
218
+
219
+ ],
220
+ "ner":[
221
+ "LOC",
222
+ "MISC",
223
+ "ORG",
224
+ "PER"
225
+ ]
226
+ },
227
+ "pipeline":[
228
+ "transformer",
229
+ "morphologizer",
230
+ "parser",
231
+ "attribute_ruler",
232
+ "lemmatizer",
233
+ "ner"
234
+ ],
235
+ "components":[
236
+ "transformer",
237
+ "morphologizer",
238
+ "parser",
239
+ "attribute_ruler",
240
+ "lemmatizer",
241
+ "ner"
242
+ ],
243
+ "disabled":[
244
+
245
+ ],
246
+ "_sourced_vectors_hashes":{
247
+
248
+ },
249
+ "performance":{
250
+ "pos_acc":0.9744285161,
251
+ "morph_acc":0.9723944208,
252
+ "morph_per_feat":{
253
+ "Mood":{
254
+ "p":0.9942473634,
255
+ "r":0.9885605338,
256
+ "f":0.9913957935
257
+ },
258
+ "Tense":{
259
+ "p":0.9841029523,
260
+ "r":0.9789156627,
261
+ "f":0.9815024538
262
+ },
263
+ "VerbForm":{
264
+ "p":0.9852125693,
265
+ "r":0.9785801714,
266
+ "f":0.9818851704
267
+ },
268
+ "Voice":{
269
+ "p":0.9947407964,
270
+ "r":0.9895366218,
271
+ "f":0.9921318846
272
+ },
273
+ "Definite":{
274
+ "p":0.9879711307,
275
+ "r":0.9735282497,
276
+ "f":0.9806965174
277
+ },
278
+ "Gender":{
279
+ "p":0.9828686597,
280
+ "r":0.9724160851,
281
+ "f":0.9776144337
282
+ },
283
+ "Number":{
284
+ "p":0.986803906,
285
+ "r":0.9752217006,
286
+ "f":0.9809786173
287
+ },
288
+ "AdpType":{
289
+ "p":0.9946714032,
290
+ "r":0.9902740937,
291
+ "f":0.9924678777
292
+ },
293
+ "PartType":{
294
+ "p":1.0,
295
+ "r":0.9967532468,
296
+ "f":0.9983739837
297
+ },
298
+ "Case":{
299
+ "p":0.9951923077,
300
+ "r":0.981042654,
301
+ "f":0.9880668258
302
+ },
303
+ "Person":{
304
+ "p":0.9875,
305
+ "r":0.9822380107,
306
+ "f":0.9848619768
307
+ },
308
+ "PronType":{
309
+ "p":0.9950413223,
310
+ "r":0.9901315789,
311
+ "f":0.9925803792
312
+ },
313
+ "NumType":{
314
+ "p":0.9798657718,
315
+ "r":0.9668874172,
316
+ "f":0.9733333333
317
+ },
318
+ "Degree":{
319
+ "p":0.9754901961,
320
+ "r":0.9590361446,
321
+ "f":0.9671931956
322
+ },
323
+ "Reflex":{
324
+ "p":1.0,
325
+ "r":1.0,
326
+ "f":1.0
327
+ },
328
+ "Polite":{
329
+ "p":0.0,
330
+ "r":0.0,
331
+ "f":0.0
332
+ },
333
+ "Number[psor]":{
334
+ "p":0.9770114943,
335
+ "r":0.988372093,
336
+ "f":0.9826589595
337
+ },
338
+ "Poss":{
339
+ "p":1.0,
340
+ "r":0.9886363636,
341
+ "f":0.9942857143
342
+ },
343
+ "Foreign":{
344
+ "p":1.0,
345
+ "r":0.4,
346
+ "f":0.5714285714
347
+ },
348
+ "Abbr":{
349
+ "p":1.0,
350
+ "r":0.4,
351
+ "f":0.5714285714
352
+ },
353
+ "Style":{
354
+ "p":1.0,
355
+ "r":1.0,
356
+ "f":1.0
357
+ }
358
+ },
359
+ "dep_uas":0.8714971531,
360
+ "dep_las":0.8396963608,
361
+ "dep_las_per_type":{
362
+ "advmod":{
363
+ "p":0.793006993,
364
+ "r":0.8008474576,
365
+ "f":0.796907941
366
+ },
367
+ "root":{
368
+ "p":0.8450704225,
369
+ "r":0.8510638298,
370
+ "f":0.8480565371
371
+ },
372
+ "nsubj":{
373
+ "p":0.9174603175,
374
+ "r":0.914556962,
375
+ "f":0.9160063391
376
+ },
377
+ "case":{
378
+ "p":0.9192422732,
379
+ "r":0.9110671937,
380
+ "f":0.9151364764
381
+ },
382
+ "obl":{
383
+ "p":0.7719568567,
384
+ "r":0.7791601866,
385
+ "f":0.7755417957
386
+ },
387
+ "cc":{
388
+ "p":0.851744186,
389
+ "r":0.851744186,
390
+ "f":0.851744186
391
+ },
392
+ "conj":{
393
+ "p":0.7320954907,
394
+ "r":0.736,
395
+ "f":0.7340425532
396
+ },
397
+ "obj":{
398
+ "p":0.8736263736,
399
+ "r":0.9262135922,
400
+ "f":0.8991517436
401
+ },
402
+ "aux":{
403
+ "p":0.8796561605,
404
+ "r":0.8950437318,
405
+ "f":0.887283237
406
+ },
407
+ "acl:relcl":{
408
+ "p":0.729281768,
409
+ "r":0.7135135135,
410
+ "f":0.7213114754
411
+ },
412
+ "obl:loc":{
413
+ "p":0.7285714286,
414
+ "r":0.7285714286,
415
+ "f":0.7285714286
416
+ },
417
+ "det":{
418
+ "p":0.9339933993,
419
+ "r":0.9324546952,
420
+ "f":0.933223413
421
+ },
422
+ "amod":{
423
+ "p":0.8799313894,
424
+ "r":0.8754266212,
425
+ "f":0.877673225
426
+ },
427
+ "nmod:poss":{
428
+ "p":0.702970297,
429
+ "r":0.702970297,
430
+ "f":0.702970297
431
+ },
432
+ "ccomp":{
433
+ "p":0.75,
434
+ "r":0.7741935484,
435
+ "f":0.7619047619
436
+ },
437
+ "nummod":{
438
+ "p":0.808,
439
+ "r":0.8416666667,
440
+ "f":0.8244897959
441
+ },
442
+ "flat":{
443
+ "p":0.8881578947,
444
+ "r":0.8940397351,
445
+ "f":0.8910891089
446
+ },
447
+ "compound:prt":{
448
+ "p":0.7,
449
+ "r":0.512195122,
450
+ "f":0.5915492958
451
+ },
452
+ "advcl":{
453
+ "p":0.7280701754,
454
+ "r":0.7155172414,
455
+ "f":0.7217391304
456
+ },
457
+ "mark":{
458
+ "p":0.9145833333,
459
+ "r":0.9014373717,
460
+ "f":0.9079627715
461
+ },
462
+ "cop":{
463
+ "p":0.8850574713,
464
+ "r":0.88,
465
+ "f":0.88252149
466
+ },
467
+ "dep":{
468
+ "p":0.1855670103,
469
+ "r":0.3396226415,
470
+ "f":0.24
471
+ },
472
+ "nmod":{
473
+ "p":0.7370600414,
474
+ "r":0.6953125,
475
+ "f":0.7155778894
476
+ },
477
+ "iobj":{
478
+ "p":1.0,
479
+ "r":0.6363636364,
480
+ "f":0.7777777778
481
+ },
482
+ "xcomp":{
483
+ "p":0.625,
484
+ "r":0.4237288136,
485
+ "f":0.5050505051
486
+ },
487
+ "appos":{
488
+ "p":0.6486486486,
489
+ "r":0.7272727273,
490
+ "f":0.6857142857
491
+ },
492
+ "list":{
493
+ "p":0.4,
494
+ "r":0.3333333333,
495
+ "f":0.3636363636
496
+ },
497
+ "vocative":{
498
+ "p":0.0,
499
+ "r":0.0,
500
+ "f":0.0
501
+ },
502
+ "fixed":{
503
+ "p":0.8947368421,
504
+ "r":0.8095238095,
505
+ "f":0.85
506
+ },
507
+ "expl":{
508
+ "p":0.9117647059,
509
+ "r":0.9117647059,
510
+ "f":0.9117647059
511
+ },
512
+ "obl:tmod":{
513
+ "p":0.8333333333,
514
+ "r":0.5555555556,
515
+ "f":0.6666666667
516
+ },
517
+ "discourse":{
518
+ "p":0.0,
519
+ "r":0.0,
520
+ "f":0.0
521
+ }
522
+ },
523
+ "sents_p":0.873015873,
524
+ "sents_r":0.8776595745,
525
+ "sents_f":0.875331565,
526
+ "lemma_acc":0.8491041162,
527
+ "ents_f":0.8178980229,
528
+ "ents_p":0.817047817,
529
+ "ents_r":0.81875,
530
+ "ents_per_type":{
531
+ "PER":{
532
+ "p":0.896969697,
533
+ "r":0.8915662651,
534
+ "f":0.8942598187
535
+ },
536
+ "ORG":{
537
+ "p":0.7228915663,
538
+ "r":0.6666666667,
539
+ "f":0.6936416185
540
+ },
541
+ "MISC":{
542
+ "p":0.7190082645,
543
+ "r":0.7699115044,
544
+ "f":0.7435897436
545
+ },
546
+ "LOC":{
547
+ "p":0.875,
548
+ "r":0.8828828829,
549
+ "f":0.8789237668
550
+ }
551
+ },
552
+ "transformer_loss":12243.0238996088,
553
+ "morphologizer_loss":3888.699029932,
554
+ "parser_loss":78618.0269862204,
555
+ "ner_loss":685.0320498052
556
+ },
557
+ "sources":[
558
+ {
559
+ "name":"UD Danish DDT v2.5",
560
+ "url":"https://github.com/UniversalDependencies/UD_Danish-DDT",
561
+ "license":"CC BY-SA 4.0",
562
+ "author":"Johannsen, Anders; Mart\u00ednez Alonso, H\u00e9ctor; Plank, Barbara"
563
+ },
564
+ {
565
+ "name":"DaNE",
566
+ "url":"https://github.com/alexandrainst/danlp/blob/master/docs/datasets.md#danish-dependency-treebank-dane",
567
+ "license":"CC BY-SA 4.0",
568
+ "author":"Rasmus Hvingelby, Amalie B. Pauli, Maria Barrett, Christina Rosted, Lasse M. Lidegaard, Anders S\u00f8gaard"
569
+ },
570
+ {
571
+ "name":"Maltehb/danish-bert-botxo",
572
+ "author":"BotXO.ai",
573
+ "url":"https://huggingface.co/Maltehb/danish-bert-botxo",
574
+ "license":"CC BY 4.0"
575
+ }
576
+ ],
577
+ "requirements":[
578
+ "spacy-transformers>=1.0.3,<1.1.0"
579
+ ],
580
+ "notes":"\n## Bias and Robustness\n\nBesides the validation done by SpaCy on the DaNE testset, DaCy also provides a series of augmentations to the DaNE test set to see how well the models deal with these types of augmentations.\nThe can be seen as behavioural probes akinn to the NLP checklist.\n\n### Deterministic Augmentations\nDeterministic augmentations are augmentation which always yield the same result.\n\n| Augmentation | Part-of-speech tagging (Accuracy) | Morphological tagging (Accuracy) | Dependency Parsing (UAS) | Dependency Parsing (LAS) |\u00a0Sentence segmentation (F1) | Lemmatization (Accuracy) | Named entity recognition (F1) |\n| --- | --- | --- | --- | --- | --- | --- | --- |\n| No augmentation | 0.98 | 0.975 | 0.888 | 0.857 | 0.936 | 0.844 | 0.765 |\n| \u00c6\u00f8\u00e5 Augmentation | 0.963 | 0.955 | 0.88 | 0.844 | 0.944 | 0.754 | 0.712 |\n| Lowercase | 0.98 | 0.975 | 0.888 | 0.857 | 0.936 | 0.848 | 0.765 |\n| No Spacing | 0.229 | 0.229 | 0.004 | 0.004 | 0.683 | 0.225 | 0.058 |\n| Abbreviated first names | 0.976 | 0.974 | 0.885 | 0.854 | 0.934 | 0.845 | 0.741 |\n| Input size augmentation 5 sentences | 0.978 | 0.973 | 0.88 | 0.85 | 0.883 | 0.844 | 0.77 |\n| Input size augmentation 10 sentences | 0.977 | 0.973 | 0.878 | 0.847 | 0.872 | 0.844 | 0.768 |\n\n\n\n### Stochastic Augmentations\nStochastic augmentations are augmentation which are repeated mulitple times to estimate the effect of the augmentation.\n\n| Augmentation | Part-of-speech tagging (Accuracy) | Morphological tagging (Accuracy) | Dependency Parsing (UAS) | Dependency Parsing (LAS) |\u00a0Sentence segmentation (F1) | Lemmatization (Accuracy) | Named entity recognition (F1) |\n| --- | --- | --- | --- | --- | --- | --- | --- |\n| Keystroke errors 2% | 0.936 (0.002) | 0.934 (0.002) | 0.836 (0.002) | 0.795 (0.002) | 0.889 (0.002) | 0.773 (0.002) | 0.627 (0.002) |\n| Keystroke errors 5% | 0.869 (0.003) | 0.873 (0.003) | 0.753 (0.003) | 0.696 (0.003) | 0.829 (0.003) | 0.68 (0.003) | 0.487 (0.003) |\n| Keystroke errors 15% | 0.647 (0.007) | 0.684 (0.007) | 0.5 (0.007) | 0.417 (0.007) | 0.664 (0.007) | 0.46 (0.007) | 0.256 (0.007) |\n| Danish names | 0.978 (0.0) | 0.975 (0.0) | 0.885 (0.0) | 0.855 (0.0) | 0.934 (0.0) | 0.847 (0.0) | 0.771 (0.0) |\n| Muslim names | 0.978 (0.0) | 0.975 (0.0) | 0.886 (0.0) | 0.855 (0.0) | 0.935 (0.0) | 0.847 (0.0) | 0.749 (0.0) |\n| Female names | 0.979 (0.0) | 0.975 (0.0) | 0.886 (0.0) | 0.856 (0.0) | 0.933 (0.0) | 0.847 (0.0) | 0.775 (0.0) |\n| Male names | 0.978 (0.0) | 0.975 (0.0) | 0.885 (0.0) | 0.855 (0.0) | 0.933 (0.0) | 0.847 (0.0) | 0.773 (0.0) |\n| Spacing Augmention 5% | 0.941 (0.002) | 0.937 (0.002) | 0.78 (0.002) | 0.751 (0.002) | 0.905 (0.002) | 0.812 (0.002) | 0.701 (0.002) |\n\n<details>\n\n<summary> Description of Augmenters </summary>\n\n \n\n**No augmentation:**\nApplies no augmentation to the DaNE test set.\n\n**\u00c6\u00f8\u00e5 Augmentation:**\nThis augmentation replace the \u00e6,\u00f8, and \u00e5 with their spelling variations ae, oe and aa respectively.\n\n**Lowercase:**\nThis augmentation lowercases all text.\n\n**No Spacing:**\nThis augmentation removed all spacing from the text.\n\n**Abbreviated first names:**\nThis agmentation abbreviates the first names of entities. For instance 'Kenneth Enevoldsen' would turn to 'K. Enevoldsen'.\n\n**Keystroke errors 2%:**\nThis agmentation simulate keystroke errors by replacing 2% of keys with a neighbouring key on a Danish QWERTY keyboard. As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n\n**Keystroke errors 5%:**\nThis agmentation simulate keystroke errors by replacing 5% of keys with a neighbouring key on a Danish QWERTY keyboard. As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n\n**Keystroke errors 15%:**\nThis agmentation simulate keystroke errors by replacing 15% of keys with a neighbouring key on a Danish QWERTY keyboard. As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n\n**Danish names:**\nThis agmentation replace all names with Danish names derived from Danmarks Statistik (2021). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n\n**Muslim names:**\nThis agmentation replace all names with Muslim names derived from Meldgaard (2005). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n\n**Female names:**\nThis agmentation replace all names with Danish female names derived from Danmarks Statistik (2021). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n\n**Male names:**\nThis agmentation replace all names with Danish male names derived from Danmarks Statistik (2021). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n\n**Spacing Augmention 5%:**\nThis agmentation replace all names with Danish male names derived from Danmarks Statistik (2021). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n </details> \n <br /> \n\n\n### Hardware\nThis was run an trained on a Quadro RTX 8000 GPU."
581
+ }
morphologizer/cfg ADDED
@@ -0,0 +1,320 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "labels_morph":{
3
+ "AdpType=Prep|POS=ADP":"AdpType=Prep",
4
+ "Definite=Ind|Gender=Com|Number=Sing|POS=NOUN":"Definite=Ind|Gender=Com|Number=Sing",
5
+ "Mood=Ind|POS=AUX|Tense=Pres|VerbForm=Fin|Voice=Act":"Mood=Ind|Tense=Pres|VerbForm=Fin|Voice=Act",
6
+ "POS=PROPN":"",
7
+ "Definite=Ind|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part":"Definite=Ind|Number=Sing|Tense=Past|VerbForm=Part",
8
+ "Definite=Def|Gender=Neut|Number=Sing|POS=NOUN":"Definite=Def|Gender=Neut|Number=Sing",
9
+ "POS=SCONJ":"",
10
+ "Definite=Def|Gender=Com|Number=Sing|POS=NOUN":"Definite=Def|Gender=Com|Number=Sing",
11
+ "Mood=Ind|POS=VERB|Tense=Pres|VerbForm=Fin|Voice=Act":"Mood=Ind|Tense=Pres|VerbForm=Fin|Voice=Act",
12
+ "POS=ADV":"",
13
+ "Number=Plur|POS=DET|PronType=Dem":"Number=Plur|PronType=Dem",
14
+ "Degree=Pos|Number=Plur|POS=ADJ":"Degree=Pos|Number=Plur",
15
+ "Definite=Ind|Gender=Com|Number=Plur|POS=NOUN":"Definite=Ind|Gender=Com|Number=Plur",
16
+ "POS=PUNCT":"",
17
+ "POS=CCONJ":"",
18
+ "Definite=Ind|Degree=Cmp|Number=Sing|POS=ADJ":"Definite=Ind|Degree=Cmp|Number=Sing",
19
+ "Degree=Cmp|POS=ADJ":"Degree=Cmp",
20
+ "POS=PRON|PartType=Inf":"PartType=Inf",
21
+ "Gender=Com|Number=Sing|POS=DET|PronType=Ind":"Gender=Com|Number=Sing|PronType=Ind",
22
+ "Definite=Ind|Degree=Pos|Number=Sing|POS=ADJ":"Definite=Ind|Degree=Pos|Number=Sing",
23
+ "Case=Acc|Gender=Neut|Number=Sing|POS=PRON|Person=3|PronType=Prs":"Case=Acc|Gender=Neut|Number=Sing|Person=3|PronType=Prs",
24
+ "Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN":"Definite=Ind|Gender=Neut|Number=Plur",
25
+ "Definite=Def|Degree=Pos|Number=Sing|POS=ADJ":"Definite=Def|Degree=Pos|Number=Sing",
26
+ "Gender=Neut|Number=Sing|POS=DET|PronType=Dem":"Gender=Neut|Number=Sing|PronType=Dem",
27
+ "Degree=Pos|POS=ADV":"Degree=Pos",
28
+ "Definite=Def|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part":"Definite=Def|Number=Sing|Tense=Past|VerbForm=Part",
29
+ "Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN":"Definite=Ind|Gender=Neut|Number=Sing",
30
+ "POS=PRON|PronType=Dem":"PronType=Dem",
31
+ "NumType=Card|POS=NUM":"NumType=Card",
32
+ "Definite=Ind|Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ":"Definite=Ind|Degree=Pos|Gender=Neut|Number=Sing",
33
+ "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs":"Case=Acc|Gender=Com|Number=Sing|Person=3|PronType=Prs",
34
+ "Degree=Pos|Gender=Com|Number=Sing|POS=ADJ":"Degree=Pos|Gender=Com|Number=Sing",
35
+ "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs":"Case=Nom|Gender=Com|Number=Sing|Person=3|PronType=Prs",
36
+ "NumType=Ord|POS=ADJ":"NumType=Ord",
37
+ "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":"Gender=Com|Number=Sing|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
38
+ "Mood=Ind|POS=AUX|Tense=Past|VerbForm=Fin|Voice=Act":"Mood=Ind|Tense=Past|VerbForm=Fin|Voice=Act",
39
+ "POS=VERB|VerbForm=Inf|Voice=Act":"VerbForm=Inf|Voice=Act",
40
+ "Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Act":"Mood=Ind|Tense=Past|VerbForm=Fin|Voice=Act",
41
+ "POS=NOUN":"",
42
+ "Mood=Ind|POS=VERB|Tense=Pres|VerbForm=Fin|Voice=Pass":"Mood=Ind|Tense=Pres|VerbForm=Fin|Voice=Pass",
43
+ "POS=ADP|PartType=Inf":"PartType=Inf",
44
+ "Degree=Pos|POS=ADJ":"Degree=Pos",
45
+ "Definite=Def|Gender=Com|Number=Plur|POS=NOUN":"Definite=Def|Gender=Com|Number=Plur",
46
+ "Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs":"Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs",
47
+ "Case=Gen|Definite=Def|Gender=Com|Number=Sing|POS=NOUN":"Case=Gen|Definite=Def|Gender=Com|Number=Sing",
48
+ "POS=AUX|VerbForm=Inf|Voice=Act":"VerbForm=Inf|Voice=Act",
49
+ "Definite=Ind|Degree=Pos|Gender=Com|Number=Sing|POS=ADJ":"Definite=Ind|Degree=Pos|Gender=Com|Number=Sing",
50
+ "Gender=Com|Number=Sing|POS=DET|PronType=Dem":"Gender=Com|Number=Sing|PronType=Dem",
51
+ "Number=Plur|POS=DET|PronType=Ind":"Number=Plur|PronType=Ind",
52
+ "Gender=Com|Number=Sing|POS=PRON|PronType=Ind":"Gender=Com|Number=Sing|PronType=Ind",
53
+ "Case=Acc|POS=PRON|Person=3|PronType=Prs|Reflex=Yes":"Case=Acc|Person=3|PronType=Prs|Reflex=Yes",
54
+ "POS=PART|PartType=Inf":"PartType=Inf",
55
+ "Gender=Neut|Number=Sing|POS=DET|PronType=Ind":"Gender=Neut|Number=Sing|PronType=Ind",
56
+ "Case=Acc|Number=Plur|POS=PRON|Person=3|PronType=Prs":"Case=Acc|Number=Plur|Person=3|PronType=Prs",
57
+ "Case=Gen|Definite=Def|Gender=Neut|Number=Sing|POS=NOUN":"Case=Gen|Definite=Def|Gender=Neut|Number=Sing",
58
+ "Case=Nom|Number=Plur|POS=PRON|Person=3|PronType=Prs":"Case=Nom|Number=Plur|Person=3|PronType=Prs",
59
+ "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs":"Case=Nom|Gender=Com|Number=Sing|Person=1|PronType=Prs",
60
+ "Case=Nom|Gender=Com|POS=PRON|PronType=Ind":"Case=Nom|Gender=Com|PronType=Ind",
61
+ "Gender=Neut|Number=Sing|POS=PRON|PronType=Ind":"Gender=Neut|Number=Sing|PronType=Ind",
62
+ "Mood=Imp|POS=VERB":"Mood=Imp",
63
+ "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs":"Gender=Com|Number=Sing|Number[psor]=Sing|Person=1|Poss=Yes|PronType=Prs",
64
+ "Definite=Ind|Number=Sing|POS=AUX|Tense=Past|VerbForm=Part":"Definite=Ind|Number=Sing|Tense=Past|VerbForm=Part",
65
+ "POS=X":"",
66
+ "Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs":"Case=Nom|Gender=Com|Number=Plur|Person=1|PronType=Prs",
67
+ "Case=Gen|Definite=Def|Gender=Com|Number=Plur|POS=NOUN":"Case=Gen|Definite=Def|Gender=Com|Number=Plur",
68
+ "POS=VERB|Tense=Pres|VerbForm=Part":"Tense=Pres|VerbForm=Part",
69
+ "Number=Plur|POS=PRON|PronType=Int,Rel":"Number=Plur|PronType=Int,Rel",
70
+ "POS=VERB|VerbForm=Inf|Voice=Pass":"VerbForm=Inf|Voice=Pass",
71
+ "Case=Gen|Definite=Ind|Gender=Com|Number=Sing|POS=NOUN":"Case=Gen|Definite=Ind|Gender=Com|Number=Sing",
72
+ "Degree=Cmp|POS=ADV":"Degree=Cmp",
73
+ "POS=ADV|PartType=Inf":"PartType=Inf",
74
+ "Degree=Sup|POS=ADV":"Degree=Sup",
75
+ "Number=Plur|POS=PRON|PronType=Dem":"Number=Plur|PronType=Dem",
76
+ "Number=Plur|POS=PRON|PronType=Ind":"Number=Plur|PronType=Ind",
77
+ "Definite=Def|Gender=Neut|Number=Plur|POS=NOUN":"Definite=Def|Gender=Neut|Number=Plur",
78
+ "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs":"Case=Acc|Gender=Com|Number=Sing|Person=1|PronType=Prs",
79
+ "Case=Gen|POS=PROPN":"Case=Gen",
80
+ "POS=ADP":"",
81
+ "Degree=Cmp|Number=Plur|POS=ADJ":"Degree=Cmp|Number=Plur",
82
+ "Definite=Def|Degree=Sup|POS=ADJ":"Definite=Def|Degree=Sup",
83
+ "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs":"Gender=Neut|Number=Sing|Number[psor]=Sing|Person=1|Poss=Yes|PronType=Prs",
84
+ "Degree=Pos|Number=Sing|POS=ADJ":"Degree=Pos|Number=Sing",
85
+ "Number=Plur|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":"Number=Plur|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
86
+ "Gender=Com|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form":"Gender=Com|Number=Sing|Number[psor]=Plur|Person=1|Poss=Yes|PronType=Prs|Style=Form",
87
+ "Number=Plur|POS=PRON|PronType=Rcp":"Number=Plur|PronType=Rcp",
88
+ "Case=Gen|Degree=Cmp|POS=ADJ":"Case=Gen|Degree=Cmp",
89
+ "Case=Gen|Definite=Def|Gender=Neut|Number=Plur|POS=NOUN":"Case=Gen|Definite=Def|Gender=Neut|Number=Plur",
90
+ "Number[psor]=Plur|POS=DET|Person=3|Poss=Yes|PronType=Prs":"Number[psor]=Plur|Person=3|Poss=Yes|PronType=Prs",
91
+ "POS=INTJ":"",
92
+ "Number=Plur|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs":"Number=Plur|Number[psor]=Sing|Person=1|Poss=Yes|PronType=Prs",
93
+ "Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ":"Degree=Pos|Gender=Neut|Number=Sing",
94
+ "Gender=Neut|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form":"Gender=Neut|Number=Sing|Number[psor]=Plur|Person=1|Poss=Yes|PronType=Prs|Style=Form",
95
+ "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs":"Case=Acc|Gender=Com|Number=Sing|Person=2|PronType=Prs",
96
+ "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs":"Gender=Com|Number=Sing|Number[psor]=Sing|Person=2|Poss=Yes|PronType=Prs",
97
+ "Case=Gen|Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN":"Case=Gen|Definite=Ind|Gender=Neut|Number=Plur",
98
+ "Number=Sing|POS=PRON|PronType=Int,Rel":"Number=Sing|PronType=Int,Rel",
99
+ "Number=Plur|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form":"Number=Plur|Number[psor]=Plur|Person=1|Poss=Yes|PronType=Prs|Style=Form",
100
+ "Gender=Neut|Number=Sing|POS=PRON|PronType=Int,Rel":"Gender=Neut|Number=Sing|PronType=Int,Rel",
101
+ "Definite=Def|Degree=Sup|Number=Plur|POS=ADJ":"Definite=Def|Degree=Sup|Number=Plur",
102
+ "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs":"Case=Nom|Gender=Com|Number=Sing|Person=2|PronType=Prs",
103
+ "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":"Gender=Neut|Number=Sing|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
104
+ "Definite=Ind|Number=Sing|POS=NOUN":"Definite=Ind|Number=Sing",
105
+ "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":"Number=Plur|Tense=Past|VerbForm=Part",
106
+ "Number=Plur|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":"Number=Plur|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
107
+ "POS=SYM":"",
108
+ "Case=Nom|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs":"Case=Nom|Gender=Com|Person=2|Polite=Form|PronType=Prs",
109
+ "Degree=Sup|POS=ADJ":"Degree=Sup",
110
+ "Number=Plur|POS=DET|PronType=Ind|Style=Arch":"Number=Plur|PronType=Ind|Style=Arch",
111
+ "Case=Gen|Gender=Com|Number=Sing|POS=DET|PronType=Dem":"Case=Gen|Gender=Com|Number=Sing|PronType=Dem",
112
+ "Foreign=Yes|POS=X":"Foreign=Yes",
113
+ "POS=DET|Person=2|Polite=Form|Poss=Yes|PronType=Prs":"Person=2|Polite=Form|Poss=Yes|PronType=Prs",
114
+ "Gender=Neut|Number=Sing|POS=PRON|PronType=Dem":"Gender=Neut|Number=Sing|PronType=Dem",
115
+ "Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs":"Case=Acc|Gender=Com|Number=Plur|Person=1|PronType=Prs",
116
+ "Case=Gen|Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN":"Case=Gen|Definite=Ind|Gender=Neut|Number=Sing",
117
+ "Case=Gen|POS=PRON|PronType=Int,Rel":"Case=Gen|PronType=Int,Rel",
118
+ "Gender=Com|Number=Sing|POS=PRON|PronType=Dem":"Gender=Com|Number=Sing|PronType=Dem",
119
+ "Abbr=Yes|POS=X":"Abbr=Yes",
120
+ "Case=Gen|Definite=Ind|Gender=Com|Number=Plur|POS=NOUN":"Case=Gen|Definite=Ind|Gender=Com|Number=Plur",
121
+ "Definite=Def|Degree=Abs|POS=ADJ":"Definite=Def|Degree=Abs",
122
+ "Definite=Ind|Degree=Sup|Number=Sing|POS=ADJ":"Definite=Ind|Degree=Sup|Number=Sing",
123
+ "Definite=Ind|POS=NOUN":"Definite=Ind",
124
+ "Gender=Com|Number=Plur|POS=NOUN":"Gender=Com|Number=Plur",
125
+ "Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs":"Number[psor]=Plur|Person=1|Poss=Yes|PronType=Prs",
126
+ "Gender=Com|POS=PRON|PronType=Int,Rel":"Gender=Com|PronType=Int,Rel",
127
+ "Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs":"Case=Nom|Gender=Com|Number=Plur|Person=2|PronType=Prs",
128
+ "Degree=Abs|POS=ADV":"Degree=Abs",
129
+ "POS=VERB|VerbForm=Ger":"VerbForm=Ger",
130
+ "POS=VERB|Tense=Past|VerbForm=Part":"Tense=Past|VerbForm=Part",
131
+ "Definite=Def|Degree=Sup|Number=Sing|POS=ADJ":"Definite=Def|Degree=Sup|Number=Sing",
132
+ "Number=Plur|Number[psor]=Plur|POS=PRON|Person=1|Poss=Yes|PronType=Prs|Style=Form":"Number=Plur|Number[psor]=Plur|Person=1|Poss=Yes|PronType=Prs|Style=Form",
133
+ "Case=Gen|Definite=Def|Degree=Pos|Number=Sing|POS=ADJ":"Case=Gen|Definite=Def|Degree=Pos|Number=Sing",
134
+ "Case=Gen|Degree=Pos|Number=Plur|POS=ADJ":"Case=Gen|Degree=Pos|Number=Plur",
135
+ "Case=Acc|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs":"Case=Acc|Gender=Com|Person=2|Polite=Form|PronType=Prs",
136
+ "Gender=Com|Number=Sing|POS=PRON|PronType=Int,Rel":"Gender=Com|Number=Sing|PronType=Int,Rel",
137
+ "POS=VERB|Tense=Pres":"Tense=Pres",
138
+ "Case=Gen|Number=Plur|POS=DET|PronType=Ind":"Case=Gen|Number=Plur|PronType=Ind",
139
+ "Number[psor]=Plur|POS=DET|Person=2|Poss=Yes|PronType=Prs":"Number[psor]=Plur|Person=2|Poss=Yes|PronType=Prs",
140
+ "POS=PRON|Person=2|Polite=Form|Poss=Yes|PronType=Prs":"Person=2|Polite=Form|Poss=Yes|PronType=Prs",
141
+ "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs":"Gender=Neut|Number=Sing|Number[psor]=Sing|Person=2|Poss=Yes|PronType=Prs",
142
+ "POS=AUX|Tense=Pres|VerbForm=Part":"Tense=Pres|VerbForm=Part",
143
+ "Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Pass":"Mood=Ind|Tense=Past|VerbForm=Fin|Voice=Pass",
144
+ "Gender=Com|Number=Sing|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":"Gender=Com|Number=Sing|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
145
+ "Degree=Sup|Number=Plur|POS=ADJ":"Degree=Sup|Number=Plur",
146
+ "Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs":"Case=Acc|Gender=Com|Number=Plur|Person=2|PronType=Prs",
147
+ "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":"Gender=Neut|Number=Sing|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
148
+ "Definite=Ind|Number=Plur|POS=NOUN":"Definite=Ind|Number=Plur",
149
+ "Case=Gen|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":"Case=Gen|Number=Plur|Tense=Past|VerbForm=Part",
150
+ "Mood=Imp|POS=AUX":"Mood=Imp",
151
+ "Gender=Com|Number=Sing|Number[psor]=Sing|POS=PRON|Person=1|Poss=Yes|PronType=Prs":"Gender=Com|Number=Sing|Number[psor]=Sing|Person=1|Poss=Yes|PronType=Prs",
152
+ "Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs":"Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs",
153
+ "Definite=Def|Gender=Com|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part":"Definite=Def|Gender=Com|Number=Sing|Tense=Past|VerbForm=Part",
154
+ "Number=Plur|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs":"Number=Plur|Number[psor]=Sing|Person=2|Poss=Yes|PronType=Prs",
155
+ "Case=Gen|Gender=Com|Number=Sing|POS=DET|PronType=Ind":"Case=Gen|Gender=Com|Number=Sing|PronType=Ind",
156
+ "Case=Gen|POS=NOUN":"Case=Gen",
157
+ "Number[psor]=Plur|POS=PRON|Person=3|Poss=Yes|PronType=Prs":"Number[psor]=Plur|Person=3|Poss=Yes|PronType=Prs",
158
+ "POS=DET|PronType=Dem":"PronType=Dem",
159
+ "Definite=Def|Number=Plur|POS=NOUN":"Definite=Def|Number=Plur"
160
+ },
161
+ "labels_pos":{
162
+ "AdpType=Prep|POS=ADP":85,
163
+ "Definite=Ind|Gender=Com|Number=Sing|POS=NOUN":92,
164
+ "Mood=Ind|POS=AUX|Tense=Pres|VerbForm=Fin|Voice=Act":87,
165
+ "POS=PROPN":96,
166
+ "Definite=Ind|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part":100,
167
+ "Definite=Def|Gender=Neut|Number=Sing|POS=NOUN":92,
168
+ "POS=SCONJ":98,
169
+ "Definite=Def|Gender=Com|Number=Sing|POS=NOUN":92,
170
+ "Mood=Ind|POS=VERB|Tense=Pres|VerbForm=Fin|Voice=Act":100,
171
+ "POS=ADV":86,
172
+ "Number=Plur|POS=DET|PronType=Dem":90,
173
+ "Degree=Pos|Number=Plur|POS=ADJ":84,
174
+ "Definite=Ind|Gender=Com|Number=Plur|POS=NOUN":92,
175
+ "POS=PUNCT":97,
176
+ "POS=CCONJ":89,
177
+ "Definite=Ind|Degree=Cmp|Number=Sing|POS=ADJ":84,
178
+ "Degree=Cmp|POS=ADJ":84,
179
+ "POS=PRON|PartType=Inf":95,
180
+ "Gender=Com|Number=Sing|POS=DET|PronType=Ind":90,
181
+ "Definite=Ind|Degree=Pos|Number=Sing|POS=ADJ":84,
182
+ "Case=Acc|Gender=Neut|Number=Sing|POS=PRON|Person=3|PronType=Prs":95,
183
+ "Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN":92,
184
+ "Definite=Def|Degree=Pos|Number=Sing|POS=ADJ":84,
185
+ "Gender=Neut|Number=Sing|POS=DET|PronType=Dem":90,
186
+ "Degree=Pos|POS=ADV":86,
187
+ "Definite=Def|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part":100,
188
+ "Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN":92,
189
+ "POS=PRON|PronType=Dem":95,
190
+ "NumType=Card|POS=NUM":93,
191
+ "Definite=Ind|Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ":84,
192
+ "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs":95,
193
+ "Degree=Pos|Gender=Com|Number=Sing|POS=ADJ":84,
194
+ "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs":95,
195
+ "NumType=Ord|POS=ADJ":84,
196
+ "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":90,
197
+ "Mood=Ind|POS=AUX|Tense=Past|VerbForm=Fin|Voice=Act":87,
198
+ "POS=VERB|VerbForm=Inf|Voice=Act":100,
199
+ "Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Act":100,
200
+ "POS=NOUN":92,
201
+ "Mood=Ind|POS=VERB|Tense=Pres|VerbForm=Fin|Voice=Pass":100,
202
+ "POS=ADP|PartType=Inf":85,
203
+ "Degree=Pos|POS=ADJ":84,
204
+ "Definite=Def|Gender=Com|Number=Plur|POS=NOUN":92,
205
+ "Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs":90,
206
+ "Case=Gen|Definite=Def|Gender=Com|Number=Sing|POS=NOUN":92,
207
+ "POS=AUX|VerbForm=Inf|Voice=Act":87,
208
+ "Definite=Ind|Degree=Pos|Gender=Com|Number=Sing|POS=ADJ":84,
209
+ "Gender=Com|Number=Sing|POS=DET|PronType=Dem":90,
210
+ "Number=Plur|POS=DET|PronType=Ind":90,
211
+ "Gender=Com|Number=Sing|POS=PRON|PronType=Ind":95,
212
+ "Case=Acc|POS=PRON|Person=3|PronType=Prs|Reflex=Yes":95,
213
+ "POS=PART|PartType=Inf":94,
214
+ "Gender=Neut|Number=Sing|POS=DET|PronType=Ind":90,
215
+ "Case=Acc|Number=Plur|POS=PRON|Person=3|PronType=Prs":95,
216
+ "Case=Gen|Definite=Def|Gender=Neut|Number=Sing|POS=NOUN":92,
217
+ "Case=Nom|Number=Plur|POS=PRON|Person=3|PronType=Prs":95,
218
+ "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs":95,
219
+ "Case=Nom|Gender=Com|POS=PRON|PronType=Ind":95,
220
+ "Gender=Neut|Number=Sing|POS=PRON|PronType=Ind":95,
221
+ "Mood=Imp|POS=VERB":100,
222
+ "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs":90,
223
+ "Definite=Ind|Number=Sing|POS=AUX|Tense=Past|VerbForm=Part":87,
224
+ "POS=X":101,
225
+ "Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs":95,
226
+ "Case=Gen|Definite=Def|Gender=Com|Number=Plur|POS=NOUN":92,
227
+ "POS=VERB|Tense=Pres|VerbForm=Part":100,
228
+ "Number=Plur|POS=PRON|PronType=Int,Rel":95,
229
+ "POS=VERB|VerbForm=Inf|Voice=Pass":100,
230
+ "Case=Gen|Definite=Ind|Gender=Com|Number=Sing|POS=NOUN":92,
231
+ "Degree=Cmp|POS=ADV":86,
232
+ "POS=ADV|PartType=Inf":86,
233
+ "Degree=Sup|POS=ADV":86,
234
+ "Number=Plur|POS=PRON|PronType=Dem":95,
235
+ "Number=Plur|POS=PRON|PronType=Ind":95,
236
+ "Definite=Def|Gender=Neut|Number=Plur|POS=NOUN":92,
237
+ "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs":95,
238
+ "Case=Gen|POS=PROPN":96,
239
+ "POS=ADP":85,
240
+ "Degree=Cmp|Number=Plur|POS=ADJ":84,
241
+ "Definite=Def|Degree=Sup|POS=ADJ":84,
242
+ "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs":90,
243
+ "Degree=Pos|Number=Sing|POS=ADJ":84,
244
+ "Number=Plur|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":90,
245
+ "Gender=Com|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form":90,
246
+ "Number=Plur|POS=PRON|PronType=Rcp":95,
247
+ "Case=Gen|Degree=Cmp|POS=ADJ":84,
248
+ "Case=Gen|Definite=Def|Gender=Neut|Number=Plur|POS=NOUN":92,
249
+ "Number[psor]=Plur|POS=DET|Person=3|Poss=Yes|PronType=Prs":90,
250
+ "POS=INTJ":91,
251
+ "Number=Plur|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs":90,
252
+ "Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ":84,
253
+ "Gender=Neut|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form":90,
254
+ "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs":95,
255
+ "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs":90,
256
+ "Case=Gen|Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN":92,
257
+ "Number=Sing|POS=PRON|PronType=Int,Rel":95,
258
+ "Number=Plur|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form":90,
259
+ "Gender=Neut|Number=Sing|POS=PRON|PronType=Int,Rel":95,
260
+ "Definite=Def|Degree=Sup|Number=Plur|POS=ADJ":84,
261
+ "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs":95,
262
+ "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":90,
263
+ "Definite=Ind|Number=Sing|POS=NOUN":92,
264
+ "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":100,
265
+ "Number=Plur|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":95,
266
+ "POS=SYM":99,
267
+ "Case=Nom|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs":95,
268
+ "Degree=Sup|POS=ADJ":84,
269
+ "Number=Plur|POS=DET|PronType=Ind|Style=Arch":90,
270
+ "Case=Gen|Gender=Com|Number=Sing|POS=DET|PronType=Dem":90,
271
+ "Foreign=Yes|POS=X":101,
272
+ "POS=DET|Person=2|Polite=Form|Poss=Yes|PronType=Prs":90,
273
+ "Gender=Neut|Number=Sing|POS=PRON|PronType=Dem":95,
274
+ "Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs":95,
275
+ "Case=Gen|Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN":92,
276
+ "Case=Gen|POS=PRON|PronType=Int,Rel":95,
277
+ "Gender=Com|Number=Sing|POS=PRON|PronType=Dem":95,
278
+ "Abbr=Yes|POS=X":101,
279
+ "Case=Gen|Definite=Ind|Gender=Com|Number=Plur|POS=NOUN":92,
280
+ "Definite=Def|Degree=Abs|POS=ADJ":84,
281
+ "Definite=Ind|Degree=Sup|Number=Sing|POS=ADJ":84,
282
+ "Definite=Ind|POS=NOUN":92,
283
+ "Gender=Com|Number=Plur|POS=NOUN":92,
284
+ "Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs":90,
285
+ "Gender=Com|POS=PRON|PronType=Int,Rel":95,
286
+ "Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs":95,
287
+ "Degree=Abs|POS=ADV":86,
288
+ "POS=VERB|VerbForm=Ger":100,
289
+ "POS=VERB|Tense=Past|VerbForm=Part":100,
290
+ "Definite=Def|Degree=Sup|Number=Sing|POS=ADJ":84,
291
+ "Number=Plur|Number[psor]=Plur|POS=PRON|Person=1|Poss=Yes|PronType=Prs|Style=Form":95,
292
+ "Case=Gen|Definite=Def|Degree=Pos|Number=Sing|POS=ADJ":84,
293
+ "Case=Gen|Degree=Pos|Number=Plur|POS=ADJ":84,
294
+ "Case=Acc|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs":95,
295
+ "Gender=Com|Number=Sing|POS=PRON|PronType=Int,Rel":95,
296
+ "POS=VERB|Tense=Pres":100,
297
+ "Case=Gen|Number=Plur|POS=DET|PronType=Ind":90,
298
+ "Number[psor]=Plur|POS=DET|Person=2|Poss=Yes|PronType=Prs":90,
299
+ "POS=PRON|Person=2|Polite=Form|Poss=Yes|PronType=Prs":95,
300
+ "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs":90,
301
+ "POS=AUX|Tense=Pres|VerbForm=Part":87,
302
+ "Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Pass":100,
303
+ "Gender=Com|Number=Sing|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":95,
304
+ "Degree=Sup|Number=Plur|POS=ADJ":84,
305
+ "Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs":95,
306
+ "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":95,
307
+ "Definite=Ind|Number=Plur|POS=NOUN":92,
308
+ "Case=Gen|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":100,
309
+ "Mood=Imp|POS=AUX":87,
310
+ "Gender=Com|Number=Sing|Number[psor]=Sing|POS=PRON|Person=1|Poss=Yes|PronType=Prs":95,
311
+ "Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs":95,
312
+ "Definite=Def|Gender=Com|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part":100,
313
+ "Number=Plur|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs":90,
314
+ "Case=Gen|Gender=Com|Number=Sing|POS=DET|PronType=Ind":90,
315
+ "Case=Gen|POS=NOUN":92,
316
+ "Number[psor]=Plur|POS=PRON|Person=3|Poss=Yes|PronType=Prs":95,
317
+ "POS=DET|PronType=Dem":90,
318
+ "Definite=Def|Number=Plur|POS=NOUN":92
319
+ }
320
+ }
morphologizer/model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3ff4ac19d2ea3ccfe10ad70e8cbcf15b22916b18b08d7d8efa3990be0323bf78
3
+ size 483528
ner/cfg ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "moves":null,
3
+ "update_with_oracle_cut_size":100,
4
+ "multitasks":[
5
+
6
+ ],
7
+ "min_action_freq":1,
8
+ "learn_tokens":false,
9
+ "beam_width":1,
10
+ "beam_density":0.0,
11
+ "beam_update_prob":0.0,
12
+ "incorrect_spans_key":null
13
+ }
ner/model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a3ecfb5d8bd519dd27a53f8f5eb525d46bd9fe2d60fd3173db955f0896317979
3
+ size 225962
ner/moves ADDED
@@ -0,0 +1 @@
 
 
1
+ ��moves��{"0":{},"1":{"PER":2146,"MISC":1273,"ORG":1267,"LOC":1144},"2":{"PER":2146,"MISC":1273,"ORG":1267,"LOC":1144},"3":{"PER":2146,"MISC":1273,"ORG":1267,"LOC":1144},"4":{"PER":2146,"MISC":1273,"ORG":1267,"LOC":1144,"":1},"5":{"":1}}�cfg��neg_key�
parser/cfg ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "moves":null,
3
+ "update_with_oracle_cut_size":100,
4
+ "multitasks":[
5
+
6
+ ],
7
+ "min_action_freq":30,
8
+ "learn_tokens":false,
9
+ "beam_width":1,
10
+ "beam_density":0.0,
11
+ "beam_update_prob":0.0,
12
+ "incorrect_spans_key":null
13
+ }
parser/model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:91770681cc7c9adf240c0d7dda5b09a3a62ccf798e939366da0c351d0f080680
3
+ size 456157
parser/moves ADDED
@@ -0,0 +1 @@
 
 
1
+ ��moves�2{"0":{"":41514},"1":{"":34292},"2":{"case":7489,"nsubj":6009,"det":4334,"amod":3968,"advmod":3657,"mark":3529,"aux":2432,"cc":2261,"punct":2182,"cop":1329,"obl":894,"nummod":799,"nmod:poss":651,"nmod":460,"expl":291,"ccomp":202,"obj":195,"xcomp":122,"case||nmod":73,"obl:tmod":53,"dep":49,"acl:relcl":43},"3":{"punct":8600,"obl":3949,"obj":3758,"nmod":3565,"conj":2743,"advmod":2095,"flat":1294,"nsubj":1172,"acl:relcl":1131,"advcl":808,"amod":629,"obl:loc":467,"fixed":390,"dep":322,"xcomp":272,"appos":268,"compound:prt":261,"ccomp":252,"acl:relcl||nsubj":237,"case":202,"nummod":167,"list":161,"nmod:poss":156,"punct||conj":151,"mark":137,"cc":135,"iobj":107,"expl":77,"cop":69,"nmod||case":60,"aux":48,"obl:tmod":45,"cc||case":43,"advcl||advmod":43,"cc||conj":40,"case||obl":38,"punct||case":33},"4":{"ROOT":4367}}�cfg��neg_key�
tokenizer ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ ��prefix_search� ~^§|^%|^=|^—|^–|^\+(?![0-9])|^…|^……|^,|^:|^;|^\!|^\?|^¿|^؟|^¡|^\(|^\)|^\[|^\]|^\{|^\}|^<|^>|^_|^#|^\*|^&|^。|^?|^!|^,|^、|^;|^:|^~|^·|^।|^،|^۔|^؛|^٪|^\.\.+|^…|^\'|^"|^”|^“|^`|^‘|^´|^’|^‚|^,|^„|^»|^«|^「|^」|^『|^』|^(|^)|^〔|^〕|^【|^】|^《|^》|^〈|^〉|^\$|^£|^€|^¥|^฿|^US\$|^C\$|^A\$|^₽|^﷼|^₴|^₠|^₡|^₢|^₣|^₤|^₥|^₦|^₧|^₨|^₩|^₪|^₫|^€|^₭|^₮|^₯|^₰|^₱|^₲|^₳|^₴|^₵|^₶|^₷|^₸|^₹|^₺|^₻|^₼|^₽|^₾|^₿|^[\u00A6\u00A9\u00AE\u00B0\u0482\u058D\u058E\u060E\u060F\u06DE\u06E9\u06FD\u06FE\u07F6\u09FA\u0B70\u0BF3-\u0BF8\u0BFA\u0C7F\u0D4F\u0D79\u0F01-\u0F03\u0F13\u0F15-\u0F17\u0F1A-\u0F1F\u0F34\u0F36\u0F38\u0FBE-\u0FC5\u0FC7-\u0FCC\u0FCE\u0FCF\u0FD5-\u0FD8\u109E\u109F\u1390-\u1399\u1940\u19DE-\u19FF\u1B61-\u1B6A\u1B74-\u1B7C\u2100\u2101\u2103-\u2106\u2108\u2109\u2114\u2116\u2117\u211E-\u2123\u2125\u2127\u2129\u212E\u213A\u213B\u214A\u214C\u214D\u214F\u218A\u218B\u2195-\u2199\u219C-\u219F\u21A1\u21A2\u21A4\u21A5\u21A7-\u21AD\u21AF-\u21CD\u21D0\u21D1\u21D3\u21D5-\u21F3\u2300-\u2307\u230C-\u231F\u2322-\u2328\u232B-\u237B\u237D-\u239A\u23B4-\u23DB\u23E2-\u2426\u2440-\u244A\u249C-\u24E9\u2500-\u25B6\u25B8-\u25C0\u25C2-\u25F7\u2600-\u266E\u2670-\u2767\u2794-\u27BF\u2800-\u28FF\u2B00-\u2B2F\u2B45\u2B46\u2B4D-\u2B73\u2B76-\u2B95\u2B98-\u2BC8\u2BCA-\u2BFE\u2CE5-\u2CEA\u2E80-\u2E99\u2E9B-\u2EF3\u2F00-\u2FD5\u2FF0-\u2FFB\u3004\u3012\u3013\u3020\u3036\u3037\u303E\u303F\u3190\u3191\u3196-\u319F\u31C0-\u31E3\u3200-\u321E\u322A-\u3247\u3250\u3260-\u327F\u328A-\u32B0\u32C0-\u32FE\u3300-\u33FF\u4DC0-\u4DFF\uA490-\uA4C6\uA828-\uA82B\uA836\uA837\uA839\uAA77-\uAA79\uFDFD\uFFE4\uFFE8\uFFED\uFFEE\uFFFC\uFFFD\U00010137-\U0001013F\U00010179-\U00010189\U0001018C-\U0001018E\U00010190-\U0001019B\U000101A0\U000101D0-\U000101FC\U00010877\U00010878\U00010AC8\U0001173F\U00016B3C-\U00016B3F\U00016B45\U0001BC9C\U0001D000-\U0001D0F5\U0001D100-\U0001D126\U0001D129-\U0001D164\U0001D16A-\U0001D16C\U0001D183\U0001D184\U0001D18C-\U0001D1A9\U0001D1AE-\U0001D1E8\U0001D200-\U0001D241\U0001D245\U0001D300-\U0001D356\U0001D800-\U0001D9FF\U0001DA37-\U0001DA3A\U0001DA6D-\U0001DA74\U0001DA76-\U0001DA83\U0001DA85\U0001DA86\U0001ECAC\U0001F000-\U0001F02B\U0001F030-\U0001F093\U0001F0A0-\U0001F0AE\U0001F0B1-\U0001F0BF\U0001F0C1-\U0001F0CF\U0001F0D1-\U0001F0F5\U0001F110-\U0001F16B\U0001F170-\U0001F1AC\U0001F1E6-\U0001F202\U0001F210-\U0001F23B\U0001F240-\U0001F248\U0001F250\U0001F251\U0001F260-\U0001F265\U0001F300-\U0001F3FA\U0001F400-\U0001F6D4\U0001F6E0-\U0001F6EC\U0001F6F0-\U0001F6F9\U0001F700-\U0001F773\U0001F780-\U0001F7D8\U0001F800-\U0001F80B\U0001F810-\U0001F847\U0001F850-\U0001F859\U0001F860-\U0001F887\U0001F890-\U0001F8AD\U0001F900-\U0001F90B\U0001F910-\U0001F93E\U0001F940-\U0001F970\U0001F973-\U0001F976\U0001F97A\U0001F97C-\U0001F9A2\U0001F9B0-\U0001F9B9\U0001F9C0-\U0001F9C2\U0001F9D0-\U0001F9FF\U0001FA60-\U0001FA6D]�suffix_search�2…$|……$|,$|:$|;$|\!$|\?$|¿$|؟$|¡$|\($|\)$|\[$|\]$|\{$|\}$|<$|>$|_$|#$|\*$|&$|。$|?$|!$|,$|、$|;$|:$|~$|·$|।$|،$|۔$|؛$|٪$|\.\.+$|…$|"$|”$|“$|`$|‘$|´$|’$|‚$|,$|„$|»$|«$|「$|」$|『$|』$|($|)$|〔$|〕$|【$|】$|《$|》$|〈$|〉$|[\u00A6\u00A9\u00AE\u00B0\u0482\u058D\u058E\u060E\u060F\u06DE\u06E9\u06FD\u06FE\u07F6\u09FA\u0B70\u0BF3-\u0BF8\u0BFA\u0C7F\u0D4F\u0D79\u0F01-\u0F03\u0F13\u0F15-\u0F17\u0F1A-\u0F1F\u0F34\u0F36\u0F38\u0FBE-\u0FC5\u0FC7-\u0FCC\u0FCE\u0FCF\u0FD5-\u0FD8\u109E\u109F\u1390-\u1399\u1940\u19DE-\u19FF\u1B61-\u1B6A\u1B74-\u1B7C\u2100\u2101\u2103-\u2106\u2108\u2109\u2114\u2116\u2117\u211E-\u2123\u2125\u2127\u2129\u212E\u213A\u213B\u214A\u214C\u214D\u214F\u218A\u218B\u2195-\u2199\u219C-\u219F\u21A1\u21A2\u21A4\u21A5\u21A7-\u21AD\u21AF-\u21CD\u21D0\u21D1\u21D3\u21D5-\u21F3\u2300-\u2307\u230C-\u231F\u2322-\u2328\u232B-\u237B\u237D-\u239A\u23B4-\u23DB\u23E2-\u2426\u2440-\u244A\u249C-\u24E9\u2500-\u25B6\u25B8-\u25C0\u25C2-\u25F7\u2600-\u266E\u2670-\u2767\u2794-\u27BF\u2800-\u28FF\u2B00-\u2B2F\u2B45\u2B46\u2B4D-\u2B73\u2B76-\u2B95\u2B98-\u2BC8\u2BCA-\u2BFE\u2CE5-\u2CEA\u2E80-\u2E99\u2E9B-\u2EF3\u2F00-\u2FD5\u2FF0-\u2FFB\u3004\u3012\u3013\u3020\u3036\u3037\u303E\u303F\u3190\u3191\u3196-\u319F\u31C0-\u31E3\u3200-\u321E\u322A-\u3247\u3250\u3260-\u327F\u328A-\u32B0\u32C0-\u32FE\u3300-\u33FF\u4DC0-\u4DFF\uA490-\uA4C6\uA828-\uA82B\uA836\uA837\uA839\uAA77-\uAA79\uFDFD\uFFE4\uFFE8\uFFED\uFFEE\uFFFC\uFFFD\U00010137-\U0001013F\U00010179-\U00010189\U0001018C-\U0001018E\U00010190-\U0001019B\U000101A0\U000101D0-\U000101FC\U00010877\U00010878\U00010AC8\U0001173F\U00016B3C-\U00016B3F\U00016B45\U0001BC9C\U0001D000-\U0001D0F5\U0001D100-\U0001D126\U0001D129-\U0001D164\U0001D16A-\U0001D16C\U0001D183\U0001D184\U0001D18C-\U0001D1A9\U0001D1AE-\U0001D1E8\U0001D200-\U0001D241\U0001D245\U0001D300-\U0001D356\U0001D800-\U0001D9FF\U0001DA37-\U0001DA3A\U0001DA6D-\U0001DA74\U0001DA76-\U0001DA83\U0001DA85\U0001DA86\U0001ECAC\U0001F000-\U0001F02B\U0001F030-\U0001F093\U0001F0A0-\U0001F0AE\U0001F0B1-\U0001F0BF\U0001F0C1-\U0001F0CF\U0001F0D1-\U0001F0F5\U0001F110-\U0001F16B\U0001F170-\U0001F1AC\U0001F1E6-\U0001F202\U0001F210-\U0001F23B\U0001F240-\U0001F248\U0001F250\U0001F251\U0001F260-\U0001F265\U0001F300-\U0001F3FA\U0001F400-\U0001F6D4\U0001F6E0-\U0001F6EC\U0001F6F0-\U0001F6F9\U0001F700-\U0001F773\U0001F780-\U0001F7D8\U0001F800-\U0001F80B\U0001F810-\U0001F847\U0001F850-\U0001F859\U0001F860-\U0001F887\U0001F890-\U0001F8AD\U0001F900-\U0001F90B\U0001F910-\U0001F93E\U0001F940-\U0001F970\U0001F973-\U0001F976\U0001F97A\U0001F97C-\U0001F9A2\U0001F9B0-\U0001F9B9\U0001F9C0-\U0001F9C2\U0001F9D0-\U0001F9FF\U0001FA60-\U0001FA6D]$|—$|–$|(?<=[0-9])\+$|(?<=°[FfCcKk])\.$|(?<=[0-9])(?:\$|£|€|¥|฿|US\$|C\$|A\$|₽|﷼|₴|₠|₡|₢|₣|₤|₥|₦|₧|₨|₩|₪|₫|€|₭|₮|₯|₰|₱|₲|₳|₴|₵|₶|₷|₸|₹|₺|₻|₼|₽|₾|₿)$|(?<=[0-9])(?:km|km²|km³|m|m²|m³|dm|dm²|dm³|cm|cm²|cm³|mm|mm²|mm³|ha|µm|nm|yd|in|ft|kg|g|mg|µg|t|lb|oz|m/s|km/h|kmh|mph|hPa|Pa|mbar|mb|MB|kb|KB|gb|GB|tb|TB|T|G|M|K|%|км|км²|км³|м|м²|м³|дм|дм²|дм³|см|см²|см³|мм|мм²|мм³|нм|кг|г|мг|м/с|км/ч|кПа|Па|мбар|Кб|КБ|кб|Мб|МБ|мб|Гб|ГБ|гб|Тб|ТБ|тбكم|كم²|كم³|م|م²|م³|سم|سم²|سم³|مم|مم²|مم³|كم|غرام|جرام|جم|كغ|ملغ|كوب|اكواب)$|(?<=[0-9a-z\uFF41-\uFF5A\u00DF-\u00F6\u00F8-\u00FF\u0101\u0103\u0105\u0107\u0109\u010B\u010D\u010F\u0111\u0113\u0115\u0117\u0119\u011B\u011D\u011F\u0121\u0123\u0125\u0127\u0129\u012B\u012D\u012F\u0131\u0133\u0135\u0137\u0138\u013A\u013C\u013E\u0140\u0142\u0144\u0146\u0148\u0149\u014B\u014D\u014F\u0151\u0153\u0155\u0157\u0159\u015B\u015D\u015F\u0161\u0163\u0165\u0167\u0169\u016B\u016D\u016F\u0171\u0173\u0175\u0177\u017A\u017C\u017E\u017F\u0180\u0183\u0185\u0188\u018C\u018D\u0192\u0195\u0199-\u019B\u019E\u01A1\u01A3\u01A5\u01A8\u01AA\u01AB\u01AD\u01B0\u01B4\u01B6\u01B9\u01BA\u01BD-\u01BF\u01C6\u01C9\u01CC\u01CE\u01D0\u01D2\u01D4\u01D6\u01D8\u01DA\u01DC\u01DD\u01DF\u01E1\u01E3\u01E5\u01E7\u01E9\u01EB\u01ED\u01EF\u01F0\u01F3\u01F5\u01F9\u01FB\u01FD\u01FF\u0201\u0203\u0205\u0207\u0209\u020B\u020D\u020F\u0211\u0213\u0215\u0217\u0219\u021B\u021D\u021F\u0221\u0223\u0225\u0227\u0229\u022B\u022D\u022F\u0231\u0233-\u0239\u023C\u023F\u0240\u0242\u0247\u0249\u024B\u024D\u024F\u2C61\u2C65\u2C66\u2C68\u2C6A\u2C6C\u2C71\u2C73\u2C74\u2C76-\u2C7B\uA723\uA725\uA727\uA729\uA72B\uA72D\uA72F-\uA731\uA733\uA735\uA737\uA739\uA73B\uA73D\uA73F\uA741\uA743\uA745\uA747\uA749\uA74B\uA74D\uA74F\uA751\uA753\uA755\uA757\uA759\uA75B\uA75D\uA75F\uA761\uA763\uA765\uA767\uA769\uA76B\uA76D\uA76F\uA771-\uA778\uA77A\uA77C\uA77F\uA781\uA783\uA785\uA787\uA78C\uA78E\uA791\uA793-\uA795\uA797\uA799\uA79B\uA79D\uA79F\uA7A1\uA7A3\uA7A5\uA7A7\uA7A9\uA7AF\uA7B5\uA7B7\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E01\u1E03\u1E05\u1E07\u1E09\u1E0B\u1E0D\u1E0F\u1E11\u1E13\u1E15\u1E17\u1E19\u1E1B\u1E1D\u1E1F\u1E21\u1E23\u1E25\u1E27\u1E29\u1E2B\u1E2D\u1E2F\u1E31\u1E33\u1E35\u1E37\u1E39\u1E3B\u1E3D\u1E3F\u1E41\u1E43\u1E45\u1E47\u1E49\u1E4B\u1E4D\u1E4F\u1E51\u1E53\u1E55\u1E57\u1E59\u1E5B\u1E5D\u1E5F\u1E61\u1E63\u1E65\u1E67\u1E69\u1E6B\u1E6D\u1E6F\u1E71\u1E73\u1E75\u1E77\u1E79\u1E7B\u1E7D\u1E7F\u1E81\u1E83\u1E85\u1E87\u1E89\u1E8B\u1E8D\u1E8F\u1E91\u1E93\u1E95-\u1E9D\u1E9F\u1EA1\u1EA3\u1EA5\u1EA7\u1EA9\u1EAB\u1EAD\u1EAF\u1EB1\u1EB3\u1EB5\u1EB7\u1EB9\u1EBB\u1EBD\u1EBF\u1EC1\u1EC3\u1EC5\u1EC7\u1EC9\u1ECB\u1ECD\u1ECF\u1ED1\u1ED3\u1ED5\u1ED7\u1ED9\u1EDB\u1EDD\u1EDF\u1EE1\u1EE3\u1EE5\u1EE7\u1EE9\u1EEB\u1EED\u1EEF\u1EF1\u1EF3\u1EF5\u1EF7\u1EF9\u1EFB\u1EFD\u1EFFёа-яәөүҗңһα-ωάέίόώήύа-щюяіїєґѓѕјљњќѐѝ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F%²\-\+…|……|,|:|;|\!|\?|¿|؟|¡|\(|\)|\[|\]|\{|\}|<|>|_|#|\*|&|。|?|!|,|、|;|:|~|·|।|،|۔|؛|٪(?:\'"”“`‘´’‚,„»«「」『』()〔〕【】《》〈〉)])\.$|(?<=[A-Z\uFF21-\uFF3A\u00C0-\u00D6\u00D8-\u00DE\u0100\u0102\u0104\u0106\u0108\u010A\u010C\u010E\u0110\u0112\u0114\u0116\u0118\u011A\u011C\u011E\u0120\u0122\u0124\u0126\u0128\u012A\u012C\u012E\u0130\u0132\u0134\u0136\u0139\u013B\u013D\u013F\u0141\u0143\u0145\u0147\u014A\u014C\u014E\u0150\u0152\u0154\u0156\u0158\u015A\u015C\u015E\u0160\u0162\u0164\u0166\u0168\u016A\u016C\u016E\u0170\u0172\u0174\u0176\u0178\u0179\u017B\u017D\u0181\u0182\u0184\u0186\u0187\u0189-\u018B\u018E-\u0191\u0193\u0194\u0196-\u0198\u019C\u019D\u019F\u01A0\u01A2\u01A4\u01A6\u01A7\u01A9\u01AC\u01AE\u01AF\u01B1-\u01B3\u01B5\u01B7\u01B8\u01BC\u01C4\u01C7\u01CA\u01CD\u01CF\u01D1\u01D3\u01D5\u01D7\u01D9\u01DB\u01DE\u01E0\u01E2\u01E4\u01E6\u01E8\u01EA\u01EC\u01EE\u01F1\u01F4\u01F6-\u01F8\u01FA\u01FC\u01FE\u0200\u0202\u0204\u0206\u0208\u020A\u020C\u020E\u0210\u0212\u0214\u0216\u0218\u021A\u021C\u021E\u0220\u0222\u0224\u0226\u0228\u022A\u022C\u022E\u0230\u0232\u023A\u023B\u023D\u023E\u0241\u0243-\u0246\u0248\u024A\u024C\u024E\u2C60\u2C62-\u2C64\u2C67\u2C69\u2C6B\u2C6D-\u2C70\u2C72\u2C75\u2C7E\u2C7F\uA722\uA724\uA726\uA728\uA72A\uA72C\uA72E\uA732\uA734\uA736\uA738\uA73A\uA73C\uA73E\uA740\uA742\uA744\uA746\uA748\uA74A\uA74C\uA74E\uA750\uA752\uA754\uA756\uA758\uA75A\uA75C\uA75E\uA760\uA762\uA764\uA766\uA768\uA76A\uA76C\uA76E\uA779\uA77B\uA77D\uA77E\uA780\uA782\uA784\uA786\uA78B\uA78D\uA790\uA792\uA796\uA798\uA79A\uA79C\uA79E\uA7A0\uA7A2\uA7A4\uA7A6\uA7A8\uA7AA-\uA7AE\uA7B0-\uA7B4\uA7B6\uA7B8\u1E00\u1E02\u1E04\u1E06\u1E08\u1E0A\u1E0C\u1E0E\u1E10\u1E12\u1E14\u1E16\u1E18\u1E1A\u1E1C\u1E1E\u1E20\u1E22\u1E24\u1E26\u1E28\u1E2A\u1E2C\u1E2E\u1E30\u1E32\u1E34\u1E36\u1E38\u1E3A\u1E3C\u1E3E\u1E40\u1E42\u1E44\u1E46\u1E48\u1E4A\u1E4C\u1E4E\u1E50\u1E52\u1E54\u1E56\u1E58\u1E5A\u1E5C\u1E5E\u1E60\u1E62\u1E64\u1E66\u1E68\u1E6A\u1E6C\u1E6E\u1E70\u1E72\u1E74\u1E76\u1E78\u1E7A\u1E7C\u1E7E\u1E80\u1E82\u1E84\u1E86\u1E88\u1E8A\u1E8C\u1E8E\u1E90\u1E92\u1E94\u1E9E\u1EA0\u1EA2\u1EA4\u1EA6\u1EA8\u1EAA\u1EAC\u1EAE\u1EB0\u1EB2\u1EB4\u1EB6\u1EB8\u1EBA\u1EBC\u1EBE\u1EC0\u1EC2\u1EC4\u1EC6\u1EC8\u1ECA\u1ECC\u1ECE\u1ED0\u1ED2\u1ED4\u1ED6\u1ED8\u1EDA\u1EDC\u1EDE\u1EE0\u1EE2\u1EE4\u1EE6\u1EE8\u1EEA\u1EEC\u1EEE\u1EF0\u1EF2\u1EF4\u1EF6\u1EF8\u1EFA\u1EFC\u1EFEЁА-ЯӘӨҮҖҢҺΑ-ΩΆΈΊΌΏΉΎА-ЩЮЯІЇЄҐЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F][A-Z\uFF21-\uFF3A\u00C0-\u00D6\u00D8-\u00DE\u0100\u0102\u0104\u0106\u0108\u010A\u010C\u010E\u0110\u0112\u0114\u0116\u0118\u011A\u011C\u011E\u0120\u0122\u0124\u0126\u0128\u012A\u012C\u012E\u0130\u0132\u0134\u0136\u0139\u013B\u013D\u013F\u0141\u0143\u0145\u0147\u014A\u014C\u014E\u0150\u0152\u0154\u0156\u0158\u015A\u015C\u015E\u0160\u0162\u0164\u0166\u0168\u016A\u016C\u016E\u0170\u0172\u0174\u0176\u0178\u0179\u017B\u017D\u0181\u0182\u0184\u0186\u0187\u0189-\u018B\u018E-\u0191\u0193\u0194\u0196-\u0198\u019C\u019D\u019F\u01A0\u01A2\u01A4\u01A6\u01A7\u01A9\u01AC\u01AE\u01AF\u01B1-\u01B3\u01B5\u01B7\u01B8\u01BC\u01C4\u01C7\u01CA\u01CD\u01CF\u01D1\u01D3\u01D5\u01D7\u01D9\u01DB\u01DE\u01E0\u01E2\u01E4\u01E6\u01E8\u01EA\u01EC\u01EE\u01F1\u01F4\u01F6-\u01F8\u01FA\u01FC\u01FE\u0200\u0202\u0204\u0206\u0208\u020A\u020C\u020E\u0210\u0212\u0214\u0216\u0218\u021A\u021C\u021E\u0220\u0222\u0224\u0226\u0228\u022A\u022C\u022E\u0230\u0232\u023A\u023B\u023D\u023E\u0241\u0243-\u0246\u0248\u024A\u024C\u024E\u2C60\u2C62-\u2C64\u2C67\u2C69\u2C6B\u2C6D-\u2C70\u2C72\u2C75\u2C7E\u2C7F\uA722\uA724\uA726\uA728\uA72A\uA72C\uA72E\uA732\uA734\uA736\uA738\uA73A\uA73C\uA73E\uA740\uA742\uA744\uA746\uA748\uA74A\uA74C\uA74E\uA750\uA752\uA754\uA756\uA758\uA75A\uA75C\uA75E\uA760\uA762\uA764\uA766\uA768\uA76A\uA76C\uA76E\uA779\uA77B\uA77D\uA77E\uA780\uA782\uA784\uA786\uA78B\uA78D\uA790\uA792\uA796\uA798\uA79A\uA79C\uA79E\uA7A0\uA7A2\uA7A4\uA7A6\uA7A8\uA7AA-\uA7AE\uA7B0-\uA7B4\uA7B6\uA7B8\u1E00\u1E02\u1E04\u1E06\u1E08\u1E0A\u1E0C\u1E0E\u1E10\u1E12\u1E14\u1E16\u1E18\u1E1A\u1E1C\u1E1E\u1E20\u1E22\u1E24\u1E26\u1E28\u1E2A\u1E2C\u1E2E\u1E30\u1E32\u1E34\u1E36\u1E38\u1E3A\u1E3C\u1E3E\u1E40\u1E42\u1E44\u1E46\u1E48\u1E4A\u1E4C\u1E4E\u1E50\u1E52\u1E54\u1E56\u1E58\u1E5A\u1E5C\u1E5E\u1E60\u1E62\u1E64\u1E66\u1E68\u1E6A\u1E6C\u1E6E\u1E70\u1E72\u1E74\u1E76\u1E78\u1E7A\u1E7C\u1E7E\u1E80\u1E82\u1E84\u1E86\u1E88\u1E8A\u1E8C\u1E8E\u1E90\u1E92\u1E94\u1E9E\u1EA0\u1EA2\u1EA4\u1EA6\u1EA8\u1EAA\u1EAC\u1EAE\u1EB0\u1EB2\u1EB4\u1EB6\u1EB8\u1EBA\u1EBC\u1EBE\u1EC0\u1EC2\u1EC4\u1EC6\u1EC8\u1ECA\u1ECC\u1ECE\u1ED0\u1ED2\u1ED4\u1ED6\u1ED8\u1EDA\u1EDC\u1EDE\u1EE0\u1EE2\u1EE4\u1EE6\u1EE8\u1EEA\u1EEC\u1EEE\u1EF0\u1EF2\u1EF4\u1EF6\u1EF8\u1EFA\u1EFC\u1EFEЁА-ЯӘӨҮҖҢҺΑ-ΩΆΈΊΌΏΉΎА-ЩЮЯІЇЄҐЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F])\.$|(?<=[^sSxXzZ])\'$�infix_finditer�YD\.\.+|…|[\u00A6\u00A9\u00AE\u00B0\u0482\u058D\u058E\u060E\u060F\u06DE\u06E9\u06FD\u06FE\u07F6\u09FA\u0B70\u0BF3-\u0BF8\u0BFA\u0C7F\u0D4F\u0D79\u0F01-\u0F03\u0F13\u0F15-\u0F17\u0F1A-\u0F1F\u0F34\u0F36\u0F38\u0FBE-\u0FC5\u0FC7-\u0FCC\u0FCE\u0FCF\u0FD5-\u0FD8\u109E\u109F\u1390-\u1399\u1940\u19DE-\u19FF\u1B61-\u1B6A\u1B74-\u1B7C\u2100\u2101\u2103-\u2106\u2108\u2109\u2114\u2116\u2117\u211E-\u2123\u2125\u2127\u2129\u212E\u213A\u213B\u214A\u214C\u214D\u214F\u218A\u218B\u2195-\u2199\u219C-\u219F\u21A1\u21A2\u21A4\u21A5\u21A7-\u21AD\u21AF-\u21CD\u21D0\u21D1\u21D3\u21D5-\u21F3\u2300-\u2307\u230C-\u231F\u2322-\u2328\u232B-\u237B\u237D-\u239A\u23B4-\u23DB\u23E2-\u2426\u2440-\u244A\u249C-\u24E9\u2500-\u25B6\u25B8-\u25C0\u25C2-\u25F7\u2600-\u266E\u2670-\u2767\u2794-\u27BF\u2800-\u28FF\u2B00-\u2B2F\u2B45\u2B46\u2B4D-\u2B73\u2B76-\u2B95\u2B98-\u2BC8\u2BCA-\u2BFE\u2CE5-\u2CEA\u2E80-\u2E99\u2E9B-\u2EF3\u2F00-\u2FD5\u2FF0-\u2FFB\u3004\u3012\u3013\u3020\u3036\u3037\u303E\u303F\u3190\u3191\u3196-\u319F\u31C0-\u31E3\u3200-\u321E\u322A-\u3247\u3250\u3260-\u327F\u328A-\u32B0\u32C0-\u32FE\u3300-\u33FF\u4DC0-\u4DFF\uA490-\uA4C6\uA828-\uA82B\uA836\uA837\uA839\uAA77-\uAA79\uFDFD\uFFE4\uFFE8\uFFED\uFFEE\uFFFC\uFFFD\U00010137-\U0001013F\U00010179-\U00010189\U0001018C-\U0001018E\U00010190-\U0001019B\U000101A0\U000101D0-\U000101FC\U00010877\U00010878\U00010AC8\U0001173F\U00016B3C-\U00016B3F\U00016B45\U0001BC9C\U0001D000-\U0001D0F5\U0001D100-\U0001D126\U0001D129-\U0001D164\U0001D16A-\U0001D16C\U0001D183\U0001D184\U0001D18C-\U0001D1A9\U0001D1AE-\U0001D1E8\U0001D200-\U0001D241\U0001D245\U0001D300-\U0001D356\U0001D800-\U0001D9FF\U0001DA37-\U0001DA3A\U0001DA6D-\U0001DA74\U0001DA76-\U0001DA83\U0001DA85\U0001DA86\U0001ECAC\U0001F000-\U0001F02B\U0001F030-\U0001F093\U0001F0A0-\U0001F0AE\U0001F0B1-\U0001F0BF\U0001F0C1-\U0001F0CF\U0001F0D1-\U0001F0F5\U0001F110-\U0001F16B\U0001F170-\U0001F1AC\U0001F1E6-\U0001F202\U0001F210-\U0001F23B\U0001F240-\U0001F248\U0001F250\U0001F251\U0001F260-\U0001F265\U0001F300-\U0001F3FA\U0001F400-\U0001F6D4\U0001F6E0-\U0001F6EC\U0001F6F0-\U0001F6F9\U0001F700-\U0001F773\U0001F780-\U0001F7D8\U0001F800-\U0001F80B\U0001F810-\U0001F847\U0001F850-\U0001F859\U0001F860-\U0001F887\U0001F890-\U0001F8AD\U0001F900-\U0001F90B\U0001F910-\U0001F93E\U0001F940-\U0001F970\U0001F973-\U0001F976\U0001F97A\U0001F97C-\U0001F9A2\U0001F9B0-\U0001F9B9\U0001F9C0-\U0001F9C2\U0001F9D0-\U0001F9FF\U0001FA60-\U0001FA6D]|(?<=[a-z\uFF41-\uFF5A\u00DF-\u00F6\u00F8-\u00FF\u0101\u0103\u0105\u0107\u0109\u010B\u010D\u010F\u0111\u0113\u0115\u0117\u0119\u011B\u011D\u011F\u0121\u0123\u0125\u0127\u0129\u012B\u012D\u012F\u0131\u0133\u0135\u0137\u0138\u013A\u013C\u013E\u0140\u0142\u0144\u0146\u0148\u0149\u014B\u014D\u014F\u0151\u0153\u0155\u0157\u0159\u015B\u015D\u015F\u0161\u0163\u0165\u0167\u0169\u016B\u016D\u016F\u0171\u0173\u0175\u0177\u017A\u017C\u017E\u017F\u0180\u0183\u0185\u0188\u018C\u018D\u0192\u0195\u0199-\u019B\u019E\u01A1\u01A3\u01A5\u01A8\u01AA\u01AB\u01AD\u01B0\u01B4\u01B6\u01B9\u01BA\u01BD-\u01BF\u01C6\u01C9\u01CC\u01CE\u01D0\u01D2\u01D4\u01D6\u01D8\u01DA\u01DC\u01DD\u01DF\u01E1\u01E3\u01E5\u01E7\u01E9\u01EB\u01ED\u01EF\u01F0\u01F3\u01F5\u01F9\u01FB\u01FD\u01FF\u0201\u0203\u0205\u0207\u0209\u020B\u020D\u020F\u0211\u0213\u0215\u0217\u0219\u021B\u021D\u021F\u0221\u0223\u0225\u0227\u0229\u022B\u022D\u022F\u0231\u0233-\u0239\u023C\u023F\u0240\u0242\u0247\u0249\u024B\u024D\u024F\u2C61\u2C65\u2C66\u2C68\u2C6A\u2C6C\u2C71\u2C73\u2C74\u2C76-\u2C7B\uA723\uA725\uA727\uA729\uA72B\uA72D\uA72F-\uA731\uA733\uA735\uA737\uA739\uA73B\uA73D\uA73F\uA741\uA743\uA745\uA747\uA749\uA74B\uA74D\uA74F\uA751\uA753\uA755\uA757\uA759\uA75B\uA75D\uA75F\uA761\uA763\uA765\uA767\uA769\uA76B\uA76D\uA76F\uA771-\uA778\uA77A\uA77C\uA77F\uA781\uA783\uA785\uA787\uA78C\uA78E\uA791\uA793-\uA795\uA797\uA799\uA79B\uA79D\uA79F\uA7A1\uA7A3\uA7A5\uA7A7\uA7A9\uA7AF\uA7B5\uA7B7\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E01\u1E03\u1E05\u1E07\u1E09\u1E0B\u1E0D\u1E0F\u1E11\u1E13\u1E15\u1E17\u1E19\u1E1B\u1E1D\u1E1F\u1E21\u1E23\u1E25\u1E27\u1E29\u1E2B\u1E2D\u1E2F\u1E31\u1E33\u1E35\u1E37\u1E39\u1E3B\u1E3D\u1E3F\u1E41\u1E43\u1E45\u1E47\u1E49\u1E4B\u1E4D\u1E4F\u1E51\u1E53\u1E55\u1E57\u1E59\u1E5B\u1E5D\u1E5F\u1E61\u1E63\u1E65\u1E67\u1E69\u1E6B\u1E6D\u1E6F\u1E71\u1E73\u1E75\u1E77\u1E79\u1E7B\u1E7D\u1E7F\u1E81\u1E83\u1E85\u1E87\u1E89\u1E8B\u1E8D\u1E8F\u1E91\u1E93\u1E95-\u1E9D\u1E9F\u1EA1\u1EA3\u1EA5\u1EA7\u1EA9\u1EAB\u1EAD\u1EAF\u1EB1\u1EB3\u1EB5\u1EB7\u1EB9\u1EBB\u1EBD\u1EBF\u1EC1\u1EC3\u1EC5\u1EC7\u1EC9\u1ECB\u1ECD\u1ECF\u1ED1\u1ED3\u1ED5\u1ED7\u1ED9\u1EDB\u1EDD\u1EDF\u1EE1\u1EE3\u1EE5\u1EE7\u1EE9\u1EEB\u1EED\u1EEF\u1EF1\u1EF3\u1EF5\u1EF7\u1EF9\u1EFB\u1EFD\u1EFFёа-яәөүҗңһα-ωάέίόώήύа-щюяіїєґѓѕјљњќѐѝ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F])\.(?=[A-Z\uFF21-\uFF3A\u00C0-\u00D6\u00D8-\u00DE\u0100\u0102\u0104\u0106\u0108\u010A\u010C\u010E\u0110\u0112\u0114\u0116\u0118\u011A\u011C\u011E\u0120\u0122\u0124\u0126\u0128\u012A\u012C\u012E\u0130\u0132\u0134\u0136\u0139\u013B\u013D\u013F\u0141\u0143\u0145\u0147\u014A\u014C\u014E\u0150\u0152\u0154\u0156\u0158\u015A\u015C\u015E\u0160\u0162\u0164\u0166\u0168\u016A\u016C\u016E\u0170\u0172\u0174\u0176\u0178\u0179\u017B\u017D\u0181\u0182\u0184\u0186\u0187\u0189-\u018B\u018E-\u0191\u0193\u0194\u0196-\u0198\u019C\u019D\u019F\u01A0\u01A2\u01A4\u01A6\u01A7\u01A9\u01AC\u01AE\u01AF\u01B1-\u01B3\u01B5\u01B7\u01B8\u01BC\u01C4\u01C7\u01CA\u01CD\u01CF\u01D1\u01D3\u01D5\u01D7\u01D9\u01DB\u01DE\u01E0\u01E2\u01E4\u01E6\u01E8\u01EA\u01EC\u01EE\u01F1\u01F4\u01F6-\u01F8\u01FA\u01FC\u01FE\u0200\u0202\u0204\u0206\u0208\u020A\u020C\u020E\u0210\u0212\u0214\u0216\u0218\u021A\u021C\u021E\u0220\u0222\u0224\u0226\u0228\u022A\u022C\u022E\u0230\u0232\u023A\u023B\u023D\u023E\u0241\u0243-\u0246\u0248\u024A\u024C\u024E\u2C60\u2C62-\u2C64\u2C67\u2C69\u2C6B\u2C6D-\u2C70\u2C72\u2C75\u2C7E\u2C7F\uA722\uA724\uA726\uA728\uA72A\uA72C\uA72E\uA732\uA734\uA736\uA738\uA73A\uA73C\uA73E\uA740\uA742\uA744\uA746\uA748\uA74A\uA74C\uA74E\uA750\uA752\uA754\uA756\uA758\uA75A\uA75C\uA75E\uA760\uA762\uA764\uA766\uA768\uA76A\uA76C\uA76E\uA779\uA77B\uA77D\uA77E\uA780\uA782\uA784\uA786\uA78B\uA78D\uA790\uA792\uA796\uA798\uA79A\uA79C\uA79E\uA7A0\uA7A2\uA7A4\uA7A6\uA7A8\uA7AA-\uA7AE\uA7B0-\uA7B4\uA7B6\uA7B8\u1E00\u1E02\u1E04\u1E06\u1E08\u1E0A\u1E0C\u1E0E\u1E10\u1E12\u1E14\u1E16\u1E18\u1E1A\u1E1C\u1E1E\u1E20\u1E22\u1E24\u1E26\u1E28\u1E2A\u1E2C\u1E2E\u1E30\u1E32\u1E34\u1E36\u1E38\u1E3A\u1E3C\u1E3E\u1E40\u1E42\u1E44\u1E46\u1E48\u1E4A\u1E4C\u1E4E\u1E50\u1E52\u1E54\u1E56\u1E58\u1E5A\u1E5C\u1E5E\u1E60\u1E62\u1E64\u1E66\u1E68\u1E6A\u1E6C\u1E6E\u1E70\u1E72\u1E74\u1E76\u1E78\u1E7A\u1E7C\u1E7E\u1E80\u1E82\u1E84\u1E86\u1E88\u1E8A\u1E8C\u1E8E\u1E90\u1E92\u1E94\u1E9E\u1EA0\u1EA2\u1EA4\u1EA6\u1EA8\u1EAA\u1EAC\u1EAE\u1EB0\u1EB2\u1EB4\u1EB6\u1EB8\u1EBA\u1EBC\u1EBE\u1EC0\u1EC2\u1EC4\u1EC6\u1EC8\u1ECA\u1ECC\u1ECE\u1ED0\u1ED2\u1ED4\u1ED6\u1ED8\u1EDA\u1EDC\u1EDE\u1EE0\u1EE2\u1EE4\u1EE6\u1EE8\u1EEA\u1EEC\u1EEE\u1EF0\u1EF2\u1EF4\u1EF6\u1EF8\u1EFA\u1EFC\u1EFEЁА-ЯӘӨҮҖҢҺΑ-ΩΆΈΊΌΏΉΎА-ЩЮЯІЇЄҐЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F])|(?<=[A-Za-z\uFF21-\uFF3A\uFF41-\uFF5A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF\u0100-\u017F\u0180-\u01BF\u01C4-\u024F\u2C60-\u2C7B\u2C7E\u2C7F\uA722-\uA76F\uA771-\uA787\uA78B-\uA78E\uA790-\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E00-\u1EFFёа-яЁА-ЯәөүҗңһӘӨҮҖҢҺα-ωάέίόώήύΑ-ΩΆΈΊΌΏΉΎа-щюяіїєґА-ЩЮЯІЇЄҐѓѕјљњќѐѝЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F])[,!?](?=[A-Za-z\uFF21-\uFF3A\uFF41-\uFF5A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF\u0100-\u017F\u0180-\u01BF\u01C4-\u024F\u2C60-\u2C7B\u2C7E\u2C7F\uA722-\uA76F\uA771-\uA787\uA78B-\uA78E\uA790-\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E00-\u1EFFёа-яЁА-ЯәөүҗңһӘӨҮҖҢҺα-ωάέίόώήύΑ-ΩΆΈΊΌΏΉΎа-щюяіїєґА-ЩЮЯІЇЄҐѓѕјљњќѐѝЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F])|(?<=[A-Za-z\uFF21-\uFF3A\uFF41-\uFF5A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF\u0100-\u017F\u0180-\u01BF\u01C4-\u024F\u2C60-\u2C7B\u2C7E\u2C7F\uA722-\uA76F\uA771-\uA787\uA78B-\uA78E\uA790-\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E00-\u1EFFёа-яЁА-ЯәөүҗңһӘӨҮҖҢҺα-ωάέίόώήύΑ-ΩΆΈΊΌΏΉΎа-щюяіїєґА-ЩЮЯІЇЄҐѓѕјљњќѐѝЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F])[:<>=](?=[A-Za-z\uFF21-\uFF3A\uFF41-\uFF5A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF\u0100-\u017F\u0180-\u01BF\u01C4-\u024F\u2C60-\u2C7B\u2C7E\u2C7F\uA722-\uA76F\uA771-\uA787\uA78B-\uA78E\uA790-\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E00-\u1EFFёа-яЁА-ЯәөүҗңһӘӨҮҖҢҺα-ωάέίόώήύΑ-ΩΆΈΊΌΏΉΎа-щюяіїєґА-ЩЮЯІЇЄҐѓѕјљњќѐѝЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F])|(?<=[A-Za-z\uFF21-\uFF3A\uFF41-\uFF5A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF\u0100-\u017F\u0180-\u01BF\u01C4-\u024F\u2C60-\u2C7B\u2C7E\u2C7F\uA722-\uA76F\uA771-\uA787\uA78B-\uA78E\uA790-\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E00-\u1EFFёа-яЁА-ЯәөүҗңһӘӨҮҖҢҺα-ωάέίόώήύΑ-ΩΆΈΊΌΏΉΎа-щюяіїєґА-ЩЮЯІЇЄҐѓѕјљњќѐѝЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F]),(?=[A-Za-z\uFF21-\uFF3A\uFF41-\uFF5A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF\u0100-\u017F\u0180-\u01BF\u01C4-\u024F\u2C60-\u2C7B\u2C7E\u2C7F\uA722-\uA76F\uA771-\uA787\uA78B-\uA78E\uA790-\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E00-\u1EFFёа-яЁА-ЯәөүҗңһӘӨҮҖҢҺα-ωάέίόώήύΑ-ΩΆΈΊΌΏΉΎа-щюяіїєґА-ЩЮЯІЇ��ҐѓѕјљњќѐѝЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F])|(?<=[A-Za-z\uFF21-\uFF3A\uFF41-\uFF5A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF\u0100-\u017F\u0180-\u01BF\u01C4-\u024F\u2C60-\u2C7B\u2C7E\u2C7F\uA722-\uA76F\uA771-\uA787\uA78B-\uA78E\uA790-\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E00-\u1EFFёа-яЁА-ЯәөүҗңһӘӨҮҖҢҺα-ωάέίόώήύΑ-ΩΆΈΊΌΏΉΎа-щюяіїєґА-ЩЮЯІЇЄҐѓѕјљњќѐѝЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F])([\"”“`‘´’‚,„»«「」『』()〔〕【】《》〈〉\)\]\(\[])(?=[A-Za-z\uFF21-\uFF3A\uFF41-\uFF5A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF\u0100-\u017F\u0180-\u01BF\u01C4-\u024F\u2C60-\u2C7B\u2C7E\u2C7F\uA722-\uA76F\uA771-\uA787\uA78B-\uA78E\uA790-\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E00-\u1EFFёа-яЁА-ЯәөүҗңһӘӨҮҖҢҺα-ωάέίόώήύΑ-ΩΆΈΊΌΏΉΎа-щюяіїєґА-ЩЮЯІЇЄҐѓѕјљњќѐѝЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F])|(?<=[A-Za-z\uFF21-\uFF3A\uFF41-\uFF5A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF\u0100-\u017F\u0180-\u01BF\u01C4-\u024F\u2C60-\u2C7B\u2C7E\u2C7F\uA722-\uA76F\uA771-\uA787\uA78B-\uA78E\uA790-\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E00-\u1EFFёа-яЁА-ЯәөүҗңһӘӨҮҖҢҺα-ωάέίόώήύΑ-ΩΆΈΊΌΏΉΎа-щюяіїєґА-ЩЮЯІЇЄҐѓѕјљњќѐѝЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F])--(?=[A-Za-z\uFF21-\uFF3A\uFF41-\uFF5A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF\u0100-\u017F\u0180-\u01BF\u01C4-\u024F\u2C60-\u2C7B\u2C7E\u2C7F\uA722-\uA76F\uA771-\uA787\uA78B-\uA78E\uA790-\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E00-\u1EFFёа-яЁА-ЯәөүҗңһӘӨҮҖҢҺα-ωάέίόώήύΑ-ΩΆΈΊΌΏΉΎа-щюяіїєґА-ЩЮЯІЇЄҐѓѕјљњќѐѝЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F])|(?<=[A-Za-z\uFF21-\uFF3A\uFF41-\uFF5A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF\u0100-\u017F\u0180-\u01BF\u01C4-\u024F\u2C60-\u2C7B\u2C7E\u2C7F\uA722-\uA76F\uA771-\uA787\uA78B-\uA78E\uA790-\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E00-\u1EFFёа-яЁА-ЯәөүҗңһӘӨҮҖҢҺα-ωάέίόώήύΑ-ΩΆΈΊΌΏΉΎа-щюяіїєґА-ЩЮЯІЇЄҐѓѕјљњќѐѝЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F0-9])[:<>=/](?=[A-Za-z\uFF21-\uFF3A\uFF41-\uFF5A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF\u0100-\u017F\u0180-\u01BF\u01C4-\u024F\u2C60-\u2C7B\u2C7E\u2C7F\uA722-\uA76F\uA771-\uA787\uA78B-\uA78E\uA790-\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E00-\u1EFFёа-яЁА-ЯәөүҗңһӘӨҮҖҢҺα-ωάέίόώήύΑ-ΩΆΈΊΌΏΉΎа-щюяіїєґА-ЩЮЯІЇЄҐѓѕјљњќѐѝЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F])�token_match��url_match�
2
+ ��A�
3
+ � ��A� �'��A�'�''��A�''�(*_*)��A�(*_*)�(-8��A�(-8�(-:��A�(-:�(-;��A�(-;�(-_-)��A�(-_-)�(._.)��A�(._.)�(:��A�(:�(;��A�(;�(=��A�(=�(>_<)��A�(>_<)�(^_^)��A�(^_^)�(o:��A�(o:�(¬_¬)��A�(¬_¬)�(ಠ_ಠ)��A�(ಠ_ಠ)�(╯°□°)╯︵┻━┻��A�(╯°□°)╯︵┻━┻�)-:��A�)-:�):��A�):�-_-��A�-_-�-__-��A�-__-�._.��A�._.�0.0��A�0.0�0.o��A�0.o�0_0��A�0_0�0_o��A�0_o�1.��A�1.�10.��A�10.�11.��A�11.�12.��A�12.�13.��A�13.�14.��A�14.�15.��A�15.�16.��A�16.�17.��A�17.�18.��A�18.�19.��A�19.�2.��A�2.�20.��A�20.�21.��A�21.�22.��A�22.�23.��A�23.�24.��A�24.�25.��A�25.�26.��A�26.�27.��A�27.�28.��A�28.�29.��A�29.�3.��A�3.�30.��A�30.�31.��A�31.�4.��A�4.�5.��A�5.�6.��A�6.�7.��A�7.�8)��A�8)�8-)��A�8-)�8-D��A�8-D�8.��A�8.�8D��A�8D�9.��A�9.�:'(��A�:'(�:')��A�:')�:'-(��A�:'-(�:'-)��A�:'-)�:(��A�:(�:((��A�:((�:(((��A�:(((�:()��A�:()�:)��A�:)�:))��A�:))�:)))��A�:)))�:*��A�:*�:-(��A�:-(�:-((��A�:-((�:-(((��A�:-(((�:-)��A�:-)�:-))��A�:-))�:-)))��A�:-)))�:-*��A�:-*�:-/��A�:-/�:-0��A�:-0�:-3��A�:-3�:->��A�:->�:-D��A�:-D�:-O��A�:-O�:-P��A�:-P�:-X��A�:-X�:-]��A�:-]�:-o��A�:-o�:-p��A�:-p�:-x��A�:-x�:-|��A�:-|�:-}��A�:-}�:/��A�:/�:0��A�:0�:1��A�:1�:3��A�:3�:>��A�:>�:D��A�:D�:O��A�:O�:P��A�:P�:X��A�:X�:]��A�:]�:o��A�:o�:o)��A�:o)�:p��A�:p�:x��A�:x�:|��A�:|�:}��A�:}�:’(��A�:’(�:’)��A�:’)�:’-(��A�:’-(�:’-)��A�:’-)�;)��A�;)�;-)��A�;-)�;-D��A�;-D�;D��A�;D�;_;��A�;_;�<.<��A�<.<�</3��A�</3�<3��A�<3�<33��A�<33�<333��A�<333�<space>��A�<space>�=(��A�=(�=)��A�=)�=/��A�=/�=3��A�=3�=D��A�=D�=[��A�=[�=]��A�=]�=|��A�=|�>.<��A�>.<�>.>��A�>.>�>:(��A�>:(�>:o��A�>:o�><(((*>��A�><(((*>�@_@��A�@_@�A.D.��A�A.D.�A/B��A�A/B�A/S��A�A/S�Aarh.��A�Aarh.�Ac.��A�Ac.�Adj.��A�Adj.�Adr.��A�Adr.�Adsk.��A�Adsk.�Adv.��A�Adv.�Afb.��A�Afb.�Afd.��A�Afd.�Afg.��A�Afg.�Afk.��A�Afk.�Afs.��A�Afs.�Aht.��A�Aht.�Alg.��A�Alg.�Alk.��A�Alk.�Alm.��A�Alm.�Amer.��A�Amer.�Ang.��A�Ang.�Ank.��A�Ank.�Anl.��A�Anl.�Anv.��A�Anv.�Apr.��A�Apr.C�april�Arb.��A�Arb.�Arr.��A�Arr.�Att.��A�Att.�Aug.��A�Aug.C�august�B.C.��A�B.C.�B.T.��A�B.T.�BK.��A�BK.�Bd.��A�Bd.�Bdt.��A�Bdt.�Beg.��A�Beg.�Begr.��A�Begr.�Beh.��A�Beh.�Bet.��A�Bet.�Bev.��A�Bev.�Bhk.��A�Bhk.�Bib.��A�Bib.�Bibl.��A�Bibl.�Bidr.��A�Bidr.�Bildl.��A�Bildl.�Bill.��A�Bill.�Biol.��A�Biol.�Bk.��A�Bk.�Bl.��A�Bl.�Bl.a.��A�Bl.a.�Borgm.��A�Borgm.�Boul.��A�Boul.�Br.��A�Br.�Brolægn.��A�Brolægn.�Bto.��A�Bto.�Bygn.��A�Bygn.�C++��A�C++�C/o��A�C/o�Ca.��A�Ca.�Cand.��A�Cand.�Chr.��A�Chr.�Cm.��A�Cm.�D.d.��A�D.d.�D.m.��A�D.m.�D.s.��A�D.s.�D.s.s.��A�D.s.s.�D.v.s.��A�D.v.s.�D.y.��A�D.y.�D.å.��A�D.å.�D.æ.��A�D.æ.�Dagl.��A�Dagl.�Dat.��A�Dat.�Dav.��A�Dav.�Dec.��A�Dec.C�december�Def.��A�Def.�Dek.��A�Dek.�Dep.��A�Dep.�Desl.��A�Desl.�Dir.��A�Dir.�Disp.��A�Disp.�Distr.��A�Distr.�Div.��A�Div.�Dkr.��A�Dkr.�Dl.��A�Dl.�Do.��A�Do.�Dobb.��A�Dobb.�Dr.��A�Dr.�Dr.h.c��A�Dr.h.c�Dr.phil.��A�Dr.phil.�Dronn.��A�Dronn.�Ds.��A�Ds.�Dvs.��A�Dvs.�E.b.��A�E.b.�E.l.��A�E.l.�E.o.��A�E.o.�E.v.t.��A�E.v.t.�Eftf.��A�Eftf.�Eftm.��A�Eftm.�Egl.��A�Egl.�Eks.��A�Eks.�Eksam.��A�Eksam.�Ekskl.��A�Ekskl.�Eksp.��A�Eksp.�Ekspl.��A�Ekspl.�El.lign.��A�El.lign.�Emer.��A�Emer.�Endv.��A�Endv.�Eng.��A�Eng.�Enk.��A�Enk.�Etc.��A�Etc.�Etym.��A�Etym.�Eur.��A�Eur.�Evt.��A�Evt.�Exam.��A�Exam.�F.eks.��A�F.eks.�F.m.��A�F.m.�F.n.��A�F.n.�F.o.��A�F.o.�F.o.m.��A�F.o.m.�F.s.v.��A�F.s.v.�F.t.��A�F.t.�F.v.t.��A�F.v.t.�F.å.��A�F.å.�Fa.��A�Fa.�Fakt.��A�Fakt.�Fam.��A�Fam.�Feb.��A�Feb.C�februar�Febr.��A�Febr.C�februar�Ff.��A�Ff.�Fg.��A�Fg.�Fhv.��A�Fhv.�Fig.��A�Fig.�Filol.��A�Filol.�Filos.��A�Filos.�Fl.��A�Fl.�Flg.��A�Flg.�Fm.��A�Fm.�Fmd.��A�Fmd.�Fol.��A�Fol.�Forb.��A�Forb.�Foreg.��A�Foreg.�Foren.��A�Foren.�Forf.��A�Forf.�Fork.��A�Fork.�Forr.��A�Forr.�Fors.��A�Fors.�Forsk.��A�Forsk.�Forts.��A�Forts.�Fr.��A�Fr.�Fr.u.��A�Fr.u.�Fre.��A�Fre.C�fredag�Frk.��A�Frk.�Fsva.��A�Fsva.�Fuldm.��A�Fuldm.�Fung.��A�Fung.�Fx.��A�Fx.�Fys.��A�Fys.�Fær.��A�Fær.�G.d.��A�G.d.�G.m.��A�G.m.�Gd.��A�Gd.�Gdr.��A�Gdr.�Genuds.��A�Genuds.�Gi'��A�Gi'C�giv�Gi’��A�Gi’C�giv�Gl.��A�Gl.�Gn.��A�Gn.�Gns.��A�Gns.�Gr.��A�Gr.�Grdl.��A�Grdl.�Gross.��A�Gross.�H.K.H.��A�H.K.H.�H.M.��A�H.M.�H.a.��A�H.a.�H.c.��A�H.c.�Ha'��A�Ha'C�have�Ha’��A�Ha’C�have�Hdl.��A�Hdl.�Henv.��A�Henv.�Hf.��A�Hf.�Hhv.��A�Hhv.�Hj.hj.��A�Hj.hj.�Hj.spl.��A�Hj.spl.�Hort.��A�Hort.�Hosp.��A�Hosp.�Hpl.��A�Hpl.�Hr.��A�Hr.�Hrs.��A�Hrs.�Hum.��A�Hum.�Hvp.��A�Hvp.�I.e.��A�I.e.�I/S��A�I/S�Id.��A�Id.�If.��A�If.�Iflg.��A�Iflg.�Ifm.��A�Ifm.�Ift.��A�Ift.�Iht.��A�Iht.�Ik'��A�Ik'C�ikke�Ik’��A�Ik’C�ikke�Ill.��A�Ill.�Inc.��A�Inc.�Indb.��A�Indb.�Indreg.��A�Indreg.�Inf.��A�Inf.�Ing.��A�Ing.�Inh.��A�Inh.�Inj.��A�Inj.�Inkl.��A�Inkl.�Insp.��A�Insp.�Instr.��A�Instr.�Isl.��A�Isl.�Istf.��A�Istf.�It.��A�It.�Ital.��A�Ital.�Iv.��A�Iv.�J.nr.��A�J.nr.�Jan.��A�Jan.C�januar�Jap.��A�Jap.�Jf.��A�Jf.�Jfr.��A�Jfr.�Jnr.��A�Jnr.�Jr.��A�Jr.�Jun.��A�Jun.C�juni�Jur.��A�Jur.�Jvf.��A�Jvf.�Ka'��A�Ka'C�kan�Kap.��A�Kap.�Ka’��A�Ka’C�kan�Kbh.��A�Kbh.�Kem.��A�Kem.�Kg.��A�Kg.�Kgl.��A�Kgl.�Kgs.��A�Kgs.�Kl.��A�Kl.�Kld.��A�Kld.�Km.��A�Km.�Km/t��A�Km/t�Km/t.��A�Km/t.�Knsp.��A�Knsp.�Komm.��A�Komm.�Kons.��A�Kons.�Korr.��A�Korr.�Kp.��A�Kp.�Kprs.��A�Kprs.�Kr.��A�Kr.�Kst.��A�Kst.�Kt.��A�Kt.�Ktr.��A�Ktr.�Ku'��A�Ku'C�kunne�Ku’��A�Ku’C�kunne�Kv.��A�Kv.�Kvm.��A�Kvm.�Kvt.��A�Kvt.�L.A.��A�L.A.�L.c.��A�L.c.�Lab.��A�Lab.�Lat.��A�Lat.�Lb.m.��A�Lb.m.�Lb.nr.��A�Lb.nr.�Lejl.��A�Lejl.�Lgd.��A�Lgd.�Lic.��A�Lic.�Lign.��A�Lign.�Lin.��A�Lin.�Ling.merc.��A�Ling.merc.�Litt.��A�Litt.�Ll.��A�Ll.�Loc.cit.��A�Loc.cit.�Lok.��A�Lok.�Lrs.��A�Lrs.�Ltr.��A�Ltr.�Lør.��A�Lør.C�lørdag�M.a.o.��A�M.a.o.�M.fl.��A�M.fl.�M.h.p.��A�M.h.p.�M.h.t.��A�M.h.t.�M.m.��A�M.m.�M.v.��A�M.v.�M.v.h.��A�M.v.h.�M/S��A�M/S�Mag.��A�Mag.�Maks.��A�Maks.�Man.��A�Man.C�mandag�Mar.��A�Mar.C�marts�Md.��A�Md.�Mdr.��A�Mdr.�Mdtl.��A�Mdtl.�Mezz.��A�Mezz.�Mfl.��A�Mfl.�Mht.��A�Mht.�Mill.��A�Mill.�Mio.��A�Mio.�Modt.��A�Modt.�Mr.��A�Mr.�Mrk.��A�Mrk.�Mul.��A�Mul.�Mv.��A�Mv.�N.br.��A�N.br.�N.f.��A�N.f.�Nb.��A�Nb.�Ndr.��A�Ndr.�Nedenst.��A�Nedenst.�Nl.��A�Nl.�Nov.��A�Nov.C�november�Nr.��A�Nr.�Nto.��A�Nto.�Nuv.��A�Nuv.�O.O��A�O.O�O.a.��A�O.a.�O.fl.��A�O.fl.�O.h.��A�O.h.�O.l.��A�O.l.�O.lign.��A�O.lign.�O.m.a.��A�O.m.a.�O.o��A�O.o�O.s.fr.��A�O.s.fr.�O/m��A�O/m�O/m.��A�O/m.�O_O��A�O_O�O_o��A�O_o�Obl.��A�Obl.�Obs.��A�Obs.�Odont.��A�Odont.�Oecon.��A�Oecon.�Off.��A�Off.�Ofl.��A�Ofl.�Okt.��A�Okt.C�oktober�Omg.��A�Omg.�Omkr.��A�Omkr.�Omr.��A�Omr.�Omtr.��A�Omtr.�Ons.��A�Ons.C�onsdag�Opg.��A�Opg.�Opl.��A�Opl.�Opr.��A�Opr.�Org.��A�Org.�Orig.��A�Orig.�Osv.��A�Osv.�Ovenst.��A�Ovenst.�Overs.��A�Overs.�Ovf.��A�Ovf.�P.a.��A�P.a.�P.b.a��A�P.b.a�P.b.v��A�P.b.v�P.c.��A�P.c.�P.m.��A�P.m.�P.m.v.��A�P.m.v.�P.n.��A�P.n.�P.p.��A�P.p.�P.p.s.��A�P.p.s.�P.s.��A�P.s.�P.t.��A�P.t.�P.v.a.��A�P.v.a.�P.v.c.��A�P.v.c.�Pag.��A�Pag.�Pass.��A�Pass.�Pcs.��A�Pcs.�Pct.��A�Pct.�Pd.��A�Pd.�Pens.��A�Pens.�Pft.��A�Pft.�Pg.��A�Pg.�Pga.��A�Pga.�Pgl.��A�Pgl.�Ph.d.��A�Ph.d.�Pinx.��A�Pinx.�Pk.��A�Pk.�Pkt.��A�Pkt.�Polit.��A�Polit.�Polyt.��A�Polyt.�Pos.��A�Pos.�Pp.��A�Pp.�Ppm.��A�Ppm.�Pr.��A�Pr.�Prc.��A�Prc.�Priv.��A�Priv.�Prod.��A�Prod.�Prof.��A�Prof.�Pron.��A�Pron.�Prs.��A�Prs.�Præd.��A�Præd.�Præf.��A�Præf.�Præt.��A�Præt.�Psych.��A�Psych.�Pt.��A�Pt.�Pæd.��A�Pæd.�Q.e.d.��A�Q.e.d.�Rad.��A�Rad.�Rcp.��A�Rcp.�Red.��A�Red.�Ref.��A�Ref.�Reg.��A�Reg.�Regn.��A�Regn.�Rel.��A�Rel.�Rep.��A�Rep.�Repr.��A�Repr.�Resp.��A�Resp.�Rest.��A�Rest.�Rm.��A�Rm.�Rtg.��A�Rtg.�Russ.��A�Russ.�S'gu��A�S'guC�s'gu�S.br.��A�S.br.�S.d.��A�S.d.�S.f.��A�S.f.�S.m.b.a.��A�S.m.b.a.�S.u.��A�S.u.�S.å.��A�S.å.�Sa.��A�Sa.�Sb.��A�Sb.�Sc.��A�Sc.�Scient.��A�Scient.�Scil.��A�Scil.�Sdr.��A�Sdr.�Sek.��A�Sek.�Sekr.��A�Sekr.�Self.��A�Self.�Sem.��A�Sem.�Sep.��A�Sep.C�september�Sept.��A�Sept.C�september�Sgu'��A�Sgu'C�s'gu�Sgu’��A�Sgu’C�s'gu�Shj.��A�Shj.�Sign.��A�Sign.�Sing.��A�Sing.�Sj.��A�Sj.�Skr.��A�Skr.�Skt.��A�Skt.�Slutn.��A�Slutn.�Sml.��A�Sml.�Smp.��A�Smp.�Snr.��A�Snr.�Soc.��A�Soc.�Soc.dem.��A�Soc.dem.�Sp.��A�Sp.�Spec.��A�Spec.�Spl.��A�Spl.�Spm.��A�Spm.�Spr.��A�Spr.�Spsk.��A�Spsk.�St.��A�St.�Statsaut.��A�Statsaut.�Stk.��A�Stk.�Str.��A�Str.�Stud.��A�Stud.�Subj.��A�Subj.�Subst.��A�Subst.�Suff.��A�Suff.�Sup.��A�Sup.�Suppl.��A�Suppl.�Sv.��A�Sv.�Såk.��A�Såk.�Sædv.��A�Sædv.�S’gu��A�S’guC�s'gu�T.h.��A�T.h.�T.o.��A�T.o.�T.o.m.��A�T.o.m.�T.v.��A�T.v.�T/r��A�T/r�TCP/IP��A�TCP/IP�Tbl.��A�Tbl.�Tcp/ip��A�Tcp/ip�Td.��A�Td.�Tdl.��A�Tdl.�Tdr.��A�Tdr.�Techn.��A�Techn.�Tekn.��A�Tekn.�Temp.��A�Temp.�Th.��A�Th.�Theol.��A�Theol.�Tidl.��A�Tidl.�Tilf.��A�Tilf.�Tilh.��A�Tilh.�Till.��A�Till.�Tilsv.��A�Tilsv.�Tirs.��A�Tirs.C�tirsdag�Tjg.��A�Tjg.�Tkr.��A�Tkr.�Tlf.��A�Tlf.�Tlgr.��A�Tlgr.�Tr.��A�Tr.�Trp.��A�Trp.�Tsk.��A�Tsk.�Tv.��A�Tv.�Ty.��A�Ty.�U/b��A�U/b�Udb.��A�Udb.�Udbet.��A�Udbet.�Ugtl.��A�Ugtl.�Undt.��A�Undt.�V.V��A�V.V�V.f.��A�V.f.�V_V��A�V_V�Vb.��A�Vb.�Vedk.��A�Vedk.�Vedl.��A�Vedl.�Vedr.��A�Vedr.�Vejl.��A�Vejl.�Vg.��A�Vg.�Vh.��A�Vh.�Vha.��A�Vha.�Vind.��A�Vind.�Vs.��A�Vs.�Vsa.��A�Vsa.�Vær.��A�Vær.�XD��A�XD�XDD��A�XDD�Zool.��A�Zool.�[-:��A�[-:�[:��A�[:�[=��A�[=�\")��A�\")�\n��A�\n�\t��A�\t�]=��A�]=�^_^��A�^_^�^__^��A�^__^�^___^��A�^___^�a.��A�a.�a/s��A�a/s�aarh.��A�aarh.�ac.��A�ac.�adj.��A�adj.�adr.��A�adr.�adsk.��A�adsk.�adv.��A�adv.�afb.��A�afb.�afd.��A�afd.�afg.��A�afg.�afk.��A�afk.�afs.��A�afs.�aht.��A�aht.�alg.��A�alg.�alk.��A�alk.�alm.��A�alm.�amer.��A�amer.�ang.��A�ang.�ank.��A�ank.�anl.��A�anl.�anv.��A�anv.�apr.��A�apr.C�april�arb.��A�arb.�arr.��A�arr.�att.��A�att.�aug.��A�aug.C�august�b.��A�b.�bd.��A�bd.�bdt.��A�bdt.�beg.��A�beg.�begr.��A�begr.�beh.��A�beh.�bet.��A�bet.�bev.��A�bev.�bhk.��A�bhk.�bib.��A�bib.�bibl.��A�bibl.�bidr.��A�bidr.�bildl.��A�bildl.�bill.��A�bill.�biol.��A�biol.�bk.��A�bk.�bl.��A�bl.�bl.a.��A�bl.a.�borgm.��A�borgm.�br.��A�br.�brolægn.��A�brolægn.�bto.��A�bto.�bygn.��A�bygn.�c.��A�c.�c/o��A�c/o�ca.��A�ca.�cand.��A�cand.�cm.��A�cm.�d.��A�d.�d.d.��A�d.d.�d.m.��A�d.m.�d.s.��A�d.s.�d.s.s.��A�d.s.s.�d.v.s.��A�d.v.s.�d.y.��A�d.y.�d.å.��A�d.å.�d.æ.��A�d.æ.�dagl.��A�dagl.�dat.��A�dat.�dav.��A�dav.�dec.��A�dec.C�december�def.��A�def.�dek.��A�dek.�dep.��A�dep.�desl.��A�desl.�diam.��A�diam.�dir.��A�dir.�disp.��A�disp.�distr.��A�distr.�div.��A�div.�dkr.��A�dkr.�dl.��A�dl.�do.��A�do.�dobb.��A�dobb.�dr.��A�dr.�dr.h.c��A�dr.h.c�dr.phil.��A�dr.phil.�ds.��A�ds.�dvs.��A�dvs.�e.��A�e.�e.b.��A�e.b.�e.l.��A�e.l.�e.o.��A�e.o.�e.v.t.��A�e.v.t.�eftf.��A�eftf.�eftm.��A�eftm.�egl.��A�egl.�eks.��A�eks.�eksam.��A�eksam.�ekskl.��A�ekskl.�eksp.��A�eksp.�ekspl.��A�ekspl.�el.lign.��A�el.lign.�emer.��A�emer.�endv.��A�endv.�eng.��A�eng.�enk.��A�enk.�etc.��A�etc.�etym.��A�etym.�eur.��A�eur.�evt.��A�evt.�exam.��A�exam.�f.��A�f.�f.eks.��A�f.eks.�f.m.��A�f.m.�f.n.��A�f.n.�f.o.��A�f.o.�f.o.m.��A�f.o.m.�f.s.v.��A�f.s.v.�f.t.��A�f.t.�f.v.t.��A�f.v.t.�f.å.��A�f.å.�fa.��A�fa.�fakt.��A�fakt.�fam.��A�fam.�feb.��A�feb.C�februar�febr.��A�febr.C�februar�ff.��A�ff.�fg.��A�fg.�fhv.��A�fhv.�fig.��A�fig.�filol.��A�filol.�filos.��A�filos.�fl.��A�fl.�flg.��A�flg.�fm.��A�fm.�fmd.��A�fmd.�fol.��A�fol.�forb.��A�forb.�foreg.��A�foreg.�foren.��A�foren.�forf.��A�forf.�fork.��A�fork.�forr.��A�forr.�fors.��A�fors.�forsk.��A�forsk.�forts.��A�forts.�fr.��A�fr.�fr.u.��A�fr.u.�fre.��A�fre.C�fredag�frk.��A�frk.�fsva.��A�fsva.�fuldm.��A�fuldm.�fung.��A�fung.�fx.��A�fx.�fys.��A�fys.�fær.��A�fær.�g.��A�g.�g.d.��A�g.d.�g.m.��A�g.m.�gd.��A�gd.�gdr.��A�gdr.�genuds.��A�genuds.�gi'��A�gi'C�giv�gi’��A�gi’C�giv�gl.��A�gl.�gn.��A�gn.�gns.��A�gns.�gr.��A�gr.�grdl.��A�grdl.�gross.��A�gross.�h.��A�h.�h.a.��A�h.a.�h.c.��A�h.c.�ha'��A�ha'C�have�ha’��A�ha’C�have�hdl.��A�hdl.�henv.��A�henv.�hhv.��A�hhv.�hj.hj.��A�hj.hj.�hj.spl.��A�hj.spl.�hort.��A�hort.�hosp.��A�hosp.�hpl.��A�hpl.�hr.��A�hr.�hrs.��A�hrs.�hum.��A�hum.�hvp.��A�hvp.�i.��A�iC�i�A�.�i.e.��A�i.e.�i/s��A�i/s�ib.��A�ib.�id.��A�id.�if.��A�if.�iflg.��A�iflg.�ifm.��A�ifm.�ift.��A�ift.�iht.��A�iht.�ik'��A�ik'C�ikke�ik’��A�ik’C�ikke�ill.��A�ill.�indb.��A�indb.�indreg.��A�indreg.�inf.��A�inf.�ing.��A�ing.�inh.��A�inh.�inj.��A�inj.�inkl.��A�inkl.�insp.��A�insp.�instr.��A�instr.�isl.��A�isl.�istf.��A�istf.�it.��A�it.�ital.��A�ital.�iv.��A�iv.�j.��A�j.�j.nr.��A�j.nr.�jan.��A�jan.C�januar�jap.��A�jap.�jf.��A�jf.�jfr.��A�jfr.�jnr.��A�jnr.�jr.��A�jr.�jun.��A�jun.C�juni�jur.��A�jur.�jvf.��A�jvf.�k.��A�k.�ka'��A�ka'C�kan�kap.��A�kap.�ka’��A�ka’C�kan�kbh.��A�kbh.�kem.��A�kem.�kg.��A�kg.�kgl.��A�kgl.�kgs.��A�kgs.�kl.��A�kl.�kld.��A�kld.�km.��A�km.�km/t��A�km/t�km/t.��A�km/t.�knsp.��A�knsp.�komm.��A�komm.�kons.��A�kons.�korr.��A�korr.�kp.��A�kp.�kr.��A�kr.�kst.��A�kst.�kt.��A�kt.�ktr.��A�ktr.�ku'��A�ku'C�kunne�ku’��A�ku’C�kunne�kv.��A�kv.�kvm.��A�kvm.�kvt.��A�kvt.�l.��A�l.�l.c.��A�l.c.�lab.��A�lab.�lat.��A�lat.�lb.m.��A�lb.m.�lb.nr.��A�lb.nr.�lejl.��A�lejl.�lgd.��A�lgd.�li'��A�li'C�lide�lic.��A�lic.�lign.��A�lign.�lin.��A�lin.�ling.merc.��A�ling.merc.�litt.��A�litt.�li’��A�li’C�lide�loc.cit.��A�loc.cit.�lok.��A�lok.�lrs.��A�lrs.�ltr.��A�ltr.�lør.��A�lør.C�lørdag�m.��A�m.�m.a.o.��A�m.a.o.�m.fl.��A�m.fl.�m.h.p.��A�m.h.p.�m.h.t.��A�m.h.t.�m.m.��A�m.m.�m.v.��A�m.v.�m.v.h.��A�m.v.h.�m/k��A�m/k�m/s��A�m/s�m/sek.��A�m/sek.�maks.��A�maks.�man.��A�man.C�mandag�mar.��A�mar.C�marts�md.��A�md.�mdr.��A�mdr.�mdtl.��A�mdtl.�mezz.��A�mezz.�mfl.��A�mfl.�mht.��A�mht.�mia.��A�mia.�mik.��A�mik.�mill.��A�mill.�mio.��A�mio.�modt.��A�modt.�mrk.��A�mrk.�mul.��A�mul.�mv.��A�mv.�n.��A�n.�n.br.��A�n.br.�n.f.��A�n.f.�nb.��A�nb.�nedenst.��A�nedenst.�nl.��A�nl.�nov.��A�nov.C�november�nr.��A�nr.�nto.��A�nto.�nuv.��A�nuv.�o.��A�o.�o.0��A�o.0�o.O��A�o.O�o.a.��A�o.a.�o.fl.��A�o.fl.�o.h.��A�o.h.�o.l.��A�o.l.�o.lign.��A�o.lign.�o.m.a.��A�o.m.a.�o.o��A�o.o�o.s.fr.��A�o.s.fr.�o/m��A�o/m�o/m.��A�o/m.�o_0��A�o_0�o_O��A�o_O�o_o��A�o_o�obl.��A�obl.�obs.��A�obs.�odont.��A�odont.�oecon.��A�oecon.�off.��A�off.�ofl.��A�ofl.�og/eller��A�og/ellerC�og/eller�okt.��A�okt.C�oktober�omg.��A�omg.�omkr.��A�omkr.�omr.��A�omr.�omtr.��A�omtr.�ons.��A�ons.C�onsdag�opg.��A�opg.�opl.��A�opl.�opr.��A�opr.�org.��A�org.�orig.��A�orig.�osv.��A�osv.�ovenst.��A�ovenst.�overs.��A�overs.�ovf.��A�ovf.�p.��A�p.�p.a.��A�p.a.�p.b.a��A�p.b.a�p.b.v��A�p.b.v�p.c.��A�p.c.�p.m.��A�p.m.�p.m.v.��A�p.m.v.�p.n.��A�p.n.�p.p.��A�p.p.�p.p.s.��A�p.p.s.�p.s.��A�p.s.�p.t.��A�p.t.�p.v.a.��A�p.v.a.�p.v.c.��A�p.v.c.�pag.��A�pag.�pass.��A�pass.�pcs.��A�pcs.�pct.��A�pct.�pd.��A�pd.�pens.��A�pens.�pers.��A�pers.�pft.��A�pft.�pg.��A�pg.�pga.��A�pga.�pgl.��A�pgl.�pinx.��A�pinx.�pk.��A�pk.�pkt.��A�pkt.�polit.��A�polit.�polyt.��A�polyt.�pos.��A�pos.�pp.��A�pp.�ppm.��A�ppm.�pr.��A�pr.�prc.��A�prc.�priv.��A�priv.�prod.��A�prod.�prof.��A�prof.�pron.��A�pron.�præd.��A�præd.�præf.��A�præf.�præt.��A�præt.�psych.��A�psych.�pt.��A�pt.�pæd.��A�pæd.�q.��A�q.�q.e.d.��A�q.e.d.�r.��A�r.�rad.��A�rad.�red.��A�red.�ref.��A�ref.�reg.��A�reg.�regn.��A�regn.�rel.��A�rel.�rep.��A�rep.�repr.��A�repr.�resp.��A�resp.�rest.��A�rest.�rm.��A�rm.�rtg.��A�rtg.�russ.��A�russ.�s'gu��A�s'guC�s'gu�s.��A�s.�s.br.��A�s.br.�s.d.��A�s.d.�s.f.��A�s.f.�s.m.b.a.��A�s.m.b.a.�s.u.��A�s.u.�s.å.��A�s.å.�sa.��A�sa.�sb.��A�sb.�sc.��A�sc.�scient.��A�scient.�scil.��A�scil.�sek.��A�sek.�sekr.��A�sekr.�self.��A�self.�sem.��A�sem.�sep.��A�sep.C�september�sept.��A�sept.C�september�sgu'��A�sgu'C�s'gu�sgu’��A�sgu’C�s'gu�shj.��A�shj.�sign.��A�sign.�sing.��A�sing.�sj.��A�sj.�skr.��A�skr.�sku'��A�sku'C�skulle�sku’��A�sku’C�skulle�slutn.��A�slutn.�sml.��A�sml.�smp.��A�smp.�snr.��A�snr.�soc.��A�soc.�soc.dem.��A�soc.dem.�sp.��A�sp.�spec.��A�spec.�spm.��A�spm.�spr.��A�spr.�spsk.��A�spsk.�st.��A�st.�statsaut.��A�statsaut.�stk.��A�stk.�str.��A�str.�stud.��A�stud.�subj.��A�subj.�subst.��A�subst.�suff.��A�suff.�sup.��A�sup.�suppl.��A�suppl.�sv.��A�sv.�såk.��A�såk.�sædv.��A�sædv.�s’gu��A�s’guC�s'gu�t.��A�t.�t.h.��A�t.h.�t.o.��A�t.o.�t.o.m.��A�t.o.m.�t.v.��A�t.v.�t/r��A�t/r�tbl.��A�tbl.�tcp/ip��A�tcp/ip�td.��A�td.�tdl.��A�tdl.�tdr.��A�tdr.�techn.��A�techn.�tekn.��A�tekn.�temp.��A�temp.�th.��A�th.�theol.��A�theol.�tidl.��A�tidl.�tilf.��A�tilf.�tilh.��A�tilh.�till.��A�till.�tilsv.��A�tilsv.�tirs.��A�tirs.C�tirsdag�tjg.��A�tjg.�tkr.��A�tkr.�tlf.��A�tlf.�tlgr.��A�tlgr.�tor.��A�tor.C�torsdag�tors.��A�tors.C�torsdag�tr.��A�tr.�trp.��A�trp.�tsk.��A�tsk.�tv.��A�tv.�ty.��A�ty.�u.��A�u.�u/b��A�u/b�udb.��A�udb.�udbet.��A�udbet.�ugtl.��A�ugtl.�undt.��A�undt.�v.��A�v.�v.f.��A�v.f.�v.v��A�v.v�v_v��A�v_v�vb.��A�vb.�vedk.��A�vedk.�vedl.��A�vedl.�vedr.��A�vedr.�vejl.��A�vejl.�vh.��A�vh.�vha.��A�vha.�vind.��A�vind.�vs.��A�vs.�vsa.��A�vsa.�vær.��A�vær.�w.��A�w.�x.��A�x.�xD��A�xD�xDD��A�xDD�y.��A�y.�z.��A�z.�zool.��A�zool.� ��A� C� �¯\(ツ)/¯��A�¯\(ツ)/¯�Årg.��A�Årg.�Årh.��A�Årh.�Ø.lgd.��A�Ø.lgd.�Øvr.��A�Øvr.�ä.��A�ä.�årg.��A�årg.�årh.��A�årh.�ö.��A�ö.�ø.lgd.��A�ø.lgd.�øvr.��A�øvr.�ü.��A�ü.�ಠ_ಠ��A�ಠ_ಠ�ಠ︵ಠ��A�ಠ︵ಠ�—��A�—�’��A�’�’’��A�’’
transformer/cfg ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "max_batch_items":4096
3
+ }
transformer/model/config.json ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "Maltehb/danish-bert-botxo",
3
+ "architectures": [
4
+ "BertForPreTraining"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "directionality": "bidi",
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-12,
15
+ "max_position_embeddings": 512,
16
+ "model_type": "bert",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 0,
20
+ "pooler_fc_size": 768,
21
+ "pooler_num_attention_heads": 12,
22
+ "pooler_num_fc_layers": 3,
23
+ "pooler_size_per_head": 128,
24
+ "pooler_type": "first_token_transform",
25
+ "position_embedding_type": "absolute",
26
+ "transformers_version": "4.5.1",
27
+ "type_vocab_size": 2,
28
+ "use_cache": true,
29
+ "vocab_size": 32000
30
+ }
transformer/model/pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f657cf1de5ca07b7c7940a3b91f6061a4b5bfafb8c27ba5bd96a853a3ccf4e1b
3
+ size 442554327
transformer/model/special_tokens_map.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]"}
transformer/model/tokenizer_config.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"do_lower_case": true, "unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]", "tokenize_chinese_chars": true, "strip_accents": false, "model_max_length": 512, "special_tokens_map_file": null, "name_or_path": "Maltehb/danish-bert-botxo", "do_basic_tokenize": true, "never_split": null}
transformer/model/vocab.txt ADDED
The diff for this file is too large to render. See raw diff
 
vocab/key2row ADDED
@@ -0,0 +1 @@
 
 
1
+
vocab/lookups.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a6f4a94131759bf84baec98b3347bcef57ffb2d6712f7f3b8f611e9ef4b3df35
3
+ size 20402
vocab/strings.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5b50a86603f748496e4fd87a8aaa203a32bf82d4b3768bf54187ff40de3ca6f9
3
+ size 460120
vocab/vectors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:14772b683e726436d5948ad3fff2b43d036ef2ebbe3458aafed6004e05a40706
3
+ size 128