File size: 11,046 Bytes
0811cd4 a88137c 0811cd4 a88137c 0811cd4 a88137c 0811cd4 a88137c 0811cd4 a88137c 0811cd4 a88137c 0811cd4 a88137c 0811cd4 a88137c 00209c5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 |
---
tags:
- spacy
- token-classification
language:
- zh
model-index:
- name: zh_lzh_sigtyp_trf
results:
- task:
name: TAG
type: token-classification
metrics:
- name: TAG (XPOS) Accuracy
type: accuracy
value: 0.742052984
- task:
name: POS
type: token-classification
metrics:
- name: POS (UPOS) Accuracy
type: accuracy
value: 0.7949418685
- task:
name: MORPH
type: token-classification
metrics:
- name: Morph (UFeats) Accuracy
type: accuracy
value: 0.8236478744
- task:
name: LEMMA
type: token-classification
metrics:
- name: Lemma Accuracy
type: accuracy
value: 0.942007037
- task:
name: UNLABELED_DEPENDENCIES
type: token-classification
metrics:
- name: Unlabeled Attachment Score (UAS)
type: f_score
value: 0.8228271306
- task:
name: LABELED_DEPENDENCIES
type: token-classification
metrics:
- name: Labeled Attachment Score (LAS)
type: f_score
value: 0.7703219397
- task:
name: SENTS
type: token-classification
metrics:
- name: Sentences F-Score
type: f_score
value: 0.9851073655
---
| Feature | Description |
| --- | --- |
| **Name** | `zh_lzh_sigtyp_trf` |
| **Version** | `0.1.0` |
| **spaCy** | `>=3.6.1,<3.7.0` |
| **Default Pipeline** | `transformer`, `parser`, `trainable_lemmatizer`, `tagger`, `morphologizer` |
| **Components** | `transformer`, `parser`, `trainable_lemmatizer`, `tagger`, `morphologizer` |
| **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
| **Sources** | n/a |
| **License** | n/a |
| **Author** | [n/a]() |
### Label Scheme
<details>
<summary>View label scheme (272 labels for 3 components)</summary>
| Component | Labels |
| --- | --- |
| **`parser`** | `ROOT`, `acl`, `advcl`, `advmod`, `amod`, `aux`, `case`, `cc`, `ccomp`, `clf`, `compound`, `compound:redup`, `conj`, `cop`, `csubj`, `csubj:outer`, `dep`, `det`, `discourse`, `discourse:sp`, `dislocated`, `expl`, `fixed`, `flat`, `flat:foreign`, `flat:vv`, `iobj`, `list`, `mark`, `nmod`, `nsubj`, `nsubj:outer`, `nummod`, `obj`, `obl`, `obl:lmod`, `obl:tmod`, `parataxis`, `vocative`, `xcomp` |
| **`tagger`** | `n,代名詞,人称,他__Person=1\|PronType=Prs`, `n,代名詞,人称,他__Person=2\|PronType=Prs`, `n,代名詞,人称,他__Person=3\|PronType=Prs`, `n,代名詞,人称,他__PronType=Prs`, `n,代名詞,人称,他__PronType=Prs\|Reflex=Yes`, `n,代名詞,人称,止格__Person=1\|PronType=Prs`, `n,代名詞,人称,止格__Person=2\|PronType=Prs`, `n,代名詞,人称,止格__Person=3\|PronType=Prs`, `n,代名詞,人称,起格__Person=1\|PronType=Prs`, `n,代名詞,人称,起格__Person=2\|PronType=Prs`, `n,代名詞,人称,起格__Person=3\|PronType=Prs`, `n,代名詞,人称,起格__PronType=Prs`, `n,代名詞,指示,*__PronType=Dem`, `n,代名詞,疑問,*__PronType=Int`, `n,名詞,不可譲,属性`, `n,名詞,不可譲,疾病`, `n,名詞,不可譲,身体`, `n,名詞,主体,動物`, `n,名詞,主体,国名__Case=Loc\|NameType=Nat`, `n,名詞,主体,書物`, `n,名詞,主体,機関`, `n,名詞,主体,神仏`, `n,名詞,主体,集団`, `n,名詞,人,その他の人名__NameType=Prs`, `n,名詞,人,人`, `n,名詞,人,名__NameType=Giv`, `n,名詞,人,姓氏__NameType=Sur`, `n,名詞,人,役割`, `n,名詞,人,複合的人名__NameType=Prs`, `n,名詞,人,関係`, `n,名詞,制度,儀礼`, `n,名詞,制度,場__Case=Loc`, `n,名詞,可搬,乗り物`, `n,名詞,可搬,伝達`, `n,名詞,可搬,成果物`, `n,名詞,可搬,糧食`, `n,名詞,可搬,道具`, `n,名詞,固定物,地名__Case=Loc\|NameType=Geo`, `n,名詞,固定物,地形__Case=Loc`, `n,名詞,固定物,建造物__Case=Loc`, `n,名詞,固定物,樹木`, `n,名詞,固定物,関係__Case=Loc`, `n,名詞,外観,人`, `n,名詞,天象,天文`, `n,名詞,天象,怪異`, `n,名詞,天象,気象`, `n,名詞,度量衡,*__NounType=Clf`, `n,名詞,思考,*`, `n,名詞,思考,思考`, `n,名詞,描写,形質`, `n,名詞,描写,態度`, `n,名詞,数量,*`, `n,名詞,時,*__Case=Tem`, `n,名詞,行為,*`, `n,数詞,干支,*__NumType=Ord`, `n,数詞,数,*`, `n,数詞,数字,*`, `p,助詞,句末,*`, `p,助詞,句頭,*`, `p,助詞,接続,並列`, `p,助詞,接続,体言化`, `p,助詞,接続,属格`, `p,助詞,提示,*`, `p,感嘆詞,*,*`, `p,接尾辞,*,*`, `s,文字,*,*`, `s,記号,一般,*`, `s,記号,括弧開,*`, `s,記号,読点,*`, `v,前置詞,基盤,*`, `v,前置詞,源泉,*`, `v,前置詞,経由,*`, `v,前置詞,関係,*`, `v,副詞,判断,推定`, `v,副詞,判断,確定`, `v,副詞,判断,逆接`, `v,副詞,否定,体言否定__Polarity=Neg`, `v,副詞,否定,有界__Polarity=Neg`, `v,副詞,否定,無界__Polarity=Neg`, `v,副詞,否定,禁止__Polarity=Neg`, `v,副詞,描写,*`, `v,副詞,時相,変化__AdvType=Tim`, `v,副詞,時相,完了__AdvType=Tim\|Aspect=Perf`, `v,副詞,時相,将来__AdvType=Tim\|Tense=Fut`, `v,副詞,時相,恒常__AdvType=Tim`, `v,副詞,時相,現在__AdvType=Tim\|Tense=Pres`, `v,副詞,時相,終局__AdvType=Tim`, `v,副詞,時相,継起__AdvType=Tim`, `v,副詞,時相,緊接__AdvType=Tim`, `v,副詞,時相,過去__AdvType=Tim\|Tense=Past`, `v,副詞,疑問,原因__AdvType=Cau`, `v,副詞,疑問,反語`, `v,副詞,疑問,所在`, `v,副詞,程度,やや高度__AdvType=Deg\|Degree=Cmp`, `v,副詞,程度,極度__AdvType=Deg\|Degree=Sup`, `v,副詞,程度,軽度__AdvType=Deg\|Degree=Pos`, `v,副詞,範囲,共同`, `v,副詞,範囲,総括`, `v,副詞,範囲,限定`, `v,副詞,頻度,偶発`, `v,副詞,頻度,重複`, `v,副詞,頻度,頻繁`, `v,助動詞,受動,*__Voice=Pass`, `v,助動詞,可能,*__Mood=Pot`, `v,助動詞,必要,*__Mood=Nec`, `v,助動詞,願望,*__Mood=Des`, `v,動詞,変化,制度`, `v,動詞,変化,制度__VerbForm=Conv`, `v,動詞,変化,制度__VerbForm=Part`, `v,動詞,変化,性質`, `v,動詞,変化,性質__VerbForm=Conv`, `v,動詞,変化,性質__VerbForm=Part`, `v,動詞,変化,生物`, `v,動詞,変化,生物__VerbForm=Conv`, `v,動詞,変化,生物__VerbForm=Part`, `v,動詞,存在,存在`, `v,動詞,存在,存在__Polarity=Neg`, `v,動詞,存在,存在__Polarity=Neg\|VerbForm=Conv`, `v,動詞,存在,存在__Polarity=Neg\|VerbForm=Part`, `v,動詞,存在,存在__VerbForm=Conv`, `v,動詞,存在,存在__VerbForm=Part`, `v,動詞,存在,存在__VerbType=Cop`, `v,動詞,描写,境遇__Degree=Pos`, `v,動詞,描写,境遇__Degree=Pos\|VerbForm=Conv`, `v,動詞,描写,境遇__Degree=Pos\|VerbForm=Part`, `v,動詞,描写,形質__Degree=Pos`, `v,動詞,描写,形質__Degree=Pos\|VerbForm=Conv`, `v,動詞,描写,形質__Degree=Pos\|VerbForm=Part`, `v,動詞,描写,態度__Degree=Pos`, `v,動詞,描写,態度__Degree=Pos\|VerbForm=Conv`, `v,動詞,描写,態度__Degree=Pos\|VerbForm=Part`, `v,動詞,描写,量__Degree=Pos`, `v,動詞,描写,量__Degree=Pos\|VerbForm=Conv`, `v,動詞,描写,量__Degree=Pos\|VerbForm=Part`, `v,動詞,行為,交流`, `v,動詞,行為,交流__VerbForm=Conv`, `v,動詞,行為,交流__VerbForm=Part`, `v,動詞,行為,伝達`, `v,動詞,行為,伝達__VerbForm=Conv`, `v,動詞,行為,伝達__VerbForm=Part`, `v,動詞,行為,使役`, `v,動詞,行為,使役__VerbForm=Conv`, `v,動詞,行為,使役__VerbForm=Part`, `v,動詞,行為,儀礼`, `v,動詞,行為,儀礼__VerbForm=Conv`, `v,動詞,行為,儀礼__VerbForm=Part`, `v,動詞,行為,分類__Degree=Equ`, `v,動詞,行為,分類__Degree=Equ\|VerbForm=Conv`, `v,動詞,行為,分類__Degree=Equ\|VerbForm=Part`, `v,動詞,行為,動作`, `v,動詞,行為,動作__VerbForm=Conv`, `v,動詞,行為,動作__VerbForm=Part`, `v,動詞,行為,姿勢`, `v,動詞,行為,姿勢__VerbForm=Conv`, `v,動詞,行為,姿勢__VerbForm=Part`, `v,動詞,行為,役割`, `v,動詞,行為,役割__VerbForm=Conv`, `v,動詞,行為,役割__VerbForm=Part`, `v,動詞,行為,得失`, `v,動詞,行為,得失__VerbForm=Conv`, `v,動詞,行為,得失__VerbForm=Part`, `v,動詞,行為,態度`, `v,動詞,行為,態度__VerbForm=Conv`, `v,動詞,行為,態度__VerbForm=Part`, `v,動詞,行為,生産`, `v,動詞,行為,生産__VerbForm=Conv`, `v,動詞,行為,生産__VerbForm=Part`, `v,動詞,行為,移動`, `v,動詞,行為,移動__VerbForm=Conv`, `v,動詞,行為,移動__VerbForm=Part`, `v,動詞,行為,設置`, `v,動詞,行為,設置__VerbForm=Conv`, `v,動詞,行為,設置__VerbForm=Part`, `v,動詞,行為,飲食`, `v,動詞,行為,飲食__VerbForm=Conv`, `v,動詞,行為,飲食__VerbForm=Part` |
| **`morphologizer`** | `POS=VERB`, `Case=Loc\|POS=NOUN`, `POS=VERB\|Polarity=Neg`, `POS=VERB\|VerbForm=Part`, `POS=NOUN`, `POS=ADV`, `AdvType=Tim\|POS=ADV\|Tense=Fut`, `POS=PRON\|Person=3\|PronType=Prs`, `POS=PART`, `POS=NUM`, `POS=CCONJ`, `Case=Tem\|POS=NOUN`, `Degree=Pos\|POS=VERB\|VerbForm=Part`, `Degree=Pos\|POS=VERB`, `NameType=Giv\|POS=PROPN`, `POS=PRON\|Person=2\|PronType=Prs`, `POS=SCONJ`, `POS=ADV\|Polarity=Neg`, `POS=PRON\|PronType=Dem`, `POS=AUX\|VerbType=Cop`, `NameType=Sur\|POS=PROPN`, `Mood=Pot\|POS=AUX`, `POS=ADV\|VerbForm=Conv`, `POS=ADP`, `NameType=Prs\|POS=PROPN`, `AdvType=Tim\|POS=ADV`, `POS=PRON\|Person=1\|PronType=Prs`, `Degree=Pos\|POS=ADV\|VerbForm=Conv`, `Case=Loc\|NameType=Nat\|POS=PROPN`, `POS=INTJ`, `AdvType=Tim\|Aspect=Perf\|POS=ADV`, `POS=PRON\|PronType=Int`, `Case=Loc\|NameType=Geo\|POS=PROPN`, `AdvType=Cau\|POS=ADV`, `POS=PRON\|PronType=Prs\|Reflex=Yes`, `Mood=Des\|POS=AUX`, `Degree=Equ\|POS=VERB`, `AdvType=Tim\|POS=ADV\|Tense=Past`, `POS=PRON\|PronType=Prs`, `POS=SYM`, `AdvType=Deg\|Degree=Cmp\|POS=ADV`, `POS=AUX\|Voice=Pass`, `NounType=Clf\|POS=NOUN`, `POS=ADV\|Polarity=Neg\|VerbForm=Conv`, `NumType=Ord\|POS=NUM`, `POS=VERB\|Polarity=Neg\|VerbForm=Part`, `Degree=Equ\|POS=ADV\|VerbForm=Conv`, `AdvType=Tim\|POS=ADV\|Tense=Pres`, `Mood=Nec\|POS=AUX`, `AdvType=Deg\|Degree=Sup\|POS=ADV`, `Degree=Equ\|POS=VERB\|VerbForm=Part`, `AdvType=Deg\|Degree=Pos\|POS=ADV`, `Degree=Equ\|POS=ADP`, `POS=PUNCT`, `POS=PROPN`, `Degree=Pos\|POS=NOUN` |
</details>
### Accuracy
| Type | Score |
| --- | --- |
| `DEP_UAS` | 82.28 |
| `DEP_LAS` | 77.03 |
| `SENTS_P` | 98.08 |
| `SENTS_R` | 98.94 |
| `SENTS_F` | 98.51 |
| `LEMMA_ACC` | 94.20 |
| `TAG_ACC` | 74.21 |
| `POS_ACC` | 79.49 |
| `MORPH_ACC` | 82.36 |
| `TRANSFORMER_LOSS` | 3733506.66 |
| `PARSER_LOSS` | 1170567.44 |
| `TRAINABLE_LEMMATIZER_LOSS` | 98325.33 |
| `TAGGER_LOSS` | 2806037.87 |
| `MORPHOLOGIZER_LOSS` | 2426650.00 |
### Citation
If you're using this model, please cite:
```
@inproceedings{miranda-2024-allen,
title = "{A}llen Institute for {AI} @ {SIGTYP} 2024 Shared Task on Word Embedding Evaluation for Ancient and Historical Languages",
author = "Miranda, Lester James",
booktitle = "Proceedings of the 6th Workshop on Research in Computational Linguistic Typology and Multilingual NLP",
month = mar,
year = "2024",
address = "St. Julian's, Malta",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2024.sigtyp-1.18",
pages = "151--159",
}
``` |