Update Readme
Browse files
README.md
CHANGED
@@ -9,7 +9,7 @@ widget:
|
|
9 |
---
|
10 |
|
11 |
## Token Classification
|
12 |
-
|
13 |
|
14 |
| **tag** | **token** |
|
15 |
|---------------------------------|-----------|
|
@@ -17,6 +17,7 @@ widget:
|
|
17 |
|I-ITEM | INSIDE ITEM|
|
18 |
|B-METRIC |BEGINNING METRIC |
|
19 |
|I-METRIC | INSIDE METRIC|
|
|
|
20 |
|
21 |
---
|
22 |
|
@@ -25,71 +26,17 @@ widget:
|
|
25 |
The following Flair script was used to train this model:
|
26 |
|
27 |
```python
|
28 |
-
from
|
29 |
-
|
30 |
-
from flair.embeddings import WordEmbeddings, StackedEmbeddings, FlairEmbeddings
|
31 |
-
|
32 |
-
# 1. get the corpus
|
33 |
-
corpus: Corpus = CONLL_2000()
|
34 |
-
|
35 |
-
# 2. what tag do we want to predict?
|
36 |
-
tag_type = 'np'
|
37 |
-
|
38 |
-
# 3. make the tag dictionary from the corpus
|
39 |
-
tag_dictionary = corpus.make_tag_dictionary(tag_type=tag_type)
|
40 |
-
|
41 |
-
# 4. initialize each embedding we use
|
42 |
-
embedding_types = [
|
43 |
-
|
44 |
-
# contextual string embeddings, forward
|
45 |
-
FlairEmbeddings('news-forward'),
|
46 |
-
|
47 |
-
# contextual string embeddings, backward
|
48 |
-
FlairEmbeddings('news-backward'),
|
49 |
-
]
|
50 |
-
|
51 |
-
# embedding stack consists of Flair and GloVe embeddings
|
52 |
-
embeddings = StackedEmbeddings(embeddings=embedding_types)
|
53 |
-
|
54 |
-
# 5. initialize sequence tagger
|
55 |
-
from flair.models import SequenceTagger
|
56 |
-
|
57 |
-
tagger = SequenceTagger(hidden_size=256,
|
58 |
-
embeddings=embeddings,
|
59 |
-
tag_dictionary=tag_dictionary,
|
60 |
-
tag_type=tag_type)
|
61 |
|
62 |
-
|
63 |
-
from flair.trainers import ModelTrainer
|
64 |
|
65 |
-
|
66 |
-
|
67 |
-
|
68 |
-
|
69 |
-
train_with_dev=True,
|
70 |
-
max_epochs=150)
|
71 |
```
|
72 |
|
73 |
|
74 |
|
75 |
---
|
76 |
-
|
77 |
-
### Cite
|
78 |
-
|
79 |
-
Please cite the following paper when using this model.
|
80 |
-
|
81 |
-
```
|
82 |
-
@inproceedings{akbik2018coling,
|
83 |
-
title={Contextual String Embeddings for Sequence Labeling},
|
84 |
-
author={Akbik, Alan and Blythe, Duncan and Vollgraf, Roland},
|
85 |
-
booktitle = {{COLING} 2018, 27th International Conference on Computational Linguistics},
|
86 |
-
pages = {1638--1649},
|
87 |
-
year = {2018}
|
88 |
-
}
|
89 |
-
```
|
90 |
-
|
91 |
-
---
|
92 |
-
|
93 |
-
### Issues?
|
94 |
-
|
95 |
-
The Flair issue tracker is available [here](https://github.com/flairNLP/flair/issues/).
|
|
|
9 |
---
|
10 |
|
11 |
## Token Classification
|
12 |
+
Classifies Gro's items and metrics
|
13 |
|
14 |
| **tag** | **token** |
|
15 |
|---------------------------------|-----------|
|
|
|
17 |
|I-ITEM | INSIDE ITEM|
|
18 |
|B-METRIC |BEGINNING METRIC |
|
19 |
|I-METRIC | INSIDE METRIC|
|
20 |
+
|O | OUTSIDE |
|
21 |
|
22 |
---
|
23 |
|
|
|
26 |
The following Flair script was used to train this model:
|
27 |
|
28 |
```python
|
29 |
+
from transformers import AutoModelForTokenClassification, AutoTokenizer, pipeline
|
30 |
+
tokenizer = AutoTokenizer.from_pretrained("Wanjiru/autotrain_gro_ner")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
31 |
|
32 |
+
model = AutoModelForTokenClassification.from_pretrained("Wanjiru/autotrain_gro_ner")
|
|
|
33 |
|
34 |
+
nlp = pipeline("ner", model=model, tokenizer=tokenizer)
|
35 |
+
example = "Wanjru"
|
36 |
+
ner_res = nlp(example)
|
37 |
+
|
|
|
|
|
38 |
```
|
39 |
|
40 |
|
41 |
|
42 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|