model update
Browse files
README.md
CHANGED
@@ -73,7 +73,7 @@ model-index:
|
|
73 |
|
74 |
pipeline_tag: token-classification
|
75 |
widget:
|
76 |
-
- text: "Get the all-analog Classic Vinyl Edition of `Takin' Off` Album from {
|
77 |
example_title: "NER Example 1"
|
78 |
---
|
79 |
# tner/roberta-large-tweetner7-selflabel2020
|
@@ -112,15 +112,34 @@ Full evaluation can be found at [metric file of NER](https://huggingface.co/tner
|
|
112 |
and [metric file of entity span](https://huggingface.co/tner/roberta-large-tweetner7-selflabel2020/raw/main/eval/metric_span.json).
|
113 |
|
114 |
### Usage
|
115 |
-
This model can be used through the [tner library](https://github.com/asahi417/tner). Install the library via pip
|
116 |
```shell
|
117 |
pip install tner
|
118 |
```
|
119 |
-
and
|
|
|
|
|
120 |
```python
|
|
|
|
|
121 |
from tner import TransformersNER
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
122 |
model = TransformersNER("tner/roberta-large-tweetner7-selflabel2020")
|
123 |
-
model.predict([
|
124 |
```
|
125 |
It can be used via transformers library but it is not recommended as CRF layer is not supported at the moment.
|
126 |
|
@@ -166,3 +185,4 @@ If you use any resource from T-NER, please consider to cite our [paper](https://
|
|
166 |
}
|
167 |
|
168 |
```
|
|
|
|
73 |
|
74 |
pipeline_tag: token-classification
|
75 |
widget:
|
76 |
+
- text: "Get the all-analog Classic Vinyl Edition of `Takin' Off` Album from {@herbiehancock@} via {@bluenoterecords@} link below: {{URL}}"
|
77 |
example_title: "NER Example 1"
|
78 |
---
|
79 |
# tner/roberta-large-tweetner7-selflabel2020
|
|
|
112 |
and [metric file of entity span](https://huggingface.co/tner/roberta-large-tweetner7-selflabel2020/raw/main/eval/metric_span.json).
|
113 |
|
114 |
### Usage
|
115 |
+
This model can be used through the [tner library](https://github.com/asahi417/tner). Install the library via pip.
|
116 |
```shell
|
117 |
pip install tner
|
118 |
```
|
119 |
+
[TweetNER7](https://huggingface.co/datasets/tner/tweetner7) pre-processed tweets where the account name and URLs are
|
120 |
+
converted into special formats (see the dataset page for more detail), so we process tweets accordingly and then run the model prediction as below.
|
121 |
+
|
122 |
```python
|
123 |
+
import re
|
124 |
+
from urlextract import URLExtract
|
125 |
from tner import TransformersNER
|
126 |
+
|
127 |
+
extractor = URLExtract()
|
128 |
+
|
129 |
+
def format_tweet(tweet):
|
130 |
+
# mask web urls
|
131 |
+
urls = extractor.find_urls(tweet)
|
132 |
+
for url in urls:
|
133 |
+
tweet = tweet.replace(url, "{{URL}}")
|
134 |
+
# format twitter account
|
135 |
+
tweet = re.sub(r"\b(\s*)(@[\S]+)\b", r'\1{\2@}', tweet)
|
136 |
+
return tweet
|
137 |
+
|
138 |
+
|
139 |
+
text = "Get the all-analog Classic Vinyl Edition of `Takin' Off` Album from @herbiehancock via @bluenoterecords link below: http://bluenote.lnk.to/AlbumOfTheWeek"
|
140 |
+
text_format = format_tweet(text)
|
141 |
model = TransformersNER("tner/roberta-large-tweetner7-selflabel2020")
|
142 |
+
model.predict([text_format])
|
143 |
```
|
144 |
It can be used via transformers library but it is not recommended as CRF layer is not supported at the moment.
|
145 |
|
|
|
185 |
}
|
186 |
|
187 |
```
|
188 |
+
|