danlou commited on
Commit
c990c11
β€’
1 Parent(s): 3b3eca6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -22
README.md CHANGED
@@ -36,7 +36,7 @@ def preprocess(text):
36
  ```python
37
  from transformers import pipeline, AutoTokenizer
38
 
39
- MODEL = "cardiffnlp/twitter-roberta-base-jun2022-15M-incr"
40
  fill_mask = pipeline("fill-mask", model=MODEL, tokenizer=MODEL)
41
  tokenizer = AutoTokenizer.from_pretrained(MODEL)
42
 
@@ -63,25 +63,25 @@ Output:
63
  ```
64
  ------------------------------
65
  So glad I'm <mask> vaccinated.
66
- 1) 0.48904 not
67
- 2) 0.19832 fully
68
- 3) 0.13791 getting
69
- 4) 0.02852 still
70
- 5) 0.01900 triple
71
  ------------------------------
72
  I keep forgetting to bring a <mask>.
73
- 1) 0.05997 backpack
74
- 2) 0.05158 charger
75
- 3) 0.05071 book
76
- 4) 0.04741 lighter
77
- 5) 0.03621 bag
78
  ------------------------------
79
  Looking forward to watching <mask> Game tonight!
80
- 1) 0.54114 the
81
- 2) 0.23145 The
82
- 3) 0.01682 this
83
- 4) 0.01435 Squid
84
- 5) 0.01300 End
85
  ```
86
 
87
  ## Example Tweet Embeddings
@@ -99,7 +99,7 @@ def get_embedding(text): # naive approach for demonstration
99
  return np.mean(features[0], axis=0)
100
 
101
 
102
- MODEL = "cardiffnlp/twitter-roberta-base-jun2022-15M-incr"
103
  tokenizer = AutoTokenizer.from_pretrained(MODEL)
104
  model = AutoModel.from_pretrained(MODEL)
105
 
@@ -124,10 +124,10 @@ Output:
124
  ```
125
  Most similar to: The book was awesome
126
  ------------------------------
127
- 1) 0.98878 The movie was great
128
- 2) 0.96100 Just finished reading 'Embeddings in NLP'
129
- 3) 0.94927 I just ordered fried chicken 🐣
130
- 4) 0.94668 What time is the next game?
131
  ```
132
 
133
  ## Example Feature Extraction
@@ -136,7 +136,7 @@ Most similar to: The book was awesome
136
  from transformers import AutoTokenizer, AutoModel, TFAutoModel
137
  import numpy as np
138
 
139
- MODEL = "cardiffnlp/twitter-roberta-base-jun2022-15M-incr"
140
  tokenizer = AutoTokenizer.from_pretrained(MODEL)
141
 
142
  text = "Good night 😊"
 
36
  ```python
37
  from transformers import pipeline, AutoTokenizer
38
 
39
+ MODEL = "cardiffnlp/twitter-roberta-base-mar2022-15M-incr"
40
  fill_mask = pipeline("fill-mask", model=MODEL, tokenizer=MODEL)
41
  tokenizer = AutoTokenizer.from_pretrained(MODEL)
42
 
 
63
  ```
64
  ------------------------------
65
  So glad I'm <mask> vaccinated.
66
+ 1) 0.35668 not
67
+ 2) 0.27636 fully
68
+ 3) 0.18418 getting
69
+ 4) 0.03197 still
70
+ 5) 0.02259 triple
71
  ------------------------------
72
  I keep forgetting to bring a <mask>.
73
+ 1) 0.04261 book
74
+ 2) 0.04233 backpack
75
+ 3) 0.04161 charger
76
+ 4) 0.03892 mask
77
+ 5) 0.03636 lighter
78
  ------------------------------
79
  Looking forward to watching <mask> Game tonight!
80
+ 1) 0.55292 the
81
+ 2) 0.17813 The
82
+ 3) 0.03052 this
83
+ 4) 0.01565 Championship
84
+ 5) 0.01391 End
85
  ```
86
 
87
  ## Example Tweet Embeddings
 
99
  return np.mean(features[0], axis=0)
100
 
101
 
102
+ MODEL = "cardiffnlp/twitter-roberta-base-mar2022-15M-incr"
103
  tokenizer = AutoTokenizer.from_pretrained(MODEL)
104
  model = AutoModel.from_pretrained(MODEL)
105
 
 
124
  ```
125
  Most similar to: The book was awesome
126
  ------------------------------
127
+ 1) 0.98951 The movie was great
128
+ 2) 0.96042 Just finished reading 'Embeddings in NLP'
129
+ 3) 0.95454 I just ordered fried chicken 🐣
130
+ 4) 0.95148 What time is the next game?
131
  ```
132
 
133
  ## Example Feature Extraction
 
136
  from transformers import AutoTokenizer, AutoModel, TFAutoModel
137
  import numpy as np
138
 
139
+ MODEL = "cardiffnlp/twitter-roberta-base-mar2022-15M-incr"
140
  tokenizer = AutoTokenizer.from_pretrained(MODEL)
141
 
142
  text = "Good night 😊"