hoan commited on
Commit
747c0b9
1 Parent(s): f3ea02d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -326
README.md CHANGED
@@ -5,330 +5,39 @@ tags:
5
  - toxicity
6
  - toxic detection
7
  ---
8
- <div align="center">
9
-
10
- This model card is extracted from the original model: [https://huggingface.co/unitary/multilingual-toxic-xlm-roberta](https://huggingface.co/unitary/multilingual-toxic-xlm-roberta)
11
-
12
- **⚠️ Disclaimer:**
13
- The huggingface models currently give different results to the detoxify library (see issue [here](https://github.com/unitaryai/detoxify/issues/15)). For the most up to date models we recommend using the models from https://github.com/unitaryai/detoxify
14
-
15
- # 🙊 Detoxify
16
- ## Toxic Comment Classification with ⚡ Pytorch Lightning and 🤗 Transformers
17
-
18
- ![CI testing](https://github.com/unitaryai/detoxify/workflows/CI%20testing/badge.svg)
19
- ![Lint](https://github.com/unitaryai/detoxify/workflows/Lint/badge.svg)
20
-
21
- </div>
22
-
23
- ![Examples image](examples.png)
24
-
25
- ## Description
26
-
27
- Trained models & code to predict toxic comments on 3 Jigsaw challenges: Toxic comment classification, Unintended Bias in Toxic comments, Multilingual toxic comment classification.
28
-
29
- Built by [Laura Hanu](https://laurahanu.github.io/) at [Unitary](https://www.unitary.ai/), where we are working to stop harmful content online by interpreting visual content in context.
30
-
31
- Dependencies:
32
- - For inference:
33
- - 🤗 Transformers
34
- - Pytorch lightning
35
- - For training will also need:
36
- - Kaggle API (to download data)
37
-
38
-
39
- | Challenge | Year | Goal | Original Data Source | Detoxify Model Name | Top Kaggle Leaderboard Score | Detoxify Score
40
- |-|-|-|-|-|-|-|
41
- | [Toxic Comment Classification Challenge](https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge) | 2018 | build a multi-headed model that’s capable of detecting different types of of toxicity like threats, obscenity, insults, and identity-based hate. | Wikipedia Comments | `original` | 0.98856 | 0.98636
42
- | [Jigsaw Unintended Bias in Toxicity Classification](https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification) | 2019 | build a model that recognizes toxicity and minimizes this type of unintended bias with respect to mentions of identities. You'll be using a dataset labeled for identity mentions and optimizing a metric designed to measure unintended bias. | Civil Comments | `unbiased` | 0.94734 | 0.93639
43
- | [Jigsaw Multilingual Toxic Comment Classification](https://www.kaggle.com/c/jigsaw-multilingual-toxic-comment-classification) | 2020 | build effective multilingual models | Wikipedia Comments + Civil Comments | `multilingual` | 0.9536 | 0.91655*
44
-
45
- *Score not directly comparable since it is obtained on the validation set provided and not on the test set. To update when the test labels are made available.
46
-
47
- It is also noteworthy to mention that the top leadearboard scores have been achieved using model ensembles. The purpose of this library was to build something user-friendly and straightforward to use.
48
-
49
- ## Limitations and ethical considerations
50
-
51
- If words that are associated with swearing, insults or profanity are present in a comment, it is likely that it will be classified as toxic, regardless of the tone or the intent of the author e.g. humorous/self-deprecating. This could present some biases towards already vulnerable minority groups.
52
-
53
- The intended use of this library is for research purposes, fine-tuning on carefully constructed datasets that reflect real world demographics and/or to aid content moderators in flagging out harmful content quicker.
54
-
55
- Some useful resources about the risk of different biases in toxicity or hate speech detection are:
56
- - [The Risk of Racial Bias in Hate Speech Detection](https://homes.cs.washington.edu/~msap/pdfs/sap2019risk.pdf)
57
- - [Automated Hate Speech Detection and the Problem of Offensive Language](https://arxiv.org/pdf/1703.04009.pdf%201.pdf)
58
- - [Racial Bias in Hate Speech and Abusive Language Detection Datasets](https://arxiv.org/pdf/1905.12516.pdf)
59
-
60
- ## Quick prediction
61
-
62
-
63
- The `multilingual` model has been trained on 7 different languages so it should only be tested on: `english`, `french`, `spanish`, `italian`, `portuguese`, `turkish` or `russian`.
64
-
65
- ```bash
66
- # install detoxify
67
-
68
- pip install detoxify
69
-
70
- ```
71
- ```python
72
-
73
- from detoxify import Detoxify
74
-
75
- # each model takes in either a string or a list of strings
76
-
77
- results = Detoxify('original').predict('example text')
78
-
79
- results = Detoxify('unbiased').predict(['example text 1','example text 2'])
80
-
81
- results = Detoxify('multilingual').predict(['example text','exemple de texte','texto de ejemplo','testo di esempio','texto de exemplo','örnek metin','пример текста'])
82
-
83
- # optional to display results nicely (will need to pip install pandas)
84
-
85
- import pandas as pd
86
-
87
- print(pd.DataFrame(results, index=input_text).round(5))
88
-
89
- ```
90
- For more details check the Prediction section.
91
-
92
-
93
- ## Labels
94
- All challenges have a toxicity label. The toxicity labels represent the aggregate ratings of up to 10 annotators according the following schema:
95
- - **Very Toxic** (a very hateful, aggressive, or disrespectful comment that is very likely to make you leave a discussion or give up on sharing your perspective)
96
- - **Toxic** (a rude, disrespectful, or unreasonable comment that is somewhat likely to make you leave a discussion or give up on sharing your perspective)
97
- - **Hard to Say**
98
- - **Not Toxic**
99
-
100
- More information about the labelling schema can be found [here](https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification/data).
101
-
102
- ### Toxic Comment Classification Challenge
103
- This challenge includes the following labels:
104
-
105
- - `toxic`
106
- - `severe_toxic`
107
- - `obscene`
108
- - `threat`
109
- - `insult`
110
- - `identity_hate`
111
-
112
- ### Jigsaw Unintended Bias in Toxicity Classification
113
- This challenge has 2 types of labels: the main toxicity labels and some additional identity labels that represent the identities mentioned in the comments.
114
-
115
- Only identities with more than 500 examples in the test set (combined public and private) are included during training as additional labels and in the evaluation calculation.
116
-
117
- - `toxicity`
118
- - `severe_toxicity`
119
- - `obscene`
120
- - `threat`
121
- - `insult`
122
- - `identity_attack`
123
- - `sexual_explicit`
124
-
125
- Identity labels used:
126
- - `male`
127
- - `female`
128
- - `homosexual_gay_or_lesbian`
129
- - `christian`
130
- - `jewish`
131
- - `muslim`
132
- - `black`
133
- - `white`
134
- - `psychiatric_or_mental_illness`
135
-
136
- A complete list of all the identity labels available can be found [here](https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification/data).
137
-
138
-
139
- ### Jigsaw Multilingual Toxic Comment Classification
140
-
141
- Since this challenge combines the data from the previous 2 challenges, it includes all labels from above, however the final evaluation is only on:
142
-
143
- - `toxicity`
144
-
145
- ## How to run
146
-
147
- First, install dependencies
148
- ```bash
149
- # clone project
150
-
151
- git clone https://github.com/unitaryai/detoxify
152
-
153
- # create virtual env
154
-
155
- python3 -m venv toxic-env
156
- source toxic-env/bin/activate
157
-
158
- # install project
159
-
160
- pip install -e detoxify
161
- cd detoxify
162
-
163
- # for training
164
- pip install -r requirements.txt
165
-
166
- ```
167
-
168
- ## Prediction
169
-
170
- Trained models summary:
171
-
172
- |Model name| Transformer type| Data from
173
- |:--:|:--:|:--:|
174
- |`original`| `bert-base-uncased` | Toxic Comment Classification Challenge
175
- |`unbiased`| `roberta-base`| Unintended Bias in Toxicity Classification
176
- |`multilingual`| `xlm-roberta-base`| Multilingual Toxic Comment Classification
177
-
178
- For a quick prediction can run the example script on a comment directly or from a txt containing a list of comments.
179
- ```bash
180
-
181
- # load model via torch.hub
182
-
183
- python run_prediction.py --input 'example' --model_name original
184
-
185
- # load model from from checkpoint path
186
-
187
- python run_prediction.py --input 'example' --from_ckpt_path model_path
188
-
189
- # save results to a .csv file
190
-
191
- python run_prediction.py --input test_set.txt --model_name original --save_to results.csv
192
-
193
- # to see usage
194
-
195
- python run_prediction.py --help
196
-
197
- ```
198
-
199
- Checkpoints can be downloaded from the latest release or via the Pytorch hub API with the following names:
200
- - `toxic_bert`
201
- - `unbiased_toxic_roberta`
202
- - `multilingual_toxic_xlm_r`
203
- ```bash
204
- model = torch.hub.load('unitaryai/detoxify','toxic_bert')
205
- ```
206
-
207
- Importing detoxify in python:
208
-
209
- ```python
210
-
211
- from detoxify import Detoxify
212
-
213
- results = Detoxify('original').predict('some text')
214
-
215
- results = Detoxify('unbiased').predict(['example text 1','example text 2'])
216
-
217
- results = Detoxify('multilingual').predict(['example text','exemple de texte','texto de ejemplo','testo di esempio','texto de exemplo','örnek metin','пример текста'])
218
-
219
- # to display results nicely
220
-
221
- import pandas as pd
222
-
223
- print(pd.DataFrame(results,index=input_text).round(5))
224
-
225
- ```
226
-
227
-
228
- ## Training
229
-
230
- If you do not already have a Kaggle account:
231
- - you need to create one to be able to download the data
232
-
233
- - go to My Account and click on Create New API Token - this will download a kaggle.json file
234
-
235
- - make sure this file is located in ~/.kaggle
236
-
237
- ```bash
238
-
239
- # create data directory
240
-
241
- mkdir jigsaw_data
242
- cd jigsaw_data
243
-
244
- # download data
245
-
246
- kaggle competitions download -c jigsaw-toxic-comment-classification-challenge
247
-
248
- kaggle competitions download -c jigsaw-unintended-bias-in-toxicity-classification
249
-
250
- kaggle competitions download -c jigsaw-multilingual-toxic-comment-classification
251
-
252
- ```
253
- ## Start Training
254
- ### Toxic Comment Classification Challenge
255
-
256
- ```bash
257
-
258
- python create_val_set.py
259
-
260
- python train.py --config configs/Toxic_comment_classification_BERT.json
261
- ```
262
- ### Unintended Bias in Toxicicity Challenge
263
-
264
- ```bash
265
-
266
- python train.py --config configs/Unintended_bias_toxic_comment_classification_RoBERTa.json
267
-
268
- ```
269
- ### Multilingual Toxic Comment Classification
270
-
271
- This is trained in 2 stages. First, train on all available data, and second, train only on the translated versions of the first challenge.
272
-
273
- The [translated data](https://www.kaggle.com/miklgr500/jigsaw-train-multilingual-coments-google-api) can be downloaded from Kaggle in french, spanish, italian, portuguese, turkish, and russian (the languages available in the test set).
274
-
275
- ```bash
276
-
277
- # stage 1
278
-
279
- python train.py --config configs/Multilingual_toxic_comment_classification_XLMR.json
280
-
281
- # stage 2
282
-
283
- python train.py --config configs/Multilingual_toxic_comment_classification_XLMR_stage2.json
284
-
285
- ```
286
- ### Monitor progress with tensorboard
287
-
288
- ```bash
289
-
290
- tensorboard --logdir=./saved
291
-
292
- ```
293
- ## Model Evaluation
294
-
295
- ### Toxic Comment Classification Challenge
296
-
297
- This challenge is evaluated on the mean AUC score of all the labels.
298
-
299
- ```bash
300
-
301
- python evaluate.py --checkpoint saved/lightning_logs/checkpoints/example_checkpoint.pth --test_csv test.csv
302
-
303
- ```
304
- ### Unintended Bias in Toxicicity Challenge
305
-
306
- This challenge is evaluated on a novel bias metric that combines different AUC scores to balance overall performance. More information on this metric [here](https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification/overview/evaluation).
307
-
308
- ```bash
309
-
310
- python evaluate.py --checkpoint saved/lightning_logs/checkpoints/example_checkpoint.pth --test_csv test.csv
311
-
312
- # to get the final bias metric
313
- python model_eval/compute_bias_metric.py
314
-
315
- ```
316
- ### Multilingual Toxic Comment Classification
317
-
318
- This challenge is evaluated on the AUC score of the main toxic label.
319
-
320
- ```bash
321
-
322
- python evaluate.py --checkpoint saved/lightning_logs/checkpoints/example_checkpoint.pth --test_csv test.csv
323
-
324
- ```
325
-
326
- ### Citation
327
  ```
328
- @misc{Detoxify,
329
- title={Detoxify},
330
- author={Hanu, Laura and {Unitary team}},
331
- howpublished={Github. https://github.com/unitaryai/detoxify},
332
- year={2020}
333
- }
334
- ```
 
5
  - toxicity
6
  - toxic detection
7
  ---
8
+ This ONNX model is a dynamic quantized for the original model: [https://huggingface.co/unitary/multilingual-toxic-xlm-roberta](https://huggingface.co/unitary/multilingual-toxic-xlm-roberta)
9
+
10
+ ### Usage
11
+ Using `pipeline` from the `optimum` library
12
+ ```
13
+ from optimum.pipelines import pipeline as pipeline_onnx
14
+
15
+ quantized_pipeline = pipeline_onnx(
16
+ "text-classification",
17
+ model="hoan/multilingual-toxic-xlm-roberta-dynamic-quantized",
18
+ accelerator="ort",
19
+ top_k=None)
20
+
21
+ text = """Artificial intelligence (AI), frequently depicted in mainstream media as a harbinger of both groundbreaking innovation and understandable concern, has seamlessly permeated and embedded itself within a multitude of diverse sectors that constitute the intricate tapestry of our contemporary society. This relentless integration spans a wide spectrum, extending from the realms of healthcare, where AI is catalyzing transformative breakthroughs in disease diagnosis, treatment planning, and medical research, to the intricate domain of finance, where algorithms are reshaping the landscape of investment strategies, risk assessment, and market predictions."""
22
+ result = quantized_pipeline(text)
23
+ ```
24
+
25
+ The result is:
26
+ ```
27
+ [[{'label': 'LABEL_0', 'score': 0.20129913091659546},
28
+ {'label': 'LABEL_7', 'score': 0.15916699171066284},
29
+ {'label': 'LABEL_8', 'score': 0.11170786619186401},
30
+ {'label': 'LABEL_4', 'score': 0.08739417046308517},
31
+ {'label': 'LABEL_14', 'score': 0.07075755298137665},
32
+ {'label': 'LABEL_10', 'score': 0.06712991744279861},
33
+ {'label': 'LABEL_2', 'score': 0.06049202010035515},
34
+ {'label': 'LABEL_13', 'score': 0.05316930636763573},
35
+ {'label': 'LABEL_3', 'score': 0.04292014613747597},
36
+ {'label': 'LABEL_12', 'score': 0.041316740214824677},
37
+ {'label': 'LABEL_15', 'score': 0.026307238265872},
38
+ {'label': 'LABEL_9', 'score': 0.024316439405083656},
39
+ {'label': 'LABEL_11', 'score': 0.017735131084918976},
40
+ {'label': 'LABEL_5', 'score': 0.015682993456721306},
41
+ {'label': 'LABEL_6', 'score': 0.01122154202312231},
42
+ {'label': 'LABEL_1', 'score': 0.009382889606058598}]]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
  ```