dardem's picture
Update README.md
8262225 verified
---
license: openrail++
datasets:
- ukr-detect/ukr-toxicity-dataset-seminatural
language:
- uk
widget:
- text: Ти неймовірна!
---
## Binary toxicity classifier for Ukrainian
This is the fine-tuned on the semi-automatically collected [Ukrainian toxicity classification dataset](https://huggingface.co/datasets/ukr-detect/ukr-toxicity-dataset) ["xlm-roberta-base"](https://huggingface.co/xlm-roberta-base) instance.
The evaluation metrics for binary toxicity classification on a test set are:
| Metric | Value |
|-----------|-------|
| F1-score | 0.99 |
| Precision | 0.99 |
| Recall | 0.99 |
| Accuracy | 0.99 |
## How to use:
```
from transformers import pipeline
classifier = pipeline("text-classification",
model="ukr-detect/ukr-toxicity-classifier")
```
## Citation
```
@article{dementieva2024toxicity,
title={Toxicity Classification in Ukrainian},
author={Dementieva, Daryna and Khylenko, Valeriia and Babakov, Nikolay and Groh, Georg},
journal={arXiv preprint arXiv:2404.17841},
year={2024}
}
```