dardem's picture
Update README.md
8262225 verified
metadata
license: openrail++
datasets:
  - ukr-detect/ukr-toxicity-dataset-seminatural
language:
  - uk
widget:
  - text: Ти неймовірна!

Binary toxicity classifier for Ukrainian

This is the fine-tuned on the semi-automatically collected Ukrainian toxicity classification dataset "xlm-roberta-base" instance.

The evaluation metrics for binary toxicity classification on a test set are:

Metric Value
F1-score 0.99
Precision 0.99
Recall 0.99
Accuracy 0.99

How to use:

from transformers import pipeline

classifier = pipeline("text-classification",
                       model="ukr-detect/ukr-toxicity-classifier")

Citation

@article{dementieva2024toxicity,
  title={Toxicity Classification in Ukrainian},
  author={Dementieva, Daryna and Khylenko, Valeriia and Babakov, Nikolay and Groh, Georg},
  journal={arXiv preprint arXiv:2404.17841},
  year={2024}
}