license: apache-2.0
language:
- en
library_name: transformers
This model has been trained and released as part of the MilaNLP solution to the EDOS Shared Task.
Please check out the paper MilaNLP at SemEval-2023 Task 10: Ensembling Domain-Adapted and Regularized Pretrained Language Models for Robust Sexism Detection for further details.
Adaptation Details
We ran domain adaptation of a pretrained DeBERTa with standard MLM on the unlabeled Reddit corpus (1M posts) provided by the task organizers (Kirk et al., 2023) and the Gab Hate Corpus (87K posts) (Kennedy et al., 2022). After concatenating and shuffling the two datasets, we held out 5% as validation data, stratifying on the data source. Our final training dataset counted around 20M words.
Please refer to the paper for full details.
Reference
If you use the model, please consider citing:
@inproceedings{cercas-curry-etal-2023-milanlp,
title = "{M}ila{NLP} at {S}em{E}val-2023 Task 10: Ensembling Domain-Adapted and Regularized Pretrained Language Models for Robust Sexism Detection",
author = "Cercas Curry, Amanda and
Attanasio, Giuseppe and
Nozza, Debora and
Hovy, Dirk",
booktitle = "Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)",
month = jul,
year = "2023",
address = "Toronto, Canada",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2023.semeval-1.285",
doi = "10.18653/v1/2023.semeval-1.285",
pages = "2067--2074",
abstract = "We present the system proposed by the MilaNLP team for the Explainable Detection of Online Sexism (EDOS) shared task.We propose an ensemble modeling approach to combine different classifiers trained with domain adaptation objectives and standard fine-tuning.Our results show that the ensemble is more robust than individual models and that regularized models generate more {``}conservative{''} predictions, mitigating the effects of lexical overfitting.However, our error analysis also finds that many of the misclassified instances are debatable, raising questions about the objective annotatability of hate speech data.",
}