How can we know which label represents what?
How can we know which label represents what?
Hello,
Sorry for general untidiness in this proyect, I uploaded this mainly to use for an internal usage.
This BERT has been trained on the UCC dataset https://github.com/conversationai/unhealthy-conversations
The labels are the following:
{"sarcastic" : 0, "antagonize" : 1, "condescending" : 2, "dismissive" : 3, "generalisation" : 4, "healthy" : 5, "hostile" : 6}
For example, if you were to do:
outputs = self.model(**tokens)
logits = outputs.logits
probabilities = F.sigmoid(logits)[0]
healthy = probabilities[0].item()
You would get the probability of an input being healthy.
Hi again! Thank you a lot for your answer! Do you know any good model which detects racism, xenophobia, homophobia? I'm doing my thesis about hate speech in football fans content on social media, and this would be really helpful, but I'm not finding the perfect one. I found the unitary/toxic-bert but I don't feel like it is the perfect one
Hello,
Seems like you are doing some interesting work! In the lab I am a part of there are people doing polarization work on football. Are you doing a PhD or a bachelors? Maybe you can get in touch.
That being said, have you checked out this work:
https://github.com/microsoft/TOXIGEN
This is an artificial dataset that can be used to train hate classifier for specific groups (homophobia and xenophobia are some of the groups this dataset covers). You would have to train the specific model yourself, this paper only comes with a binary classifier for all toxic messages.
Good luck!