eevvgg/StanceBERTa
This model is a fine-tuned version of distilroberta-base model to predict 3 categories of stance (negative, positive, neutral) towards some entity mentioned in the text. Fine-tuned on a larger and more balanced data sample compared with the previous version eevvgg/Stance-Tw.
Developed by: Ewelina Gajewska
Model type: RoBERTa for stance classification
Language(s) (NLP): English social media data from Twitter and Reddit
Finetuned from model: distilroberta-base
Uses
from transformers import pipeline
model_path = "eevvgg/StanceBERTa"
cls_task = pipeline(task = "text-classification", model = model_path, tokenizer = model_path)#, device=0
sequence = ["user The fact is that she still doesn’t change her ways and still stays non environmental friendly"
"user The criteria for these awards dont seem to be very high."]
result = cls_task(sequence)
Model suited for classification of stance in short text. Fine-tuned on a balanced corpus of size 5.6k, partially semi-annotated. *Suitable for fine-tuning on hate/offensive language detection.
Model Sources
- Repository: training procedure available in Colab notebook
- Paper : tba
Training Details
Preprocessing
Normalization of user mentions and hyperlinks to "@user" and "http" tokens, respectively.
Training Hyperparameters
- trained for 3 epochs, mini-batch size of 8.
- loss: 0.509
- learning_rate: 5e-5; weight_decay: 1e-2
Evaluation
Results
evaluation on 15% of data.
accuracy: 0.785
macro avg:
- f1: 0.778
- precision: 0.779
- recall: 0.778
weighted avg:
- f1: 0.786
- precision: 0.786
- recall: 0.785
Citation
BibTeX: tba
- Downloads last month
- 151