DistilBERT Base Model for Lithuanian Reviews Sentiment Analysis

Overview

This repository contains a fine-tuned version of the distilbert/distilbert-base-multilingual-cased model for sentiment analysis classification. It was specifically trained using Lithuanian internet reviews from various domains as part of a master's degree research project on the topic "Sentiment Analysis of Lithuanian Online Reviews Using Deep Language Models".

DistilBERT is a smaller, faster, and more efficient version of BERT, retaining 97% of BERT’s language understanding while being 60% faster and 40% smaller. The base DistilBERT model was pre-trained on the Wikipedia dataset across 104 languages, including Lithuanian. The case-sensitive model can differentiate between 'labai nepatiko' and 'LABAI nepatiko'. For more architectural details refer to distilbert/distilbert-base-multilingual-cased model description.

Model Details

Model Description

Developed by: Brigita Vileikytė
Model type: Transformer-based language model
Language(s) (NLP): fine-tuned for Lithuanian, pre-trained on 104 languages;
License: Apache 2.0
Finetuned from model: distilbert/distilbert-base-multilingual-cased

Bias, Risks, and Limitations

While the fine-tuned DistilBERT model shows promising results in classifying sentiments from Lithuanian reviews, it is important to be aware of potential biases and limitations:

Dataset Bias

Imbalance in Sentiment Distribution: The dataset contains more positive reviews than negative or neutral ones. This imbalance can lead the model to perform better on positive sentiments and less accurately on neutral or negative ones.
Source Bias: Reviews were collected from specific sources (Pigu.lt, Atsiliepimai.lt, Google Maps). These sources may not represent the full spectrum of sentiments expressed across all Lithuanian internet domains.

Practical Considerations

Interpretation of Sentiments: Sentiments are subjective, and the model's classification might not always align with human judgment. Users should consider the model's predictions as one of several tools for sentiment analysis.
Updates and Maintenance: The model's performance may degrade as language usage evolves. Regular updates and retraining with new data can help maintain accuracy.

Training Details

Training Data

The dataset for fine-tuning the model was collected from three sources:

Pigu.lt - 5993 reviews
Atsiliepimai.lt - 3212 reviews
Google Maps - 122795 reviews

The reviews were classified into five categories based on a 5-star rating system:

5 stars: Emotionally positive sentiment (Category 4)
4 stars: Rationally positive sentiment (Category 3)
3 stars: Neutral sentiment (Category 2)
2 stars: Rationally negative sentiment (Category 1)
1 star: Emotionally negative sentiment (Category 0)

Evaluation

Performance Metrics

Model	Accuracy	F1 Score Overall	F1 Scores by Category
DistilBERT	0.6845	0.6751	0.7601, 0.3556, 0.4938, 0.4513, 0.8354

Results

The model's performance was evaluated using a confusion matrix and various metrics. The table below presents the results for all five sentiment categories:

True Category	Emotionally Negative	Rationally Negative	Neutral	Rationally Positive	Emotionally Positive
Emotionally Negative	2135 (80.74%)	248 (9.38%)	197 (7.45%)	82 (3.10%)	83 (3.14%)
Rationally Negative	362 (26.32%)	402 (29.20%)	232 (16.85%)	71 (5.15%)	40 (2.91%)
Neutral	237 (12.76%)	217 (11.69%)	984 (53.00%)	396 (21.31%)	280 (15.08%)
Rationally Positive	48 (2.63%)	32 (1.75%)	299 (16.41%)	1030 (56.51%)	978 (53.60%)
Emotionally Positive	71 (1.14%)	25 (0.40%)	149 (2.37%)	590 (9.39%)	5645 (89.61%)

The table below presents the results for three sentiment categories:

True Category	Negative	Neutral	Positive
Negative	3147 (75.79%)	429 (10.34%)	276 (6.65%)
Neutral	454 (14.90%)	984 (32.18%)	676 (22.09%)
Positive	217 (2.98%)	445 (6.11%)	8243 (91.01%)

Getting Started

Model Usage

To use the fine-tuned model for sentiment analysis, you can follow the steps below:

from transformers import pipeline, AutoModelForSequenceClassification, AutoTokenizer

# Load the fine-tuned model and tokenizer
model_output_dir = "brivil1/lithuanian-sentiment-analysis-ByT5"
trained_model = AutoModelForSequenceClassification.from_pretrained(model_output_dir)
trained_tokenizer = AutoTokenizer.from_pretrained(model_output_dir)

# Create a sentiment analysis pipeline
sentiment_pipeline = pipeline("text-classification", model=trained_model, tokenizer=trained_tokenizer)

Example

print(sentiment_pipeline("Blogai. ziauru ir nepatiko"))
print(sentiment_pipeline("Labai puiku"))
print(sentiment_pipeline("Nežinau, visai nepatinka"))

Results:

[{'label': 'negative', 'score': 0.9424479007720947}]
[{'label': 'positive', 'score': 0.8821539282798767}]
[{'label': 'neutral', 'score': 0.9761189222335815}]