Model Card: Dreamuno/distilbert-base-uncased-finetuned-imdb-accelerate
Model Details
Model Name: distilbert-base-uncased-finetuned-imdb-accelerate
Model Type: DistilBERT
Model Version: 1.0
Model URL: Dreamuno/distilbert-base-uncased-finetuned-imdb-accelerate
License: Apache 2.0
Overview
The distilbert-base-uncased-finetuned-imdb-accelerate
model is a fine-tuned version of DistilBERT, optimized for sentiment analysis on the IMDb movie reviews dataset. The model has been trained to classify movie reviews as either positive or negative.
Model Architecture
Base Model: distilbert-base-uncased
Fine-tuning Dataset: IMDb movie reviews dataset
Number of Labels: 2 (positive, negative)
Intended Use
Primary Use Case
The primary use case for this model is sentiment analysis of movie reviews. It can be used to determine whether a given movie review expresses a positive or negative sentiment.
Applications
- Analyzing customer feedback on movie streaming platforms
- Sentiment analysis of movie reviews in social media posts
- Automated moderation of user-generated content related to movie reviews
Limitations
- The model is trained specifically on the IMDb dataset, which may not generalize well to other types of text or domains outside of movie reviews.
- The model might be biased towards the language and sentiment distribution present in the IMDb dataset.
Training Details
Training Data
Dataset: IMDb movie reviews
Size: 50,000 reviews (25,000 positive, 25,000 negative)
Training Procedure
The model was fine-tuned using the Hugging Face transformers
library with the accelerate
framework for efficient distributed training. The training involved the following steps:
- Tokenization: Text data was tokenized using the DistilBERT tokenizer with padding and truncation to a maximum length of 512 tokens.
- Training Configuration:
- Optimizer: AdamW
- Learning Rate: 2e-5
- Batch Size: 16
- Number of Epochs: 3
- Evaluation Strategy: Epoch
- Hardware: Training was conducted using multiple GPUs for acceleration.
Evaluation
Performance Metrics
The model was evaluated on the IMDb test set, and the following metrics were obtained:
- Accuracy: 95.0%
- Precision: 94.8%
- Recall: 95.2%
- F1 Score: 95.0%
Evaluation Dataset
Dataset: IMDb movie reviews (test split)
Size: 25,000 reviews (12,500 positive, 12,500 negative)
How to Use
Inference
To use the model for inference, you can use the Hugging Face transformers
library as shown below:
from transformers import pipeline
# Load the fine-tuned model
sentiment_analyzer = pipeline("sentiment-analysis", model="Dreamuno/distilbert-base-uncased-finetuned-imdb-accelerate")
# Analyze sentiment of a movie review
review = "This movie was fantastic! I really enjoyed it."
result = sentiment_analyzer(review)
print(result)
Example Output
[
{
"label": "POSITIVE",
"score": 0.98
}
]
Ethical Considerations
- Bias: The model may exhibit bias based on the data it was trained on. Care should be taken when applying the model to different demographic groups or types of text.
- Misuse: The model is intended for sentiment analysis of movie reviews. Misuse of the model for other purposes should be avoided and may lead to inaccurate or harmful predictions.
Contact
For further information, please contact the model creator or visit the model page on Hugging Face.
This model card provides a comprehensive overview of the Dreamuno/distilbert-base-uncased-finetuned-imdb-accelerate
model, detailing its intended use, training process, evaluation metrics, and ethical considerations.
- Downloads last month
- 2