Dataset

The following Turkish dataset is used for fine-tuning https://huggingface.co/datasets/maydogan/TRSAv1

TRSAv1 (Turkish Sentiment Analysis Version 1) Dataset This data set has been produced to contribute to Turkish NLP studies. The dataset consists of a total of 150 thousand samples, 50 thousand negative, 50 thousand positive and 50 thousand neutral. It can be used in text classification and sentiment analysis studies by citing the related study.

Load model directly

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("Marzu39/bert-turkish-text-classification")
model = AutoModelForSequenceClassification.from_pretrained("Marzu39/bert-turkish-text-classification")

Training hyperparameters

training_args = TrainingArguments(
    do_train=True,
    do_eval=True,
    num_train_epochs=3,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=16,
    warmup_steps=100,
    weight_decay=0.01,
    logging_strategy='steps',
    logging_steps=50,
    evaluation_strategy="epoch",
    eval_steps=50,
    save_strategy="epoch",
    fp16=False,
    load_best_model_at_end=True
)

Citation

Please cite the following paper if needed

@article{arzu2023turkcce,
  title={T{\"u}rk{\c{c}}e Duygu S{\i}n{\i}fland{\i}rma {\.I}{\c{c}}in Transformers Tabanl{\i} Mimarilerin Kar{\c{s}}{\i}la{\c{s}}t{\i}r{\i}lmal{\i} Analizi},
  author={Arzu, Mehmet and Aydo{\u{g}}an, Murat},
  journal={Computer Science},
  number={IDAP-2023},
  pages={1--6},
  year={2023},
  publisher={Ali KARCI}
}

Summary

Sentiment classification based on Transformers is a topic that has recently been widely studied in natural language processing and machine learning. There are many areas where it can be used, such as the interpretation and classification of emotional expressions in texts, social media analysis, market research, user experiences, etc. For this reason, this study aims to realize sentiment classification using Transformers based architectures.