KazSandra / static /description.txt
lgblkb's picture
feat: added texts
f824ccf
raw
history blame contribute delete
No virus
1.03 kB
KazSAnDRA is a dataset developed for Kazakh sentiment analysis, representing the first and most extensive publicly available resource in this field. This comprehensive dataset includes 180,064 reviews obtained from a variety of sources, supplemented with numerical ratings from 1 to 5 to quantitatively capture customer sentiments. The project also focused on automating Kazakh sentiment classification by developing and evaluating four different machine learning models. These models were trained for both polarity classification and score classification, with performance assessed under balanced and imbalanced conditions. The most effective model achieved an F1-score of 0.81 for polarity classification and 0.39 for score classification on test datasets.
The dataset and fine-tuned models are open access and available for download under the Creative Commons Attribution 4.0 International License (CC BY 4.0) through our GitHub repository.
DOI: https://doi.org/10.48550/arXiv.2403.19335
Data: https://github.com/IS2AI/KazSAnDRA