# SentiCSE This is a roBERTa-base model trained on MR dataset and finetuned for sentiment analysis with the Sentiment tasks. This model is suitable for English. + Reference Paper: SentiCSE (Main of Coling 2024). + Git Repo: https://github.com/nayohan/SentiCSE. ```python import torch from scipy.spatial.distance import cosine from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("DILAB-HYU/SentiCSE") model = AutoModel.from_pretrained("DILAB-HYU/SentiCSE") # Tokenize input texts texts = [ "The food is delicious.", "The atmosphere of the restaurant is good.", "The food at the restaurant is devoid of flavor.", "The restaurant lacks a good ambiance." ] inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt") # Get the embeddings with torch.no_grad(): embeddings = model(**inputs, output_hidden_states=True, return_dict=True).pooler_output # Calculate cosine similarities # Cosine similarities are in [-1, 1]. Higher means more similar cosine_sim_0_1 = 1 - cosine(embeddings[0], embeddings[1]) cosine_sim_0_2 = 1 - cosine(embeddings[0], embeddings[2]) cosine_sim_0_3 = 1 - cosine(embeddings[0], embeddings[3]) print("Cosine similarity between \"%s\" and \"%s\" is: %.3f" % (texts[0], texts[1], cosine_sim_0_1)) print("Cosine similarity between \"%s\" and \"%s\" is: %.3f" % (texts[0], texts[2], cosine_sim_0_2)) print("Cosine similarity between \"%s\" and \"%s\" is: %.3f" % (texts[0], texts[3], cosine_sim_0_3)) ``` Output: ``` Cosine similarity between "The food is delicious." and "The atmosphere of the restaurant is good." is: 0.942 Cosine similarity between "The food is delicious." and "The food at the restaurant is devoid of flavor." is: 0.703 Cosine similarity between "The food is delicious." and "The restaurant lacks a good ambiance." is: 0.656 ``` ## BibTeX entry and citation info Please cite the reference paper if you use this model. ``` @article{2024SentiCES, title={SentiCSE: A Sentiment-aware Contrastive Sentence Embedding Framework with Sentiment-guided Textual Similarity}, author={Kim, Jaemin and Na, Yohan and Kim, Kangmin and Lee, Sangrak and Chae, Dong-Kyu}, journal={Proceedings of the 30th International Conference on Computational Linguistics (COLING)}, year={2024}, } ```