--- title: Stance Directed Humanizing Ai emoji: 🐠 colorFrom: blue colorTo: purple sdk: streamlit sdk_version: 1.31.1 app_file: app.py pinned: false license: gpl-3.0 --- # Towards a Programmable Humanizing AI through Scalable Stance-Directed Architecture Welcome to the **Stance-Directed Humanizing AI** Huggingface Space! This project aims to reduce the generation of toxic narratives in digital communications by leveraging the power of generative artificial intelligence (AI) fine-tuned on positive human values. Our approach emphasizes the importance of fostering social cohesion and understanding through language, counteracting the spread of harmful content online. ## About the Project Our study introduces a novel pipeline to train Large Language Models (LLM) for generating tweets that are not only relevant to given aspects and entities but also aligned with healthier discourse and constructive sentiments. This pipeline utilizes a toxic content classifier to ensure generated tweets are non-toxic and employs a stance-aware aspect-based sentiment analysis (ABSA) model to extract stances from these tweets, promoting a more civil and humanized interaction on social media platforms. ### Key Components 1. **Tweet Generator**: Based on aspects and entities, this model generates tweets that aim to reflect humanized and constructive discourse. 2. **Toxic Content Classifier**: This component classifies the generated tweets as toxic or non-toxic, ensuring the promotion of positive engagement. 3. **Stance-Aware ABSA Model**: Extracts the stance of the generated tweet towards the specified aspects and entities, facilitating a deep understanding of sentiments. ## Datasets The study incorporates five datasets: - **TrainTweetsForHumanizedLLM.csv** and **TrainTweetsForUnrestrictedLLM.csv**: Training data for humanized and unrestricted LLMs respectively. - **ToxicClassifierDataset.csv**: Training data for the toxic content classifier. - **GoldToxicDataset.csv**: Golden outputs for evaluating the toxicity classifier's performance, labelled by 3 annotators with a Krippendorff's alpha nominal score of 0.73, indicating a good level of inter-annotator agreement. - **GeneratedOutputsWithLabels.csv**: The generated tweets using humanized and unrestricted LLMs labelled by 3 annotators and provided with classifier model predictions, indicating a Krippendorff's alpha nominal score of 0.75, further showing a reliable consensus among annotators. ## Using This Space To simulate our study and see the models in action: 1. **Select Ideology, Aspects, and Entities**: Begin by specifying the ideology along with the aspects and entities you are interested in. 2. **Generate a Tweet**: The tweet generator model will produce a tweet based on your input. 3. **View ABSA Outputs**: Analyze the fine-grained sentiments and stances extracted from the generated tweet. 4. **Toxic/Non-Toxic Label**: Determine whether the generated tweet is considered toxic or non-toxic. ### Example Usage ```plaintext Ideology: Left Pro Entities: ['migrant worker rights groups'] Anti Entities: ['labor exploitation'] Neutral Entities: ['agricultural sector'] Pro Aspects: ['fair treatment', 'safety standards'] Anti Aspects: [] Neutral Aspects: ['employment laws', 'worker visas'] Generated Tweet: "the agricultural sector is the single biggest recipient of migrants workers rights groups argue . nearly 90 % of those who come to the us are denied employment due to discriminatory employment laws and safety standards ." ABSA Outputs: Aspect: migrants, Sentiment: positive Aspect: rights, Sentiment: positive Aspect: laws, Sentiment: positive Aspect: safety, Sentiment: positive Toxic/Non-Toxic Label: Non-Toxic ``` ## Contributions and Feedback We encourage contributions and feedback to improve this project. If you have suggestions or want to contribute, please open an issue or pull request on our GitHub repository at tweetpie/stance-directed-humanizing-ai. ## Citation If you use our work, please cite our paper: ```bibtex @article{tweetpie2022, title={Towards a Programmable Humanizing AI through Scalable Stance-Directed Architecture}, author={TweetPie}, journal={arXiv}, year={2024}, volume={} } ```