TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization Methods
Abstract
Authorship obfuscation aims to disguise the identity of an author within a text by altering the writing style, vocabulary, syntax, and other linguistic features associated with the text author. This alteration needs to balance privacy and utility. While strong obfuscation techniques can effectively hide the author's identity, they often degrade the quality and usefulness of the text for its intended purpose. Conversely, maintaining high utility tends to provide insufficient privacy, making it easier for an adversary to de-anonymize the author. Thus, achieving an optimal trade-off between these two conflicting objectives is crucial. In this paper, we propose TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization, a new unsupervised authorship obfuscation method whose goal is to optimize the privacy-utility trade-off by regenerating the entire text considering its downstream utility. Our approach leverages policy optimization as a fine-tuning paradigm over small language models in order to rewrite texts by preserving author identity and downstream task utility. We show that our approach largely reduce the accuracy of attackers while preserving utility. We make our code and models publicly available.
Community
Text anonymization is crucial for protecting private information, but it often compromises the meaning of the original text. Striking a balance between privacy and meaning preservation is challenging. To address this, we propose TAROT (Task-Oriented Authorship Obfuscation using Policy Optimization Techniques), a new method leveraging advancements in reinforcement learning. TAROT optimizes a language model to rewrite text, focusing on both preserving meaning and enhancing privacy. This approach aims to maintain the core message while safeguarding sensitive information (text authorship).
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- NAP^2: A Benchmark for Naturalness and Privacy-Preserving Text Rewriting by Learning from Human (2024)
- Robust Utility-Preserving Text Anonymization Based on Large Language Models (2024)
- IDT: Dual-Task Adversarial Attacks for Privacy Protection (2024)
- IncogniText: Privacy-enhancing Conditional Text Anonymization via LLM-based Private Attribute Randomization (2024)
- Exposing Privacy Gaps: Membership Inference Attack on Preference Data for LLM Alignment (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 2
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper