agentlans's picture
Upload folder using huggingface_hub
4bc841c verified
metadata
language:
  - en
tags:
  - llama
  - llama-3
  - lora
  - content-moderation
  - uncensored
  - text-generation
license: mit
base_model: meta-llama/Meta-Llama-3.1-8B-Instruct

Llama 3.1 Censorship LoRAs

This repository contains LoRA adapters for Meta's Llama 3.1 8B Instruct model, designed for censoring and uncensoring text content.

What are these LoRA adapters?

These LoRA adapters serve as fine-tuning tools for the Llama 3.1 model. They capture the differences between the original, more cautious Llama 3.1 and a version that has been adjusted to be less restrictive, agentlans/Llama3.1-vodka. These adapters adjust how the model handles potentially sensitive content.

The Basics

  • Base Model: Llama 3.1 Instruct 8B
  • Comparison Model: agentlans/Llama3.1-vodka
  • Extraction Method: LoRA (Low-Rank Adaptation)

Adapter Options

Different "strengths" of adaptation are available: 2, 4, 8, 16, 32, and 64. These can be thought of as dials for determining the extent of changes to the model's behaviour.

Applications

  • Customizing Llama 3.1 for specific content needs
  • Adjusting the model's behaviour to align more closely with the censored or uncensored variant
  • Experimenting with various settings to identify the most effective configuration

Tips for Use

  • Starting with lower ranks (2, 4, 8) allows for more subtle changes
  • Higher ranks (32, 64) enable larger adjustments but require more computational resources to apply to the model
  • Use the lowest rank that achieves the desired effect
  • For best results, use system prompts in conjunction with the LoRAs
  • Always use these adapters responsibly and ethically

Uses and Limitations

The Censor-LoRA

Designed for:

  • Maintaining family-friendly content
  • Removing explicit language
  • General content moderation

The Uncensor-LoRA

Intended for:

  • Restoring text that may have been excessively censored
  • Creative writing in more mature contexts
  • Generating realistic dialogue for adult-oriented content

Limitations

  • These adapters may occasionally over-censor or under-censor content
  • They should not be the sole method for content moderation; human oversight remains crucial
  • The uncensoring adapter has the potential to generate inappropriate content, necessitating careful use

Ethical Considerations

The use of these adapters raises several ethical concerns:

  • The censoring adapter may inadvertently suppress legitimate speech or artistic expression
  • The uncensoring adapter could be misused to produce harmful or offensive content
  • Both adapters may reflect and potentially amplify societal biases present in the training data

Careful consideration of the implications of deploying these models is necessary, along with the implementation of appropriate safeguards to ensure responsible usage.