metadata

language:
  - en
tags:
  - llama
  - llama-3
  - lora
  - content-moderation
  - uncensored
  - text-generation
license: mit
base_model: meta-llama/Meta-Llama-3.1-8B-Instruct

Llama 3.1 Censorship LoRAs

This repository contains LoRA adapters for Meta's Llama 3.1 8B Instruct model, designed for censoring and uncensoring text content.

What are these LoRA adapters?

These LoRA adapters serve as fine-tuning tools for the Llama 3.1 model. They capture the differences between the original, more cautious Llama 3.1 and a version that has been adjusted to be less restrictive, agentlans/Llama3.1-vodka. These adapters adjust how the model handles potentially sensitive content.

The Basics

Base Model: Llama 3.1 Instruct 8B
Comparison Model: agentlans/Llama3.1-vodka
Extraction Method: LoRA (Low-Rank Adaptation)

Adapter Options

Different "strengths" of adaptation are available: 2, 4, 8, 16, 32, and 64. These can be thought of as dials for determining the extent of changes to the model's behaviour.

Applications

Customizing Llama 3.1 for specific content needs
Adjusting the model's behaviour to align more closely with the censored or uncensored variant
Experimenting with various settings to identify the most effective configuration

Tips for Use

Starting with lower ranks (2, 4, 8) allows for more subtle changes
Higher ranks (32, 64) enable larger adjustments but require more computational resources to apply to the model
Use the lowest rank that achieves the desired effect
For best results, use system prompts in conjunction with the LoRAs
Always use these adapters responsibly and ethically

Uses and Limitations

The Censor-LoRA

Designed for:

Maintaining family-friendly content
Removing explicit language
General content moderation

The Uncensor-LoRA

Intended for:

Restoring text that may have been excessively censored
Creative writing in more mature contexts
Generating realistic dialogue for adult-oriented content

Limitations

These adapters may occasionally over-censor or under-censor content
They should not be the sole method for content moderation; human oversight remains crucial
The uncensoring adapter has the potential to generate inappropriate content, necessitating careful use

Ethical Considerations

The use of these adapters raises several ethical concerns:

The censoring adapter may inadvertently suppress legitimate speech or artistic expression
The uncensoring adapter could be misused to produce harmful or offensive content
Both adapters may reflect and potentially amplify societal biases present in the training data

Careful consideration of the implications of deploying these models is necessary, along with the implementation of appropriate safeguards to ensure responsible usage.