AI Text Steganography

Description

This is the baseline implementation of AI Text Steganography for our final project in Software Designing and Applied Information Security courses in HCMUS-VNU.
Our project focuses on hiding data inside a text sequence generated by LLMs (e.g. GPT-2).
We took inspiration from Kirchenbauer et al..

git clone https://github.com/trnKhanh/ai-text-steganography.git
cd ai-text-steganography

conda create -n ai-text-steganography python=3.10
conda activate ai-text-steganography

pip install -r requirements.txt

python demo.py

python api.py

python main.py -h

To access the documentation for the RestAPI, launch the RestAPI and go to http://localhost:6969/docs

config.ini is the config file of the project. We use the modified syntax of the configparser package. Every key-value pair follows the syntax: key = type:value. Currently, type can only be int, float or str.
Details on config:
- server: parameters for the RestAPI:
- models.names: names of LLMs allowed. Note that this follows the name defined on Hugging Face.
- models.params: parameters used to load LLMs.
- encrypt.default: default parameters for encryption algorithm.
- decrypt.default: default parameters for decryption algorithm.

Because of the limited resources, we load multiple models on the same machine (implementation is in model_factory.py):
- Each model is first loaded to the load_device (e.g. cpu).
- If there is a request to use a specific model, it is loaded to the run_device (e.g. gpu) for inference.
Therefore, only one model can be used for inference at a time. As a result, we could optimize the limited resources we have to allow users to choose different LLMs, but it forces the API to be synchronous instead.