gistillery / README.md
Benjamin Bossan
Further debugging
1ec7d38
metadata
title: Gistillery
emoji: 🏭
colorFrom: purple
colorTo: gray
sdk: docker
app_port: 7860

Dump your knowledge, let AI refine it

Installation

Create a Python environment with Python 3.10+. Install the requirements and the package:

python -m pip install -r requirements.txt
python -m pip install .

For development, instead do:

python -m pip install -r requirements.txt
python -m pip install -r requirements-dev.txt
python -m pip install -e .

Starting

Preparing environemnt

Set an environemnt variable called "HF_HUB_TOKEN" with your Hugging Face token or create a .env file with that env var.

Running the app

Run the start.sh script. This may require a chmod +x start.sh if not already executable.

./start.sh

Instead, you can also run each part individually. For this, in one terminal, start the background worker:

python src/gistillery/worker.py

In another terminal, start the web server:

uvicorn src.gistillery.webservice:app --reload --port 8080

For example requests, check requests.org.

A very simple web interface is available via gradio. To start it, run:

python demo.py

and navigate to the indicated URL (usually http://127.0.0.1:7860).

Docker

To run everything with Docker, first build the image:

docker build -t gistillery:latest .

Next run the container:

docker run -p 7860:7860 -p 8080:8080 -v $HOME/.cache/huggingface/hub:/home/user/.cache/huggingface/hub gistillery:latest

Note that the Hugging Face cache folder is mounted as a docker volume to make use of potentially available local model cache instead of downloading the transformers models each time the container is started. To prevent that, remove the -v ... parameter. The database used for storing the results is ephemeral and will be deleted when the docker container is stopped. The backend server is also exposed directly via port 8080 to enable DB backups (see below).

Backup

To download a backup of the backend DB, visit localhost:8080/backup. If you wish to start the app based on a backup, set the DB_FILE_NAME environment variable to the name of the backup.

Checks

Running tests

python -m pytest tests/

Other

mypy src/
black src/ && black tests/ && black demo.py
ruff src/ && ruff tests/ && ruff demo.py

TODOs

Tools

i. Reading pdf in general i. Reading arxiv i. Generating text from youtube videos using whisper

Deployment

i. Make DB location configurable, mountable when running in docker (otherwise, it will be deleted each time the container is stopped).