Benjamin Bossan commited on
Commit
29eecb6
1 Parent(s): 3d6a12e

Add Docker image

Browse files
Files changed (4) hide show
  1. Dockerfile +19 -0
  2. README.md +37 -1
  3. src/gistillery/worker.py +2 -1
  4. start.sh +21 -0
Dockerfile ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM pytorch/pytorch:latest
2
+
3
+ RUN apt update && apt install -y && rm -rf /var/lib/apt/lists/*
4
+
5
+ COPY requirements.txt .
6
+ RUN python3 -m pip install -r requirements.txt
7
+
8
+ COPY setup.py VERSION .
9
+ COPY src ./src
10
+ RUN python3 -m pip install .
11
+
12
+ COPY "demo.py" .
13
+ EXPOSE 7860
14
+ COPY start.sh .
15
+ RUN chmod +x start.sh
16
+
17
+ RUN mkdir /data
18
+ ENV TRANSFORMERS_CACHE=/data
19
+ CMD ["./start.sh"]
README.md CHANGED
@@ -24,7 +24,15 @@ python -m pip install -e .
24
  Set an environemnt variable called "HF_HUB_TOKEN" with your Hugging Face token
25
  or create a `.env` file with that env var.
26
 
27
- In one terminal, start the background worker:
 
 
 
 
 
 
 
 
28
 
29
  ```sh
30
  python src/gistillery/worker.py
@@ -46,6 +54,22 @@ python demo.py
46
 
47
  and navigate to the indicated URL (usually http://127.0.0.1:7860).
48
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
49
  ## Checks
50
 
51
  ### Running tests
@@ -61,3 +85,15 @@ mypy src/
61
  black src/ && black tests/
62
  ruff src/ && ruff tests/
63
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
24
  Set an environemnt variable called "HF_HUB_TOKEN" with your Hugging Face token
25
  or create a `.env` file with that env var.
26
 
27
+ ### Running the app
28
+
29
+ Run the `start.sh` script. This may require a `chmod +x start.sh` if not already executable.
30
+
31
+ ```sh
32
+ ./start.sh
33
+ ```
34
+
35
+ Instead, you can also run each part individually. For this, in one terminal, start the background worker:
36
 
37
  ```sh
38
  python src/gistillery/worker.py
 
54
 
55
  and navigate to the indicated URL (usually http://127.0.0.1:7860).
56
 
57
+ ## Docker
58
+
59
+ To run everything with Docker, first build the image:
60
+
61
+ ```sh
62
+ docker build -t gistillery:latest .
63
+ ```
64
+
65
+ Next run the container:
66
+
67
+ ```sh
68
+ docker run -p 7860:7860 -e GRADIO_SERVER_NAME=0.0.0.0 -v $HOME/.cache/huggingface/hub:/data gistillery:latest
69
+ ```
70
+
71
+ Note that the Hugging Face cache folder is mounted as a docker volume to make use of potentially available local model cache instead of downloading the transformers models each time the container is started. To prevent that, remove the `-v ...` parameter. The database used for storing the results is ephemeral and will be deleted when the docker container is stopped.
72
+
73
  ## Checks
74
 
75
  ### Running tests
 
85
  black src/ && black tests/
86
  ruff src/ && ruff tests/
87
  ```
88
+
89
+ ## TODOs
90
+
91
+ ### Tools
92
+
93
+ i. Reading pdf in general
94
+ i. Reading arxiv
95
+ i. Generating text from youtube videos using whisper
96
+
97
+ ### Deployment
98
+
99
+ i. Make DB location configurable, mountable when running in docker (otherwise, it will be deleted each time the container is stopped).
src/gistillery/worker.py CHANGED
@@ -1,4 +1,5 @@
1
  import time
 
2
  from dataclasses import dataclass
3
 
4
  from gistillery.base import JobInput
@@ -123,4 +124,4 @@ if __name__ == "__main__":
123
  main()
124
  except KeyboardInterrupt:
125
  print("Shutting down...")
126
- exit(0)
 
1
  import time
2
+ import sys
3
  from dataclasses import dataclass
4
 
5
  from gistillery.base import JobInput
 
124
  main()
125
  except KeyboardInterrupt:
126
  print("Shutting down...")
127
+ sys.exit()
start.sh ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+
3
+ echo "Starting background worker"
4
+ python3 src/gistillery/worker.py &
5
+ pid_worker=$!
6
+
7
+ echo "Starting web server"
8
+ uvicorn src.gistillery.webservice:app --port 8080 --host=0.0.0.0 &
9
+ pid_webserver=$!
10
+
11
+ # kill with ctrl-c
12
+ trap onexit INT
13
+ function onexit() {
14
+ echo "Killing background worker"
15
+ kill $pid_worker
16
+ echo "Killing web server"
17
+ kill $pid_webserver
18
+ }
19
+
20
+ echo "Starting gradio app on default port 7860"
21
+ python3 demo.py