kbldcpp-hf-noxinc-gemma2bPTBR

Sleeping

App Files Files Community

Henk717 commited on Nov 6, 2023

Commit

a1992ae

•

1 Parent(s): 0ebbc54

Better clone-ability

Browse files

Files changed (2) hide show

Dockerfile +2 -1
README.md +23 -4

Dockerfile CHANGED Viewed

@@ -1,10 +1,11 @@
 FROM nvidia/cuda:11.8.0-devel-ubuntu22.04
 ARG MODEL
 RUN mkdir /opt/koboldcpp
 RUN apt update && apt install git build-essential libopenblas-dev wget python3-pip -y
 RUN git clone https://github.com/lostruins/koboldcpp /opt/koboldcpp
 WORKDIR /opt/koboldcpp
 RUN make LLAMA_OPENBLAS=1 LLAMA_CUBLAS=1 LLAMA_PORTABLE=1
 RUN wget -O model.ggml $MODEL
-CMD ["/bin/python3", "./koboldcpp.py", "--model", "model.ggml", "--usecublas", "mmq", "--gpulayers", "99", "--multiuser", "--contextsize", "4096", "--port", "7860", "--hordeconfig", "HF_SPACE_Tiefighter", "1", "1"]

 FROM nvidia/cuda:11.8.0-devel-ubuntu22.04
 ARG MODEL
+ARG MODEL_NAME
 RUN mkdir /opt/koboldcpp
 RUN apt update && apt install git build-essential libopenblas-dev wget python3-pip -y
 RUN git clone https://github.com/lostruins/koboldcpp /opt/koboldcpp
 WORKDIR /opt/koboldcpp
 RUN make LLAMA_OPENBLAS=1 LLAMA_CUBLAS=1 LLAMA_PORTABLE=1
 RUN wget -O model.ggml $MODEL
+CMD ["/bin/python3", "./koboldcpp.py", "--model", "model.ggml", "--usecublas", "mmq", "--gpulayers", "99", "--multiuser", "--contextsize", "4096", "--port", "7860", "--hordeconfig", "HF_SPACE_$MODEL_NAME", "1", "1"]

README.md CHANGED Viewed

@@ -1,11 +1,30 @@
 ---
 title: Koboldcpp Tiefigther
-emoji: 🚀
-colorFrom: indigo
-colorTo: gray
 sdk: docker
 pinned: false
 license: agpl-3.0
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
 title: Koboldcpp Tiefigther
+emoji: 🦎
+colorFrom: yellow
+colorTo: orange
 sdk: docker
 pinned: false
 license: agpl-3.0
 ---
+# Koboldcpp in a Space!
+Welcome to the Koboldcpp space, Koboldcpp allows you to easily make your own demonstration spaces of a GGUF model.
+### For the users
+In this space:
+- You can use the KoboldAI Lite UI for Instructions, Writing, Chat and Adventure use.
+- You can use the model shown with a KoboldAI compatible API (Use the instance link that is shows + /api) or as an OpenAI compatible API (Use the instance link that it shows, optionally with /v1 if your solution requires this)
+- In the UI all your data is stored locally without a sign-in.
+- View the API documentation by accessing the frame link + /api in your browser (For example https://koboldai-koboldcpp-tiefighter.hf.space/api)
+### For model / space developers
+This space was designed to be easy to clone, first make sure you convert your model to the GGUF format and quantize it to something that fits on the GPU you allocated to your space.
+If you have a GPU available for your space, clone this space and point the MODEL variable to your model's download location, then force a rebuild so it can use your own custom model. You can customize the model that is being displayed by setting the MODEL_NAME.
+Want to run on the CPU tier? The following line enables multiuser GPU usage.
+, "--usecublas", "mmq", "--gpulayers", "99", "--multiuser", "--contextsize", "4096"
+If you remove this from the CMD in the Dockerfile your instance will now be compatible with CPU only usage.