koboldcpp_cpu / README.md
Henk717's picture
Better clone-ability
65a62d0
|
raw
history blame
1.57 kB
---
title: Koboldcpp Tiefigther
emoji: 🦎
colorFrom: green
colorTo: blue
sdk: docker
pinned: false
license: agpl-3.0
---
# Koboldcpp in a Space!
Welcome to the Koboldcpp space, Koboldcpp allows you to easily make your own demonstration spaces of a GGUF model.
### For the users
In this space:
- You can use the KoboldAI Lite UI for Instructions, Writing, Chat and Adventure use.
- You can use the model shown with a KoboldAI compatible API (Use the instance link that is shows + /api) or as an OpenAI compatible API (Use the instance link that it shows, optionally with /v1 if your solution requires this)
- In the UI all your data is stored locally without a sign-in.
- View the API documentation by accessing the frame link + /api in your browser (For example https://koboldai-koboldcpp-tiefighter.hf.space/api)
### For model / space developers
This space was designed to be easy to clone, first make sure you convert your model to the GGUF format and quantize it to something that fits on the GPU you allocated to your space.
If you have a GPU available for your space, clone this space and point the MODEL variable to your model's download location, then force a rebuild so it can use your own custom model. You can customize the model that is being displayed by setting the MODEL_NAME.
Want to run on the CPU tier? The following line enables multiuser GPU usage.
, "--usecublas", "mmq", "--gpulayers", "99", "--multiuser", "--contextsize", "4096"
If you remove this from the CMD in the Dockerfile your instance will now be compatible with CPU only usage.