GGUF version would be appciated
#1
by
nonetrix
- opened
GGUF version would be great, also GPTQ and AWQ as well
Actually, it's easy to create a gguf model yourself.
It works:
# if the LLM model path = "~/text-generation-webui/models"
docker pull ghcr.io/ggerganov/llama.cpp:full
docker run -v ~/text-generation-webui/models:/models ghcr.io/ggerganov/llama.cpp:full --convert /models/nitky_Swallow-70b-NVE-RP
docker run -v ~/text-generation-webui/models:/models ghcr.io/ggerganov/llama.cpp:full --quantize /models/nitky_Swallow-70b-NVE-RP/ggml-model-f16.gguf /models/nitky_Swallow-70b-NVE-RP-Q4_K_M.gguf Q4_K_M
I'll upload it if I have time.