when i use this model to embed a PDF file i get an error : ggml_allocr_alloc: not enough space in the buffer (needed 222784000, largest block available 24166400)

#1
by RajeshkumarV - opened

ggml_allocr_alloc: not enough space in the buffer (needed 222784000, largest block available 24166400)
GGML_ASSERT: C:\Users<name>\AppData\Local\Temp\pip-install-0ohg_aj6\llama-cpp-python_29c4846b4af1471bbb28a41659b32aa3\vendor\llama.cpp\ggml-alloc.c:144: !"not enough space in the buffer"

I tried other models as well .. same iissue
huginn-13b-v4.5.Q5_K_M.gguf
LLaMA-2-7B-32K-Q3_K_S.gguf

ggml_allocr_alloc: not enough space in the buffer (needed 222784000, largest block available 24166400)
GGML_ASSERT: C:\Users<name>\AppData\Local\Temp\pip-install-0ohg_aj6\llama-cpp-python_29c4846b4af1471bbb28a41659b32aa3\vendor\llama.cpp\ggml-alloc.c:144: !"not enough space in the buffer"

I tried other models as well .. same iissue
huginn-13b-v4.5.Q5_K_M.gguf
LLaMA-2-7B-32K-Q3_K_S.gguf

I think it's probably because you are exceeding your system specs

I have a 32GB RAM windows 11 Laptop with an I7 processor. Also has a dedicated GPU T1200

That is not nearly enough memory to load this model in a 64k context. You will need nearly 9x more memory.
Read this from TheBloke: https://huggingface.co/TheBloke/Yarn-Llama-2-13B-64K-GGUF/discussions/1#64f1b7d10b27861b2cb6e956

is there a 4k version of this model in gguf format pls?

Sign up or log in to comment