https://github.com/ggerganov/llama.cpp/pull/9141 PR to get long context support for this model in llama.cpp has been merged, so those quants no longer need to be used provided you have a newly compiled llama.cpp build. You can find official long ctx quants [here](https://huggingface.co/anthracite-org/magnum-v2-4b-gguf)