bullerwins
/

DeepSeek-Coder-V2-Instruct-GGUF

Inference Endpoints

Model card Files Files and versions Community

bullerwins commited on Jun 18

Commit

ddb996c

•

1 Parent(s): 4556979

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -7,6 +7,8 @@ license_link: LICENSE
 <!-- markdownlint-disable html -->
 <!-- markdownlint-disable no-duplicate-header -->
 GGUF quantize version of [DeepSeek-Coder-V2-Instruct-GGUF](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Instruct)
 Using [llama.cpp c637fcd](https://github.com/ggerganov/llama.cpp/commit/c637fcd34d135a9ff4f97d3a53ad03a910a4a31f)

 <!-- markdownlint-disable html -->
 <!-- markdownlint-disable no-duplicate-header -->
+NOTE: You might need to disable FA (Flash Attention) in llama.cpp to work properly.
 GGUF quantize version of [DeepSeek-Coder-V2-Instruct-GGUF](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Instruct)
 Using [llama.cpp c637fcd](https://github.com/ggerganov/llama.cpp/commit/c637fcd34d135a9ff4f97d3a53ad03a910a4a31f)