More updates. Going to be attempting a imatrix quant at Q3 so hopefully if that goes well and runs right on latest master will upload asap!
Not uploading Q6. This Quant may not work with your current llama.cpp build as there were breaking changes in the latest master.
Q4_K_M IS UP!
For those of you who do not know how to reassamble the parts cat Meta-Llama-3.1-405B-Instruct-Q4_K_M.gguf.part_* > Meta-Llama-3.1-405B-Instruct-Q4_K_M.gguf
Am amusing generation to kill the time. Is the instruct tuned to think it can ONLY be run on the cloud?
Why download these GGUF's? I made some modifications to the llama.cpp quantification process to for SURE use the right tokenizer vs the Smaug BPE GGUF's that are out now.