Please, add GGUF version!

by Anderson452 - opened May 22

Discussion

Anderson452

May 22

Please

bapatra

Microsoft org May 22

There is currently work happening on the llama.cpp side to actively support this (for example, this and this).

Specifically for this model, adding LongRoPE support for the 128k context length and the heterogeneous block-sparsity attention makes it a bit tricky, but hopefully this should be there soon :)

bapatra changed discussion status to closed May 22

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment