Quantized version temporarily unavailable

by jonabur - opened May 24

LumiOpen org May 24

We saw some performance issues with the quantized version and have taken it down temporarily while we investigate.

Jun 3

We saw some performance issues with the quantized version and have taken it down temporarily while we investigate.
Any ETA on this? :)

LumiOpen org Jun 3

We ended up needing to submit a PR for llama.cpp to support our tokenizer. We submitted the PR today so hopefully it can be fixed soon:

Once the PR is merged we should be able to upload a new version.

Jun 3

Sweet, looking forward to that!

Jun 14

ggerganov approved it

Jun 17

Progress on this?

LumiOpen org Jun 17

Should be coming back ~today!

LumiOpen org Jun 17

We have a little more testing to do, but it looks good for tomorrow.

LumiOpen org Jun 18

It's uploaded, please let us know if you have any trouble. Make sure you're using a current version of llama.cpp though!

jonabur changed discussion status to closed Jun 18

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment