Any plan for 70b ?

by LPN64 - opened Apr 22

Discussion

LPN64

Apr 22

Hello, do you plan to release 70b ?

supportend

Apr 22

I think, yes, because the model card says it and the 70b folder was renamed to:
Meta-Llama-3-70B-Instruct-GGUF-old

bartowski

Owner Apr 22

yup, just having trouble with the server that was running it, transferred several off but then it crashed and i need to get it back up

Dampfinchen

Apr 22

•

edited Apr 22

yup, just having trouble with the server that was running it, transferred several off but then it crashed and i need to get it back up

I think it would make sense to test perplexity of the models beforehand as allegedly there are issues with imatrix and I-Quants.

bartowski

Owner Apr 23

@Dampfinchen

There's perplexity issues but there's absolutely no generation issues

Even the exl2 gets weirdly high 7+ PPL, but it runs great. I almost feel the instruct tune is SO sensitive to its prompt template that it goes off the rails if it doesn't have it.

I've found in using it, unlike other models that will only slightly misbehave when they don't have their template, this one will go absolutely nuts generating infinitely. That's likely not good for perplexity..

Either way PPL on wiki raw is a weak test of a model's performance, use it if you like it

bartowski

Owner Apr 23

As for the 70B version, getting close ! internet is being waaay too slow, been going all day long :') just a one-off though, won't be doing it this way going forward

bartowski

Owner Apr 23

@LPN64 @supportend

Sorry it took so long! It's up now :)

https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF

supportend

Apr 23

Thank you for the great work. Q5_K_M is the best i can use with my CPU/RAM, i think, imatrix could have benefits. For smaller models i use Q6 or even Q8.

bartowski

Owner Apr 23

These are with imatrix btw :)

supportend

Apr 24

Yes, this is, why i use them. :-)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment