GGUF Model

by juanjgit - opened Aug 26, 2023

Aug 26, 2023

I converted it to GGUF. It is the first time I do it so I might have done something wrong... but It is working fine for me in a 6Gb android phone.
https://huggingface.co/juanjgit/orca_mini_3B-GGUF

pankajmathur

Owner Aug 26, 2023

Wow 6GB Android phone, did you measure the speed of tokens generation? How slow/fast it is?

Good news is that I am working on releasing v2 , so you could be early one to make GGUF version :) stay tuned .

juanjgit

Aug 27, 2023

Your model is the only 3B that is usable, it gives pretty good responses. And when it hallucinates, it is funny. So a v2 is very good news!
I compiled llama.cpp in Termux and I am getting 1.5-2 tokes/s.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment