Transformers
GGUF
English
mistral
Generated from Trainer
text-generation-inference

Free and ready to use zephyr-7B-beta-GGUF model as OpenAI API compatible endpoint

#2
by limcheekin - opened

Hi there,

I deployed the model as OpenAI API compatible endpoint at https://huggingface.co/spaces/limcheekin/zephyr-7B-beta-GGUF.

If you find this resource valuable, your support in the form of starring the space would be greatly appreciated.

Thank you.

seriously?

Much better and easier is use local compatible api via some Binding ...

Yes, but you need a powerful machine to run inferencing locally.

what? ...that's 7b model... you can run it on every potato

LoL, I can't run it on my machine. 🀣

deleted

You can almost run a 7B on a phone...

Not on my phone. :)

By the way, one of my many similar spaces like this one at https://huggingface.co/spaces/limcheekin/Mistral-7B-Instruct-v0.1-GGUF is quite active.

A sign that not everyone can run 7b model in their machine and doing productive works at the same time.

@limcheekin You want to do a productive work on your machine and cannot run potato size 7b model?
Maybe is time to buy something newer than computer from 2010?

btw : I run this model even on my smartphone redmi 12 pro ... 2 tokens/s

deleted

@limcheekin Seriously tho, you can run something like this even on a 100 dollar RK3588 board. No, it wont burn down the house in speed, but it will work. It does not take much resource to run these really.

deleted

@mirek190 Cool that its usable on a 12 ( mine is an 11, but i did just get a pixel 6 ), normally i just run this stuff on Linux ( even on ARM machines ) and not tried on android. You mind telling us what you used to run it on android ? Or a pointer to a page that you used?

try via MLCChat

deleted

@mirek190 tks ill take a look.

Thank you for your work and effort, this helped me shape my thinking into what direction I should take.

Sign up or log in to comment