Spaces:
Running
on
Zero
Suggestion
The Hugging Face InferenceClient is a quick and efficient way to obtain model responses, It also provides many models and also didn't require any storage.
Sample Space - https://huggingface.co/spaces/ehristoforu/mixtral-46.7b-chat
By the way Nice Project
Hello @KingNish ,
What a honor!
I found it very amazing you like it our space!
With @Lumpen1 we work on this also we have more spaces ideas to put in here!
Cool thank you for the suggestion we are very noob using gradio. But with this ZeroGPU advantage that HF give us we take the wheels.
client = InferenceClient(
"mistralai/Mixtral-8x7B-Instruct-v0.1"
)
is this support JSON Schema
or Grammars
Nevertheless if wanna be friends or work together we are open to that!
In this scenario, we are letting the llama-cpp-agent
framework handle the client-inference, through the use of a provider selection.
Currently, it supports LlamaCPPPython (as shown in this example), LlamaCPPServer, VLLMServer and TGIServer.
client = InferenceClient( "mistralai/Mixtral-8x7B-Instruct-v0.1" )
is this support
JSON Schema
orGrammars
Sadly, No.
Nevertheless if wanna be friends or work together we are open to that!
I'm definitely interested in exploring ways we can collaborate and share ideas.
Nice we can talk on discord of llama-cpp-agent
if you like https://discord.gg/DwVpftn4
We are always open and friendly.
Also let me know if wanna join this org... The main purpose was to experiment in spaces with ZeroGPU
I would like to join this org.