Groq/Llama-3-Groq-8B-Tool-Use · Inference API (serverless)

Hugging Face

Inference API (serverless)

by vbaldinger - opened Jul 29

Discussion

vbaldinger

Jul 29

Hi all!

According to the model card, this model can be loaded on Inference API (serverless). But if I try do to so I get the error:
The model Groq/Llama-3-Groq-8B-Tool-Use is too large to be loaded automatically (16GB > 10GB)

ricklamers

Groq org Jul 29

Is this not just a limitation of the free tier?

vbaldinger

Jul 29

I have the pro subscription and can use e.g. meta-llama/Meta-Llama-3.1-405B-Instruct-FP8, so I don't think so

ricklamers

Groq org Jul 29

meta-llama/Meta-Llama-3.1-405B-Instruct-FP8

This is sub 10GB? :O

ganeshjcs

Jul 31

You might need to save the model as smaller safetensors.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment