🚩 403 Forbidden: error: The model CohereForAI/c4ai-command-r-plus is too large to be loaded automatically (207GB > 10GB).

#62

by gbhall - opened Sep 12

Sep 12

Dammit. I'm suddenly now getting a 403 and the following response:

{
    "error": "The model CohereForAI/c4ai-command-r-plus is too large to be loaded automatically (207GB > 10GB). Please use Spaces (https://huggingface.co/spaces) or Inference Endpoints (https://huggingface.co/inference-endpoints)."
}

Endpoint: https://api-inference.huggingface.co/models/CohereForAI/c4ai-command-r-plus/v1/chat/completions

gbhall

Sep 12

Is the API inference for this model no longer supported @shivi ? I was relying upon it in a few apps in production with lots of users.

gbhall

Sep 12

I also notice the page now says "Inference API (serverless) has been turned off for this model." 😔 Is this the end for this model?

alexrs

Cohere For AI org Sep 12

Hi @gbhall , the inference endpoint API is provided by Huggingface but we were not officially supporting it. If you are using your model in production, we recommend to check our API -- https://docs.cohere.com/reference/chat

The usage of our models for commercial purposes is not permitted, as per our license

gbhall

Sep 12

•

edited Sep 12

Hi @alexrs , thank you for your response. Do you have any insight into why it has been disabled? I understand it’s out of your control but do you have any contacts at HF you can reach out to?

Thank you. I’m not using the API for commercial purposes. Just useful utilities.

I’ve had a look at your API, unfortunately HF was desirable as it’s directly compatible with the ChatGPT Chat Completions API, using the HF TGI model. This allows you to swap in and out models simply by providing a new endpoint URL.

nbroad

Sep 12

Hi, I work at HF. I believe we switched to support the newer cohere model: https://huggingface.co/CohereForAI/c4ai-command-r-plus-08-2024

If you need it up 24/7, consider using Inference Endpoints (for non-commercial purposes), or Cohere's API (for commercial).

gbhall

Sep 13

•

edited Sep 13

Hi @nbroad , thank you very much! That is such good news!

Can I ask why not just update this model to the newer version? And do you have any tips how I can keep abreast / informed if a model is deactivated and switched to a newer one?

I tried the Dedicated Inference Endpoints yesterday. Unfortunately 3 requests ended up costing 45 min of compute time with a 15 min cooldown period, for a total of $6 USD. Since I'm doing non-commerical purposes this is untenable for me unfortunately, hence why I pay for the HF Pro subscription.

nbroad

Sep 13

•

edited Sep 13

Can I ask why not just update this model to the newer version? And do you have any tips how I can keep abreast / informed if a model is deactivated and switched to a newer one?

It's better to let users access both models. This was Cohere's decision, not HF's.

I tried the Dedicated Inference Endpoints yesterday. Unfortunately 3 requests ended up costing 45 min of compute time with a 15 min cooldown period, for a total of $6 USD. Since I'm doing non-commerical purposes this is untenable for me unfortunately, hence why I pay for the HF Pro subscription.

Try Cohere's API then.

gbhall

Sep 19

•

edited Sep 19

Hi, I work at HF. I believe we switched to support the newer cohere model: https://huggingface.co/CohereForAI/c4ai-command-r-plus-08-2024

If you need it up 24/7, consider using Inference Endpoints (for non-commercial purposes), or Cohere's API (for commercial).

Hi @nbroad , damn seems the new model has also been disabled.

For reasons I've listed already, my use case requires the OpenAI compatibility with the Text Generation Inference (TGI) capable Serverless Inference API, which Cohere does not support.

@nbroad , @alexrs is it possible to keep one of these models enabled on HF please. I don't know who's decision and call that is, but this is an extremely useful model to use on HF.

Edit: Nevermind, on the page of https://huggingface.co/CohereForAI/c4ai-command-r-plus-08-2024 it says it's disabled, but the Serverless Inference API works still.

alexrs changed discussion status to closed Sep 25

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment