Text Generation
Transformers
PyTorch
TensorBoard
Safetensors
bloom
Eval Results
text-generation-inference
Inference Endpoints

Bloom keeps giving me Response [424].

#208
by nicholasKluge - opened

I have been using the Bloom API inference for a while, but recently, my code just crashed (and it only crashes with Bloom, other models work just fine, like gpt-2 or distillgpt2). A solution was proposed by a member of the community, but out of nowhere it stopped working again. Here is my code to reproduce the error:


import os
import requests

API_KEY = "my_api_key"

response = requests.post(
        "https://api-inference.huggingface.co/models/bigscience/bloom",
        headers={"Authorization": f"Bearer {API_KEY}"},
        json={"inputs": 'Python é uma'})

print(response)

# >> output: <Response [424]>

Is there something about requesting Bloom that is different than other models? I only have this problem when working with it (and I really like Bloom : ( )

If someone could help, I'll be really thankful!

I have the same error too... ive also openend some discussions abouot it with no answers yet. since this is only since yesterday im guessing it is just a server problem. Im sure hugging face staff will resolve this within the next 24h. if not maybe we should write some emails to the staff or something similar..

+1
I have been facing this issue for the past week, so the problem seems to be more complex :(

<Response [424]>
that is the only text i am getting after inference API execution...from last 6 hours

Same issue happening here.

BigScience Workshop org

Hi, thanks for reporting, I could reproduce. @olivierdehaene is looking into it.

BigScience Workshop org

Bloom is the only model hosted on AzureML infrastructure and we are currently awaiting further investigation from them on recent issues we experienced in the past week.
We are working hard to make sure Bloom is back up as quickly as possible but our hands are somewhat tied.

In the meantime, you can try Bloomz which is hosted on the HF infra.

I will post here as soon as the model is back up.

BigScience Workshop org
edited Mar 7, 2023

The issue is now fixed. Thanks for your patience.

BigScience Workshop org

Thank you @olivierdehaene !

christopher changed discussion status to closed

the service is unavailable again

christopher changed discussion status to open
BigScience Workshop org

Yes we are facing the same issue as before. We might host it on our own infra while we wait for better availability guarantees from our partner.
I will keep you updated.

BigScience Workshop org

Azure found that one of the eight A100s we use to serve BLOOM was faulty and replaced our node.
Bloom should be back at full capacity now.

We also added token streaming capabilities to Bloom. If you want to try it out, you can use this client: https://pypi.org/project/text-generation/

from text_generation import InferenceAPIClient

client = InferenceAPIClient("bigscience/bloom")

# Token Streaming
for response in client.generate_stream("Hello Bloom!"):
    if not response.token.special:
        print(response.token.text)
TimeRobber changed discussion status to closed

Sign up or log in to comment