Bloom keeps giving me Response [424].
I have been using the Bloom API inference for a while, but recently, my code just crashed (and it only crashes with Bloom, other models work just fine, like gpt-2 or distillgpt2). A solution was proposed by a member of the community, but out of nowhere it stopped working again. Here is my code to reproduce the error:
import os
import requests
API_KEY = "my_api_key"
response = requests.post(
"https://api-inference.huggingface.co/models/bigscience/bloom",
headers={"Authorization": f"Bearer {API_KEY}"},
json={"inputs": 'Python é uma'})
print(response)
# >> output: <Response [424]>
Is there something about requesting Bloom that is different than other models? I only have this problem when working with it (and I really like Bloom : ( )
If someone could help, I'll be really thankful!
I have the same error too... ive also openend some discussions abouot it with no answers yet. since this is only since yesterday im guessing it is just a server problem. Im sure hugging face staff will resolve this within the next 24h. if not maybe we should write some emails to the staff or something similar..
+1
I have been facing this issue for the past week, so the problem seems to be more complex :(
<Response [424]>
that is the only text i am getting after inference API execution...from last 6 hours
Same issue happening here.
Hi, thanks for reporting, I could reproduce. @olivierdehaene is looking into it.
Bloom is the only model hosted on AzureML infrastructure and we are currently awaiting further investigation from them on recent issues we experienced in the past week.
We are working hard to make sure Bloom is back up as quickly as possible but our hands are somewhat tied.
In the meantime, you can try Bloomz which is hosted on the HF infra.
I will post here as soon as the model is back up.
The issue is now fixed. Thanks for your patience.
Thank you @olivierdehaene !
the service is unavailable again
The team is investigating https://huggingface.co/bigscience/bloom/discussions/211#64098eaed5042941df7e416d
Yes we are facing the same issue as before. We might host it on our own infra while we wait for better availability guarantees from our partner.
I will keep you updated.
Azure found that one of the eight A100s we use to serve BLOOM was faulty and replaced our node.
Bloom should be back at full capacity now.
We also added token streaming capabilities to Bloom. If you want to try it out, you can use this client: https://pypi.org/project/text-generation/
from text_generation import InferenceAPIClient
client = InferenceAPIClient("bigscience/bloom")
# Token Streaming
for response in client.generate_stream("Hello Bloom!"):
if not response.token.special:
print(response.token.text)