tags: | |
- image-to-text | |
- image-captioning | |
- endpoints-template | |
license: bsd-3-clause | |
library_name: generic | |
# Fork of [salesforce/BLIP](https://github.com/salesforce/BLIP) for a `image-captioning` task on 🤗Inference endpoint. | |
This repository implements a `custom` task for `image-captioning` for 🤗 Inference Endpoints. The code for the customized pipeline is in the [pipeline.py](https://huggingface.co/florentgbelidji/blip_captioning/blob/main/pipeline.py). | |
To use deploy this model a an Inference Endpoint you have to select `Custom` as task to use the `pipeline.py` file. -> _double check if it is selected_ | |
### expected Request payload | |
```json | |
{ | |
"image": "/9j/4AAQSkZJRgABAQEBLAEsAAD/2wBDAAMCAgICAgMC....", // base64 image as bytes | |
} | |
``` | |
below is an example on how to run a request using Python and `requests`. | |
## Run Request | |
1. prepare an image. | |
```bash | |
!wget https://huggingface.co/datasets/mishig/sample_images/resolve/main/palace.jpg | |
``` | |
2.run request | |
```python | |
import json | |
from typing import List | |
import requests as r | |
import base64 | |
ENDPOINT_URL = "" | |
HF_TOKEN = "" | |
def predict(path_to_image: str = None): | |
with open(path_to_image, "rb") as i: | |
image = i.read() | |
payload = { | |
"inputs": [image], | |
"parameters": { | |
"do_sample": True, | |
"top_p":0.9, | |
"min_length":5, | |
"max_length":20 | |
} | |
} | |
response = r.post( | |
ENDPOINT_URL, headers={"Authorization": f"Bearer {HF_TOKEN}"}, json=payload | |
) | |
return response.json() | |
prediction = predict( | |
path_to_image="palace.jpg" | |
) | |
``` | |
Example parameters depending on the decoding strategy: | |
1. Beam search | |
``` | |
"parameters": { | |
"num_beams":5, | |
"max_length":20 | |
} | |
``` | |
2. Nucleus sampling | |
``` | |
"parameters": { | |
"num_beams":1, | |
"max_length":20, | |
"do_sample": True, | |
"top_k":50, | |
"top_p":0.95 | |
} | |
``` | |
3. Contrastive search | |
``` | |
"parameters": { | |
"penalty_alpha":0.6, | |
"top_k":4 | |
"max_length":512 | |
} | |
``` | |
See [generate()](https://huggingface.co/docs/transformers/v4.25.1/en/main_classes/text_generation#transformers.GenerationMixin.generate) doc for additional detail | |
expected output | |
```python | |
['buckingham palace with flower beds and red flowers'] | |
``` | |