"Use in Transformers" feature is wrong

by smartinezbragado - opened Jun 25, 2023

Jun 25, 2023

The right way to load this model is:

from transformers import Blip2Processor, Blip2ForConditionalGeneration

processor = Blip2Processor.from_pretrained("Salesforce/blip2-flan-t5-xl-coco")
model = Blip2ForConditionalGeneration.from_pretrained(
    "Salesforce/blip2-flan-t5-xl-coco", torch_dtype=torch.float16
)
```

nielsr

Mar 27

Thanks for flagging. cc @osanseviero do you know why AutoModelForVisualQuestionAnswering is displayed in the "use with Transformers" UI? That class doesn't even exist in the Transformers library

osanseviero

Mar 29

•

edited Mar 29

It's specified in the pipeline_tags file generated in transformers https://huggingface.co/datasets/huggingface/transformers-metadata/blob/main/pipeline_tags.json#L57

Right here it exists-> https://github.com/huggingface/transformers/blob/main/src/transformers/models/auto/modeling_auto.py#L1482

nielsr

Mar 29

Ok thanks, actually I think we should deprecate BLIP and BLIP-2 for that class, and move them to the new "image-text-to-text" pipeline, which is added in https://github.com/huggingface/transformers/pull/29572.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment