Finetuned version of MiniCPM-V 2.6 on GQA-it
This is a fine-tuned version of MiniCPM-V 2.6 on GQA-it, designed for Italian Vision Question Answering. The original model is built on SigLip-400M and Qwen2-7B with a total of 8B parameters.
Usage
You can visit the original basic model repository for advanced usage: https://github.com/OpenBMB/MiniCPM-V.
For more details about dataset please visit: https://github.com/crux82/gqa-it
import torch
from PIL import Image
from transformers import AutoModel, AutoTokenizer,AutoProcessor
model = AutoModel.from_pretrained('sag-uniroma2/MiniCPM-V-2_6-gqa-it-finetuned', trust_remote_code=True,
attn_implementation='sdpa', torch_dtype=torch.bfloat16)
model = model.eval().cuda()
tokenizer = AutoTokenizer.from_pretrained('openbmb/MiniCPM-V-2_6', trust_remote_code=True)
img="n346247.jpg"
image = Image.open(img).convert('RGB')
question = "C'è un idrante sull'erba?"
msgs = [{'role': 'user', 'content': [image,question]}]
answer = model.chat(
image=None,
msgs=msgs,
tokenizer=tokenizer
)
print(answer)
GQA-it
Italian Question Answering on Image Scene Graphs
GQA-it is a large-scale Italian dataset for Visual Question Answering based on the balanced version of GQA.
GQA-it contains more than 1 million question/answer pairs in Italian over 80K images obtained by applying Neural Machine Translation.
Most importantly, a Test set of 3,000 question-answer pairs has been manually validated to provide a valuable benchmark in Italian.
Example
Language | Question | Answer |
---|---|---|
En | Is the remote to the right or to the left of the book? | right |
It | Il telecomando è a destra o a sinistra del libro? | destra |
En | How thick is the book to the left of the remote? | thick |
It | Quanto è spesso il libro a sinistra del telecomando? | spesso |
En | What device is to the left of the calculator made of plastic? | charger |
It | Quale dispositivo si trova a sinistra della calcolatrice di plastica? | caricabatterie |
En | What's the charger made of? | plastic |
It | Di cosa è fatto il caricabatterie? | plastica |
En | Are there any phones? | no |
It | Ci sono dei telefoni? | no |
Citation
TODO
- Downloads last month
- 2
Model tree for sag-uniroma2/MiniCPM-V-2_6-gqa-it-finetuned
Base model
openbmb/MiniCPM-V-2_6