Visual Question Answering
Transformers
English
qwen2
text-generation
multimodal large language model
large video-language model
Inference Endpoints
hangzhang-nlp's picture
Model Upload
7b414f2
raw
history blame
0 Bytes