File size: 670 Bytes
cc93efa
 
5e1c155
 
 
 
a2a70d5
5e1c155
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
---
license: mit
language:
- en
base_model:
- OpenGVLab/InternVL-Chat-V1-5
pipeline_tag: visual-question-answering
---

## Citation

If you use this finetuned model checkpoint in your research, please cite our paper as follows:

```bibtex
      @misc{zhang2024visualquestiondecompositionmultimodal,
      title={Visual Question Decomposition on Multimodal Large Language Models}, 
      author={Haowei Zhang and Jianzhe Liu and Zhen Han and Shuo Chen and Bailan He and Volker Tresp and Zhiqiang Xu and Jindong Gu},
      year={2024},
      eprint={2409.19339},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2409.19339}, 
}
```