Is visual grounding possible on multiple images?
#48
by
echooooooooo
- opened
I'd like to finetune the model for visual grounding task with multiple images. If it's possible, please give me an example. I want to know how to distinguish the bounding box of the first image and the second image.
Pi network