Is visual grounding possible on multiple images?

#48
by echooooooooo - opened

I'd like to finetune the model for visual grounding task with multiple images. If it's possible, please give me an example. I want to know how to distinguish the bounding box of the first image and the second image.

Pi network

Sign up or log in to comment