InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning Paper • 2305.06500 • Published May 11, 2023 • 4
PaLI-3 Vision Language Models: Smaller, Faster, Stronger Paper • 2310.09199 • Published Oct 13, 2023 • 24
Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models Paper • 2306.05424 • Published Jun 8, 2023 • 7