Qwen2-VL Collection Vision-language model series based on Qwen2 • 15 items • Updated 14 days ago • 129
Flamingo: a Visual Language Model for Few-Shot Learning Paper • 2204.14198 • Published Apr 29, 2022 • 14
view article Article Introducing IDEFICS: An Open Reproduction of State-of-the-art Visual Language Model Aug 22, 2023 • 26
ColPali: Efficient Document Retrieval with Vision Language Models Paper • 2407.01449 • Published Jun 27 • 41
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated 7 days ago • 585
view article Article Multimodal Augmentation for Documents: Recovering “Comprehension” in “Reading and Comprehension” task By danaaubakirova • May 16 • 17