Multimodal Dataset - a btjhjeon Collection

btjhjeon 's Collections

Multimodal Alignment

PEFT

LLM

LLM context length

Multimodal Dataset

Multimodal Benchmarks

Multimodal Dataset

updated 19 days ago

SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers

Paper • 2407.09413 • Published Jul 12 • 9
MAVIS: Mathematical Visual Instruction Tuning

Paper • 2407.08739 • Published Jul 11 • 30
Kvasir-VQA: A Text-Image Pair GI Tract Dataset

Paper • 2409.01437 • Published Sep 2 • 70
MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct

Paper • 2409.05840 • Published Sep 9 • 45
InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning

Paper • 2409.12568 • Published Sep 19 • 47
LVD-2M: A Long-take Video Dataset with Temporally Dense Captions

Paper • 2410.10816 • Published 23 days ago • 19
Personalized Visual Instruction Tuning

Paper • 2410.07113 • Published 28 days ago • 69
Harnessing Webpage UIs for Text-Rich Visual Understanding

Paper • 2410.13824 • Published 20 days ago • 29