DocGraphLM: Documental Graph Language Model for Information Extraction Paper • 2401.02823 • Published Jan 5 • 32
Understanding LLMs: A Comprehensive Overview from Training to Inference Paper • 2401.02038 • Published Jan 4 • 60
DocLLM: A layout-aware generative language model for multimodal document understanding Paper • 2401.00908 • Published Dec 31, 2023 • 176
Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration Paper • 2309.01131 • Published Sep 3, 2023 • 1
LMDX: Language Model-based Document Information Extraction and Localization Paper • 2309.10952 • Published Sep 19, 2023 • 61
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training Paper • 2403.09611 • Published Mar 14 • 123
HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models Paper • 2403.13447 • Published Mar 20 • 16
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report Paper • 2405.00732 • Published Apr 29 • 116
Block Transformer: Global-to-Local Language Modeling for Fast Inference Paper • 2406.02657 • Published about 1 month ago • 36
Unifying Vision, Text, and Layout for Universal Document Processing Paper • 2212.02623 • Published Dec 5, 2022 • 10
Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning Paper • 2406.15334 • Published 14 days ago • 8