Does Prompt Formatting Have Any Impact on LLM Performance? Paper • 2411.10541 • Published 19 days ago • 1
DPO datasets for EN Collection A collection of DPO datasets for the EN language. • 43 items • Updated 14 days ago • 2
view article Article Introducing Observers: AI Observability with Hugging Face datasets through a lightweight SDK By davidberenstein1957 • 13 days ago • 34
Tulu 3 Datasets Collection All datasets released with Tulu 3 -- state of the art open post-training recipes. • 32 items • Updated 6 days ago • 52
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization Paper • 2411.10442 • Published 19 days ago • 61
Continuous Speculative Decoding for Autoregressive Image Generation Paper • 2411.11925 • Published 16 days ago • 14
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering Paper • 2411.11504 • Published 16 days ago • 19
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published 19 days ago • 107
Awesome Document AI Collection A collection of open-source document AI 📄 📝 📈 • 27 items • Updated Mar 11 • 74
Direct Preference Optimization Using Sparse Feature-Level Constraints Paper • 2411.07618 • Published 22 days ago • 15
ViDoRe Benchmark Collection Benchmark for document retrieval using visual features, introduced in the ColPali paper. Datasets are using the QA format. • 10 items • Updated 16 days ago • 11
LLM-KT: A Versatile Framework for Knowledge Transfer from Large Language Models to Collaborative Filtering Paper • 2411.00556 • Published Nov 1 • 1
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding Paper • 2411.04952 • Published 27 days ago • 27
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models Paper • 2411.04996 • Published 27 days ago • 49