view article Article Llama can now see and run on your device - welcome Llama 3.2 11 days ago • 137
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model Paper • 2409.01704 • Published Sep 3 • 78
Qwen2-Math Collection Math-specific model series based on Qwen2 • 8 items • Updated 17 days ago • 44
view article Article ColPali: Efficient Document Retrieval with Vision Language Models 👀 By manu • Jul 5 • 110
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models Jun 24 • 170
Qwen2 Collection Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated 17 days ago • 340
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data Paper • 2405.14333 • Published May 23 • 32
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community Apr 15 • 161
view article Article A Dive into Pretraining Strategies for Vision-Language Models Feb 3, 2023 • 36
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report Paper • 2405.00732 • Published Apr 29 • 118
view article Article SeeMoE: Implementing a MoE Vision Language Model from Scratch By AviSoori1x • Jun 23 • 33
MobileSAMv2: Faster Segment Anything to Everything Paper • 2312.09579 • Published Dec 15, 2023 • 20
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated 10 days ago • 676
LMDX: Language Model-based Document Information Extraction and Localization Paper • 2309.10952 • Published Sep 19, 2023 • 65
PERL: Parameter Efficient Reinforcement Learning from Human Feedback Paper • 2403.10704 • Published Mar 15 • 56
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6 • 182
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding Paper • 2307.02499 • Published Jul 4, 2023 • 15
TinyGSM: achieving >80% on GSM8k with small language models Paper • 2312.09241 • Published Dec 14, 2023 • 36
CogVLM: Visual Expert for Pretrained Language Models Paper • 2311.03079 • Published Nov 6, 2023 • 23
OpenChat Collection OpenChat: Advancing Open-source Language Models with Mixed-Quality Data • 7 items • Updated Jul 31 • 33
Zephyr 7B Collection Models, datasets, and demos associated with Zephyr 7B. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 9 items • Updated Apr 12 • 144
Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification Paper • 2308.07921 • Published Aug 15, 2023 • 22
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models Paper • 2310.08491 • Published Oct 12, 2023 • 53
A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models Paper • 2309.11674 • Published Sep 20, 2023 • 31
Investigating Answerability of LLMs for Long-Form Question Answering Paper • 2309.08210 • Published Sep 15, 2023 • 12
Multimodal Foundation Models: From Specialists to General-Purpose Assistants Paper • 2309.10020 • Published Sep 18, 2023 • 40
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages Paper • 2309.09400 • Published Sep 17, 2023 • 82
YaRN: Efficient Context Window Extension of Large Language Models Paper • 2309.00071 • Published Aug 31, 2023 • 65
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents Paper • 2306.16527 • Published Jun 21, 2023 • 47
UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition Paper • 2308.03279 • Published Aug 7, 2023 • 21
Med-Flamingo: a Multimodal Medical Few-shot Learner Paper • 2307.15189 • Published Jul 27, 2023 • 22
LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition Paper • 2307.13269 • Published Jul 25, 2023 • 31
Semantic-SAM: Segment and Recognize Anything at Any Granularity Paper • 2307.04767 • Published Jul 10, 2023 • 20
Faster Segment Anything: Towards Lightweight SAM for Mobile Applications Paper • 2306.14289 • Published Jun 25, 2023 • 15