The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 602
Common 7B Language Models Already Possess Strong Math Capabilities Paper • 2403.04706 • Published Mar 7 • 16
DocLLM: A layout-aware generative language model for multimodal document understanding Paper • 2401.00908 • Published Dec 31, 2023 • 181
Improving Text Embeddings with Large Language Models Paper • 2401.00368 • Published Dec 31, 2023 • 79
Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models Paper • 2401.00788 • Published Jan 1 • 21
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws Paper • 2401.00448 • Published Dec 31, 2023 • 28
WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation Paper • 2312.14187 • Published Dec 20, 2023 • 49
LLM in a flash: Efficient Large Language Model Inference with Limited Memory Paper • 2312.11514 • Published Dec 12, 2023 • 258
RepoFusion: Training Code Models to Understand Your Repository Paper • 2306.10998 • Published Jun 19, 2023 • 14
PathFinder: Guided Search over Multi-Step Reasoning Paths Paper • 2312.05180 • Published Dec 8, 2023 • 9
Mamba: Linear-Time Sequence Modeling with Selective State Spaces Paper • 2312.00752 • Published Dec 1, 2023 • 138
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines Paper • 2310.03714 • Published Oct 5, 2023 • 30