OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper • 2411.04905 • Published 12 days ago • 105
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases Paper • 2402.14905 • Published Feb 22 • 126
Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse Paper • 2410.21333 • Published 23 days ago • 9
Counting Ability of Large Language Models and Impact of Tokenization Paper • 2410.19730 • Published 25 days ago • 10
NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples Paper • 2410.14669 • Published Oct 18 • 35
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss Paper • 2410.17243 • Published 28 days ago • 88
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch Paper • 2410.18693 • Published 26 days ago • 40
Why Does the Effective Context Length of LLMs Fall Short? Paper • 2410.18745 • Published 26 days ago • 16
Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices Paper • 2410.11795 • Published Oct 15 • 16
LLM Reasoning Papers Collection Papers to improve reasoning capabilities of LLMs • 15 items • Updated 17 days ago • 75
Thinking LLMs: General Instruction Following with Thought Generation Paper • 2410.10630 • Published Oct 14 • 16
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models Paper • 2410.07985 • Published Oct 10 • 26
Think While You Generate: Discrete Diffusion with Planned Denoising Paper • 2410.06264 • Published Oct 8 • 9
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging Paper • 2410.01215 • Published Oct 2 • 30
Visual Context Window Extension: A New Perspective for Long Video Understanding Paper • 2409.20018 • Published Sep 30 • 8
One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos Paper • 2409.19603 • Published Sep 29 • 17