Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens Paper • 2411.17691 • Published 1 day ago • 4
Unveiling the Backbone-Optimizer Coupling Bias in Visual Representation Learning Paper • 2410.06373 • Published Oct 8 • 35
From CISC to RISC: language-model guided assembly transpilation Paper • 2411.16341 • Published 2 days ago • 11
Cautious Optimizers: Improving Training with One Line of Code Paper • 2411.16085 • Published 3 days ago • 10
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training Paper • 2411.15124 • Published 5 days ago • 50
OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs Paper • 2411.14199 • Published 6 days ago • 25