LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21 • 53
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering Paper • 2408.09174 • Published Aug 17 • 51
To Code, or Not To Code? Exploring Impact of Code in Pre-training Paper • 2408.10914 • Published Aug 20 • 40
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications Paper • 2408.11878 • Published Aug 20 • 50
CURLoRA: Stable LLM Continual Fine-Tuning and Catastrophic Forgetting Mitigation Paper • 2408.14572 • Published Aug 26 • 7
SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding Paper • 2408.15545 • Published Aug 28 • 34
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture Paper • 2409.02889 • Published Sep 4 • 54
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA Paper • 2409.02897 • Published Sep 4 • 44
Guide-and-Rescale: Self-Guidance Mechanism for Effective Tuning-Free Real Image Editing Paper • 2409.01322 • Published Sep 2 • 94
Towards a Unified View of Preference Learning for Large Language Models: A Survey Paper • 2409.02795 • Published Sep 4 • 72
Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance Paper • 2409.04593 • Published Sep 6 • 22
ProteinBench: A Holistic Evaluation of Protein Foundation Models Paper • 2409.06744 • Published Sep 10 • 6
Training Language Models to Self-Correct via Reinforcement Learning Paper • 2409.12917 • Published 28 days ago • 131
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models Paper • 2409.16191 • Published 24 days ago • 41
TPI-LLM: Serving 70B-scale LLMs Efficiently on Low-resource Edge Devices Paper • 2410.00531 • Published 17 days ago • 28
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging Paper • 2410.01215 • Published 16 days ago • 30
RATIONALYST: Pre-training Process-Supervision for Improving Reasoning Paper • 2410.01044 • Published 16 days ago • 34
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis Paper • 2410.02749 • Published 14 days ago • 12
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration Paper • 2410.02367 • Published 15 days ago • 45
Addition is All You Need for Energy-efficient Language Models Paper • 2410.00907 • Published 16 days ago • 131
Agent S: An Open Agentic Framework that Uses Computers Like a Human Paper • 2410.08164 • Published 7 days ago • 24
Toward General Instruction-Following Alignment for Retrieval-Augmented Generation Paper • 2410.09584 • Published 5 days ago • 42