Star Attention: Efficient LLM Inference over Long Sequences Paper • 2411.17116 • Published 2 days ago • 35
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems Paper • 2411.02959 • Published 23 days ago • 64
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free Paper • 2410.10814 • Published Oct 14 • 48
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization Paper • 2410.08815 • Published Oct 11 • 42
Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts Paper • 2409.16040 • Published Sep 24 • 13
view article Article dstack: Your LLM Launchpad - From Fine-Tuning to Serving, Simplified By chansung • Aug 22 • 12
CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases Paper • 2408.03910 • Published Aug 7 • 15