LiveMind: Low-latency Large Language Models with Simultaneous Inference Paper • 2406.14319 • Published Jun 20 • 14
Invertible Consistency Distillation for Text-Guided Image Editing in Around 7 Steps Paper • 2406.14539 • Published Jun 20 • 26
HARE: HumAn pRiors, a key to small language model Efficiency Paper • 2406.11410 • Published Jun 17 • 38
Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters Paper • 2406.16758 • Published Jun 24 • 19
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation Paper • 2406.16855 • Published Jun 24 • 54
Found in the Middle: Calibrating Positional Attention Bias Improves Long Context Utilization Paper • 2406.16008 • Published Jun 23 • 6
Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations Paper • 2406.13632 • Published Jun 19 • 5
AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models Paper • 2406.16714 • Published Jun 24 • 10
Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMs Paper • 2406.15927 • Published Jun 22 • 13
Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers Paper • 2406.16747 • Published Jun 24 • 18
Efficient Continual Pre-training by Mitigating the Stability Gap Paper • 2406.14833 • Published Jun 21 • 19
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions Paper • 2406.15877 • Published Jun 22 • 45