Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers Faster Paper • 2311.08263 • Published Nov 14, 2023 • 15
SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention Paper • 2312.07987 • Published Dec 13, 2023 • 40