SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices Paper • 2406.02532 • Published Jun 4 • 13