Accelerating Speculative Decoding using Dynamic Speculation Length Paper • 2405.04304 • Published May 7 • 2
Distributed Speculative Inference of Large Language Models Paper • 2405.14105 • Published May 23 • 15