Papers
arxiv:2311.08981

Speculative Contrastive Decoding

Published on Nov 15, 2023
Authors:
,
,
,

Abstract

Large language models (LLMs) have shown extraordinary performance in various language tasks, but high computational requirements hinder their widespread deployment. Speculative decoding, which uses amateur models to predict the generation of expert models, has been proposed as a way to accelerate LLM inference. However, speculative decoding focuses on acceleration instead of making the best use of the token distribution from amateur models. We proposed Speculative Contrastive Decoding (SCD), an accelerated decoding method leveraging the natural contrast between expert and amateur models in speculative decoding. Comprehensive evaluations on four benchmarks show that SCD can achieve similar acceleration factors as speculative decoding while further improving the generation quality as the contrastive decoding. The analysis of token probabilities further demonstrates the compatibility between speculative and contrastive decoding. Overall, SCD provides an effective approach to enhance the decoding quality of LLMs while saving computational resources.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2311.08981 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2311.08981 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2311.08981 in a Space README.md to link it from this page.

Collections including this paper 2