Let's make it faster!

#130
by emilios - opened

I did some tests with v3-turbo as assistant model (speculative decoding) with v3 as "teacher"
and it is even faster and less words get lost in "transit"

Will you please implement it here?

@sanchit-gandhi

Sign up or log in to comment