LLM Speculative Decoding
Collection
Tiny language models meant to serve as draft models for speculative decoding.
•
6 items
•
Updated
•
2
Experimental model meant to serve as a long-context speculative decoding model.
Created using Doctor-Shotgun/smol_llama-220M-GQA-32k-theta and finetuning at 32768 context length on several instruction datasets.
This variant uses the rope theta (rope frequency base) method for context extension.
The trained instruction format is Alpaca:
### Instruction:
{{instruction}}
### Input:
{{user input}}
### Response:
{{model response}}