Accurate FP8 quantized models by Neural Magic, ready for use with vLLM!
-
neuralmagic/Meta-Llama-3-8B-Instruct-FP8
Text Generation • Updated • 3.94k • 5 -
neuralmagic/Meta-Llama-3-8B-Instruct-FP8-KV
Text Generation • Updated • 2.17k • 2 -
neuralmagic/Meta-Llama-3-70B-Instruct-FP8
Text Generation • Updated • 609 • 2 -
neuralmagic/Meta-Llama-3-70B-Instruct-FP8-KV
Text Generation • Updated • 123