Will quantised version be available?
#9
by
angerhang
- opened
Thanks for sharing but what are the recommended ways to quantise this model?
Or will quantised model be made available so that it is not as resource-intensive to do inference?
Thanks
Did you see https://huggingface.co/models?other=base_model:quantized:nvidia/Llama-3.1-Nemotron-70B-Instruct-HF?
Use the model tree section on model pages to see what quantizations are available.