Bitwidth for LM Head

#3
by denru - opened

What is the bitwidth used to quantize the LM head? Thanks!

It's the default, 6bpw. If in doubt, check the "quantization_config" key in config.json, specifically "head_bits".

Sign up or log in to comment