why so size?
#1
by
eurotaku
- opened
The FP16 fix only means you can use it with FP16 AMP (which means you use fp16 to do computation and have fp16 hidden states)
not means you use FP16 to save weight.
For original model, you can even use fp8 to store the weight, just need to use BF16 to do computation.
So I provide fp32 and fp16 weight version. Both can be used at FP16 computation.
KBlueLeaf
changed discussion status to
closed