VRAM need for inference
#5
by
RizSoto
- opened
Needs information about the number of VRAM needed for inference.
Hello,
I want to know how many VRAM is needed to do a inference with Phitral 4x2_8.
If I understand how its work, the inference have 7.81B params but only 4.46B active params (approximaly).
So the need of VRAM is arround 2x(4.46) = 8.92Go of VRAM for a FP16 inference ?
I'd be interested to get the figures too! You can see the VRAM usage in Colab using the inference notebook provided in the readme.