Any idea how much VRAM does this use ?

#12
by Teddydj - opened

Hello,
A 4090 can use WizardLM 33B but for WizardCoder34B it doesn't seems to be enough, does someone know how much VRAM WizardCoder34B use ?
i cant find the information

thanks you

So no point to try with a 4080? Sounds very sad :)

This repo seems to have a better version/ more info https://huggingface.co/TheBloke/WizardCoder-Python-34B-V1.0-GGUF

yep I'm using GGUF q4k_m version as it fits fully in vram ( all layers ) and has less perplexity than any GPTQ obsolete version.
With RTX 3090 have around 30 t/s.

WizardLM changed discussion status to closed

Sign up or log in to comment