TheBloke/Wizard-Vicuna-30B-Uncensored-GPTQ

Jun 3, 2023

This model is the only one pass boiling-water-obtuse-angle test among tested open-source models. Even better than falcon-40b

And can run on single 4090 with satisfying speed (about 8 token/s on win11)

TheBloke

Owner Jun 5, 2023

Excellent! Thanks for reporting. That's a good test!

Prakash1206

Jun 5, 2023

May i know memory requirement to run this model on GPU.
i assume min 17GB video memory.
currently i'm on 32GB Ram with GTX 1080ti (11GB Vid Memory)
when i turn to load this model on text-generation-webui, it fills ram (bit of mem swapping) then it crashes.
may i know how much RAM is required to run this and is it possible to run only on CPU mode (selecting cpu under model settings in text-generatino-webui didn't help)

following model works fine on GPU
TheBloke_stable-vicuna-13B-GPTQ
Getting about 8tokens/sec
tried same question as in this thread

Thanks

chraac

Jun 8, 2023

Got different result with a littel bit wording change, :)

Pb-207

Jun 9, 2023

May i know memory requirement to run this model on GPU.
i assume min 17GB video memory.
currently i'm on 32GB Ram with GTX 1080ti (11GB Vid Memory)
when i turn to load this model on text-generation-webui, it fills ram (bit of mem swapping) then it crashes.
may i know how much RAM is required to run this and is it possible to run only on CPU mode (selecting cpu under model settings in text-generatino-webui didn't help)

following model works fine on GPU
TheBloke_stable-vicuna-13B-GPTQ
Getting about 8tokens/sec
tried same question as in this thread

Thanks

You need to choose ggml version to run on cpu, GPTQ is only for GPU. This model requires at least 18G to load, and the usage of vram will increase to 21G after several chats, so i suggest using GPU with at least 24G VRAM.
You need to confine the usage of VRAM and leave some VRAM for chat to get rid of "CUDA: OUT OF MEMEROY". In oob-webui that's --gpu-memory 8 (8 is an example) This will decrease the generating speed, but decrease the demand of VRAM.

TheBloke
/

Wizard-Vicuna-30B-Uncensored-GPTQ

Awesome model !!!