cognitivecomputations/Wizard-Vicuna-30B-Uncensored · Always getting Segmentation Fault

Hey, I'm trying to run this model via pipelines, and have also tried running it manually with LlamaForCausalLLM, however no matter what I try to do, I always get a segmentation fault.

Here's my current code:

model = "ehartford/Wizard-Vicuna-30B-Uncensored"

import os
from transformers import pipeline

os.environ["HF_HOME"] = "D:\\transformers_cache"
os.environ["TRANSFORMERS_CACHE"] = "D:\\transformers_cache"

classifier = pipeline("text-generation", model=model)

prompt = "Who are you?"

print(classifier(prompt))

I have CUDA Toolkit installed and everything should be up to date.

Any ideas how to solve?