Using the model in ctransformers
Hey bartowski,
I am trying to use your quantized version in ctransformers but I can't get it to load correctly.
from ctransformers import AutoModelForCausalLM
try:
llm = AutoModelForCausalLM.from_pretrained(
"bartowski/Llama-3.1-SauerkrautLM-8b-Instruct-GGUF",
model_file="Llama-3.1-SauerkrautLM-8b-Instruct-Q6_K_L.gguf",
model_type="llama",
gpu_layers=0,
context_length=2048
)
print("Model loaded successfully")
except Exception as e:
print(f"Error loading model: {e}")
This will keep throwing: "RuntimeError: Failed to create LLM 'llama2' from '/home/jovyan/.cache/huggingface/hub/models--bartowski--Llama-3.1-SauerkrautLM-8b-Instruct-GGUF/blobs/7a5b0d4528966fd00dd378b8edf40e96dd65e839078feac7d4f4ab383fbe551b'."
So, the model_type seems to be the issue.
It throws the same for "llama2", "llama3" and None. Google and Copilot could not help me.
What do I have to use a model type?
Thanks!
it's possible ctransformers was never updated with llama3 support as there haven't been any commits to it since september of last year :(
maybe try llama-cpp-python?
All right, thanks for the quick reply. I'll try!