Weight Error in Notebook
Getting this error:
ValueError: Trying to set a tensor of shape torch.Size([1024, 3072]) in "weight" (which has shape torch.Size([768, 3072])), this look incorrect.
while running default script:
import torch
from transformers import AutoTokenizer, LlamaForCausalLM
# Load the tokenizer and model
model_path = "nvidia/Llama3.1-Minitron-4B-Width-Base"
tokenizer = AutoTokenizer.from_pretrained(model_path)
device = 'cuda'
dtype = torch.bfloat16
model = LlamaForCausalLM.from_pretrained(model_path, torch_dtype=dtype, device_map=device)
# Prepare the input text
prompt = 'Complete the paragraph: our solar system is'
inputs = tokenizer.encode(prompt, return_tensors='pt').to(model.device)
# Generate the output
outputs = model.generate(inputs, max_length=20)
# Decode and print the output
output_text = tokenizer.decode(outputs[0])
print(output_text)
probably issue on GroudQueryAttention
The pull requests to support this model in Hugging Face Transformers are currently under review.
Follow the installation instructions below:
Fetch PR 32502
$ git clone -b suhara/llama-kv-channels --single-branch https://github.com/suhara/transformers.git && cd transformers
Fetch changes from PR 32495
$ git fetch https://github.com/suiyoubi/transformers.git aot/head_dim_rope && git cherry-pick FETCH_HEAD --strategy-option theirs
Install transformers
$ pip install -e
Will subscribe to that
@atharvanighot are you still getting this error? The installation instructions have been updated - you no longer need to fetch these PRs manually:
pip install git+https://github.com/huggingface/transformers
I tried it once again without manually fetching the PRs. I made sure to upgrade transformers, but I'm still getting this error:
ValueError: Trying to set a tensor of shape torch.Size([1024, 3072]) in "weight" (which has shape torch.Size([768, 3072])), this look incorrect.
I'll try it again later by fetching PRs manually.
same error
@atharvanighot
,
@Tiz01
Though Depth prunned model works fine. You could try using it instead
Still same issue today with latest transformer package.