NeMo
Safetensors
llama

Weight Error in Notebook

#5
by atharvanighot - opened

Getting this error:

ValueError: Trying to set a tensor of shape torch.Size([1024, 3072]) in "weight" (which has shape torch.Size([768, 3072])), this look incorrect.

while running default script:

import torch
from transformers import AutoTokenizer, LlamaForCausalLM

# Load the tokenizer and model
model_path = "nvidia/Llama3.1-Minitron-4B-Width-Base"
tokenizer = AutoTokenizer.from_pretrained(model_path)

device = 'cuda'
dtype = torch.bfloat16
model = LlamaForCausalLM.from_pretrained(model_path, torch_dtype=dtype, device_map=device)

# Prepare the input text
prompt = 'Complete the paragraph: our solar system is'
inputs = tokenizer.encode(prompt, return_tensors='pt').to(model.device)

# Generate the output
outputs = model.generate(inputs, max_length=20)

# Decode and print the output
output_text = tokenizer.decode(outputs[0])
print(output_text)

probably issue on GroudQueryAttention

@atharvanighot

The pull requests to support this model in Hugging Face Transformers are currently under review.
Follow the installation instructions below:

Fetch PR 32502

$ git clone -b suhara/llama-kv-channels --single-branch https://github.com/suhara/transformers.git && cd transformers

Fetch changes from PR 32495

$ git fetch https://github.com/suiyoubi/transformers.git aot/head_dim_rope && git cherry-pick FETCH_HEAD --strategy-option theirs

Install transformers

$ pip install -e

Will subscribe to that

NVIDIA org

@atharvanighot are you still getting this error? The installation instructions have been updated - you no longer need to fetch these PRs manually:

pip install git+https://github.com/huggingface/transformers

@srvm

I tried it once again without manually fetching the PRs. I made sure to upgrade transformers, but I'm still getting this error:

ValueError: Trying to set a tensor of shape torch.Size([1024, 3072]) in "weight" (which has shape torch.Size([768, 3072])), this look incorrect.

I'll try it again later by fetching PRs manually.

same error

atharvanighot changed discussion status to closed
atharvanighot changed discussion status to open

@atharvanighot , @Tiz01
Though Depth prunned model works fine. You could try using it instead

Still same issue today with latest transformer package.

Sign up or log in to comment