Inference error, tensor shapes.

#18

by alejandrofdz - opened Jul 24, 2023

Jul 24, 2023

HI everyone, thanks @TheBloke for your great job.

Im trying to inference with TheBloke/Llama-2-70B-chat-GPTQ and I get the next error:

out = out + self.bias if self.bias is not None else out
RuntimeError: The size of tensor a (24576) must match the size of tensor b (10240) at non-singleton dimension 2

At first I thought it was an installation problema but my code works with TheBloke/Llama-2-13B-chat-GPTQ... It occurs also with FreeWilly2, maybe beacuse its based on Llama-2-70B.

Any help will be appreciated.

TheBloke

Owner Jul 25, 2023

Can you check the sha256sum of the .safetensors file, or just try downloading the model again. The download may have terminated early, giving you an invalid file

Also please confirm you're using Transformers 4.31.0 which is required for 70B.

alejandrofdz

Jul 25, 2023

Thank you so much @TheBloke ! It was transformers version I thought I had the newest!

Regards!

tridungduong16

Jul 27, 2023

I face the similar problem when fine-tuning with AutoGPTQ. Does you solve the problems?

TheBloke

Owner Jul 27, 2023

@tridungduong16 make sure you're using AutoGPTQ 0.3.2 + Transformers 4.31.0

tridungduong16

Jul 27, 2023

I've already used the latest version of Transformers indeed.

tridungduong16

Jul 27, 2023

•

edited Jul 27, 2023

Same for 4.31.0

TheBloke

Owner Jul 27, 2023

@tridungduong16 I'm confused, you said you had a problem with AutoGPTQ but your error screenshot show ExLlama, not AutoGPTQ?

If you're using ExLlama then please make sure ExLlama is updated to the latest version. This model definitely works with ExLlama, so you might have an older version that doesn't support 70B.

tridungduong16

Jul 27, 2023

•

edited Jul 27, 2023

Sorry, I have the wrong screenshot. I use the fine-tune scripts from https://github.com/PanQiWei/AutoGPTQ/blob/main/examples/peft/peft_lora_clm_instruction_tuning.py.

It works well for 13B model such as:

https://huggingface.co/TheBloke/Llama-2-13B-GPTQ
https://huggingface.co/TheBloke/OpenAssistant-Llama2-13B-Orca-8K-3319-GGML
but when I fine-tune with 70B model, there is some problems.

Library version I use is:

transformers.version '4.32.0.dev0'
auto_gptq.version '0.3.2'

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment