warning during loading of the model
Some weights of ItaliaForCausalLM were not initialized from the model checkpoint at iGeniusAI/Italia-9B-Instruct-v0.1 and are newly initialized: ['embed_out.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
FIY
This should not be a problem given the order in which the weights are loaded. In fact, you can verify that the weights have been correctly loaded by downloading the file "model-00008-of-00008.safetensors" (which contains the weights of embed_out) and running locally:
import torch
from transformers import AutoModelForCausalLM
from safetensors.torch import load_file as load_file_safetensor
model = AutoModelForCausalLM.from_pretrained("iGeniusAI/Italia-9B-Instruct-v0.1", trust_remote_code=True)
m = load_file_safetensor('model-00008-of-00008.safetensors')
all(torch.eq(m['embed_out.bias'].data, model.embed_out.bias.data))
You should get True despite the warning during the loading process. (PS: tested with transformers 4.41.2)
The issue is due to the missing line
"embed_out.bias": "model-00008-of-00008.safetensors"
in model.safetensors.index.json.
However, this absence impacts the case when you want to load the model using the arguments load_in_8bit and load_in_4bit, where in that case, the bias of embed_out will be initialized to 0.
Therefore, we have uploaded the correct version of model.safetensors.index.json to eliminate the warning and fix the loading issue when using load_in_8bit and load_in_4bit.
Thank you for bringing this to our attention.
Hi guys,
I am going to covert the model's weight to litgpt format. But I have received the same problem while converting. I think there is a something wrong with model-00008 of 00008 and needs to be fixed.
I copy the error message here,
Processing model-00008-of-00008.bin
Traceback (most recent call last):
File "/home/.../miniconda3/envs/llama3_1/bin/litgpt", line 8, in
sys.exit(main())
^^^^^^
File "/home/..../miniconda3/envs/llama3_1/lib/python3.12/site-packages/litgpt/main.py", line 143, in main
fn(**kwargs)
File "/home/..../miniconda3/envs/llama3_1/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/..../miniconda3/envs/llama3_1/lib/python3.12/site-packages/litgpt/scripts/convert_hf_checkpoint.py", line 348, in convert_hf_checkpoint
copy_fn(sd, hf_weights, saver=saver, dtype=dtype)
File "/home/..../miniconda3/envs/llama3_1/lib/python3.12/site-packages/litgpt/scripts/convert_hf_checkpoint.py", line 53, in copy_weights_gpt_neox
to_name = weight_map[name]
~~~~~~~~~~^^^^^^
KeyError: 'embed_out.bias'