Wrong model upload?
Dear authors,
Thank you for sharing this model and for the scientific contribution of your paper. I have been exploring the use of this model, but I consistently encounter issues with inconsistent outputs. Upon investigation, it appears that the problem lies in the configuration file, which specifies the model as LlamaModel
instead of LlamaModelForCausalLM
. This discrepancy seems to result in missing head weights, leading to the inconsistent outputs observed.
Additionally, when I attempt to load the model using the Transformers
library, I encounter the following error:
Some weights of LlamaForCausalLM were not initialized from the model checkpoint at Dongwei/Rationalyst_reasoning_datasets and are newly initialized: ['lm_head.weight']"
When I attempt to load the model using the vLLM
library, I encounter the following error:
ValueError: Model architectures ['LlamaModel'] are not supported for now. Supported architectures: ['AquilaModel', 'AquilaForCausalLM', 'BaiChuanForCausalLM', 'BaichuanForCausalLM', 'BloomForCausalLM', 'ChameleonForCausalLM', 'ChameleonForConditionalGeneration', 'ChatGLMModel', 'ChatGLMForConditionalGeneration', 'CohereForCausalLM', 'DbrxForCausalLM', 'DeciLMForCausalLM', 'DeepseekForCausalLM', 'DeepseekV2ForCausalLM', 'FalconForCausalLM', 'FuyuForCausalLM', 'GemmaForCausalLM', 'Gemma2ForCausalLM', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTJForCausalLM', 'GPTNeoXForCausalLM', 'InternLMForCausalLM', 'InternLM2ForCausalLM', 'JAISLMHeadModel', 'LlamaForCausalLM', 'LlavaForConditionalGeneration', 'LlavaNextForConditionalGeneration', 'LLaMAForCausalLM', 'MistralForCausalLM', 'MixtralForCausalLM', 'QuantMixtralForCausalLM', 'MptForCausalLM', 'MPTForCausalLM', 'MiniCPMForCausalLM', 'OlmoForCausalLM', 'OPTForCausalLM', 'OrionForCausalLM', 'PersimmonForCausalLM', 'PaliGemmaForConditionalGeneration', 'PhiForCausalLM', 'Phi3ForCausalLM', 'Phi3VForCausalLM', 'QWenLMHeadModel', 'Qwen2ForCausalLM', 'Qwen2MoeForCausalLM', 'RWForCausalLM', 'StableLMEpochForCausalLM', 'StableLmForCausalLM', 'Starcoder2ForCausalLM', 'ArcticForCausalLM', 'XverseForCausalLM', 'Phi3SmallForCausalLM', 'MedusaModel', 'MLPSpeculatorPreTrainedModel', 'JambaForCausalLM', 'MistralModel']
Has the model been saved incorrectly? Could you provide a code example demonstrating how to load the model correctly?
Thank you for your support.
Best regards