Wrong model upload?

#3
by flaviomerenda - opened

Dear authors,

Thank you for sharing this model and for the scientific contribution of your paper. I have been exploring the use of this model, but I consistently encounter issues with inconsistent outputs. Upon investigation, it appears that the problem lies in the configuration file, which specifies the model as LlamaModel instead of LlamaModelForCausalLM. This discrepancy seems to result in missing head weights, leading to the inconsistent outputs observed.

Additionally, when I attempt to load the model using the Transformers library, I encounter the following error:

Some weights of LlamaForCausalLM were not initialized from the model checkpoint at Dongwei/Rationalyst_reasoning_datasets and are newly initialized: ['lm_head.weight']"

When I attempt to load the model using the vLLM library, I encounter the following error:

ValueError: Model architectures ['LlamaModel'] are not supported for now. Supported architectures: ['AquilaModel', 'AquilaForCausalLM', 'BaiChuanForCausalLM', 'BaichuanForCausalLM', 'BloomForCausalLM', 'ChameleonForCausalLM', 'ChameleonForConditionalGeneration', 'ChatGLMModel', 'ChatGLMForConditionalGeneration', 'CohereForCausalLM', 'DbrxForCausalLM', 'DeciLMForCausalLM', 'DeepseekForCausalLM', 'DeepseekV2ForCausalLM', 'FalconForCausalLM', 'FuyuForCausalLM', 'GemmaForCausalLM', 'Gemma2ForCausalLM', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTJForCausalLM', 'GPTNeoXForCausalLM', 'InternLMForCausalLM', 'InternLM2ForCausalLM', 'JAISLMHeadModel', 'LlamaForCausalLM', 'LlavaForConditionalGeneration', 'LlavaNextForConditionalGeneration', 'LLaMAForCausalLM', 'MistralForCausalLM', 'MixtralForCausalLM', 'QuantMixtralForCausalLM', 'MptForCausalLM', 'MPTForCausalLM', 'MiniCPMForCausalLM', 'OlmoForCausalLM', 'OPTForCausalLM', 'OrionForCausalLM', 'PersimmonForCausalLM', 'PaliGemmaForConditionalGeneration', 'PhiForCausalLM', 'Phi3ForCausalLM', 'Phi3VForCausalLM', 'QWenLMHeadModel', 'Qwen2ForCausalLM', 'Qwen2MoeForCausalLM', 'RWForCausalLM', 'StableLMEpochForCausalLM', 'StableLmForCausalLM', 'Starcoder2ForCausalLM', 'ArcticForCausalLM', 'XverseForCausalLM', 'Phi3SmallForCausalLM', 'MedusaModel', 'MLPSpeculatorPreTrainedModel', 'JambaForCausalLM', 'MistralModel']

Has the model been saved incorrectly? Could you provide a code example demonstrating how to load the model correctly?

Thank you for your support.

Best regards

Sign up or log in to comment