Would it be possible to make it compatible with LlavaForConditionalGeneration?

#12
by theblackcat102 - opened

Since transformers now support Llava modeling, would be it possible to make Yi-VL weights compatible with the weights loader?

At this moment there seems to be some issues on loading it using transformers 4.45.2.

from transformers import AutoProcessor, LlavaForConditionalGeneration

LlavaForConditionalGeneration.from_pretrained(
            "01-ai/Yi-VL-6B",
            torch_dtype=torch.bfloat16,
            device_map="auto",
            attn_implementation="flash_attention_2",
        )
Some weights of LlavaForConditionalGeneration were not initialized from the model checkpoint at 01-ai/Yi-VL-6B and are newly initialized:
 ['model.language_model.lm_head.weight', 'model.language_model.model.embed_tokens.weight', 'model.language_model.model.layers.0.input_lay
ernorm.weight', 'model.language_model.model.layers.0.mlp.down_proj.weight', 'model.language_model.model.layers.0.mlp.gate_proj.weight', '
model.language_model.model.layers.0.mlp.up_proj.weight', 'model.language_model.model.layers.0.post_attention_layernorm.weight', 'model.la
nguage_model.model.layers.0.self_attn.k_proj.weight', 'model.language_model.model.layers.0.self_attn.o_proj.weight', 'model.language_mode
l.model.layers.0.self_attn.q_proj.weight', 'model.language_model.model.layers.0.self_attn.v_proj.weight', 'model.language_model.model.lay
ers.1.input_layernorm.weight', 'model.language_model.model.layers.1.mlp.down_proj.weight'....

Also the preprocessor config file is missing which is a small issue.

I assume the mapping is 1:1 from Yi-VL-6B to LLaVA-hf?

  1. Llama module : add a language_model prefix

  2. Vision tower: remove "model." prefix

https://huggingface.co/llava-hf/llava-1.5-7b-hf/blob/main/model.safetensors.index.json

Hi @theblackcat102 πŸ‘‹, thank you so much for your question. We will conduct an experiment. Thanks again πŸ™.

Sign up or log in to comment