how to run it?

#1
by guwenyi - opened

good work my friend! ! ! how to run it? would you provide the script for running? thank you

I try to use script in this https://huggingface.co/mistral-community/pixtral-12b, but got errors, any suggestions?

Traceback (most recent call last):
File "/mnt/guwenyi/mistral-community-pixtral/my_eval.py", line 7, in
model = LlavaForConditionalGeneration.from_pretrained(model_id)
File "/root/miniconda3/envs/pixtral_8bit/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3354, in from_pretrained
config, model_kwargs = cls.config_class.from_pretrained(
File "/root/miniconda3/envs/pixtral_8bit/lib/python3.10/site-packages/transformers/configuration_utils.py", line 610, in from_pretrained
return cls.from_dict(config_dict, **kwargs)
File "/root/miniconda3/envs/pixtral_8bit/lib/python3.10/site-packages/transformers/configuration_utils.py", line 772, in from_dict
config = cls(**config_dict)
File "/root/miniconda3/envs/pixtral_8bit/lib/python3.10/site-packages/transformers/models/llava/configuration_llava.py", line 104, in init
vision_config = CONFIG_MAPPINGvision_config["model_type"]
File "/root/miniconda3/envs/pixtral_8bit/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 695, in getitem
raise KeyError(key)
KeyError: 'pixtral'

Make sure transformers is up to date.
pip install git+https://github.com/huggingface/transformers

Works great. ~16GB vRAM.

I'm on https://github.com/NielsRogge/transformers/tree/update_pixtral rev 143d012427c78f1798c6a1ebfad487714347afd7

from transformers import LlavaForConditionalGeneration, AutoProcessor
from PIL import Image

model_id="DewEfresh/pixtral-12b-8bit"
image_url ="https://m.media-amazon.com/images/I/81tPIA63TrL._AC_SL1500_.jpg"
prompt = "Beschreibe das Bild in einem Satz."

IMG_URLS = [
    "https://picsum.photos/id/237/400/300",
    "https://picsum.photos/id/231/200/300",
    "https://picsum.photos/id/27/500/500",
    "https://picsum.photos/id/17/150/600",
]
PROMPT = "<s>[INST]Beschreibe die Bilder.\n[IMG][IMG][IMG][IMG][/INST]"

model = LlavaForConditionalGeneration.from_pretrained(model_id)
processor = AutoProcessor.from_pretrained(model_id)

inputs = processor(images=IMG_URLS, text=PROMPT, return_tensors="pt").to("cuda")
generate_ids = model.generate(**inputs, max_new_tokens=500)
output = processor.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
print(output)

Sign up or log in to comment