What is the input payload for llava predictor on sagemaker, (KeyError: 'input_ids'. aws sagemaker and llava 1.6)
i am trying to deploy this modal on sagemaker something like this
sagemaker_session_bucket = None
if sagemaker_session_bucket is None and sagemakerSession is not None:
sagemaker_session_bucket = sagemakerSession.default_bucket()
hub = {
'HF_MODEL_ID':'llava-hf/llava-v1.6-vicuna-13b-hf', # model_id from hf.co/models
'HF_TASK':'image-to-text', # NLP task you want to use for predictions
'HF_MODEL_QUANTIZE':'true'
}
image_uri="custom_image_uri_with_upgraded_transformers"
instance_type="ml.p2.xlarge"
from sagemaker.huggingface.model import HuggingFaceModel
huggingface_model = HuggingFaceModel(
image_uri=image_uri,
env=hub,
role=role,
)
and i am assuming this is how we are going to predict
image = "https://llava-vl.github.io/static/images/view.jpg"
predictor.predict(data={
"inputs":image,
"prompt":"what is this image?"
})
which causes to throw the KeyError: 'input_ids
error on the cloudwatch . below are the complete error log.
"/opt/conda/lib/python3.10/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py",
line 258, in handle response = self.transform_fn(*([self.model, input_data,
content_type, accept] + self.transform_extra_arg)) File
"/opt/conda/lib/python3.10/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py",
line 214, in transform_fn predictions = self.predict(*([processed_data, model] +
self.predict_extra_arg)) File
"/opt/conda/lib/python3.10/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py",
line 178, in predict prediction = model(inputs) File
"/opt/conda/lib/python3.10/site-packages/transformers/pipelines/image_to_text.py",
line 125, in __call__ return super().__call__(images, **kwargs) File
"/opt/conda/lib/python3.10/site-packages/transformers/pipelines/base.py", line
1243, in __call__ return self.run_single(inputs, preprocess_params,
forward_params, postprocess_params) File
"/opt/conda/lib/python3.10/site-packages/transformers/pipelines/base.py", line
1250, in run_single model_outputs = self.forward(model_inputs, **forward_params)
File "/opt/conda/lib/python3.10/site-packages/transformers/pipelines/base.py",
line 1150, in forward model_outputs = self._forward(model_inputs,
**forward_params) File
"/opt/conda/lib/python3.10/site-packages/transformers/pipelines/image_to_text.py",
line 180, in _forward inputs = model_inputs.pop(self.model.main_input_name) File
"/opt/conda/lib/python3.10/_collections_abc.py", line 962, in pop value =
self[key] File "/opt/conda/lib/python3.10/collections/__init__.py", line 1106,
in __getitem__ raise KeyError(key) KeyError: 'input_ids'
i am assuming i am sending the wrong payload. i have research but i couldnt find the exact payload. but i might be wrong. any help would be appreciated
also tried this payload
image = "https://llava-vl.github.io/static/images/view.jpg"
predictor.predict(data={
"inputs":image,
"prompt":"[INST] <image>\nWhat is shown in this image? [/INST]"
})
same error as above
Hi, the "image-to-text" task is not yet officially supported on Sagemaker. We plan to move models like LLaVa (and LLaVa-NeXT, Idefics2, PaliGemma,...) to the "image-text-to-text" task, for which APIs are currently being standardized. For now you'll need a custom deployment on SageMaker (a deployement out-of-the-box is not supported yet).
ahh i see, but i think image-text-to-text is not yet supported right? i tried with that task, and it threw error saying that it is not supported, if it is supported, can you share the documentation for it's implementation
@mujammil "image-text-to-text" pipeline is not yet added to transformers. I cannot say about the exact timeline for adding it, probably a couple months