Error in Colab Notebook

#1
by roydebox - opened

Hi,
Thanks for sharing the model and notebook.
When I tried to run the notebook, I ran into an error as follows:
'''
from transformers import LlavaNextVideoProcessor, LlavaNextVideoForConditionalGeneration

ImportError Traceback (most recent call last)
in <cell line: 1>()
----> 1 from transformers import LlavaNextVideoProcessor, LlavaNextVideoForConditionalGeneration

ImportError: cannot import name 'LlavaNextVideoProcessor' from 'transformers' (/usr/local/lib/python3.10/dist-packages/transformers/init.py)


NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.


'''

Which transformers library has the modules like LlavaNextVideoProcessor?

Please help. Thanks!

Llava Hugging Face org

@roydebox hi, this model is planned to be added to the v4.42 release and thus doesn't work yet. It will take probably one-two weeks, in the meanwhile you can try out another video LLM we have, VideoLlava :)

I see. Thank you for your quick response!

roydebox changed discussion status to closed

@roydebox you may use the branch that is in active development. No promises that it will work

!pip install git+https://github.com/zucchini-nlp/transformers@llava_next_video

Hi, @legraphista .

Thank you very much for your help! That branch has the modules and fixes the error.

When I tried to continue to test the notebook, I ran into a new error :) as follows:
```
/usr/local/lib/python3.10/dist-packages/transformers/feature_extraction_utils.py:142: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ../torch/csrc/utils/tensor_new.cpp:274.)
return torch.tensor(value)

ValueError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py in convert_to_tensors(self, tensor_type, prepend_batch_axis)
761 if not is_tensor(value):
--> 762 tensor = as_tensor(value)
763

8 frames
ValueError: expected sequence of length 19 at dim 1 (got 20)

The above exception was the direct cause of the following exception:

ValueError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py in convert_to_tensors(self, tensor_type, prepend_batch_axis)
776 "Please see if a fast version of this tokenizer is available to have this feature available."
777 ) from e
--> 778 raise ValueError(
779 "Unable to create tensor, you should probably activate truncation and/or padding with"
780 " 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your"

ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (input_ids in this case) have excessive nesting (inputs type list where type int is expected).


I checked the parameters and function signature, but couldn't figure the reason. Could you please help again?



@RaushanTurganbay

	  BTW, I found two lines which seem to be for testing and should be removed.

example = dataset['train'][0]
clip = example["clip"]


roydebox changed discussion status to open

Just add padding=True to the tokenizer. As for the 2 lines of code, they seem to be leftovers, you can simply comment them out

Llava Hugging Face org

Right, the notebook also might contain errors as it wasn't tested yet

@legraphista

'padding=True' fixes that error. :)

Thank you very much!

roydebox changed discussion status to closed

Sign up or log in to comment