How to increase num_frames?

by khangnguyen2907 - opened Jan 10

Jan 10

Hi, I'm new to video tasks. I notice that the model processes only 8 frames per video (config.json). I see it pretty ineffective because it may lose a lot of information of minutes long video. Do we need to pre-train the model from scratch to increase the number of frames? Could you please give me some advises? I would appreciate it. Thank you in advanced.

Wonje

Jan 17

same here.

LanguageBind

Owner Jan 18

Thank you for your attention. We're actually focusing on short videos at the moment. It has been the case that long videos are not that well handled. The reason for this is that if the inputs are dense, then the model will be overwhelmed. If the output is not dense then the feature information will be lost. Maybe I'm wondering if I should do an advance query based on the text to extract certain tokens from a frame, which might reduce the complexity of the video model and allow more frames to be input.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment