How to increase num_frames?
Hi, I'm new to video tasks. I notice that the model processes only 8 frames per video (config.json). I see it pretty ineffective because it may lose a lot of information of minutes long video. Do we need to pre-train the model from scratch to increase the number of frames? Could you please give me some advises? I would appreciate it. Thank you in advanced.
same here.
Thank you for your attention. We're actually focusing on short videos at the moment. It has been the case that long videos are not that well handled. The reason for this is that if the inputs are dense, then the model will be overwhelmed. If the output is not dense then the feature information will be lost. Maybe I'm wondering if I should do an advance query based on the text to extract certain tokens from a frame, which might reduce the complexity of the video model and allow more frames to be input.