Visual Question Answering
English
mhan commited on
Commit
d7a5829
1 Parent(s): 5bd2e88

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -19,7 +19,7 @@ pipeline_tag: visual-question-answering
19
 
20
  ## Training Dataset
21
 
22
- **For video data downloading, please have a look at [this issue](https://github.com/bytedance/Shot2Story/issues/5).**
23
 
24
  We are excited to release a new video-text benchmark for multi-shot video understanding. This release contains a 134k version of our dataset. It includes detailed long summaries (human annotated + GPTV generated) for 134k videos and shot captions (human annotated) for 188k video shots. Please check the dataset [here](https://huggingface.co/datasets/mhan/Shot2Story-134K).
25
 
@@ -37,7 +37,7 @@ We are releasing the checkpoints trained with our [Shot2Story-20K](https://huggi
37
 
38
  Our text annotations are licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) License](https://creativecommons.org/licenses/by-nc-sa/4.0/). They are available strictly for non-commercial research.
39
 
40
- Please note, our dataset does not include the original videos. Users must refer to [HD-VILA-100M](https://github.com/microsoft/XPretrain/blob/main/hd-vila-100m/README.md) for video access. By downloading our annotations, you agree to these terms. Respect for video copyright holders is paramount. Ensure your use of the videos aligns with the original source's terms.
41
 
42
  ---
43
 
 
19
 
20
  ## Training Dataset
21
 
22
+ **Please download the multi-shot videos [here](https://1drv.ms/f/s!Ap3OKt6-X52NgXoG4-64N9WZDenS?e=oIHfkZ).**
23
 
24
  We are excited to release a new video-text benchmark for multi-shot video understanding. This release contains a 134k version of our dataset. It includes detailed long summaries (human annotated + GPTV generated) for 134k videos and shot captions (human annotated) for 188k video shots. Please check the dataset [here](https://huggingface.co/datasets/mhan/Shot2Story-134K).
25
 
 
37
 
38
  Our text annotations are licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) License](https://creativecommons.org/licenses/by-nc-sa/4.0/). They are available strictly for non-commercial research.
39
 
40
+ Users must refer to [HD-VILA-100M](https://github.com/microsoft/XPretrain/blob/main/hd-vila-100m/README.md) for original video access. By downloading our annotations, you agree to these terms. Respect for video copyright holders is paramount. Ensure your use of the videos aligns with the original source's terms.
41
 
42
  ---
43