|
--- |
|
license: cc-by-4.0 |
|
language: |
|
- en |
|
tags: |
|
- text-to-video |
|
--- |
|
|
|
# <span style="font-family: 'Courier New', monospace; font-weight: bold">T2V-Turbo</span>: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback |
|
|
|
## 4-step Text-to-video Generation |
|
|
|
<table class="center"> |
|
<td> |
|
<video poster="" autoplay controls muted loop height="150%" playbackRate=2.0><source src="https://rg-lcd.s3.amazonaws.com/t2v-turbo-demo/t2v-turbo-vc2/4steps/0273.mp4" type="video/mp4"></video> |
|
</td> |
|
<td> |
|
<video poster="" autoplay controls muted loop height="150%" playbackRate=2.0><source src="https://rg-lcd.s3.amazonaws.com/t2v-turbo-demo/t2v-turbo-vc2/4steps/0054.mp4" type="video/mp4"></video> |
|
</td> |
|
<td> |
|
<video poster="" autoplay controls muted loop height="150%" playbackRate=2.0><source src="https://rg-lcd.s3.amazonaws.com/t2v-turbo-demo/t2v-turbo-vc2/4steps/0262.mp4" type="video/mp4"></video> |
|
</td> |
|
|
|
<tr> |
|
<td style="text-align:center;" width="320">With the style of low-poly game art, A majestic, white horse gallops gracefully across a moonlit beach.</td> |
|
<td style="text-align:center;" width="320">medium shot of Christine, a beautiful 25-year-old brunette resembling Selena Gomez, anxiously looking up as she walks down a New York street, cinematic style</td> |
|
<td style="text-align:center;" width="320">a cartoon pig playing his guitar, Andrew Warhol style</td> |
|
<tr> |
|
</table > |
|
|
|
<table class="center"> |
|
<td> |
|
<video poster="" autoplay controls muted loop height="150%" playbackRate=2.0><source src="https://rg-lcd.s3.amazonaws.com/t2v-turbo-demo/t2v-turbo-vc2/4steps/0023.mp4" type="video/mp4"></video> |
|
</td> |
|
<td> |
|
<video poster="" autoplay controls muted loop height="150%" playbackRate=2.0><source src="https://rg-lcd.s3.amazonaws.com/t2v-turbo-demo/t2v-turbo-vc2/4steps/0021.mp4" type="video/mp4"></video> |
|
</td> |
|
<td> |
|
<video poster="" autoplay controls muted loop height="150%" playbackRate=2.0><source src="https://rg-lcd.s3.amazonaws.com/t2v-turbo-demo/t2v-turbo-vc2/4steps/0064.mp4" type="video/mp4"></video> |
|
</td> |
|
|
|
<tr> |
|
<td style="text-align:center;" width="320">a dog wearing vr goggles on a boat</td> |
|
<td style="text-align:center;" width="320">Pikachu snowboarding</td> |
|
<td style="text-align:center;" width="320">a girl floating underwater </td> |
|
<tr> |
|
</table > |
|
|
|
## Model description π |
|
|
|
This repository contains `unet_lora.pt` that can turn [VideoCrafter2](https://ailab-cvc.github.io/videocrafter2/) into our <span style="font-family: 'Courier New', monospace; font-weight: bold">T2V-Turbo</span> (VC2). Our <span style="font-family: 'Courier New', monospace; font-weight: bold">T2V-Turbo</span> (VC2) can achieve both fast and high-quality T2V generation. On [VBench](https://vchitect.github.io/VBench-project/), the 4-step generation from our <span style="font-family: 'Courier New', monospace; font-weight: bold">T2V-Turbo</span> (VC2) even outperform proprietary systems, including [Gen-2](https://research.runwayml.com/gen2) and [Pika](https://pika.art/). Please refer to our [GitHub repo](https://github.com/Ji4chenLi/t2v-turbo) for detailed instructions. |
|
|
|
|
|
## Misuse, Malicious Use and Excessive Use π |
|
Our model is meant for research purposes. |
|
|
|
- It is prohibited to generate content that is demeaning or harmful to people or their environment, culture, religion, etc. |
|
- Prohibited for pornographic, violent and bloody content generation. |
|
- Prohibited for error and false information generation. |