WenhaoWang
/

Meta-Llama-3-8B-AutoT2VPrompt

 ---
 license: cc-by-nc-4.0
+datasets:
+- WenhaoWang/VidProM
+language:
+- en
+pipeline_tag: text-generation
+tags:
+- text-to-video generation
+- VidProM
+- Automatical text-to-video prompt
 ---
+# The first model for automatic text-to-video prompt completion: Given a few words as input, the model will generate a few whole text-to-video prompts.
+# Details
+It is fine-tuned on the [VidProM](https://huggingface.co/datasets/WenhaoWang/VidProM) dataset using [Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) and 8 A100 80G GPUs.
+# Usage
+## Download the model
+```
+from transformers import pipeline
+pipe = pipeline("text-generation", model="WenhaoWang/Meta-Llama-3-8B-AutoT2VPrompt")
+```
+## Set the Parameters
+```
+input = "An underwater world"      # The input text to generate text-to-video prompt.
+max_length = 50                    # The maximum length of the generated text.
+temperature = 1.2                  # Controls the randomness of the generation. Higher values lead to more random outputs.
+top_k = 8                          # Limits the number of words considered at each step to the top k most likely words.
+num_return_sequences = 10          # The number of different text-to-video prompts to generate from the same input.
+```
+## Generation
+```
+all_prompts = pipe(input, max_length = max_length, do_sample = True, temperature = temperature, top_k = top_k, num_return_sequences=num_return_sequences)
+def process(text):
+    text = text.replace('\n', '.')
+    text = text.replace('  .', '.')
+    text = text[:text.rfind('.')]
+    text = text + '.'
+    return text
+for i in range(num_return_sequences):
+    print(process(all_prompts[i]['generated_text']))
+```
+You will get 10 text-to-video prompts, and you can pick one you like most.
+```
+An underwater world, 25 ye boy, with aqua-green eyes, dk sandy blond hair, from the back, and on his back a fish, 23 ye old, weing glasses,ctoon chacte.
+An underwater world, the video should capture the essence of tranquility and the beauty of nature.. a woman with short hair weing a green dress sitting at the desk.
+An underwater world, the ocean is full of discded items, the water flows, and the light penetrating through the water.
+An underwater world.. a woman with red eyes and red lips  is looking forwd.
+An underwater world.. an old man sitting in a chair, smoking a pipe, a little smoke coming out of the chair, a man is drinking a glass.
+An underwater world. The ocean is filled with bioluminess as the water reflects a soft glow from a bioluminescent phosphorescent light source. The camera slowly moves away and zooms in..
+An underwater world. the girl looks at the camera and smiles with happiness..
+An underwater world, 1960s horror film..
+An underwater world.. 4 men in 1940s style clothes walk ound a gothic castle. night, fe. A girl is running, and there e some flowers along the river.
+An underwater world,  -camera pan up . A girl is playing with her cat on a sunny day in the pk. A man is running and then falling down and dying.
+```
+# License
+The model is licensed under the [CC BY-NC 4.0 license](https://creativecommons.org/licenses/by-nc/4.0/deed.en).
+# Citation
+```
+@article{wang2024vidprom,
+  title={VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models},
+  author={Wang, Wenhao and Yang, Yi},
+  journal={arXiv preprint arXiv:2403.06098},
+  year={2024}
+}
+```
+# Acknowledgment
+The fine-tuning process is helped by [Yaowei Zheng](https://github.com/hiyouga).
+# Contact
+If you have any questions, feel free to contact [Wenhao Wang](https://wangwenhao0716.github.io) ([email protected]).