Bai-YT
/

ConsistencyTTA

Model card Files Files and versions Community

Bai-YT commited on Jun 11

Commit

ab88f61

•

1 Parent(s): fb8e922

Update README.md

Files changed (1) hide show

README.md +7 -1

README.md CHANGED Viewed

@@ -5,7 +5,7 @@ license: cc-by-nc-nd-4.0
 # ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation
 This page shares the official model checkpoints of the paper \
-"Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation" \
 from Microsoft Applied Science Group and UC Berkeley \
 by [Yatong Bai](https://bai-yt.github.io),
 [Trung Dang](https://www.microsoft.com/applied-sciences/people/trung-dang),
@@ -13,6 +13,7 @@ by [Yatong Bai](https://bai-yt.github.io),
 [Kazuhito Koishida](https://www.microsoft.com/applied-sciences/people/kazuhito-koishida),
 and [Somayeh Sojoudi](https://people.eecs.berkeley.edu/~sojoudi/).
 **[[Preprint Paper](https://arxiv.org/abs/2309.10740)]** &nbsp;&nbsp;&nbsp;&nbsp;
 **[[Project Homepage](https://consistency-tta.github.io)]** &nbsp;&nbsp;&nbsp;&nbsp;
 **[[Code](https://github.com/Bai-YT/ConsistencyTTA)]** &nbsp;&nbsp;&nbsp;&nbsp;
@@ -22,6 +23,11 @@ and [Somayeh Sojoudi](https://people.eecs.berkeley.edu/~sojoudi/).
 ## Description
 This work proposes a *consistency distillation* framework to train
 text-to-audio (TTA) generation models that only require a single neural network query,
 reducing the computation of the core step of diffusion-based TTA models by a factor of 400.

 # ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation
 This page shares the official model checkpoints of the paper \
+*ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation* \
 from Microsoft Applied Science Group and UC Berkeley \
 by [Yatong Bai](https://bai-yt.github.io),
 [Trung Dang](https://www.microsoft.com/applied-sciences/people/trung-dang),
 [Kazuhito Koishida](https://www.microsoft.com/applied-sciences/people/kazuhito-koishida),
 and [Somayeh Sojoudi](https://people.eecs.berkeley.edu/~sojoudi/).
+**[[🤗 Live Demo](https://huggingface.co/spaces/Bai-YT/ConsistencyTTA)]** &nbsp;&nbsp;&nbsp;&nbsp;
 **[[Preprint Paper](https://arxiv.org/abs/2309.10740)]** &nbsp;&nbsp;&nbsp;&nbsp;
 **[[Project Homepage](https://consistency-tta.github.io)]** &nbsp;&nbsp;&nbsp;&nbsp;
 **[[Code](https://github.com/Bai-YT/ConsistencyTTA)]** &nbsp;&nbsp;&nbsp;&nbsp;
 ## Description
+**2024/06 Updates:**
+- We have hosted an interactive live demo of ConsistencyTTA at [🤗 Huggingface](https://huggingface.co/spaces/Bai-YT/ConsistencyTTA).
+- ConsistencyTTA has been accepted to ***INTERSPEECH 2024***! We look forward to meeting you in Kos Island.
 This work proposes a *consistency distillation* framework to train
 text-to-audio (TTA) generation models that only require a single neural network query,
 reducing the computation of the core step of diffusion-based TTA models by a factor of 400.