Text-to-Audio
Transformers
English
Inference Endpoints

VRAM requirements

#1
by makovan - opened

Hello,

First of all, amazing work!
I would like to try this out but it looks like I'm running out of memory, I have a 10GB RTX 3080, is there anyway I could make it run or is more memory required?

Thank you

Fixed it by changing this line
tango = Tango("declare-lab/tango", "cpu")

Takes a while but at least it works :)

Deep Cognition and Language Research (DeCLaRe) Lab org
This comment has been hidden
Deep Cognition and Language Research (DeCLaRe) Lab org

In our A6000 GPU, the required VRAM is around 13GB for full precision inference with a batch size of 1. I will add a script for fp16 inference of TANGO in our GitHub repository soon which should reduce the memory footprint. Stay tuned!

Sign up or log in to comment