Duplicated from GuyYariv/AudioToken
1b92e8f 44620f0 4d0d20c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
diffusers accelerate torchvision transformers>=4.25.1 ftfy tensorboard opencv-python Pillow pandas torchaudio datasets scipy xformers --pre