README.md · bleepybloops/sao_vae_tuned

metadata

license: cc
base_model:
  - stabilityai/stable-audio-open-1.0
tags:
  - VAE

https://twitter.com/_lyraaaa_/status/1819145905972691227

model config is identical to the stock stable_audio_2.0_vae included in the stable-audio-tools repo

finetuned stable audio open's vae for 100k steps to try and fix its habit of colorizing gritty sounds

the blue and orange runs are near-identical, same seed etc, except the orange one had the encoder and bottleneck frozen while blue was a full train. orange model has an identical latent space and therefore is instantly swappable into any stable audio open model, blue will require further training in exchange for slightly higher fidelity.

to use the blue vae, pass it to your train command with --pretransform-ckpt-path. to use the orange vae, you'll need to load the stable audio open model (with original vae), load the new vae, and then replace model.pretransform.model with it.

further instructions may be written at some point, but i highly recommend you play with the code and figure it out yourself!