Kandinsky_2.0 / README.md
sberbank-ai
Update README.md
060b7d9
|
raw
history blame
1.14 kB
metadata
license: apache-2.0
tags:
  - Kandinsky
  - text-image
  - text2image
  - diffusion
  - latent diffusion
  - mCLIP-XLMR
  - mT5

Kandinsky 2.0

Kandinsky 2.0 - the first multilingual latent diffusion text2image model. UNet size: 1.2B parameters

It is a latent diffusion model with two multi-lingual text encoders:

  • mCLIP-XLMR (344M parameters)
  • mT5-small (300M parameters)

These encoders and multilingual training datasets unveil the real multilingual text2image generation experience!

Authors