metadata
license: apple-ascl
tags:
- mdm
Matryoshka Diffusion Models
Matryoshka Diffusion Models was introduced in the paper of the same name, by Jiatao Gu,Shuangfei Zhai, Yizhe Zhang, Josh Susskind, Navdeep Jaitly.
This repository contains the Flickr 64 checkpoint.
Highlights
- This checkpoint was trained on a dataset of 50M text-image pairs collected from Flickr.
- This model was trained using a single UNet (not nested), and generates images with a resolution of 64 × 64.
- Despite training on relatively small datasets, MDMs show strong zero-shot capabilities of generating high-resolution images and videos.
Checkpoints
Model | Dataset | Resolution | Nested UNets |
---|---|---|---|
mdm-flickr-64 | Flickr 50M | 64 × 64 | ❎ |
mdm-flickr-256 | Flickr 50M | 256 × 256 | ✅ |
mdm-flickr-1024 | Flickr 50M | 1024 × 1024 | ✅ |
How to Use
Please, refer to the original repository for training and inference instructions.
Citation
@misc{gu2023matryoshkadiffusionmodels,
title={Matryoshka Diffusion Models},
author={Jiatao Gu and Shuangfei Zhai and Yizhe Zhang and Josh Susskind and Navdeep Jaitly},
year={2023},
eprint={2310.15111},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2310.15111},
}