DeepFloyd
:grapes: [Official Project Page] :apple:[Official Online Demo]
DeepFloyd IF is a novel state-of-the-art open-source text-to-image model with a high degree of photorealism and language understanding.
We've thoughtfully put together some important details for you to keep in mind while using the DeepFloyd models. We sincerely hope this will assist you in creating even more interesting demos with IF. Enjoy your creative journey!
Table of Contents
TODO
- Add installation guide (Continual Updating)
- Test Text-to-Image model
- Test Style-Transfer model
- Add Inpaint demo (seems not work well)
- Add SAM inpaint and Grounded-SAM inpaint demo
Installation
Detailed installation guide
There're more things you should take care for installing DeepFloyd despite of their official guide. You can install DeepFloyd as follows:
- Create a new environment using
Python=3.10
conda create -n floyd python=3.10 -y
conda activate floyd
DeepFloyd need xformers to accelerate some attention mechanism and reduce the GPU memory usage. And
xformers
requires at least PyTorch 1.12.1, PyTorch 1.13.1 or 2.0.0 installed with conda.- If you only have CUDA 11.4 or lower CUDA version installed, you can only PyTorch 1.12.1 locally as:
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 -c pytorch
- After installing PyTorch, it's highly recommended to install xformers using conda:
conda install xformers -c xformers
Then install deepfloyd following their official guidance:
pip install deepfloyd_if==1.0.2rc0
pip install git+https://github.com/openai/CLIP.git --no-deps
Additional notes for bug fixing
- [Attention] To use DeepFloyd with diffusers for saving GPU memory usage, you should update your transformers to at least
4.27.0
and accelerate to0.17.0
.
pip install transformers==4.27.1
pip install accelerate==0.17.0
- And refer to DeepFloyd/issue64, there are some bugs with inpainting demos, you need
protobuf==3.19.0
to load T5Embedder andscikit-image
for inpainting
pip install protobuf==3.19.0
pip install scikit-image
However this bug has not been updated to the python package of DeepFloyd
, so the users should update the code manually follow issue64 or install DeepFloyd
locally as:
git clone https://github.com/deep-floyd/IF.git
cd IF
pip install -e .
Requirements before running demos
Before running DeepFloyd demo, please refer to Integration with DIffusers for some requirements for the pretrained weights.
If you want to download the weights into specific dir, you can set cache_dir
as follows:
- Under diffusers
from diffusers import DiffusionPipeline
from diffusers.utils import pt_to_pil
import torch
cache_dir = "path/to/specific_dir"
# stage 1
stage_1 = DiffusionPipeline.from_pretrained(
"DeepFloyd/IF-I-XL-v1.0",
variant="fp16",
torch_dtype=torch.float16,
cache_dir=cache_dir # loading model from specific dir
)
stage_1.enable_xformers_memory_efficient_attention() # remove line if torch.__version__ >= 2.0.0
stage_1.enable_model_cpu_offload()
- Runing locally
from deepfloyd_if.modules import IFStageI, IFStageII, StableStageIII
from deepfloyd_if.modules.t5 import T5Embedder
cache_dir = "path/to/cache_dir"
device = 'cuda:0'
if_I = IFStageI('IF-I-XL-v1.0', device=device, cache_dir=cache_dir)
if_II = IFStageII('IF-II-L-v1.0', device=device, cache_dir=cache_dir)
if_III = StableStageIII('stable-diffusion-x4-upscaler', device=device, cache_dir=cache_dir)
t5 = T5Embedder(device="cpu", cache_dir=cache_dir)
DeepFloyd Demos
- 16GB vRAM for IF-I-XL (4.3B text to 64x64 base module) & IF-II-L (1.2B to 256x256 upscaler module)
- 24GB vRAM for IF-I-XL (4.3B text to 64x64 base module) & IF-II-L (1.2B to 256x256 upscaler module) & Stable x4 (to 1024x1024 upscaler)
- (Highlight)
xformers
and set env variableFORCE_MEM_EFFICIENT_ATTN=1
, which may help you to save lots of GPU memory usage
export FORCE_MEM_EFFICIENT_ATTN=1
Dream
The text-to-image
mode for DeepFloyd
cd playground/DeepFloyd
export FORCE_MEM_EFFICIENT_ATTN=1
python dream.py
It takes around 26GB
GPU memory usage for this demo. You can download the following awesome generated images from inpaint playground storage.
Style Transfer
Download the original image from here, which is borrowed from DeepFloyd official image.
cd playground/DeepFloyd
export FORCE_MEM_EFFICIENT_ATTN=1
python style_transfer.py