File size: 1,814 Bytes

93f6ee5
 
59ea6cc
 
 
93f6ee5
 
 
 
 
59ea6cc
93f6ee5
4ef1da2
 
93f6ee5
 
 
 
 
 
4ef1da2
93f6ee5
 
 
 
59ea6cc
93f6ee5
 
59ea6cc
93f6ee5
 
 
 
 
 
 
 
59ea6cc
 
 
 
 
 
4ef1da2
 
59ea6cc
 
93f6ee5

---
license: other
license_name: flux-1-dev-non-commercial-license
license_link: >-
  https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md
tags:
- text-to-image
- flux
---

# Flux Dev

Run the Flux Dev model with limited VRAM in 8bit mode. It's possible, but inpractical, since the downloads alone are "only" 40GB.

## Setup

```
pip install accelerate diffusers optimum-quanto transformers sentencepiece
```

In int4 mode there are places where the pre-trained weights in fp16 **overflow**, resulting in a blank image.

## Inference

```python
from diffusers import FluxPipeline, FluxTransformer2DModel
from optimum.quanto.models import QuantizedDiffusersModel, QuantizedTransformersModel
import torch
from transformers import T5EncoderModel

class Flux2DModel(QuantizedDiffusersModel):
    base_class = FluxTransformer2DModel

class T5Model(QuantizedTransformersModel):
    auto_class = T5EncoderModel

if __name__ == '__main__':
    T5EncoderModel.from_config = lambda c: T5EncoderModel(c).to(dtype=torch.float16)  # Duck and tape for Quanto support.
    t5 = T5Model.from_pretrained('./flux-t5')._wrapped
    transformer = Flux2DModel.from_pretrained('./flux-fp8')._wrapped
    pipe = FluxPipeline.from_pretrained('black-forest-labs/FLUX.1-dev',
                                        text_encoder_2=t5,
                                        transformer=transformer)
    # This method moves one whole model at a time to the GPU when it's in forward mode.
    pipe.enable_model_cpu_offload()
    image = pipe('cat playing piano', num_inference_steps=10, output_type='pil').images[0]
    image.save('cat.png')
```

## Disclaimer

Use of this code and the copy of documentation requires citation and attribution to the author via a link to their Hugging Face profile in all resulting work.