--- license: openrail++ base_model: stabilityai/stable-diffusion-xl-base-1.0 language: - en tags: - stable-diffusion - stable-diffusion-xl - tensorrt - text-to-image --- # Stable Diffusion XL 1.0 TensorRT ## Introduction This repository hosts the TensorRT versions of **Stable Diffusion XL 1.0** created in collaboration with [NVIDIA](https://huggingface.co/nvidia). The optimized versions give substantial improvements in speed and efficiency. ![examples](./examples.jpg) ## Model Description - **Developed by:** Stability AI - **Model type:** Diffusion-based text-to-image generative model - **License:** [CreativeML Open RAIL++-M License](https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0/blob/main/LICENSE.md) - **Model Description:** This is a conversion of the [SDXL base 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) and [SDXL refiner 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0) models for [NVIDIA TensorRT](https://developer.nvidia.com/tensorrt) optimized inference ## Performance Comparison #### Timings for 30 steps at 1024x1024 | Accelerator | Baseline (non-optimized) | NVIDIA TensorRT (optimized) | Percentage improvement | |-------------|--------------------------|-----------------------------|------------------------| | A10 | 9399 ms | 8160 ms | ~13% | | A100 | 3704 ms | 2742 ms | ~26% | | H100 | 2496 ms | 1471 ms | ~41% | #### Image throughput for 30 steps at 1024x1024 | Accelerator | Baseline (non-optimized) | NVIDIA TensorRT (optimized) | Percentage improvement | |-------------|--------------------------|-----------------------------|------------------------| | A10 | 0.10 images/sec | 0.12 images/sec | ~20% | | A100 | 0.27 images/sec | 0.36 images/sec | ~33% | | H100 | 0.40 images/sec | 0.68 images/sec | ~70% | ## Usage Example 1. Following the [setup instructions](https://github.com/rajeevsrao/TensorRT/blob/release/8.6/demo/Diffusion/README.md) on launching a TensorRT NGC container. ```shell git clone https://github.com/rajeevsrao/TensorRT.git cd TensorRT git checkout release/8.6 docker run --rm -it --gpus all -v $PWD:/workspace nvcr.io/nvidia/pytorch:23.06-py3 /bin/bash ``` 2. Download the SDXL TensorRT files from this repo ```shell git lfs install git clone https://huggingface.co/stabilityai/stable-diffusion-xl-1.0-tensorrt cd stable-diffusion-xl-1.0-tensorrt git lfs pull cd .. ``` 3. Install libraries and requirements ```shell python3 -m pip install --upgrade pip python3 -m pip install --upgrade tensorrt cd demo/Diffusion pip3 install -r requirements.txt ``` 4. Perform TensorRT optimized inference ``` python3 demo_txt2img_xl.py \ "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" \ --build-static-batch \ --use-cuda-graph \ --num-warmup-runs 1 \ --width 1024 \ --height 1024 \ --denoising-steps 30 \ --onnx-base-dir /workspace/stable-diffusion-xl-1.0-tensorrt/sdxl-1.0-base \ --onnx-refiner-dir /workspace/stable-diffusion-xl-1.0-tensorrt/sdxl-1.0-refiner ```