stabilityai
/

stable-diffusion-xl-1.0-tensorrt

@@ -12,14 +12,14 @@ tags:
 # Stable Diffusion XL 1.0 TensorRT
-### Introduction
 This repository hosts the TensorRT versions of **Stable Diffusion XL 1.0** created in collaboration with [NVIDIA](https://huggingface.co/nvidia). The optimized versions give substantial improvements in speed and efficiency.
 ![examples](./examples.jpg)
-### Model Description
 - **Developed by:** Stability AI
 - **Model type:** Diffusion-based text-to-image generative model
@@ -27,7 +27,7 @@ This repository hosts the TensorRT versions of **Stable Diffusion XL 1.0** creat
 - **Model Description:** This is a conversion of the [SDXL base 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) and [SDXL refiner 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0) models for [NVIDIA TensorRT](https://developer.nvidia.com/tensorrt) optimized inference
-### Performance Comparison
 #### Timings for 30 steps at 1024x1024
@@ -37,7 +37,7 @@ This repository hosts the TensorRT versions of **Stable Diffusion XL 1.0** creat
 | A100        | 3704 ms                  | 2742 ms                     | ~26%                   |
 | H100        | 2496 ms                  | 1471 ms                     | ~41%                   |
-#### Image throughput for 30 steps
 | Accelerator | Baseline (non-optimized) | NVIDIA TensorRT (optimized) | Percentage improvement |
 |-------------|--------------------------|-----------------------------|------------------------|
@@ -46,7 +46,7 @@ This repository hosts the TensorRT versions of **Stable Diffusion XL 1.0** creat
 | H100        | 0.40 images/sec          | 0.68 images/sec             | ~70%                   |
-### Usage Example
 1. Following the [setup instructions](https://github.com/rajeevsrao/TensorRT/blob/release/8.6/demo/Diffusion/README.md) for TensorRT on launching a TensorRT NGC container.
 ```shell
@@ -70,8 +70,7 @@ cd ..
 python3 -m pip install --upgrade pip
 python3 -m pip install --upgrade tensorrt
-export TRT_OSSPATH=/workspace
-cd $TRT_OSSPATH/demo/Diffusion
 pip3 install -r requirements.txt
 ```
@@ -79,7 +78,6 @@ pip3 install -r requirements.txt
 ```
 python3 demo_txt2img_xl.py \
   "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" \
-  --hf-token=<Your HF TOKEN> \
   --build-static-batch \
   --use-cuda-graph \
   --num-warmup-runs 1 \

 # Stable Diffusion XL 1.0 TensorRT
+## Introduction
 This repository hosts the TensorRT versions of **Stable Diffusion XL 1.0** created in collaboration with [NVIDIA](https://huggingface.co/nvidia). The optimized versions give substantial improvements in speed and efficiency.
 ![examples](./examples.jpg)
+## Model Description
 - **Developed by:** Stability AI
 - **Model type:** Diffusion-based text-to-image generative model
 - **Model Description:** This is a conversion of the [SDXL base 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) and [SDXL refiner 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0) models for [NVIDIA TensorRT](https://developer.nvidia.com/tensorrt) optimized inference
+## Performance Comparison
 #### Timings for 30 steps at 1024x1024
 | A100        | 3704 ms                  | 2742 ms                     | ~26%                   |
 | H100        | 2496 ms                  | 1471 ms                     | ~41%                   |
+#### Image throughput for 30 steps at 1024x1024
 | Accelerator | Baseline (non-optimized) | NVIDIA TensorRT (optimized) | Percentage improvement |
 |-------------|--------------------------|-----------------------------|------------------------|
 | H100        | 0.40 images/sec          | 0.68 images/sec             | ~70%                   |
+## Usage Example
 1. Following the [setup instructions](https://github.com/rajeevsrao/TensorRT/blob/release/8.6/demo/Diffusion/README.md) for TensorRT on launching a TensorRT NGC container.
 ```shell
 python3 -m pip install --upgrade pip
 python3 -m pip install --upgrade tensorrt
+cd demo/Diffusion
 pip3 install -r requirements.txt
 ```
 ```
 python3 demo_txt2img_xl.py \
   "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" \
   --build-static-batch \
   --use-cuda-graph \
   --num-warmup-runs 1 \