Spaces:

adamelliotfields
/

diffusion

Running on Zero

App Files Files Community

adamelliotfields commited on Sep 4

Commit

9edebae

•

1 Parent(s): 7f19757

Single-file checkpoints

Browse files

Files changed (6) hide show

README.md +25 -4
app.py +10 -24
cli.py +2 -6
lib/config.py +38 -12
lib/loader.py +81 -71
usage.md +12 -24

README.md CHANGED Viewed

@@ -15,6 +15,7 @@ header: mini
 license: apache-2.0
 models:
 - ai-forever/Real-ESRGAN
 - fluently/Fluently-v4
 - h94/IP-Adapter
 - Linaqruf/anything-v3-1
@@ -22,31 +23,38 @@ models:
 - prompthero/openjourney-v4
 - runwayml/stable-diffusion-v1-5
 - SG161222/Realistic_Vision_V5.1_noVAE
 preload_from_hub:
 - >-
   ai-forever/Real-ESRGAN
   RealESRGAN_x2.pth,RealESRGAN_x4.pth
 - >-
   fluently/Fluently-v4
-  text_encoder/model.fp16.safetensors,unet/diffusion_pytorch_model.fp16.safetensors,vae/diffusion_pytorch_model.fp16.safetensors
 - >-
   h94/IP-Adapter
   models/ip-adapter-full-face_sd15.safetensors,models/ip-adapter-plus_sd15.safetensors,models/image_encoder/model.safetensors
 - >-
   Linaqruf/anything-v3-1
-  text_encoder/model.safetensors,unet/diffusion_pytorch_model.safetensors,vae/diffusion_pytorch_model.safetensors
 - >-
   Lykon/dreamshaper-8
   text_encoder/model.fp16.safetensors,unet/diffusion_pytorch_model.fp16.safetensors,vae/diffusion_pytorch_model.fp16.safetensors
 - >-
   prompthero/openjourney-v4
-  text_encoder/model.safetensors,unet/diffusion_pytorch_model.safetensors,vae/diffusion_pytorch_model.safetensors
 - >-
   runwayml/stable-diffusion-v1-5
   text_encoder/model.fp16.safetensors,unet/diffusion_pytorch_model.fp16.safetensors,vae/diffusion_pytorch_model.fp16.safetensors
 - >-
   SG161222/Realistic_Vision_V5.1_noVAE
-  text_encoder/model.safetensors,unet/diffusion_pytorch_model.safetensors,vae/diffusion_pytorch_model.safetensors
 ---
 # diffusion
@@ -85,3 +93,16 @@ python app.py --port 7860
 # cli
 python cli.py 'an astronaut riding a horse on mars'
 ```

 license: apache-2.0
 models:
 - ai-forever/Real-ESRGAN
+- cyberdelia/CyberRealistic
 - fluently/Fluently-v4
 - h94/IP-Adapter
 - Linaqruf/anything-v3-1
 - prompthero/openjourney-v4
 - runwayml/stable-diffusion-v1-5
 - SG161222/Realistic_Vision_V5.1_noVAE
+- XpucT/Deliberate
 preload_from_hub:
 - >-
   ai-forever/Real-ESRGAN
   RealESRGAN_x2.pth,RealESRGAN_x4.pth
+- >-
+  cyberdelia/CyberRealistic
+  CyberRealistic_V5_FP16.safetensors
 - >-
   fluently/Fluently-v4
+  Fluently-v4.safetensors
 - >-
   h94/IP-Adapter
   models/ip-adapter-full-face_sd15.safetensors,models/ip-adapter-plus_sd15.safetensors,models/image_encoder/model.safetensors
 - >-
   Linaqruf/anything-v3-1
+  anything-v3-2.safetensors
 - >-
   Lykon/dreamshaper-8
   text_encoder/model.fp16.safetensors,unet/diffusion_pytorch_model.fp16.safetensors,vae/diffusion_pytorch_model.fp16.safetensors
 - >-
   prompthero/openjourney-v4
+  openjourney-v4.ckpt
 - >-
   runwayml/stable-diffusion-v1-5
   text_encoder/model.fp16.safetensors,unet/diffusion_pytorch_model.fp16.safetensors,vae/diffusion_pytorch_model.fp16.safetensors
 - >-
   SG161222/Realistic_Vision_V5.1_noVAE
+  Realistic_Vision_V5.1_fp16-no-ema.safetensors
+- >-
+  XpucT/Deliberate
+  Deliberate_v6.safetensors
 ---
 # diffusion
 # cli
 python cli.py 'an astronaut riding a horse on mars'
 ```
+## Development
+See [pull requests and discussions](https://huggingface.co/docs/hub/en/repositories-pull-requests-discussions).
+```sh
+git fetch origin refs/pr/42:pr/42
+git checkout pr/42
+# ...
+git add .
+git commit -m "Commit message"
+git push origin pr/42:refs/pr/42
+```

app.py CHANGED Viewed

@@ -144,7 +144,7 @@ with gr.Blocks(
                             min_width=240,
                         )
                         scheduler = gr.Dropdown(
-                            choices=Config.SCHEDULERS,
                             value=Config.SCHEDULER,
                             elem_id="scheduler",
                             label="Scheduler",
@@ -245,23 +245,6 @@ with gr.Blocks(
                             maximum=(2**64) - 1,
                         )
-                    with gr.Row():
-                        increment_seed = gr.Checkbox(
-                            elem_classes=["checkbox"],
-                            label="Autoincrement",
-                            value=True,
-                        )
-                        use_freeu = gr.Checkbox(
-                            elem_classes=["checkbox"],
-                            label="FreeU",
-                            value=False,
-                        )
-                        use_clip_skip = gr.Checkbox(
-                            elem_classes=["checkbox"],
-                            label="Clip skip",
-                            value=False,
-                        )
                     with gr.Row():
                         use_karras = gr.Checkbox(
                             elem_classes=["checkbox"],
@@ -273,9 +256,14 @@ with gr.Blocks(
                             label="Tiny VAE",
                             value=False,
                         )
-                        truncate_prompts = gr.Checkbox(
                             elem_classes=["checkbox"],
-                            label="Truncate prompts",
                             value=False,
                         )
@@ -468,15 +456,13 @@ with gr.Blocks(
             guidance_scale,
             inference_steps,
             denoising_strength,
             num_images,
             use_karras,
             use_taesd,
             use_freeu,
             use_clip_skip,
-            truncate_prompts,
-            increment_seed,
-            deepcache_interval,
-            scale,
         ],
     )

                             min_width=240,
                         )
                         scheduler = gr.Dropdown(
+                            choices=Config.SCHEDULERS.keys(),
                             value=Config.SCHEDULER,
                             elem_id="scheduler",
                             label="Scheduler",
                             maximum=(2**64) - 1,
                         )
                     with gr.Row():
                         use_karras = gr.Checkbox(
                             elem_classes=["checkbox"],
                             label="Tiny VAE",
                             value=False,
                         )
+                        use_freeu = gr.Checkbox(
                             elem_classes=["checkbox"],
+                            label="FreeU",
+                            value=False,
+                        )
+                        use_clip_skip = gr.Checkbox(
+                            elem_classes=["checkbox"],
+                            label="Clip skip",
                             value=False,
                         )
             guidance_scale,
             inference_steps,
             denoising_strength,
+            deepcache_interval,
+            scale,
             num_images,
             use_karras,
             use_taesd,
             use_freeu,
             use_clip_skip,
         ],
     )

cli.py CHANGED Viewed

@@ -36,10 +36,8 @@ async def main():
     parser.add_argument("--ip-face", action="store_true")
     parser.add_argument("--taesd", action="store_true")
     parser.add_argument("--clip-skip", action="store_true")
-    parser.add_argument("--truncate", action="store_true")
     parser.add_argument("--karras", action="store_true")
     parser.add_argument("--freeu", action="store_true")
-    parser.add_argument("--no-increment", action="store_false")
     # fmt: on
     args = parser.parse_args()
@@ -60,15 +58,13 @@ async def main():
         args.guidance,
         args.steps,
         args.strength,
         args.images,
         args.karras,
         args.taesd,
         args.freeu,
         args.clip_skip,
-        args.truncate,
-        args.no_increment,
-        args.deepcache,
-        args.scale,
     )
     await async_call(save_images, images, args.filename)

     parser.add_argument("--ip-face", action="store_true")
     parser.add_argument("--taesd", action="store_true")
     parser.add_argument("--clip-skip", action="store_true")
     parser.add_argument("--karras", action="store_true")
     parser.add_argument("--freeu", action="store_true")
     # fmt: on
     args = parser.parse_args()
         args.guidance,
         args.steps,
         args.strength,
+        args.deepcache,
+        args.scale,
         args.images,
         args.karras,
         args.taesd,
         args.freeu,
         args.clip_skip,
     )
     await async_call(save_images, images, args.filename)

lib/config.py CHANGED Viewed

@@ -1,5 +1,16 @@
 from types import SimpleNamespace
 Config = SimpleNamespace(
     MONO_FONTS=["monospace"],
     SANS_FONTS=[
@@ -9,30 +20,45 @@ Config = SimpleNamespace(
         "Segoe UI Symbol",
         "Noto Color Emoji",
     ],
     MODEL="Lykon/dreamshaper-8",
     MODELS=[
         "fluently/Fluently-v4",
         "Linaqruf/anything-v3-1",
         "Lykon/dreamshaper-8",
         "prompthero/openjourney-v4",
         "runwayml/stable-diffusion-v1-5",
-        "SG161222/Realistic_Vision_V5.1_Novae",
     ],
     SCHEDULER="DEIS 2M",
-    SCHEDULERS=[
-        "DDIM",
-        "DEIS 2M",
-        "DPM++ 2M",
-        "Euler",
-        "Euler a",
-        "PNDM",
-    ],
     EMBEDDING="fast_negative",
-    EMBEDDINGS={
         "bad_dream",
         "fast_negative",
         "unrealistic_dream",
-    },
     STYLE="sai-enhance",
     WIDTH=448,
     HEIGHT=576,
@@ -40,7 +66,7 @@ Config = SimpleNamespace(
     SEED=-1,
     GUIDANCE_SCALE=6,
     INFERENCE_STEPS=35,
-    DENOISING_STRENGTH=0.6,
     DEEPCACHE_INTERVAL=1,
     SCALE=1,
     SCALES=[1, 2, 4],

 from types import SimpleNamespace
+from diffusers import (
+    DDIMScheduler,
+    DEISMultistepScheduler,
+    DPMSolverMultistepScheduler,
+    EulerAncestralDiscreteScheduler,
+    EulerDiscreteScheduler,
+    PNDMScheduler,
+    StableDiffusionImg2ImgPipeline,
+    StableDiffusionPipeline,
+)
 Config = SimpleNamespace(
     MONO_FONTS=["monospace"],
     SANS_FONTS=[
         "Segoe UI Symbol",
         "Noto Color Emoji",
     ],
+    PIPELINES={
+        "txt2img": StableDiffusionPipeline,
+        "img2img": StableDiffusionImg2ImgPipeline,
+    },
     MODEL="Lykon/dreamshaper-8",
     MODELS=[
+        "cyberdelia/CyberRealistic",
         "fluently/Fluently-v4",
         "Linaqruf/anything-v3-1",
         "Lykon/dreamshaper-8",
         "prompthero/openjourney-v4",
         "runwayml/stable-diffusion-v1-5",
+        "SG161222/Realistic_Vision_V5.1_noVAE",
+        "XpucT/Deliberate",
     ],
+    MODEL_CHECKPOINTS={
+        # keep keys lowercase
+        "cyberdelia/cyberrealistic": "CyberRealistic_V5_FP16.safetensors",
+        "fluently/fluently-v4": "Fluently-v4.safetensors",
+        "linaqruf/anything-v3-1": "anything-v3-2.safetensors",
+        "prompthero/openjourney-v4": "openjourney-v4.ckpt",
+        "sg161222/realistic_vision_v5.1_novae": "Realistic_Vision_V5.1_fp16-no-ema.safetensors",
+        "xpuct/deliberate": "Deliberate_v6.safetensors",
+    },
     SCHEDULER="DEIS 2M",
+    SCHEDULERS={
+        "DDIM": DDIMScheduler,
+        "DEIS 2M": DEISMultistepScheduler,
+        "DPM++ 2M": DPMSolverMultistepScheduler,
+        "Euler": EulerDiscreteScheduler,
+        "Euler a": EulerAncestralDiscreteScheduler,
+        "PNDM": PNDMScheduler,
+    },
     EMBEDDING="fast_negative",
+    EMBEDDINGS=[
         "bad_dream",
         "fast_negative",
         "unrealistic_dream",
+    ],
     STYLE="sai-enhance",
     WIDTH=448,
     HEIGHT=576,
     SEED=-1,
     GUIDANCE_SCALE=6,
     INFERENCE_STEPS=35,
+    DENOISING_STRENGTH=0.7,
     DEEPCACHE_INTERVAL=1,
     SCALE=1,
     SCALES=[1, 2, 4],

lib/loader.py CHANGED Viewed

@@ -1,27 +1,18 @@
 import torch
 from DeepCache import DeepCacheSDHelper
-from diffusers import (
-    DDIMScheduler,
-    DEISMultistepScheduler,
-    DPMSolverMultistepScheduler,
-    EulerAncestralDiscreteScheduler,
-    EulerDiscreteScheduler,
-    PNDMScheduler,
-    StableDiffusionImg2ImgPipeline,
-    StableDiffusionPipeline,
-)
 from diffusers.models import AutoencoderKL, AutoencoderTiny
 from diffusers.models.attention_processor import AttnProcessor2_0, IPAdapterAttnProcessor2_0
 from torch._dynamo import OptimizedModule
 from .upscaler import RealESRGAN
 __import__("warnings").filterwarnings("ignore", category=FutureWarning, module="diffusers")
-PIPELINES = {
-    "txt2img": StableDiffusionPipeline,
-    "img2img": StableDiffusionImg2ImgPipeline,
-}
 class Loader:
@@ -31,6 +22,7 @@ class Loader:
         if cls._instance is None:
             cls._instance = super(Loader, cls).__new__(cls)
             cls._instance.pipe = None
             cls._instance.upscaler = None
             cls._instance.ip_adapter = None
         return cls._instance
@@ -38,13 +30,13 @@ class Loader:
     def _should_unload_upscaler(self, scale=1):
         return self.upscaler is not None and scale == 1
-    def _should_unload_ip_adapter(self, ip_adapter=None):
-        return self.ip_adapter is not None and ip_adapter is None
     def _should_unload_pipeline(self, kind="", model=""):
         if self.pipe is None:
             return False
-        if self.pipe.config._name_or_path.lower() != model.lower():
             return True
         if kind == "txt2img" and not isinstance(self.pipe, StableDiffusionPipeline):
             return True  # txt2img -> img2img
@@ -52,6 +44,7 @@ class Loader:
             return True  # img2img -> txt2img
         return False
     def _unload_ip_adapter(self):
         print("Unloading IP Adapter...")
         if not isinstance(self.pipe, StableDiffusionImg2ImgPipeline):
@@ -73,7 +66,7 @@ class Loader:
             )
         self.pipe.unet.set_attn_processor(attn_procs)
-    def _unload(self, kind="", model="", ip_adapter=None, scale=1):
         to_unload = []
         if self._should_unload_upscaler(scale):
@@ -84,27 +77,30 @@ class Loader:
             to_unload.append("ip_adapter")
         if self._should_unload_pipeline(kind, model):
             to_unload.append("pipe")
         for component in to_unload:
-            if hasattr(self, component):
-                delattr(self, component)
         torch.cuda.empty_cache()
         torch.cuda.ipc_collect()
         for component in to_unload:
             setattr(self, component, None)
-    def _load_ip_adapter(self, ip_adapter=None):
-        if self.ip_adapter is None and ip_adapter is not None:
             print(f"Loading IP Adapter: {ip_adapter}...")
             self.pipe.load_ip_adapter(
                 "h94/IP-Adapter",
                 subfolder="models",
                 weight_name=f"ip-adapter-{ip_adapter}_sd15.safetensors",
             )
-            # TODO: slider for ip_scale
             self.pipe.set_ip_adapter_scale(0.5)
             self.ip_adapter = ip_adapter
@@ -114,24 +110,39 @@ class Loader:
             self.upscaler = RealESRGAN(device=device, scale=scale)
             self.upscaler.load_weights()
-    def _load_pipeline(self, kind, model, taesd, device, **kwargs):
-        pipeline = PIPELINES[kind]
         if self.pipe is None:
-            print(f"Loading {model.lower()} with {'Tiny' if taesd else 'KL'} VAE...")
-            self.pipe = pipeline.from_pretrained(model, **kwargs).to(device)
         if not isinstance(self.pipe, pipeline):
             self.pipe = pipeline.from_pipe(self.pipe).to(device)
-    def _load_vae(self, taesd=False, model_name=None, variant=None):
         vae_type = type(self.pipe.vae)
         is_kl = issubclass(vae_type, (AutoencoderKL, OptimizedModule))
         is_tiny = issubclass(vae_type, AutoencoderTiny)
         # by default all models use KL
         if is_kl and taesd:
-            # can't compile tiny VAE
             print("Switching to Tiny VAE...")
             self.pipe.vae = AutoencoderTiny.from_pretrained(
                 pretrained_model_name_or_path="madebyollin/taesd",
                 torch_dtype=self.pipe.dtype,
             ).to(self.pipe.device)
@@ -139,16 +150,22 @@ class Loader:
         if is_tiny and not taesd:
             print("Switching to KL VAE...")
-            model = AutoencoderKL.from_pretrained(
-                pretrained_model_name_or_path=model_name,
-                torch_dtype=self.pipe.dtype,
-                subfolder="vae",
-                variant=variant,
-            ).to(self.pipe.device)
             self.pipe.vae = torch.compile(
                 mode="reduce-overhead",
                 fullgraph=True,
-                model=model,
             )
     def _load_deepcache(self, interval=1):
@@ -162,8 +179,8 @@ class Loader:
         self.pipe.deepcache.set_params(cache_interval=interval)
         self.pipe.deepcache.enable()
     def _load_freeu(self, freeu=False):
-        # https://github.com/huggingface/diffusers/blob/v0.30.0/src/diffusers/models/unets/unet_2d_condition.py
         block = self.pipe.unet.up_blocks[0]
         attrs = ["b1", "b2", "s1", "s2"]
         has_freeu = all(getattr(block, attr, None) is not None for attr in attrs)
@@ -171,7 +188,6 @@ class Loader:
             print("Disabling FreeU...")
             self.pipe.disable_freeu()
         elif not has_freeu and freeu:
-            # https://github.com/ChenyangSi/FreeU
             print("Enabling FreeU...")
             self.pipe.enable_freeu(b1=1.5, b2=1.6, s1=0.9, s2=0.2)
@@ -187,20 +203,7 @@ class Loader:
         deepcache,
         scale,
         device,
-        dtype,
     ):
-        model_lower = model.lower()
-        model_name = self.pipe.config._name_or_path.lower() if self.pipe is not None else ""
-        schedulers = {
-            "DDIM": DDIMScheduler,
-            "DEIS 2M": DEISMultistepScheduler,
-            "DPM++ 2M": DPMSolverMultistepScheduler,
-            "Euler": EulerDiscreteScheduler,
-            "Euler a": EulerAncestralDiscreteScheduler,
-            "PNDM": PNDMScheduler,
-        }
         scheduler_kwargs = {
             "beta_schedule": "scaled_linear",
             "timestep_spacing": "leading",
@@ -217,45 +220,52 @@ class Loader:
             scheduler_kwargs["clip_sample"] = False
             scheduler_kwargs["set_alpha_to_one"] = False
-        # no fp16 variant
-        if model_lower not in [
-            "sg161222/realistic_vision_v5.1_novae",
-            "prompthero/openjourney-v4",
-            "linaqruf/anything-v3-1",
-        ]:
-            variant = "fp16"
-        else:
-            variant = None
         pipe_kwargs = {
-            "scheduler": schedulers[scheduler](**scheduler_kwargs),
-            "requires_safety_checker": False,
             "safety_checker": None,
-            "torch_dtype": dtype,
-            "variant": variant,
         }
         self._unload(kind, model, ip_adapter, scale)
-        self._load_pipeline(kind, model, taesd, device, **pipe_kwargs)
-        same_scheduler = isinstance(self.pipe.scheduler, schedulers[scheduler])
         same_karras = (
             not hasattr(self.pipe.scheduler.config, "use_karras_sigmas")
             or self.pipe.scheduler.config.use_karras_sigmas == karras
         )
         # same model, different scheduler
-        if model_name == model_lower:
             if not same_scheduler:
                 print(f"Switching to {scheduler}...")
             if not same_karras:
                 print(f"{'Enabling' if karras else 'Disabling'} Karras sigmas...")
             if not same_scheduler or not same_karras:
-                self.pipe.scheduler = schedulers[scheduler](**scheduler_kwargs)
         self._load_upscaler(device, scale)
         self._load_ip_adapter(ip_adapter)
-        self._load_vae(taesd, model_lower, variant)
         self._load_freeu(freeu)
         self._load_deepcache(deepcache)
         return self.pipe, self.upscaler

+import gc
 import torch
 from DeepCache import DeepCacheSDHelper
+from diffusers import StableDiffusionImg2ImgPipeline, StableDiffusionPipeline
 from diffusers.models import AutoencoderKL, AutoencoderTiny
 from diffusers.models.attention_processor import AttnProcessor2_0, IPAdapterAttnProcessor2_0
 from torch._dynamo import OptimizedModule
+from .config import Config
 from .upscaler import RealESRGAN
 __import__("warnings").filterwarnings("ignore", category=FutureWarning, module="diffusers")
+__import__("warnings").filterwarnings("ignore", category=FutureWarning, module="torch")
+__import__("diffusers").logging.set_verbosity_error()
 class Loader:
         if cls._instance is None:
             cls._instance = super(Loader, cls).__new__(cls)
             cls._instance.pipe = None
+            cls._instance.model = None
             cls._instance.upscaler = None
             cls._instance.ip_adapter = None
         return cls._instance
     def _should_unload_upscaler(self, scale=1):
         return self.upscaler is not None and scale == 1
+    def _should_unload_ip_adapter(self, ip_adapter=""):
+        return self.ip_adapter is not None and not ip_adapter
     def _should_unload_pipeline(self, kind="", model=""):
         if self.pipe is None:
             return False
+        if self.model.lower() != model.lower():
             return True
         if kind == "txt2img" and not isinstance(self.pipe, StableDiffusionPipeline):
             return True  # txt2img -> img2img
             return True  # img2img -> txt2img
         return False
+    # https://github.com/huggingface/diffusers/blob/v0.28.0/src/diffusers/loaders/ip_adapter.py#L300
     def _unload_ip_adapter(self):
         print("Unloading IP Adapter...")
         if not isinstance(self.pipe, StableDiffusionImg2ImgPipeline):
             )
         self.pipe.unet.set_attn_processor(attn_procs)
+    def _unload(self, kind="", model="", ip_adapter="", scale=1):
         to_unload = []
         if self._should_unload_upscaler(scale):
             to_unload.append("ip_adapter")
         if self._should_unload_pipeline(kind, model):
+            to_unload.append("model")
             to_unload.append("pipe")
         for component in to_unload:
+            delattr(self, component)
+        gc.collect()
         torch.cuda.empty_cache()
         torch.cuda.ipc_collect()
+        torch.cuda.reset_max_memory_allocated()
+        torch.cuda.reset_peak_memory_stats()
         for component in to_unload:
             setattr(self, component, None)
+    def _load_ip_adapter(self, ip_adapter=""):
+        if self.ip_adapter is None and ip_adapter:
             print(f"Loading IP Adapter: {ip_adapter}...")
             self.pipe.load_ip_adapter(
                 "h94/IP-Adapter",
                 subfolder="models",
                 weight_name=f"ip-adapter-{ip_adapter}_sd15.safetensors",
             )
+            # 50% works the best
             self.pipe.set_ip_adapter_scale(0.5)
             self.ip_adapter = ip_adapter
             self.upscaler = RealESRGAN(device=device, scale=scale)
             self.upscaler.load_weights()
+    def _load_pipeline(self, kind, model, device, **kwargs):
+        pipeline = Config.PIPELINES[kind]
         if self.pipe is None:
+            print(f"Loading {model}...")
+            try:
+                if model.lower() in Config.MODEL_CHECKPOINTS.keys():
+                    self.pipe = pipeline.from_single_file(
+                        f"https://huggingface.co/{model}/{Config.MODEL_CHECKPOINTS[model.lower()]}",
+                        **kwargs,
+                    ).to(device)
+                else:
+                    self.pipe = pipeline.from_pretrained(model, **kwargs).to(device)
+                self.model = model
+            except Exception as e:
+                print(f"Error loading {model}: {e}")
+                self.model = None
+                self.pipe = None
+                return
         if not isinstance(self.pipe, pipeline):
             self.pipe = pipeline.from_pipe(self.pipe).to(device)
+        self.pipe.set_progress_bar_config(disable=True)
+    def _load_vae(self, taesd=False, model=""):
         vae_type = type(self.pipe.vae)
         is_kl = issubclass(vae_type, (AutoencoderKL, OptimizedModule))
         is_tiny = issubclass(vae_type, AutoencoderTiny)
         # by default all models use KL
         if is_kl and taesd:
             print("Switching to Tiny VAE...")
             self.pipe.vae = AutoencoderTiny.from_pretrained(
+                # can't compile tiny VAE
                 pretrained_model_name_or_path="madebyollin/taesd",
                 torch_dtype=self.pipe.dtype,
             ).to(self.pipe.device)
         if is_tiny and not taesd:
             print("Switching to KL VAE...")
+            if model.lower() in Config.MODEL_CHECKPOINTS.keys():
+                vae = AutoencoderKL.from_single_file(
+                    f"https://huggingface.co/{model}/{Config.MODEL_CHECKPOINTS[model.lower()]}",
+                    torch_dtype=self.pipe.dtype,
+                ).to(self.pipe.device)
+            else:
+                vae = AutoencoderKL.from_pretrained(
+                    pretrained_model_name_or_path=model,
+                    torch_dtype=self.pipe.dtype,
+                    subfolder="vae",
+                    variant="fp16",
+                ).to(self.pipe.device)
             self.pipe.vae = torch.compile(
                 mode="reduce-overhead",
                 fullgraph=True,
+                model=vae,
             )
     def _load_deepcache(self, interval=1):
         self.pipe.deepcache.set_params(cache_interval=interval)
         self.pipe.deepcache.enable()
+    # https://github.com/ChenyangSi/FreeU
     def _load_freeu(self, freeu=False):
         block = self.pipe.unet.up_blocks[0]
         attrs = ["b1", "b2", "s1", "s2"]
         has_freeu = all(getattr(block, attr, None) is not None for attr in attrs)
             print("Disabling FreeU...")
             self.pipe.disable_freeu()
         elif not has_freeu and freeu:
             print("Enabling FreeU...")
             self.pipe.enable_freeu(b1=1.5, b2=1.6, s1=0.9, s2=0.2)
         deepcache,
         scale,
         device,
     ):
         scheduler_kwargs = {
             "beta_schedule": "scaled_linear",
             "timestep_spacing": "leading",
             scheduler_kwargs["clip_sample"] = False
             scheduler_kwargs["set_alpha_to_one"] = False
         pipe_kwargs = {
             "safety_checker": None,
+            "requires_safety_checker": False,
+            "scheduler": Config.SCHEDULERS[scheduler](**scheduler_kwargs),
         }
+        # diffusers fp16 variant
+        if model.lower() not in Config.MODEL_CHECKPOINTS.keys():
+            pipe_kwargs["variant"] = "fp16"
+        else:
+            pipe_kwargs["variant"] = None
+        # convert fp32 to bf16/fp16
+        if (
+            model.lower() in ["linaqruf/anything-v3-1"]
+            and torch.cuda.get_device_properties(device).major >= 8
+        ):
+            pipe_kwargs["torch_dtype"] = torch.bfloat16
+        else:
+            pipe_kwargs["torch_dtype"] = torch.float16
         self._unload(kind, model, ip_adapter, scale)
+        self._load_pipeline(kind, model, device, **pipe_kwargs)
+        # error loading model
+        if self.pipe is None:
+            return self.pipe, self.upscaler
+        same_scheduler = isinstance(self.pipe.scheduler, Config.SCHEDULERS[scheduler])
         same_karras = (
             not hasattr(self.pipe.scheduler.config, "use_karras_sigmas")
             or self.pipe.scheduler.config.use_karras_sigmas == karras
         )
         # same model, different scheduler
+        if self.model.lower() == model.lower():
             if not same_scheduler:
                 print(f"Switching to {scheduler}...")
             if not same_karras:
                 print(f"{'Enabling' if karras else 'Disabling'} Karras sigmas...")
             if not same_scheduler or not same_karras:
+                self.pipe.scheduler = Config.SCHEDULERS[scheduler](**scheduler_kwargs)
         self._load_upscaler(device, scale)
         self._load_ip_adapter(ip_adapter)
+        self._load_vae(taesd, model)
         self._load_freeu(freeu)
         self._load_deepcache(deepcache)
         return self.pipe, self.upscaler

usage.md CHANGED Viewed

@@ -4,13 +4,7 @@ Enter a prompt and click `Generate`. Roll the `🎲` for a random prompt.
 ### Prompting
-Positive and negative prompts are embedded by [Compel](https://github.com/damian0815/compel) for weighting. You can use a float or +/-. For example:
-* `man, portrait, blue+ eyes, close-up`
-* `man, portrait, (blue)1.1 eyes, close-up`
-* `man, portrait, (blue eyes)-, close-up`
-* `man, portrait, (blue eyes)0.9, close-up`
-Note that `++` is `1.1^2` (and so on). See [syntax features](https://github.com/damian0815/compel/blob/main/doc/syntax.md) to learn more and read [Civitai](https://civitai.com)'s guide on [prompting](https://education.civitai.com/civitais-prompt-crafting-guide-part-1-basics/) for best practices.
 #### Arrays
@@ -30,22 +24,20 @@ Styles are prompt templates from twri's [sdxl_prompt_styler](https://github.com/
 ### Scale
-Rescale up to 4x using [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) (Wang et al. 2021).
 ### Models
 Each model checkpoint has a different aesthetic:
-* [lykon/dreamshaper-8](https://huggingface.co/Lykon/dreamshaper-8): general purpose (default)
-* [fluently/fluently-v4](https://huggingface.co/fluently/Fluently-v4): general purpose merge
-* [linaqruf/anything-v3-1](https://huggingface.co/linaqruf/anything-v3-1): anime
 * [prompthero/openjourney-v4](https://huggingface.co/prompthero/openjourney-v4): Midjourney-like
 * [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5): base
-* [sg161222/realistic_vision_v5.1](https://huggingface.co/SG161222/Realistic_Vision_V5.1_noVAE): photorealistic
-### Schedulers
-The default is [DEIS 2M](https://huggingface.co/docs/diffusers/en/api/schedulers/deis) with [Karras](https://arxiv.org/abs/2206.00364) enabled. The other multistep scheduler, [DPM++ 2M](https://huggingface.co/docs/diffusers/en/api/schedulers/multistep_dpm_solver), is also good. For realism, [DDIM](https://huggingface.co/docs/diffusers/en/api/schedulers/ddim) is recommended. [Euler a](https://huggingface.co/docs/diffusers/en/api/schedulers/euler_ancestral) is worth trying for a different look.
 ### Image-to-Image
@@ -55,15 +47,15 @@ Denoising strength is essentially how much the generation will differ from the i
 ### IP-Adapter
-In an image-to-image pipeline, the input image is used as the initial latent. With [IP-Adapter](https://github.com/tencent-ailab/IP-Adapter) (Ye et al. 2023), the input image is processed by a separate image encoder and the encoded features are used as conditioning along with the text prompt.
-For capturing faces, enable `IP-Adapter Face` to use the full-face model. You should use an input image that is mostly a face along with the Realistic Vision model.
 ### Advanced
 #### DeepCache
-[DeepCache](https://github.com/horseee/DeepCache) (Ma et al. 2023) caches lower UNet layers and reuses them every `Interval` steps. Trade quality for speed:
 * `1`: no caching (default)
 * `2`: more quality
 * `3`: balanced
@@ -71,7 +63,7 @@ For capturing faces, enable `IP-Adapter Face` to use the full-face model. You sh
 #### FreeU
-[FreeU](https://github.com/ChenyangSi/FreeU) (Si et al. 2023) re-weights the contributions sourced from the UNet’s skip connections and backbone feature maps. Can sometimes improve image quality.
 #### Clip Skip
@@ -80,7 +72,3 @@ When enabled, the last CLIP layer is skipped. Can sometimes improve image qualit
 #### Tiny VAE
 Enable [madebyollin/taesd](https://github.com/madebyollin/taesd) for near-instant latent decoding with a minor loss in detail. Useful for development.
-#### Prompt Truncation
-When enabled, prompts will be truncated to CLIP's limit of 77 tokens. By default this is _disabled_, so Compel will chunk prompts into segments rather than cutting them off.

 ### Prompting
+Positive and negative prompts are embedded by [Compel](https://github.com/damian0815/compel) for weighting. See [syntax features](https://github.com/damian0815/compel/blob/main/doc/syntax.md) to learn more and read [Civitai](https://civitai.com)'s guide on [prompting](https://education.civitai.com/civitais-prompt-crafting-guide-part-1-basics/) for best practices.
 #### Arrays
 ### Scale
+Rescale up to 4x using [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) from [ai-forever](ai-forever/Real-ESRGAN).
 ### Models
 Each model checkpoint has a different aesthetic:
+* [cyberdelia/CyberRealistic_v5](https://huggingface.co/cyberdelia/CyberRealistic): photorealistic
+* [Lykon/dreamshaper-8](https://huggingface.co/Lykon/dreamshaper-8): general purpose (default)
+* [fluently/Fluently-v4](https://huggingface.co/fluently/Fluently-v4): general purpose
+* [Linaqruf/anything-v3-1](https://huggingface.co/Linaqruf/anything-v3-1): anime
 * [prompthero/openjourney-v4](https://huggingface.co/prompthero/openjourney-v4): Midjourney-like
 * [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5): base
+* [SG161222/Realistic_Vision_v5.1](https://huggingface.co/SG161222/Realistic_Vision_V5.1_noVAE): photorealistic
+* [XpucT/Deliberate_v6](https://huggingface.co/XpucT/Deliberate): general purpose
 ### Image-to-Image
 ### IP-Adapter
+In an image-to-image pipeline, the input image is used as the initial latent. With [IP-Adapter](https://github.com/tencent-ailab/IP-Adapter), the input image is processed by a separate image encoder and the encoded features are used as conditioning along with the text prompt.
+For capturing faces, enable `IP-Adapter Face` to use the full-face model. You should use an input image that is mostly a face and it should be high quality. You can generate fake portraits with Realistic Vision to experiment. Note that you'll never get true identity preservation without an advanced pipeline like [InstantID](https://github.com/instantX-research/InstantID), which combines many techniques.
 ### Advanced
 #### DeepCache
+[DeepCache](https://github.com/horseee/DeepCache) caches lower UNet layers and reuses them every `Interval` steps. Trade quality for speed:
 * `1`: no caching (default)
 * `2`: more quality
 * `3`: balanced
 #### FreeU
+[FreeU](https://github.com/ChenyangSi/FreeU) re-weights the contributions sourced from the UNet’s skip connections and backbone feature maps. Can sometimes improve image quality.
 #### Clip Skip
 #### Tiny VAE
 Enable [madebyollin/taesd](https://github.com/madebyollin/taesd) for near-instant latent decoding with a minor loss in detail. Useful for development.