initialize model card

Browse files

Files changed (10) hide show

.DS_Store +0 -0
.gitattributes +3 -0
README.md +150 -0
assets/.DS_Store +0 -0
assets/outdomain_preview.png +3 -0
assets/qualitative_real.png +3 -0
assets/teaser_figure.png +3 -0
adapter.pt → models/adapter.pt +0 -0
aggregator.pt → models/aggregator.pt +0 -0
previewer_lora_weights.bin → models/previewer_lora_weights.bin +0 -0

.DS_Store ADDED Viewed

Binary file (6.15 kB). View file

.gitattributes CHANGED Viewed

@@ -33,3 +33,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+assets/outdomain_preview.png filter=lfs diff=lfs merge=lfs -text
+assets/qualitative_real.png filter=lfs diff=lfs merge=lfs -text
+assets/teaser_figure.png filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,3 +1,153 @@
 ---
 license: apache-2.0
 ---

 ---
 license: apache-2.0
+language:
+- en
+library_name: diffusers
+pipeline_tag: image-to-image
 ---
+# InstantIR Model Card
+<!-- > **InstantIR: Blind Image Restoration with Instant Generative Reference**<br>
+> Jen-Yuan Huang<sup>1,2</sup>, Haofan Wang<sup>2</sup>, Qixun Wang<sup>2</sup>, Xu Bai<sup>2</sup>, Hao Ai<sup>2</sup>, Peng Xing<sup>2</sup>, Jen-Tse Huang<sup>3</sup> <br>
+> <sup>1</sup>Peking University, <sup>2</sup>InstantX Team, <sup>3</sup>The Chinese University of Hong Kong -->
+<a href='https://arxiv.org/abs/2410.06551'><img src='https://img.shields.io/badge/arXiv-b31b1b.svg'>
+<a href='https://jy-joy.github.io/InstantIR'><img src='https://img.shields.io/badge/Website-informational'></a>
+<a href='https://github.com/JY-Joy/InstantIR'><img src='https://img.shields.io/badge/Github-gray'></a>
+> **InstantIR** is a novel single-image restoration model designed to resurrect your damaged images, delivering extrem-quality yet realistic details. You can further boost **InstantIR** performance with additional text prompts, even achieve customized editing!
+<div  align="center">
+<img src='assets/teaser_figure.png'>
+</div>
+## Usage
+### 1. Clone the github repo
+```sh
+git clone https://github.com/JY-Joy/InstantIR.git
+cd InstantIR
+```
+### 2. Download model weights
+You can directly download InstantIR weights in this repository, or
+you can download them using python script:
+```python
+from huggingface_hub import hf_hub_download
+hf_hub_download(repo_id="InstantX/InstantIR", filename="models/adapter.pt", local_dir="./models")
+hf_hub_download(repo_id="InstantX/InstantIR", filename="models/aggregator.pt", local_dir="./models")
+hf_hub_download(repo_id="InstantX/InstantIR", filename="models/previewer_lora_weights.bin", local_dir="./models")
+```
+### 3. Load InstantIR with 🧨 diffusers
+```python
+# !pip install opencv-python transformers accelerate
+import torch
+from PIL import Image
+import diffusers
+from diffusers import DDPMScheduler, StableDiffusionXLPipeline
+from diffusers.utils import load_image
+from schedulers.lcm_single_step_scheduler import LCMSingleStepScheduler
+from transformers import AutoImageProcessor, AutoModel
+from module.ip_adapter.utils import load_ip_adapter_to_pipe, revise_state_dict, init_ip_adapter_in_unet
+from module.ip_adapter.resampler import Resampler
+from module.aggregator import Aggregator
+from pipelines.sdxl_instantir import InstantIRPipeline
+# prepare 'dinov2'
+image_encoder = AutoModel.from_pretrained('facebook/dinov2-large')
+image_processor = AutoImageProcessor.from_pretrained('facebook/dinov2-large')
+# prepare models under ./checkpoints
+dcp_adapter = f'./models/adapter.pt'
+previewer_lora_path = f'./models'
+instantir_path = f'./models/aggregator.pt'
+# load SDXL
+sdxl = StableDiffusionXLPipeline.from_pretrained('stabilityai/stable-diffusion-xl-base-1.0', torch_dtype=torch.float16)
+# load adapter
+image_proj_model = Resampler(
+    embedding_dim=image_encoder.config.hidden_size,
+    output_dim=sdxl.unet.config.cross_attention_dim,
+)
+init_ip_adapter_in_unet(
+    sdxl.unet,
+    image_proj_model,
+    dcp_adapter,
+)
+pipe = InstantIRPipeline(
+    sdxl.vae, sdxl.text_encoder, sdxl.text_encoder_2, sdxl.tokenizer, sdxl.tokenizer_2,
+    sdxl.unet, sdxl.scheduler, feature_extractor=image_processor, image_encoder=image_encoder,
+)
+pipe.cuda()
+# load previewer lora
+pipe.prepare_previewers(previewer_lora_path)
+pipe.unet.to(dtype=torch.float16)
+pipe.scheduler = DDPMScheduler.from_pretrained('stabilityai/stable-diffusion-xl-base-1.0', subfolder="scheduler")
+lcm_scheduler = LCMSingleStepScheduler.from_config(pipe.scheduler.config)
+# load aggregator weights
+pretrained_state_dict = torch.load(instantir_path)
+pipe.aggregator.load_state_dict(pretrained_state_dict)
+pipe.aggregator.to(dtype=torch.float16)
+```
+Then, you can restore your broken images with:
+```python
+# load a broken image
+image = Image.open('path/to/your-image').convert("RGB")
+# InstantIR restoration
+image = pipe(
+    prompt='',
+    image=image,
+    ip_adapter_image=[image],
+    negative_prompt='',
+    guidance_scale=7.0,
+    previewer_scheduler=lcm_scheduler,
+    return_dict=False,
+)[0]
+```
+For more details including text-guided enhancement/editing, please refer to our [GitHub repository](https://github.com/JY-Joy/InstantIR).
+<!-- ## Usage Tips
+1. If you're not satisfied with the similarity, try to increase the weight of "IdentityNet Strength" and "Adapter Strength".
+2. If you feel that the saturation is too high, first decrease the Adapter strength. If it is still too high, then decrease the IdentityNet strength.
+3. If you find that text control is not as expected, decrease Adapter strength.
+4. If you find that realistic style is not good enough, go for our Github repo and use a more realistic base model. -->
+## Examples
+<div  align="center">
+<img src='assets/qualitative_real.png'>
+</div>
+<div  align="center">
+<img src='assets/outdomain_preview.png'>
+</div>
+## Disclaimer
+This project is released under Apache License and aims to positively impact the field of AI-driven image generation. Users are granted the freedom to create images using this tool, but they are obligated to comply with local laws and utilize it responsibly. The developers will not assume any responsibility for potential misuse by users.
+## Citation
+```bibtex
+@article{huang2024instantir,
+  title={InstantIR: Blind Image Restoration with Instant Generative Reference},
+  author={Huang, Jen-Yuan and Wang, Haofan and Wang, Qixun and Bai, Xu and Ai, Hao and Xing, Peng and Huang, Jen-Tse},
+  journal={arXiv preprint arXiv:2410.06551},
+  year={2024}
+}
+```

assets/.DS_Store ADDED Viewed

Binary file (6.15 kB). View file

assets/outdomain_preview.png ADDED Viewed

Git LFS Details

SHA256: cc27a9c5c5ea41785bffd3c0142a83ca87881aeda3604c5052f5eccd5602f5fc
Pointer size: 132 Bytes
Size of remote file: 5.04 MB

assets/qualitative_real.png ADDED Viewed

Git LFS Details

SHA256: e8507a8a741a3d6a7a7e4fa2648cb248d88fc6c3102dea0d6f09d016f005a2b8
Pointer size: 132 Bytes
Size of remote file: 6.65 MB

assets/teaser_figure.png ADDED Viewed

Git LFS Details

SHA256: 2b0c926a7913faed56e3f5d24967416e283f6d0372c769e24f0f8939f1b50d3f
Pointer size: 132 Bytes
Size of remote file: 5.58 MB

adapter.pt → models/adapter.pt RENAMED Viewed

File without changes

aggregator.pt → models/aggregator.pt RENAMED Viewed

File without changes

previewer_lora_weights.bin → models/previewer_lora_weights.bin RENAMED Viewed

File without changes