Spaces:

sayakpaul
/

convert-kerascv-sd-diffusers

Build error

File size: 3,744 Bytes

3304f7d
ddc8a59
3304f7d
 
ddc8a59
 
 
b64b5e6
ddc8a59
 
 
0554219
ddc8a59
e548249
ddc8a59
e548249
ddc8a59
 
 
 
b64b5e6
ddc8a59
 
3304f7d
ddc8a59
 
3304f7d
ddc8a59
3304f7d
b64b5e6
ddc8a59
 
 
a1008bd
 
 
 
 
 
3304f7d
 
 
a1008bd
3304f7d
ddc8a59
 
 
 
3304f7d

import gradio as gr

from convert import run_conversion
from hub_utils import push_to_hub, save_model_card

PRETRAINED_CKPT = "CompVis/stable-diffusion-v1-4"
DESCRIPTION = """
This Space lets you convert KerasCV Stable Diffusion weights to a format compatible with [Diffusers](https://github.com/huggingface/diffusers) 🧨. This allows users to fine-tune using KerasCV and use the fine-tuned weights in Diffusers taking advantage of its nifty features (like [schedulers](https://huggingface.co/docs/diffusers/main/en/using-diffusers/schedulers), [fast attention](https://huggingface.co/docs/diffusers/optimization/fp16), etc.). Specifically, the Keras weights are first converted to PyTorch and then they are wrapped into a [`StableDiffusionPipeline`](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/overview). This pipeline is then pushed to the Hugging Face Hub given you have provided `your_hf_token`.

## Notes (important)

* The Space downloads a couple of pre-trained weights and runs a dummy inference. Depending, on the machine type, the enture process can take anywhere between 2 - 5 minutes.
* Only Stable Diffusion (v1) is supported as of now. In particular this checkpoint: [`"CompVis/stable-diffusion-v1-4"`](https://huggingface.co/CompVis/stable-diffusion-v1-4).
* Only the text encoder and UNet parameters are converted since only these two elements are generally fine-tuned.
* [This Colab Notebook](https://colab.research.google.com/drive/1RYY077IQbAJldg8FkK8HSEpNILKHEwLb?usp=sharing) was used to develop the conversion utilities initially.
* You can choose NOT to provide `text_encoder_weights` and `unet_weights` in case you don't have any fine-tuned weights. In that case, the original parameters of the respective models (text encoder and UNet) from KerasCV will be used.
* You can provide only `text_encoder_weights` or `unet_weights` or both.
* When providing the weights' links, ensure they're directly downloadable. Internally, the Space uses [`tf.keras.utils.get_file()`](https://www.tensorflow.org/api_docs/python/tf/keras/utils/get_file) to retrieve the weights locally. 
* If you don't provide `your_hf_token` the converted pipeline won't be pushed. 

Check [here](https://github.com/huggingface/diffusers/blob/31be42209ddfdb69d9640a777b32e9b5c6259bf0/examples/dreambooth/train_dreambooth_lora.py#L975) for an example on how you can change the scheduler of an already initialized `StableDiffusionPipeline`.
"""


def run(hf_token, text_encoder_weights, unet_weights, repo_prefix):
    if text_encoder_weights == "":
        text_encoder_weights = None
    if unet_weights == "":
        unet_weights = None
    print(f"unet_weights: {unet_weights}")
    pipeline = run_conversion(text_encoder_weights, unet_weights)
    output_path = "kerascv_sd_diffusers_pipeline"
    pipeline.save_pretrained(output_path)

    weight_paths = []
    if text_encoder_weights is not None:
        weight_paths.append(text_encoder_weights)
    if unet_weights is not None:
        weight_paths.append(unet_weights)
    save_model_card(
        base_model=PRETRAINED_CKPT,
        repo_folder=output_path,
        weight_paths=weight_paths,
    )
    push_str = push_to_hub(hf_token, output_path, repo_prefix)
    return push_str


demo = gr.Interface(
    title="KerasCV Stable Diffusion to Diffusers Stable Diffusion Pipelines 🧨🤗",
    description=DESCRIPTION,
    allow_flagging="never",
    inputs=[
        gr.Text(max_lines=1, label="your_hf_token"),
        gr.Text(max_lines=1, label="text_encoder_weights"),
        gr.Text(max_lines=1, label="unet_weights"),
        gr.Text(max_lines=1, label="output_repo_prefix"),
    ],
    outputs=[gr.Markdown(label="output")],
    fn=run,
)

demo.launch()