Dreambooth using XL

#41
by mauricio-repetto - opened

Hi,

Is this model available for dreambooth? I tried to run the regular script for this model to see how good it is in comparison with the other sd versions but I'm getting an error :'(

This is the configuration I'm using:

!accelerate launch train_dreambooth.py \
  --pretrained_model_name_or_path="stabilityai/stable-diffusion-xl-base-0.9"  \
  --instance_data_dir="./pneumoconiosis" \
  --class_data_dir="./data/xray" \
  --output_dir="./diffusion/pneumoconiosis/models/stable-diffusion-xl-pneumoconiosis-finetuned" \
  --train_text_encoder \
  --mixed_precision="fp16" \
  --with_prior_preservation --prior_loss_weight=1.0 \
  --instance_prompt="image of a pneumoconiosis xray" \
  --class_prompt="image of a xray" \
  --resolution=512 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=2 --gradient_checkpointing \
  --use_8bit_adam \
  --learning_rate=5e-6 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --num_class_images=200 \
  --max_train_steps=11400 \
  --checkpointing_steps=4000 \
  --num_validation_images=4 \
  --report_to="wandb" \
  --seed=1337

this is the error:

2023-07-16 01:08:42.735258: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2023-07-16 01:08:48.477573: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
07/16/2023 01:08:51 - INFO - __main__ - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda

Mixed precision type: fp16

Keyword arguments {'safety_checker': None} are not expected by StableDiffusionXLPipeline and will be ignored.
Loading pipeline components...:   0% 0/7 [00:00<?, ?it/s]Loaded text_encoder_2 as CLIPTextModelWithProjection from `text_encoder_2` subfolder of stabilityai/stable-diffusion-xl-base-0.9.
Loading pipeline components...:  14% 1/7 [00:01<00:09,  1.61s/it]{'force_upcast'} was not found in config. Values will be initialized to default values.
Loaded vae as AutoencoderKL from `vae` subfolder of stabilityai/stable-diffusion-xl-base-0.9.
Loading pipeline components...:  29% 2/7 [00:01<00:04,  1.24it/s]Loaded text_encoder as CLIPTextModel from `text_encoder` subfolder of stabilityai/stable-diffusion-xl-base-0.9.
Loading pipeline components...:  43% 3/7 [00:02<00:02,  1.47it/s]Loaded tokenizer_2 as CLIPTokenizer from `tokenizer_2` subfolder of stabilityai/stable-diffusion-xl-base-0.9.
Loaded tokenizer as CLIPTokenizer from `tokenizer` subfolder of stabilityai/stable-diffusion-xl-base-0.9.
Loading pipeline components...:  71% 5/7 [00:02<00:00,  3.02it/s]Loaded unet as UNet2DConditionModel from `unet` subfolder of stabilityai/stable-diffusion-xl-base-0.9.
Loading pipeline components...:  86% 6/7 [00:07<00:01,  1.72s/it]Loaded scheduler as EulerDiscreteScheduler from `scheduler` subfolder of stabilityai/stable-diffusion-xl-base-0.9.
Loading pipeline components...: 100% 7/7 [00:07<00:00,  1.10s/it]
07/16/2023 01:08:59 - INFO - __main__ - Number of class images to sample: 200.
Generating class images: 100% 50/50 [19:51<00:00, 23.83s/it]
You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
{'variance_type'} was not found in config. Values will be initialized to default values.
{'force_upcast'} was not found in config. Values will be initialized to default values.
wandb: Currently logged in as: amd-repetto (only-my-team). Use `wandb login --relogin` to force relogin
wandb: Tracking run with wandb version 0.15.5
wandb: Run data is saved locally in /content/wandb/run-20230716_012903-r6j5zx8r
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run azure-shadow-26
wandb: ⭐️ View project at https://wandb.ai/only-my-team/dreambooth
wandb: 🚀 View run at https://wandb.ai/only-my-team/dreambooth/runs/r6j5zx8r
07/16/2023 01:29:03 - INFO - __main__ - ***** Running training *****
07/16/2023 01:29:03 - INFO - __main__ -   Num examples = 200
07/16/2023 01:29:03 - INFO - __main__ -   Num batches each epoch = 200
07/16/2023 01:29:03 - INFO - __main__ -   Num Epochs = 114
07/16/2023 01:29:03 - INFO - __main__ -   Instantaneous batch size per device = 1
07/16/2023 01:29:03 - INFO - __main__ -   Total train batch size (w. parallel, distributed & accumulation) = 2
07/16/2023 01:29:03 - INFO - __main__ -   Gradient Accumulation steps = 2
07/16/2023 01:29:03 - INFO - __main__ -   Total optimization steps = 11400
Steps:   0% 0/11400 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/content/train_dreambooth.py", line 1375, in <module>
    main(args)
  File "/content/train_dreambooth.py", line 1223, in main
    model_pred = unet(
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/utils/operations.py", line 581, in forward
    return model_forward(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/utils/operations.py", line 569, in __call__
    return convert_to_fp32(self.model_forward(*args, **kwargs))
  File "/usr/local/lib/python3.10/dist-packages/torch/amp/autocast_mode.py", line 14, in decorate_autocast
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/diffusers/models/unet_2d_condition.py", line 839, in forward
    if "text_embeds" not in added_cond_kwargs:
TypeError: argument of type 'NoneType' is not iterable
wandb: Waiting for W&B process to finish... (failed 1). Press Control-C to abort syncing.
wandb: 🚀 View run azure-shadow-26 at: https://wandb.ai/only-my-team/dreambooth/runs/r6j5zx8r
wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s)
wandb: Find logs at: ./wandb/run-20230716_012903-r6j5zx8r/logs
Traceback (most recent call last):
  File "/usr/local/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.py", line 45, in main
    args.func(args)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 979, in launch_command
    simple_launcher(args)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 628, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', 'train_dreambooth.py', '--pretrained_model_name_or_path=stabilityai/stable-diffusion-xl-base-0.9', '--instance_data_dir=/content/pneumoconiosis_resized/train/1', '--class_data_dir=/content/data/xray', '--output_dir=/content/drive/MyDrive/ORT/Master/Codes/diffusion/pneumoconiosis/models/stable-diffusion-xl-pneumoconiosis-finetuned', '--train_text_encoder', '--mixed_precision=fp16', '--with_prior_preservation', '--prior_loss_weight=1.0', '--instance_prompt=image of a pneumoconiosis xray', '--class_prompt=image of a xray', '--resolution=768', '--train_batch_size=1', '--gradient_accumulation_steps=2', '--gradient_checkpointing', '--use_8bit_adam', '--learning_rate=5e-6', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--num_class_images=200', '--max_train_steps=11400', '--checkpointing_steps=4000', '--num_validation_images=4', '--report_to=wandb', '--seed=1337']' returned non-zero exit status 1.

Now that I think this is probably something that I should put as an issue or a feature request in the diffusers' scripts repo.

[UPDATED] I've created an issue here

mauricio-repetto changed discussion title from Dreambooth using XL / Use of watermark to Dreambooth using XL

Sign up or log in to comment