TRAINING Error running job: black-forest-labs/FLUX.1-dev

#106
by mato22 - opened

after trying everything i could find.. still no success... please help!
i tried to have the .env file in all the right places and rerecreated the token multiple times i always get this
Screenshot 2024-08-28 002504.png
Screenshot 2024-08-28 002644.png
Screenshot 2024-08-28 003143.png
Screenshot 2024-08-28 003243.png
Screenshot 2024-08-28 003349.png

(venv) C:\COMFY\ComfyUI_windows_portable\ComfyUI\ai-toolkit>python run.py train_lora_flux_24gb2.yml
Running 1 job
C:\Users\admin\ai-toolkit\venv\Lib\site-packages\controlnet_aux\segment_anything\modeling\tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_5m_224 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_5m_224. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
C:\Users\admin\ai-toolkit\venv\Lib\site-packages\controlnet_aux\segment_anything\modeling\tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_11m_224 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_11m_224. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
C:\Users\admin\ai-toolkit\venv\Lib\site-packages\controlnet_aux\segment_anything\modeling\tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_21m_224 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_21m_224. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
C:\Users\admin\ai-toolkit\venv\Lib\site-packages\controlnet_aux\segment_anything\modeling\tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_21m_384 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_21m_384. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
C:\Users\admin\ai-toolkit\venv\Lib\site-packages\controlnet_aux\segment_anything\modeling\tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_21m_512 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_21m_512. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
{
"type": "sd_trainer",
"training_folder": "output",
"device": "cuda:0",
"network": {
"type": "lora",
"linear": 16,
"linear_alpha": 16
},
"save": {
"dtype": "float16",
"save_every": 250,
"max_step_saves_to_keep": 4,
"push_to_hub": false
},
"datasets": [
{
"folder_path": "D:\PICTURES_files\TRAINING\vw_mago1",
"caption_ext": "txt",
"caption_dropout_rate": 0.05,
"shuffle_tokens": false,
"cache_latents_to_disk": true,
"resolution": [
512,
768,
1024
]
}
],
"train": {
"batch_size": 1,
"steps": 2000,
"gradient_accumulation_steps": 1,
"train_unet": true,
"train_text_encoder": false,
"gradient_checkpointing": true,
"noise_scheduler": "flowmatch",
"optimizer": "adamw8bit",
"lr": 0.0001,
"ema_config": {
"use_ema": true,
"ema_decay": 0.99
},
"dtype": "bf16"
},
"model": {
"name_or_path": "black-forest-labs/FLUX.1-dev",
"is_flux": true,
"quantize": true
},
"sample": {
"sampler": "flowmatch",
"sample_every": 250,
"width": 1024,
"height": 1024,
"prompts": [
"woman with red hair, playing chess at the park, bomb going off in the background",
"a woman holding a coffee cup, in a beanie, sitting at a cafe",
"a horse is a DJ at a night club, fish eye lens, smoke machine, lazer lights, holding a martini",
"a man showing off his cool new t shirt at the beach, a shark is jumping out of the water in the background",
"a bear building a log cabin in the snow covered mountains",
"woman playing the guitar, on stage, singing a song, laser lights, punk rocker",
"hipster man with a beard, building a chair, in a wood shop",
"photo of a man, white background, medium shot, modeling clothing, studio lighting, white backdrop",
"a man holding a sign that says, 'this is a sign'",
"a bulldog, in a post apocalyptic world, with a shotgun, in a leather jacket, in a desert, with a motorcycle"
],
"neg": "",
"seed": 42,
"walk_seed": true,
"guidance_scale": 4,
"sample_steps": 20
}
}
Using EMA
C:\COMFY\ComfyUI_windows_portable\ComfyUI\ai-toolkit\extensions_built_in\sd_trainer\SDTrainer.py:61: FutureWarning: torch.cuda.amp.GradScaler(args...) is deprecated. Please use torch.amp.GradScaler('cuda', args...) instead.
self.scaler = torch.cuda.amp.GradScaler()

#############################################

Running job: vw_mago1_lora_v1

#############################################

Running 1 process
Loading Flux model
Loading transformer
Error running job: black-forest-labs/FLUX.1-dev is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo with token or log in with huggingface-cli login.

========================================
Result:
- 0 completed jobs
- 1 failure

Traceback (most recent call last):
File "C:\Users\admin\ai-toolkit\venv\Lib\site-packages\huggingface_hub\utils_errors.py", line 304, in hf_raise_for_status
response.raise_for_status()
File "C:\Users\admin\ai-toolkit\venv\Lib\site-packages\requests\models.py", line 1024, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/transformer/config.json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "C:\Users\admin\ai-toolkit\venv\Lib\site-packages\diffusers\configuration_utils.py", line 379, in load_config
config_file = hf_hub_download(
^^^^^^^^^^^^^^^^
File "C:\Users\admin\ai-toolkit\venv\Lib\site-packages\huggingface_hub\utils_deprecation.py", line 101, in inner_f
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "C:\Users\admin\ai-toolkit\venv\Lib\site-packages\huggingface_hub\utils_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "C:\Users\admin\ai-toolkit\venv\Lib\site-packages\huggingface_hub\file_download.py", line 1240, in hf_hub_download
return _hf_hub_download_to_cache_dir(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\admin\ai-toolkit\venv\Lib\site-packages\huggingface_hub\file_download.py", line 1347, in _hf_hub_download_to_cache_dir
_raise_on_head_call_error(head_call_error, force_download, local_files_only)
File "C:\Users\admin\ai-toolkit\venv\Lib\site-packages\huggingface_hub\file_download.py", line 1854, in _raise_on_head_call_error
raise head_call_error
File "C:\Users\admin\ai-toolkit\venv\Lib\site-packages\huggingface_hub\file_download.py", line 1751, in _get_metadata_or_catch_error
metadata = get_hf_file_metadata(
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\admin\ai-toolkit\venv\Lib\site-packages\huggingface_hub\utils_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "C:\Users\admin\ai-toolkit\venv\Lib\site-packages\huggingface_hub\file_download.py", line 1673, in get_hf_file_metadata
r = _request_wrapper(
^^^^^^^^^^^^^^^^^
File "C:\Users\admin\ai-toolkit\venv\Lib\site-packages\huggingface_hub\file_download.py", line 376, in _request_wrapper
response = _request_wrapper(
^^^^^^^^^^^^^^^^^
File "C:\Users\admin\ai-toolkit\venv\Lib\site-packages\huggingface_hub\file_download.py", line 400, in _request_wrapper
hf_raise_for_status(response)
File "C:\Users\admin\ai-toolkit\venv\Lib\site-packages\huggingface_hub\utils_errors.py", line 321, in hf_raise_for_status
raise GatedRepoError(message, response) from e
huggingface_hub.utils._errors.GatedRepoError: 401 Client Error. (Request ID: Root=1-66cecd45-1f2cc6a96e9d4e331aa3a1f6;7641e2c7-2f42-4e25-bcc2-b0f18d9f630c)

Cannot access gated repo for url https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/transformer/config.json.
Access to model black-forest-labs/FLUX.1-dev is restricted. You must be authenticated to access it.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\COMFY\ComfyUI_windows_portable\ComfyUI\ai-toolkit\run.py", line 90, in
main()
File "C:\COMFY\ComfyUI_windows_portable\ComfyUI\ai-toolkit\run.py", line 86, in main
raise e
File "C:\COMFY\ComfyUI_windows_portable\ComfyUI\ai-toolkit\run.py", line 78, in main
job.run()
File "C:\COMFY\ComfyUI_windows_portable\ComfyUI\ai-toolkit\jobs\ExtensionJob.py", line 22, in run
process.run()
File "C:\COMFY\ComfyUI_windows_portable\ComfyUI\ai-toolkit\jobs\process\BaseSDTrainProcess.py", line 1233, in run
self.sd.load_model()
File "C:\COMFY\ComfyUI_windows_portable\ComfyUI\ai-toolkit\toolkit\stable_diffusion_model.py", line 488, in load_model
transformer = FluxTransformer2DModel.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\admin\ai-toolkit\venv\Lib\site-packages\huggingface_hub\utils_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "C:\Users\admin\ai-toolkit\venv\Lib\site-packages\diffusers\models\modeling_utils.py", line 612, in from_pretrained
config, unused_kwargs, commit_hash = cls.load_config(
^^^^^^^^^^^^^^^^
File "C:\Users\admin\ai-toolkit\venv\Lib\site-packages\huggingface_hub\utils_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "C:\Users\admin\ai-toolkit\venv\Lib\site-packages\diffusers\configuration_utils.py", line 394, in load_config
raise EnvironmentError(
OSError: black-forest-labs/FLUX.1-dev is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo with token or log in with huggingface-cli login.

I had the same problem and the thing that helped me to solve it was to go back to:

https://huggingface.co/black-forest-labs/FLUX.1-dev/

And accept the T&C.

@FridaRuh thanks for your response... but i accepted them before and now i was trying to find a way, to accept them again... but i don't know how :/

I had the same problem and the thing that helped me to solve it was to go back to:

https://huggingface.co/black-forest-labs/FLUX.1-dev/

And accept the T&C.

I encountered the same problem and accepted the terms, but it still occurred.

I had the same problem and the thing that helped me to solve it was to go back to:

https://huggingface.co/black-forest-labs/FLUX.1-dev/

And accept the T&C.

I encountered the same problem and accepted the terms, but it still occurred.

Review the extension of yours images

Review the extension of yours images

Is there something wrong with the dataset?

I had the same problem and the thing that helped me to solve it was to go back to:

https://huggingface.co/black-forest-labs/FLUX.1-dev/

And accept the T&C.

Thank you. I think this is working.

Yes i have the same error and the problem was the image extension in my case the image had the extension .webp buy before change the extension to jpg the train works fine

Yes i have the same error and the problem was the image extension in my case the image had the extension .webp buy before change the extension to jpg the train works fine

There is no real change, because my extension is also jpg. The error still exists.

The issue that i had with the IMAGE error, i fixed it. u have to modify some files, and this solves the problems.

Based on your script and error logs, the most likely place to insert the fix for the OSError: image file is truncated would be in the load_and_process_image function, since this is where the images are opened and processed using Pillow's Image.open().

Here’s where you can make the modifications within your script:

  1. File to Modify:
    The relevant function load_and_process_image is located in the ImageProcessingDTOMixin class inside the dataloader_mixins.py file.

  2. Modifications to Handle Truncated Images:
    You can add the fix by allowing truncated images to load by Pillow. Modify the load_and_process_image function to include the Pillow configuration options ImageFile.LOAD_TRUNCATED_IMAGES = True.


Here’s how you can modify that function:


//Original Code in dataloader_mixins.py:

def load_and_process_image(self: 'FileItemDTO', transform: Union[None, transforms.Compose], only_load_latents=False):
try:
img = Image.open(self.path)
img = exif_transpose(img)
except Exception as e:
print(f"Error: {e}")
print(f"Error loading image: {self.path}")
img = img.convert('RGB')
# Additional processing code here...


Modified Code with the Fix:

from PIL import ImageFile # Import goes at the top of the file (if u already have imports, just add 'ImageFile' to the imports)

def load_and_process_image(self: 'FileItemDTO', transform: Union[None, transforms.Compose], only_load_latents=False):
try:
# Set this to allow truncated images
ImageFile.LOAD_TRUNCATED_IMAGES = True

    # Open and process the image
    img = Image.open(self.path)
    img = exif_transpose(img)
except Exception as e:
    print(f"Error: {e}")
    print(f"Error loading image: {self.path}")

img = img.convert('RGB')
# Additional processing code here...
  1. Steps on Runpod.io to Apply the Changes:
    Access the File: Use the terminal or a file editor (like nano, vim, or via a Jupyter notebook) on Runpod.io to edit the dataloader_mixins.py file located in /workspace/ai-toolkit/toolkit/.

Save the Changes: Save the file and re-run the script to check if the issue is resolved.

By doing this, you’ll allow Pillow to load truncated images without crashing the script, which should resolve the OSError: image file is truncated error.

Sign up or log in to comment