Kijai/flux-fp8 · Silent exit when loading via FluxTransformer2DModel.from_single

I'm using the following test code, but when running the FluxTransformer2DModel.from_single_file it's just consume around 15GB of RAM (I still have around 7GB of RAM free), not using the GPU (4070TI with 12GB VRAM) and just silently exit after a 1-2 minutes back to the terminal, without any error.

I have tried several configurations, which didn't worked for me.
I don't have any issue with the GPU nor CUDA version using torch - if I'm running only the FluxPipeline.from_pretrained I can see the VRAM is maxed out until the model is running out of memory.
I'm running this using a vscode devcontainer (nvcr.io/nvidia/pytorch:24.08-py3) where the host machine is Windows 11 with i713700/32GB.

transformer = FluxTransformer2DModel.from_single_file( "./flux-text-to-image/flux-fp8/flux1-dev-fp8.safetensors", torch_dtype=dtype, use_safetensors=True, local_files_only=False, variant="fp16" if torch.cuda.is_available() else "fp32", device_map="auto", offload_folder="./offload", )
pipe = FluxPipeline.from_pretrained( "./flux-text-to-image/FLUX.1-dev", transformer=transformer, local_files_only=False, device_map="auto", torch_dtype=dtype, )
Notes:
CUDA available: True
CUDA version: 12.6
GPU: NVIDIA GeForce RTX 4070 Ti
Using device: cuda
Using dtype: torch.float16

Would love to get inputs on what I am missing here.
Thanks!

Kijai
/

flux-fp8

Silent exit when loading via FluxTransformer2DModel.from_single_file and no GPU use