sayakpaul/flux.1-dev-nf4-pkg · Handling Long Prompts and Device Movement in Transformer Models

Thank you for providing this incredible model and source code. I am not very familiar with the intricacies of coding within the transformers framework.

While running your code with a prompt that exceeds 500 characters, I encountered the following message:

Token indices sequence length is longer than the specified maximum sequence length for this model (84 > 77). Running this sequence through the model will result in indexing errors.
The following part of your input was truncated because CLIP can only handle sequences up to 77 tokens.

Additionally, I received the following error:
The module 'FluxTransformer2DModel' has been loaded in bitsandbytes 4bit and moving it to CPU via .to() is not supported. Module is still on cuda:0. In most cases, it is recommended to not change the device.
The module 'T5EncoderModel' has been loaded in bitsandbytes 4bit and moving it to CPU via .to() is not supported. Module is still on cuda:0. In most cases, it is recommended to not change the device.

Could you please advise on the solution for these issues? I would really appreciate your help.

Thank you!