Fuyu-8B Colab

by nengelmann - opened Oct 27, 2023

Oct 27, 2023

Just wanted to let you all know, that here is a colab you can try the model and get started!
https://github.com/nengelmann/Fuyu-8B---Exploration/tree/main

liupei0408

Oct 27, 2023

any suggestion to perform OCR recognition and location function, i.e. text bounding box, as showing in the blog?

nengelmann

Oct 27, 2023

I don't think there is any documentation yet.
I'd suggest, that you take a look into the processing script and try to figure it out.
https://github.com/huggingface/transformers/blob/aa4198a238f915e7ac04bc43d28ddbcb7fe690df/src/transformers/models/fuyu/processing_fuyu.py#L29

Please let me know if you found out which prompts are working for your case 🤓

liupei0408

Oct 30, 2023

Unfortunately, I'm unsure how to implement text-to-box functionality. I'm even uncertain if this base version supports such features like text-to-box or box-to-text. I can only await further details to be released.

wdsfaw

Nov 7, 2023

Hi, I was wondering if you knew if this version of the model still has advantages over the original if GPU is able to load both?

nengelmann

Nov 17, 2023

It has less RAM usage.

I'd recommend checking it out yourself with the notebook above.
You can switch simply between the sharded and normal version.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment