adept/fuyu-8b · Discussions

#70 opened 8 months ago by

jchiu1234

How to evaluate it on AI2D dataset?

#69 opened 9 months ago by

boydcheung

Masking the image tokens during training

#68 opened 9 months ago by

jchiu1234

finetune fuyu-8b model

#67 opened 10 months ago by

yinincanada

Is there any way to use image embeddings as input? (similar to input_embeds param)

#66 opened 10 months ago by

sanchd

OCR function

#65 opened 11 months ago by

linxi

Does localization really work?

#64 opened 12 months ago by

Seungyoun

finetune fuyu8b text location with image size of 1920x1080 always got OOM even on A100*8

#63 opened 12 months ago by

Nooodles

Are there special tokens that are ignored during loss computation?

9

#62 opened 12 months ago by

Nyandwi

why does the coordinates need to be divided by two in scale_bbox_to_transformed_image?

#61 opened 12 months ago by

Nooodles

Here is a simple multimodal like training script to see model working.

#60 opened 12 months ago by

besiktas

GPU requirements

#59 opened 12 months ago by

thightower1

I keep running out of memory. Why dont they just tell what equipment is required to run these models

#58 opened 12 months ago by

alquimista888

crash kernel

6

#57 opened 12 months ago by

simonbrbx

Tips on resolving this typing.Optional error seemingly related to PIL.Image?

#56 opened 12 months ago by

justinwickett

demo of PDF vqa

#55 opened about 1 year ago by

verigle

test

#54 opened about 1 year ago by

Aaronx

Upload 2.jpg

#53 opened about 1 year ago by

Aaronx

test

#52 opened about 1 year ago by

Aaronx

8B? Or 9B?

#51 opened about 1 year ago by

mrfakename

Memory Spikes while Getting Model Logits

#49 opened about 1 year ago by

Nyandwi

Is there a way to run it on a 8GB GPU?

#47 opened about 1 year ago by

bobe94

issue with quantization on windows

#46 opened about 1 year ago by

FantasticMrCat

How does the Fuyu model Get images?

#45 opened about 1 year ago by

VatsaDev

For the vqav2 data set example "fish and carrot", why does the model output a sentence instead of a phrase?

8

#44 opened about 1 year ago by

changgeli

fine-tuning using FSDP and non 80GB cards?

8

#43 opened about 1 year ago by

besiktas

Released capabilities

6

#42 opened about 1 year ago by

ludeksvoboda

Update README.md

#41 opened about 1 year ago by

ybelkada

Colab

#39 opened about 1 year ago by

nengelmann

whether special instruction is need to trigger OCR location function?

#38 opened about 1 year ago by

liupei0408

How to get Image embedding using Fuyu

#37 opened about 1 year ago by

oaishi

How to get the detailed description in the fuyu-8b-demo？

#35 opened about 1 year ago by

dwdxdy

The Numbers

#33 opened about 1 year ago by

changgeli

Questions about the examples in the blog

#32 opened about 1 year ago by

AudreyLin

ImportError for FuyuProcessor in Transformers v4.34.1

#30 opened about 1 year ago by

ClaraLovesFunk

hi love it

#29 opened about 1 year ago by

boinc

The 8b model could get correct results for case showed on the offical blog

#28 opened about 1 year ago by

YuntaoChen

long response times

#27 opened about 1 year ago by

FantasticMrCat42

ValueError: Unable to infer channel dimension format

#26 opened about 1 year ago by

vishal1278

A working demo.py for your reference

#25 opened about 1 year ago by

Colderthanice

Using this model as a QA-tool/OCR on a text heavy document?

#24 opened about 1 year ago by

Techie5879

Loading the model on multi-gpu setup?

#23 opened about 1 year ago by

Techie5879

issue with inference

#22 opened about 1 year ago by

zhangchaosunshine

issue with running the model

#21 opened about 1 year ago by

slay

Possible for quantization other than bitsandbytes?

#20 opened about 1 year ago by

Yhyu13

Run on MBP M1

4

#17 opened about 1 year ago by

sagar-kris

License question

#16 opened about 1 year ago by deleted

Warning output

4

#15 opened about 1 year ago by

dashesy

Bug when deploying to Inference Endpoints