Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs Paper ā¢ 2404.05719 ā¢ Published Apr 8 ā¢ 80
Harnessing Webpage UIs for Text-Rich Visual Understanding Paper ā¢ 2410.13824 ā¢ Published 15 days ago ā¢ 28
DocLayout-YOLO Collection Dataset and model for DocLayout-YOLO ā¢ 9 items ā¢ Updated 10 days ago ā¢ 10
Flex3D: Feed-Forward 3D Generation With Flexible Reconstruction Model And Input View Curation Paper ā¢ 2410.00890 ā¢ Published about 1 month ago ā¢ 17