6 92 116

Emanuele Vivoli

emanuelevivoli

https://emanuelevivoli.github.io

AI & ML interests

Vision-Language models, VQA, DocumentAI

Recent Activity

liked a dataset 1 day ago

rubentito/OCR-IDL

liked a dataset 1 day ago

rubentito/mp-docvqa

liked a dataset 6 days ago

microsoft/orca-agentinstruct-1M-v1

View all activity

Organizations

emanuelevivoli's activity

upvoted a paper 29 days ago

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published Oct 22 • 88

upvoted 2 papers about 1 month ago

xGen-MM-Vid (BLIP-3-Video): You Only Need 32 Tokens to Represent a Video Even in VLMs

Paper • 2410.16267 • Published Oct 21 • 15

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

Paper • 2410.16153 • Published Oct 21 • 42

upvoted 6 collections about 1 month ago

upvoted 5 papers about 1 month ago

MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks

Paper • 2410.10563 • Published Oct 14 • 37

Animate-X: Universal Character Image Animation with Enhanced Motion Representation

Paper • 2410.10306 • Published Oct 14 • 52

Baichuan-Omni Technical Report

Paper • 2410.08565 • Published Oct 11 • 84

Personalized Visual Instruction Tuning

Paper • 2410.07113 • Published Oct 9 • 69

Falcon Mamba: The First Competitive Attention-free 7B Language Model

Paper • 2410.05355 • Published Oct 7 • 29

upvoted 2 papers about 2 months ago

MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning

Paper • 2409.20566 • Published Sep 30 • 52

ComiCap: A VLMs pipeline for dense captioning of Comic Panels

Paper • 2409.16159 • Published Sep 24 • 1

upvoted 2 articles about 2 months ago

Article

Deprecation of Git Authentication using password

Aug 25, 2023

• 18

Article

Llama can now see and run on your device - welcome Llama 3.2

Sep 25

• 169

upvoted an article 2 months ago

Article

Scaling robotics datasets with video encoding

Aug 27

• 34

upvoted a paper 2 months ago

Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models

Paper • 2408.02442 • Published Aug 5 • 21