4 30 15

Frank Sommers PRO

fsommers

fsommers

AI & ML interests

None yet

Recent Activity

upvoted an article 25 minutes ago

liked a model about 2 hours ago

NexaAIDev/omnivision-968M

liked a Space 3 days ago

naver-clova-ix/donut-base-finetuned-cord-v2

Articles

Document Similarity Search with ColPali

Sep 21

• 47

Organizations

fsommers's activity

upvoted an article 25 minutes ago

Article

Enjoy the Power of Phi-3 with ONNX Runtime on your device

•

May 22

• 26

upvoted an article 23 days ago

Article

Visually Multilingual: Introducing mcdse-2b

•

25 days ago

• 37

upvoted 2 papers 23 days ago

A Survey of Small Language Models

Paper • 2410.20011 • Published 27 days ago • 37

Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction

Paper • 2410.21169 • Published 24 days ago • 29

upvoted a paper 30 days ago

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

Paper • 2410.16153 • Published Oct 21 • 42

upvoted a paper about 1 month ago

From Generalist to Specialist: Adapting Vision Language Models via Task-Specific Visual Instruction Tuning

Paper • 2410.06456 • Published Oct 9 • 35

upvoted 3 articles about 2 months ago

Article

Deploying Your FastAPI Applications on Huggingface Via Docker

•

Dec 11, 2023

• 16

Article

Hosting your Models and Datasets on Hugging Face Spaces using Streamlit

Oct 5, 2021

• 2

Article

Llama can now see and run on your device - welcome Llama 3.2

Sep 25

• 169

upvoted a collection about 2 months ago

ColPali Paper Resources

Collection

Main resources for the paper: "ColPali: Efficient Document Retrieval with Vision Language Models" • 4 items • Updated 4 days ago • 6

upvoted an article about 2 months ago

Article

Document Similarity Search with ColPali

•

Sep 21

• 47

upvoted an article 2 months ago

Article

Getty Images Brings High-Quality, Commercially Safe Dataset to Hugging Face

•

Sep 6

• 16

upvoted 3 papers 2 months ago

Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

Paper • 2409.12191 • Published Sep 18 • 74

jina-embeddings-v3: Multilingual Embeddings With Task LoRA

Paper • 2409.10173 • Published Sep 16 • 26

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published Sep 3 • 82

upvoted a collection 2 months ago

Qwen2

Collection

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Sep 18 • 347

upvoted a paper 3 months ago

Attention Heads of Large Language Models: A Survey

Paper • 2409.03752 • Published Sep 5 • 88

upvoted an article 3 months ago

Article

ColPali: Efficient Document Retrieval with Vision Language Models 👀

•

Jul 5

• 161

upvoted a collection 3 months ago

Awesome Document AI

Collection

A collection of open-source document AI 📄 📝 📈 • 27 items • Updated Mar 11 • 74

upvoted a paper 3 months ago

Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders

Paper • 2408.15998 • Published Aug 28 • 83