Merve Noyan's picture

Merve Noyan

merve

·

AI & ML interests

VLMs, vision & co

Recent Activity

posted an update 3 days ago

liked a model 3 days ago

google/siglip-so400m-patch16-256-i18n

reacted to sayakpaul's post with ❤️ 3 days ago

Articles

Llama can now see and run on your device - welcome Llama 3.2

Preference Optimization for Vision Language Models

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

PaliGemma – Google's Cutting-Edge Open Vision Language Model

Vision Language Models Explained

Introduction to Quantization cooked in 🤗 with 💗🧑‍🍳

Deploy MusicGen in no time with Inference Endpoints

Open-Source Text Generation & LLM Ecosystem at Hugging Face

Jupyter X Hugging Face

Using Machine Learning to Aid Survivors and Race through Time

Introducing Skops

Announcing the Hugging Face Fellowship Program

Showcase Your Projects in Spaces using Gradio

Hosting your Models and Datasets on Hugging Face Spaces using Streamlit

Organizations

Posts 72

Post

1574

🤗 transformers pipelines now support vision language models for easy local inference 🫰🏻
h/t @yonigozlan for shipping this 🎩👏

you can also use inference API to infer hosted vision LMs (via Python, JS and cURL) https://huggingface.co/docs/api-inference/en/tasks/image-text-to-text

Post

4629

OmniVision-968M: a new local VLM for edge devices, fast & small but performant
💨 a new vision language model with 9x less image tokens, super efficient
📖 aligned with DPO for reducing hallucinations
⚡️ Apache 2.0 license 🔥

Demo hf.co/spaces/NexaAIDev/omnivlm-dpo-demo
Model NexaAIDev/omnivision-968M

Collections 34

spaces 104

Running on Zero

OWLSAM

State-of-the-art open-vocabulary image segmentation ⚡️

Sam2.1

SuperPoint

Running on CPU Upgrade

Gradio Tgi

Vision Papers

OWLSAM2

models 87

merve/paligemma_vqav2

Image-Text-to-Text • Updated 14 days ago • 192 • 10

merve/google-ckpts

Updated about 1 month ago

merve/google-tokenizers

Updated about 1 month ago

merve/idefics3-llama-vqav2

merve/idefics3llama-vqav2

Updated Sep 11 • 8

merve/flux-dreambooth-lora

Updated Aug 16 • 1

merve/trained-flux-lora-lego

Text-to-Image • Updated Aug 16 • 9 • • 1

merve/flux-lego-lora-dreambooth

Text-to-Image • Updated Aug 16 • 2.62k • • 13

merve/sam2-hiera-large

Mask Generation • Updated Aug 2 • 34.7k • 2

merve/sam2-hiera-base-plus

Mask Generation • Updated Aug 2 • 37

datasets 26

merve/model-test-inputs

Updated about 1 month ago • 50

merve/vqav2-small

Viewer • Updated Aug 8 • 21.4k • 1.03k • 8

merve/SGinW

Preview • Updated Jul 11 • 397

merve/pascal-voc

Viewer • Updated Jul 6 • 336k • 496

merve/YouCook2

Viewer • Updated May 28 • 2k • 56

merve/faiss_embeddings

Updated Jan 25 • 22

merve/pokemon-ds-embeddings

Viewer • Updated Jan 10 • 833 • 57 • 4

merve/tr-h4-norobots

Updated Jan 7 • 55 • 10

merve/lego_sets_latest

Viewer • Updated Jan 6 • 61 • 204 • 2

merve/ai-tube-dummy

Updated Dec 1, 2023 • 53