fabiodr (Fabio Dias Rollo)

upvoted a collection 8 days ago

Emu3

Collection

3 items • Updated 9 days ago • 47

upvoted a paper 9 days ago

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions

Paper • 2409.18042 • Published 9 days ago • 34

upvoted a paper 11 days ago

MaskedMimic: Unified Physics-Based Character Control Through Masked Motion Inpainting

Paper • 2409.14393 • Published 14 days ago • 7

upvoted an article 12 days ago

Article

Document Similarity Search with ColPali

By

•

14 days ago

• 36

upvoted a collection 13 days ago

Qwen2.5-Coder

Collection

Code-specific model series based on Qwen2.5 • 14 items • Updated 11 days ago • 69

upvoted a paper 13 days ago

Imagine yourself: Tuning-Free Personalized Image Generation

Paper • 2409.13346 • Published 16 days ago • 66

upvoted a paper 15 days ago

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published 16 days ago • 128

upvoted a collection 16 days ago

Qwen2.5-Math

Collection

Math-specific model series based on Qwen2.5 • 9 items • Updated 13 days ago • 35

upvoted a paper 17 days ago

Kolmogorov-Arnold Transformer

Paper • 2409.10594 • Published 19 days ago • 37

upvoted a collection 17 days ago

Moshi v0.1 Release

Collection

MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated 17 days ago • 201

upvoted a paper 17 days ago

Agile Continuous Jumping in Discontinuous Terrains

Paper • 2409.10923 • Published 19 days ago • 11

upvoted a paper 18 days ago

Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs

Paper • 2406.18629 • Published Jun 26 • 40

upvoted a collection 18 days ago

Step-DPO

Collection

Resources for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs" • 11 items • Updated Jul 1 • 5

upvoted a paper 18 days ago

Self-Harmonized Chain of Thought

Paper • 2409.04057 • Published 30 days ago • 16

upvoted a paper 19 days ago

Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning

Paper • 2406.12050 • Published Jun 17 • 17

upvoted a collection 23 days ago

DataGemma Release

Collection

A series of pioneering open models that help ground LLMs in real-world data through Data Commons. • 2 items • Updated 24 days ago • 76

upvoted a paper 25 days ago

ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities

Paper • 2407.14482 • Published Jul 19 • 24

upvoted a collection about 1 month ago

Multi-Vector Retrievers

Collection

2 items • Updated Aug 20 • 3

upvoted a paper about 1 month ago

Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents

Paper • 2408.07199 • Published Aug 13 • 20

upvoted a collection about 1 month ago

Jamba-1.5

Collection

The AI21 Jamba family of models are state-of-the-art, hybrid SSM-Transformer instruction following foundation models • 2 items • Updated Aug 22 • 75

upvoted a paper about 2 months ago

MeshFormer: High-Quality Mesh Generation with 3D-Guided Reconstruction Model

Paper • 2408.10198 • Published Aug 19 • 32

upvoted a paper 2 months ago

ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild

Paper • 2407.04172 • Published Jul 4 • 22

upvoted an article 2 months ago

Article

Announcing Finance Commons and the Bad Data Toolbox: Pioneering Open Data and Advanced Document Processing

By

•

Jul 19

• 17

upvoted a collection 2 months ago

SAM2

Collection

All the models and demos for SAM2 • 8 items • Updated Aug 2 • 11

upvoted a paper 2 months ago

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31 • 103

upvoted a collection 2 months ago

Palmyra (Writer license)

Collection

Palmyra LLMs under Writer license https://writer.com/legal/open-model-license/ • 8 items • Updated Aug 17 • 6

upvoted 2 papers 2 months ago

UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition

Paper • 2404.15254 • Published Apr 23 • 1

Wolf: Captioning Everything with a World Summarization Framework

Paper • 2407.18908 • Published Jul 26 • 30

upvoted an article 2 months ago

Article

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

By

•

Jul 29

• 212

upvoted a collection 2 months ago

InternLM2-Math

Collection

14 items • Updated Jul 30 • 7

upvoted an article 2 months ago

Article

Uncensor any LLM with abliteration

By

•

Jun 13

• 337

upvoted a paper 2 months ago

OpenDevin: An Open Platform for AI Software Developers as Generalist Agents

Paper • 2407.16741 • Published Jul 23 • 67

upvoted a collection 2 months ago

🍃 MINT-1T

Collection

Data for "MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens" • 13 items • Updated Jul 24 • 50

upvoted a paper 2 months ago

OutfitAnyone: Ultra-high Quality Virtual Try-On for Any Clothing and Any Person

Paper • 2407.16224 • Published Jul 23 • 23

upvoted 2 articles 3 months ago

Article

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

Jun 24

• 170

Article

Docmatix - a huge dataset for Document Visual Question Answering

Jul 18

• 65

upvoted a collection 3 months ago

distil-large-v3

Collection

This collection contains the model repositories for distil-large-v3, which provides support for the most popular Whisper libraries. • 4 items • Updated Mar 21 • 6

upvoted 4 articles 3 months ago

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16

• 244

Article

ColPali: Efficient Document Retrieval with Vision Language Models 👀

By

•

Jul 5

• 110

Article

The Rise of Agentic Data Generation

By

•

Jul 15

• 74

Article

How to run Gemini Nano locally in your browser

By

•

Jul 11

• 42

upvoted 4 papers 3 months ago

Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems

Paper • 2407.01370 • Published Jul 1 • 85

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Paper • 2402.14905 • Published Feb 22 • 108

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Paper • 2407.03320 • Published Jul 3 • 92

Agentless: Demystifying LLM-based Software Engineering Agents

Paper • 2407.01489 • Published Jul 1 • 42

upvoted a collection 3 months ago

LLM Compiler

Collection

Meta LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning. • 4 items • Updated Jun 27 • 147

upvoted an article 3 months ago

Article

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Apr 22

• 78

upvoted a collection 3 months ago

Transformers.js demos

Collection

A collection of my favorite WebML demos, built with Transformers.js! • 30 items • Updated Jul 11 • 80

upvoted a collection 4 months ago

4M Models

Collection

Multimodal models from https://4m.epfl.ch/ • 14 items • Updated Jun 14 • 29

upvoted a paper 4 months ago

Depth Anything V2

Paper • 2406.09414 • Published Jun 13 • 91

upvoted 4 collections 4 months ago

upvoted 3 papers 4 months ago

Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion

Paper • 2406.03184 • Published Jun 5 • 18

Learning Temporally Consistent Video Depth from Video Diffusion Priors

Paper • 2406.01493 • Published Jun 3 • 17

CraftsMan: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner

Paper • 2405.14979 • Published May 23 • 15

upvoted a collection 5 months ago

Phi-3

Collection

Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 27 items • Updated 17 days ago • 473

upvoted 2 articles 5 months ago

Article

PaliGemma – Google's Cutting-Edge Open Vision Language Model

May 14

• 201

Article

License to Call: Introducing Transformers Agents 2.0

May 13

• 108

Fabio Dias Rollo

AI & ML interests

Organizations

fabiodr's activity

Document Similarity Search with ColPali

Announcing Finance Commons and the Bad Data Toolbox: Pioneering Open Data and Advanced Document Processing

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

Uncensor any LLM with abliteration

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

Docmatix - a huge dataset for Document Visual Question Answering

SmolLM - blazingly fast and remarkably powerful

ColPali: Efficient Document Retrieval with Vision Language Models 👀

The Rise of Agentic Data Generation

How to run Gemini Nano locally in your browser

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

PaliGemma – Google's Cutting-Edge Open Vision Language Model

License to Call: Introducing Transformers Agents 2.0