Dokyoon's picture

29 242

Dokyoon

leeloolee

·

Eruly

AI & ML interests

ai

Organizations

leeloolee's activity

upvoted a paper 4 days ago

NVLM: Open Frontier-Class Multimodal LLMs

Paper • 2409.11402 • Published 18 days ago • 66

upvoted 2 papers 25 days ago

MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct

Paper • 2409.05840 • Published 26 days ago • 45

Towards a Unified View of Preference Learning for Large Language Models: A Survey

Paper • 2409.02795 • Published Sep 4 • 72

upvoted a collection 27 days ago

OCR

5 items • Updated 27 days ago • 1

upvoted a paper 27 days ago

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published Sep 3 • 78

upvoted a collection 27 days ago

VisionLM

354 items • Updated 6 days ago • 25

upvoted a paper about 1 month ago

Online DPO: Online Direct Preference Optimization with Fast-Slow Chasing

Paper • 2406.05534 • Published Jun 8 • 3

upvoted 6 collections about 1 month ago

CLAIR and APO

Data and Models for the paper "Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment" • 8 items • Updated Aug 14 • 3

Korean Medical Dataset

한국어 의료 관련 데이터 (내가 편집한 것 위주) • 16 items • Updated Aug 21 • 4

Word Sense Linking

Word Sense Linking is the task designed to identify and disambiguate spans of text to their most suitable senses from a reference inventory. • 5 items • Updated Aug 5 • 3

Cerebras DocChat

GPT-4 Level Conversational QA Trained In a Few Hours • 5 items • Updated Aug 21 • 3

Llama Scope

An Open-Source Suite of 416 Sparse Autoencoders on Llama-3.1-8B • 1 item • Updated Sep 3 • 4

Awesome Visual Embedding

9 items • Updated Jul 23 • 4

upvoted a paper about 1 month ago

Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation

Paper • 2401.08417 • Published Jan 16 • 31

upvoted a paper 2 months ago

MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens

Paper • 2406.11271 • Published Jun 17 • 18

upvoted a paper 3 months ago

BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning

Paper • 2206.08657 • Published Jun 17, 2022 • 2

upvoted an article 3 months ago

Article

Vision Language Models Explained

Apr 11

• 185

upvoted a paper 3 months ago

PaliGemma: A versatile 3B VLM for transfer

Paper • 2407.07726 • Published Jul 10 • 65

upvoted a collection 3 months ago

Korean Datasets I've released so far.

지금까지 업로드한 한국어 데이터셋 콜렉션입니다. • 8 items • Updated May 24 • 16

upvoted a paper 3 months ago

Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs

Paper • 2406.16860 • Published Jun 24 • 55

upvoted an article 3 months ago

Article

SeeMoE: Implementing a MoE Vision Language Model from Scratch

By

•

Jun 23

• 33

upvoted an article 4 months ago

Article

Putting RL back in RLHF

Jun 12

• 60

upvoted an article 5 months ago

Article

The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare

Apr 19

• 102

upvoted a collection 7 months ago

Awesome feedback datasets

A curated list of datasets with human or AI feedback. Useful for training reward models or applying techniques like DPO. • 19 items • Updated Apr 12 • 65

upvoted a paper 8 months ago

ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization

Paper • 2402.09320 • Published Feb 14 • 6

upvoted a collection 8 months ago

Journal Club

Candidate papers to read in the H4 journal club • 54 items • Updated Apr 21 • 26

upvoted a collection 9 months ago

MoEs papers reading list

58 items • Updated 2 days ago • 133

upvoted 2 papers about 1 year ago

InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation

Paper • 2309.06380 • Published Sep 12, 2023 • 32

MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records

Paper • 2308.14089 • Published Aug 27, 2023 • 28