Lingyu Kong

kppkkp

https://www.kppkkp.top/

LingyvKong

AI & ML interests

None yet

Recent Activity

authored a paper about 2 months ago

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

authored a paper about 2 months ago

Focus Anywhere for Fine-grained Multi-page Document Understanding

View all activity

Organizations

None yet

kppkkp's activity

authored 2 papers about 2 months ago

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published Sep 3 • 82

Focus Anywhere for Fine-grained Multi-page Document Understanding

Paper • 2405.14295 • Published May 23

updated a model 2 months ago

kppkkp/OneChart

Feature Extraction • Updated Sep 17 • 201 • 12

liked a Space 2 months ago

Running on Zero

303

💬

GOT Online

liked a model 2 months ago

stepfun-ai/GOT-OCR2_0

Image-Text-to-Text • Updated Sep 18 • 617k • 1.21k

upvoted a paper 3 months ago

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published Sep 3 • 82

authored 3 papers 7 months ago

updated a dataset 7 months ago

kppkkp/ChartSE

Updated Apr 18 • 65 • 3

liked a model 7 months ago

kppkkp/OneChart

Feature Extraction • Updated Sep 17 • 201 • 12

upvoted a paper 7 months ago

OneChart: Purify the Chart Structural Extraction via One Auxiliary Token

Paper • 2404.09987 • Published Apr 15 • 2

liked a model 8 months ago

HaoranWei/Vary-toy

Text Generation • Updated Jan 22 • 31 • 34

upvoted a paper 8 months ago

Small Language Model Meets with Reinforced Vision Vocabulary

Paper • 2401.12503 • Published Jan 23 • 32

upvoted a paper 10 months ago

StableIdentity: Inserting Anybody into Anywhere at First Sight

Paper • 2401.15975 • Published Jan 29 • 17

liked a model 10 months ago

MILVLG/imp-v1-3b

Text Generation • Updated May 26 • 277 • 205

authored a paper 10 months ago

Small Language Model Meets with Reinforced Vision Vocabulary

Paper • 2401.12503 • Published Jan 23 • 32

upvoted 2 papers 12 months ago

Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models

Paper • 2312.06109 • Published Dec 11, 2023 • 20

Merlin:Empowering Multimodal LLMs with Foresight Minds

Paper • 2312.00589 • Published Nov 30, 2023 • 24

upvoted a paper about 1 year ago

DreamLLM: Synergistic Multimodal Comprehension and Creation

Paper • 2309.11499 • Published Sep 20, 2023 • 58