Shuhuai Ren

ShuhuaiRen

https://renshuhuai-andy.github.io/

AI & ML interests

NLP, Multi-modal

Recent Activity

liked a dataset 13 days ago

laion/laion-coco

updated a collection about 2 months ago

next-block-prediction

updated a collection about 2 months ago

next-block-prediction

View all activity

Organizations

ShuhuaiRen's activity

liked a dataset 13 days ago

laion/laion-coco

Viewer • Updated Jul 14 • 641M • 2.86k • 75

updated a collection about 2 months ago

next-block-prediction

Collection

2 items • Updated Oct 9

liked a Space 2 months ago

Running

513

🌖

Qwen2-VL-72B

liked 2 models 3 months ago

GAIR/Anole-7b-v0.1

Updated Jul 14 • 21 • 112

maitrix-org/Pandora

Updated Jun 18 • 61

liked a Space 4 months ago

Running

127

📊

VBench Leaderboard

liked 2 models 4 months ago

stabilityai/stable-diffusion-3-medium-diffusers

Text-to-Image • Updated Jun 19 • 299k • • 355

stabilityai/stable-diffusion-3-medium

Text-to-Image • Updated Aug 12 • 45.2k • 4.59k

liked a Space 4 months ago

Running on CPU Upgrade

10.8k

🔥

Stable Diffusion 2-1

updated a model 6 months ago

ShuhuaiRen/TimeChat-7b-Charades-STA-ft

Updated Jun 6

upvoted a paper 6 months ago

M^3IT: A Large-Scale Dataset towards Multi-Modal Multilingual Instruction Tuning

Paper • 2306.04387 • Published Jun 7, 2023 • 8

authored 5 papers 6 months ago

Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition

Paper • 2304.04704 • Published Apr 10, 2023

Delving into the Openness of CLIP

Paper • 2206.01986 • Published Jun 4, 2022

Towards End-to-End Embodied Decision Making via Multi-modal Large Language Model: Explorations with GPT4-Vision and Beyond

Paper • 2310.02071 • Published Oct 3, 2023 • 4

TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding

Paper • 2310.19060 • Published Oct 29, 2023

DCA: Diversified Co-Attention towards Informative Live Video Commenting

Paper • 1911.02739 • Published Nov 7, 2019