4 14 17

Rui-Jie Zhu

ridger

AI & ML interests

None yet

Recent Activity

upvoted a paper 19 days ago

Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models

upvoted a paper 19 days ago

OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision

upvoted a collection about 1 month ago

Qwen2.5

View all activity

Organizations

ridger's activity

upvoted 2 papers 19 days ago

Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models

Paper • 2411.07140 • Published 20 days ago • 33

OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision

Paper • 2411.07199 • Published 20 days ago • 44

upvoted a collection about 1 month ago

Qwen2.5

Collection

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated 3 days ago • 395

upvoted a paper about 1 month ago

Gated Linear Attention Transformers with Hardware-Efficient Training

Paper • 2312.06635 • Published Dec 11, 2023 • 6

authored a paper about 2 months ago

ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation

Paper • 2406.18522 • Published Jun 26 • 41

upvoted 2 papers about 2 months ago

Autonomous Driving with Spiking Neural Networks

Paper • 2405.19687 • Published May 30 • 1

SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks

Paper • 2302.13939 • Published Feb 27, 2023 • 1

authored 7 papers about 2 months ago

updated 3 models about 2 months ago

ridger/MMfreeLM-1.3B

Text Generation • Updated Oct 14 • 94 • 6

ridger/MMfreeLM-370M

Text Generation • Updated Oct 14 • 481 • 17

ridger/MMfreeLM-2.7B

Text Generation • Updated Oct 14 • 69 • 34

upvoted 2 papers about 2 months ago

Scalable MatMul-free Language Modeling

Paper • 2406.02528 • Published Jun 4 • 11

Instruction Following without Instruction Tuning

Paper • 2409.14254 • Published Sep 21 • 27

upvoted a paper 3 months ago

Gated Slot Attention for Efficient Linear-Time Sequence Modeling

Paper • 2409.07146 • Published Sep 11 • 19