17 78 23

Yang Lee

innovation64

https://innovation64.github.io/

AI & ML interests

AGI

Recent Activity

updated a collection 11 days ago

RAG

liked a Space about 1 month ago

lmarena-ai/chatbot-arena-leaderboard

updated a collection about 1 month ago

RAG

Organizations

innovation64's activity

upvoted a paper about 2 months ago

Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation

Paper • 2409.12941 • Published Sep 19 • 22

upvoted 2 papers 4 months ago

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Paper • 2401.06066 • Published Jan 11 • 43

SAM 2: Segment Anything in Images and Videos

Paper • 2408.00714 • Published Aug 1 • 108

upvoted 2 articles 4 months ago

Article

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

•

May 7

• 39

Article

Sparse Mixture of Experts Language Model from Scratch: Extending makeMoE with Expert Capacity

•

Mar 18

• 7

upvoted 7 papers 4 months ago

ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities

Paper • 2407.14482 • Published Jul 19 • 25

Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist

Paper • 2407.08733 • Published Jul 11 • 20

Self-Recognition in Language Models

Paper • 2407.06946 • Published Jul 9 • 24

upvoted 4 papers 5 months ago

From RAGs to rich parameters: Probing how language models utilize external knowledge over parametric information for factual queries

Paper • 2406.12824 • Published Jun 18 • 20

ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

Paper • 2406.12793 • Published Jun 18 • 31

Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning

Paper • 2406.09170 • Published Jun 13 • 24

Transformers meet Neural Algorithmic Reasoners

Paper • 2406.09308 • Published Jun 13 • 43

upvoted an article 5 months ago

Article

Putting RL back in RLHF

Jun 12

• 62

upvoted 3 papers 5 months ago

CRAG -- Comprehensive RAG Benchmark

Paper • 2406.04744 • Published Jun 7 • 42

WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild

Paper • 2406.04770 • Published Jun 7 • 27

NATURAL PLAN: Benchmarking LLMs on Natural Language Planning

Paper • 2406.04520 • Published Jun 6 • 11