MisakiWang (Misaki Wang)

upvoted a paper 16 days ago

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published 17 days ago • 128

upvoted 3 papers 27 days ago

upvoted 5 papers about 1 month ago

Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique

Paper • 2408.10701 • Published Aug 20 • 10

Out-of-Distribution Detection with Attention Head Masking for Multimodal Document Classification

Paper • 2408.11237 • Published Aug 20 • 4

TrackGo: A Flexible and Efficient Method for Controllable Video Generation

Paper • 2408.11475 • Published Aug 21 • 16

Evidence-backed Fact Checking using RAG and Few-Shot In-Context Learning with LLMs

Paper • 2408.12060 • Published Aug 22 • 4

Foundation Models for Music: A Survey

Paper • 2408.14340 • Published Aug 26 • 38

upvoted a paper about 2 months ago

Text-Driven Neural Collaborative Filtering Model for Paper Source Tracing

Paper • 2407.17722 • Published Jul 25 • 7

upvoted a collection 2 months ago

"Physics of Language Models" series

Collection

6 items • Updated Aug 30 • 31

upvoted 2 papers 4 months ago

Offline Regularised Reinforcement Learning for Large Language Models Alignment

Paper • 2405.19107 • Published May 29 • 13

LLMs achieve adult human performance on higher-order theory of mind tasks

Paper • 2405.18870 • Published May 29 • 16

upvoted 5 papers 7 months ago

Mora: Enabling Generalist Video Generation via A Multi-Agent Framework

Paper • 2403.13248 • Published Mar 20 • 76

VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding

Paper • 2403.11481 • Published Mar 18 • 11

DiPaCo: Distributed Path Composition

Paper • 2403.10616 • Published Mar 15 • 12

PERL: Parameter Efficient Reinforcement Learning from Human Feedback

Paper • 2403.10704 • Published Mar 15 • 56

Gemma: Open Models Based on Gemini Research and Technology

Paper • 2403.08295 • Published Mar 13 • 47

upvoted a collection 7 months ago

Agents

Collection

60 items • Updated Aug 14 • 5

upvoted 10 papers 7 months ago

SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents

Paper • 2403.08715 • Published Mar 13 • 20

VideoAgent: Long-form Video Understanding with Large Language Model as Agent

Paper • 2403.10517 • Published Mar 15 • 30

Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations

Paper • 2403.09704 • Published Mar 8 • 31

Unleashing Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration

Paper • 2307.05300 • Published Jul 11, 2023 • 18

Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling

Paper • 2403.03234 • Published Mar 5 • 11

Stop Regressing: Training Value Functions via Classification for Scalable Deep RL

Paper • 2403.03950 • Published Mar 6 • 13

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 592

Priority Sampling of Large Language Models for Compilers

Paper • 2402.18734 • Published Feb 28 • 16

OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web

Paper • 2402.17553 • Published Feb 27 • 21

Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models

Paper • 2402.17177 • Published Feb 27 • 88

Misaki Wang

AI & ML interests

Organizations

MisakiWang's activity