Training Language Models to Self-Correct via Reinforcement Learning Paper • 2409.12917 • Published 17 days ago • 128
Attention Heads of Large Language Models: A Survey Paper • 2409.03752 • Published about 1 month ago • 86
Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique Paper • 2408.10701 • Published Aug 20 • 10
Out-of-Distribution Detection with Attention Head Masking for Multimodal Document Classification Paper • 2408.11237 • Published Aug 20 • 4
TrackGo: A Flexible and Efficient Method for Controllable Video Generation Paper • 2408.11475 • Published Aug 21 • 16
Evidence-backed Fact Checking using RAG and Few-Shot In-Context Learning with LLMs Paper • 2408.12060 • Published Aug 22 • 4
Text-Driven Neural Collaborative Filtering Model for Paper Source Tracing Paper • 2407.17722 • Published Jul 25 • 7
Offline Regularised Reinforcement Learning for Large Language Models Alignment Paper • 2405.19107 • Published May 29 • 13
LLMs achieve adult human performance on higher-order theory of mind tasks Paper • 2405.18870 • Published May 29 • 16
Mora: Enabling Generalist Video Generation via A Multi-Agent Framework Paper • 2403.13248 • Published Mar 20 • 76
VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding Paper • 2403.11481 • Published Mar 18 • 11
PERL: Parameter Efficient Reinforcement Learning from Human Feedback Paper • 2403.10704 • Published Mar 15 • 56
Gemma: Open Models Based on Gemini Research and Technology Paper • 2403.08295 • Published Mar 13 • 47
SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents Paper • 2403.08715 • Published Mar 13 • 20
VideoAgent: Long-form Video Understanding with Large Language Model as Agent Paper • 2403.10517 • Published Mar 15 • 30
Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations Paper • 2403.09704 • Published Mar 8 • 31
Unleashing Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration Paper • 2307.05300 • Published Jul 11, 2023 • 18
Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling Paper • 2403.03234 • Published Mar 5 • 11
Stop Regressing: Training Value Functions via Classification for Scalable Deep RL Paper • 2403.03950 • Published Mar 6 • 13
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 592
OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web Paper • 2402.17553 • Published Feb 27 • 21
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models Paper • 2402.17177 • Published Feb 27 • 88