Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large Language Model Paper • 2411.04496 • Published 2 days ago • 15
Survey of User Interface Design and Interaction Techniques in Generative AI Applications Paper • 2410.22370 • Published 12 days ago • 11
Unbounded: A Generative Infinite Game of Character Life Simulation Paper • 2410.18975 • Published 16 days ago • 34
Tracking Universal Features Through Fine-Tuning and Model Merging Paper • 2410.12391 • Published 24 days ago • 5
HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks Paper • 2410.12381 • Published 24 days ago • 41
Agent S: An Open Agentic Framework that Uses Computers Like a Human Paper • 2410.08164 • Published 30 days ago • 24
Inference Scaling for Long-Context Retrieval Augmented Generation Paper • 2410.04343 • Published Oct 6 • 9
PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation Paper • 2409.18964 • Published Sep 27 • 25
Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance Paper • 2409.04593 • Published Sep 6 • 22
WildVis: Open Source Visualizer for Million-Scale Chat Logs in the Wild Paper • 2409.03753 • Published Sep 5 • 18
VideoLLaMB: Long-context Video Understanding with Recurrent Memory Bridges Paper • 2409.01071 • Published Sep 2 • 26
Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler Paper • 2408.13359 • Published Aug 23 • 21
A Web-Based Solution for Federated Learning with LLM-Based Automation Paper • 2408.13010 • Published Aug 23 • 8
Building and better understanding vision-language models: insights and future directions Paper • 2408.12637 • Published Aug 22 • 116
The Brittleness of AI-Generated Image Watermarking Techniques: Examining Their Robustness Against Visual Paraphrasing Attacks Paper • 2408.10446 • Published Aug 19 • 5
Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique Paper • 2408.10701 • Published Aug 20 • 10
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model Paper • 2408.11039 • Published Aug 20 • 56