66 1070 1960

taesiri PRO

taesiri

https://taesiri.ai/

AI & ML interests

AGI ... one linear layer at a time

Recent Activity

updated a dataset 20 minutes ago

taesiri/DiscordCrawler

updated a dataset about 1 hour ago

taesiri/DiscordCrawler

updated a dataset about 2 hours ago

taesiri/DiscordCrawler

View all activity

Organizations

taesiri's activity

upvoted 7 papers about 20 hours ago

VisOnlyQA: Large Vision Language Models Still Struggle with Visual Perception of Geometric Information

Paper • 2412.00947 • Published 3 days ago • 6

VLSBench: Unveiling Visual Leakage in Multimodal Safety

Paper • 2411.19939 • Published 5 days ago • 6

SOLAMI: Social Vision-Language-Action Modeling for Immersive Interaction with 3D Autonomous Characters

Paper • 2412.00174 • Published 5 days ago • 13

VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models

Paper • 2412.01822 • Published 1 day ago • 10

Open-Sora Plan: Open-Source Large Video Generation Model

Paper • 2412.00131 • Published 6 days ago • 22

o1-Coder: an o1 Replication for Coding

Paper • 2412.00154 • Published 5 days ago • 21

X-Prompt: Towards Universal In-Context Image Generation in Auto-Regressive Vision Language Foundation Models

Paper • 2412.01824 • Published 1 day ago • 50

upvoted 3 papers 1 day ago

TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any Point in Long Video

Paper • 2411.18671 • Published 7 days ago • 14

On Domain-Specific Post-Training for Multimodal Large Language Models

Paper • 2411.19930 • Published 5 days ago • 23

Video Depth without Video Models

Paper • 2411.19189 • Published 6 days ago • 23

upvoted a paper 5 days ago

Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning

Paper • 2411.18203 • Published 7 days ago • 26

upvoted an article 6 days ago

Article

Use Models from the Hugging Face Hub in LM Studio

•

6 days ago

• 85

upvoted 4 papers 6 days ago

upvoted 4 papers 7 days ago

SketchAgent: Language-Driven Sequential Sketch Generation

Paper • 2411.17673 • Published 8 days ago • 14

MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs

Paper • 2411.15296 • Published 12 days ago • 18

Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published 8 days ago • 42

Rethinking Token Reduction in MLLMs: Towards a Unified Paradigm for Training-Free Acceleration

Paper • 2411.17686 • Published 8 days ago • 18