samusenps

AI & ML interests

Foundational Architectures, Multi-Modality, Interpretability, Benchmarking w/ simulations, Robotics, Integration with Non envasive Open Source stack RISC-V BCI. Extremely high quality training data. Fully Open Source ML/AI.

Recent Activity

upvoted a paper 13 days ago

Balancing Pipeline Parallelism with Vocabulary Parallelism

upvoted a paper 15 days ago

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

upvoted a paper 15 days ago

DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion

View all activity

Organizations

samusenps's activity

upvoted a paper 13 days ago

Balancing Pipeline Parallelism with Vocabulary Parallelism

Paper • 2411.05288 • Published 16 days ago • 19

upvoted 5 papers 15 days ago

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Paper • 2411.04996 • Published 16 days ago • 48

DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion

Paper • 2411.04928 • Published 16 days ago • 48

BitNet a4.8: 4-bit Activations for 1-bit LLMs

Paper • 2411.04965 • Published 16 days ago • 63

ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

Paper • 2411.05003 • Published 16 days ago • 68

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Paper • 2411.04905 • Published 16 days ago • 109

upvoted 14 papers 18 days ago

OS-ATLAS: A Foundation Action Model for Generalist GUI Agents

Paper • 2410.23218 • Published 24 days ago • 46

SALSA: Soup-based Alignment Learning for Stronger Adaptation in RLHF

Paper • 2411.01798 • Published 20 days ago • 8

Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in Foundation Models

Paper • 2411.00743 • Published 22 days ago • 6

AutoVFX: Physically Realistic Video Editing from Natural Language Instructions

Paper • 2411.02394 • Published 19 days ago • 17

LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language Models

Paper • 2411.00918 • Published 22 days ago • 8

PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance

Paper • 2411.02327 • Published 19 days ago • 11

IGOR: Image-GOal Representations are the Atomic Control Units for Foundation Models in Embodied AI

Paper • 2411.00785 • Published Oct 17 • 8

DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models

Paper • 2411.00836 • Published 25 days ago • 15

GenXD: Generating Any 3D and 4D Scenes

Paper • 2411.02319 • Published 19 days ago • 20

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

Paper • 2411.02265 • Published 19 days ago • 24

How Far is Video Generation from World Model: A Physical Law Perspective

Paper • 2411.02385 • Published 19 days ago • 32