taufiqdp (Taufiq Dwi Purnomo)

upvoted a paper 10 days ago

FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI

Paper • 2411.04872 • Published 12 days ago • 4

upvoted a paper 11 days ago

BitNet a4.8: 4-bit Activations for 1-bit LLMs

Paper • 2411.04965 • Published 11 days ago • 63

upvoted a paper 13 days ago

HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems

Paper • 2411.02959 • Published 14 days ago • 61

upvoted a paper 14 days ago

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

Paper • 2411.02265 • Published 15 days ago • 23

upvoted a collection 15 days ago

MobileLLM

Collection

Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 8 items • Updated 12 days ago • 95

upvoted a paper 18 days ago

Stealing User Prompts from Mixture of Experts

Paper • 2410.22884 • Published 20 days ago • 13

upvoted 2 papers 21 days ago

GPT-4o System Card

Paper • 2410.21276 • Published 25 days ago • 79

A Survey of Small Language Models

Paper • 2410.20011 • Published 24 days ago • 37

upvoted an article 23 days ago

Article

Visually Multilingual: Introducing mcdse-2b

By

•

23 days ago

• 37

upvoted 2 papers 25 days ago

Unbounded: A Generative Infinite Game of Character Life Simulation

Paper • 2410.18975 • Published 26 days ago • 34

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published 28 days ago • 88

upvoted a paper 27 days ago

Ichigo: Mixed-Modal Early-Fusion Realtime Voice Assistant

Paper • 2410.15316 • Published 30 days ago • 10

upvoted 2 papers 28 days ago

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

Paper • 2410.16153 • Published 29 days ago • 42

AutoTrain: No-code training for state-of-the-art models

Paper • 2410.15735 • Published 29 days ago • 57

upvoted a collection 29 days ago

Granite 3.0 Language Models

Collection

A series of language models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 8 items • Updated 15 days ago • 88

upvoted a paper 29 days ago

nGPT: Normalized Transformer with Representation Learning on the Hypersphere

Paper • 2410.01131 • Published Oct 1 • 8

upvoted 4 papers about 1 month ago

Movie Gen: A Cast of Media Foundation Models

Paper • 2410.13720 • Published Oct 17 • 88

DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception

Paper • 2410.12628 • Published Oct 16 • 27

DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

Paper • 2410.10819 • Published Oct 14 • 6

Baichuan-Omni Technical Report

Paper • 2410.08565 • Published Oct 11 • 83

Taufiq Dwi Purnomo

AI & ML interests

Organizations

taufiqdp's activity