floom (Felix Tuma)

upvoted a paper 3 days ago

TPI-LLM: Serving 70B-scale LLMs Efficiently on Low-resource Edge Devices

Paper • 2410.00531 • Published 5 days ago • 27

upvoted a paper 16 days ago

Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published 17 days ago • 121

upvoted 5 papers about 1 month ago

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

Paper • 2408.12528 • Published Aug 22 • 50

Evidence-backed Fact Checking using RAG and Few-Shot In-Context Learning with LLMs

Paper • 2408.12060 • Published Aug 22 • 4

upvoted a paper about 2 months ago

The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community

Paper • 2408.08291 • Published Aug 15 • 9

upvoted a paper 3 months ago

LLM Circuit Analyses Are Consistent Across Training and Scale

Paper • 2407.10827 • Published Jul 15 • 4

upvoted a collection 3 months ago

Qwen2

Collection

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated 17 days ago • 340

upvoted 2 papers 3 months ago

Symbolic Learning Enables Self-Evolving Agents

Paper • 2406.18532 • Published Jun 26 • 10

LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs

Paper • 2406.15319 • Published Jun 21 • 60

upvoted 9 papers 4 months ago

DataComp-LM: In search of the next generation of training sets for language models

Paper • 2406.11794 • Published Jun 17 • 48

Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization

Paper • 2406.11431 • Published Jun 17 • 4

Measuring memorization in RLHF for code completion

Paper • 2406.11715 • Published Jun 17 • 6

Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

Paper • 2406.07522 • Published Jun 11 • 36

Open-Endedness is Essential for Artificial Superhuman Intelligence

Paper • 2406.04268 • Published Jun 6 • 11

Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration

Paper • 2406.01014 • Published Jun 3 • 30

To Believe or Not to Believe Your LLM

Paper • 2406.02543 • Published Jun 4 • 31

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

Paper • 2405.21060 • Published May 31 • 63

The Road Less Scheduled

Paper • 2405.15682 • Published May 24 • 20

upvoted 2 papers 6 months ago

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Paper • 2404.07143 • Published Apr 10 • 103

LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

Paper • 2403.15042 • Published Mar 22 • 24

upvoted 12 papers 8 months ago

Large Language Models as Zero-shot Dialogue State Tracker through Function Calling

Paper • 2402.10466 • Published Feb 16 • 16

In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss

Paper • 2402.10790 • Published Feb 16 • 40

V-STaR: Training Verifiers for Self-Taught Reasoners

Paper • 2402.06457 • Published Feb 9 • 8

InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory

Paper • 2402.04617 • Published Feb 7 • 4

Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 98

AutoMathText: Autonomous Data Selection with Language Models for Mathematical Texts

Paper • 2402.07625 • Published Feb 12 • 11

OS-Copilot: Towards Generalist Computer Agents with Self-Improvement

Paper • 2402.07456 • Published Feb 12 • 41

Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model

Paper • 2402.07827 • Published Feb 12 • 45

World Model on Million-Length Video And Language With RingAttention

Paper • 2402.08268 • Published Feb 13 • 36

Mixtures of Experts Unlock Parameter Scaling for Deep RL

Paper • 2402.08609 • Published Feb 13 • 34

CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay

Paper • 2402.04858 • Published Feb 7 • 14

Self-Discover: Large Language Models Self-Compose Reasoning Structures

Paper • 2402.03620 • Published Feb 6 • 109

upvoted a paper 9 months ago

The Impact of Reasoning Step Length on Large Language Models

Paper • 2401.04925 • Published Jan 10 • 15

upvoted a paper about 1 year ago

Self-consistency for open-ended generations

Paper • 2307.06857 • Published Jul 11, 2023 • 9

Felix Tuma

AI & ML interests

Organizations

floom's activity