bfuzzy1 (Robin Williams)

upvoted an article 5 days ago

Article

Releasing the largest multilingual open pretraining dataset

By

•

6 days ago

• 92

upvoted a paper 7 days ago

Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding

Paper • 2411.04282 • Published 12 days ago • 27

upvoted 2 papers 10 days ago

Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large Language Model

Paper • 2411.04496 • Published 12 days ago • 20

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Paper • 2411.04905 • Published 12 days ago • 105

upvoted 2 papers 12 days ago

Zebra-Llama: A Context-Aware Large Language Model for Democratizing Rare Disease Knowledge

Paper • 2411.02657 • Published 14 days ago • 5

ATM: Improving Model Merging by Alternating Tuning and Merging

Paper • 2411.03055 • Published 14 days ago • 1

upvoted 4 papers 16 days ago

BitStack: Fine-Grained Size Control for Compressed Large Language Models in Variable Memory Environments

Paper • 2410.23918 • Published 19 days ago • 17

SelfCodeAlign: Self-Alignment for Code Generation

Paper • 2410.24198 • Published 19 days ago • 20

What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective

Paper • 2410.23743 • Published 19 days ago • 58

Communicative Agents for Software Development

Paper • 2307.07924 • Published Jul 16, 2023 • 3

upvoted an article 18 days ago

Article

Decoding Strategies in Large Language Models

By

•

21 days ago

• 37

upvoted a collection 18 days ago

Model2Vec base models

Collection

These are the Minishlab Model2Vec base models. Load them and use them with model2vec (https://github.com/MinishLab/model2vec) or sentence-transformers • 7 items • Updated 21 days ago • 8

upvoted a paper 19 days ago

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Paper • 2402.14905 • Published Feb 22 • 126

upvoted a collection 19 days ago

MobileLLM

Collection

Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 8 items • Updated 12 days ago • 95

upvoted a paper 20 days ago

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding

Paper • 2404.16710 • Published Apr 25 • 74

upvoted 5 papers 23 days ago

Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering

Paper • 2410.15999 • Published 29 days ago • 19

Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs

Paper • 2410.18451 • Published 26 days ago • 13

Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch

Paper • 2410.18693 • Published 26 days ago • 40

LOGO -- Long cOntext aliGnment via efficient preference Optimization

Paper • 2410.18533 • Published 26 days ago • 42

Can Knowledge Editing Really Correct Hallucinations?

Paper • 2410.16251 • Published 29 days ago • 54

Robin Williams PRO

AI & ML interests

Organizations

bfuzzy1's activity