Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.00838

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18 • 144
Orion-14B: Open-source Multilingual Large Language Models

Paper • 2401.12246 • Published Jan 20 • 12
MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24 • 51
MM-LLMs: Recent Advances in MultiModal Large Language Models

Paper • 2401.13601 • Published Jan 24 • 45

Seminal AI Papers

A collection of top AI papers.

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 44
You Only Look Once: Unified, Real-Time Object Detection

Paper • 1506.02640 • Published Jun 8, 2015 • 1
HEp-2 Cell Image Classification with Deep Convolutional Neural Networks

Paper • 1504.02531 • Published Apr 10, 2015
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

Paper • 2401.05566 • Published Jan 10 • 26

OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1 • 80
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models

Paper • 2406.06563 • Published Jun 3 • 17

OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1 • 80

OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1 • 80

MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series

Paper • 2405.19327 • Published May 29 • 46
LLM360/K2

Text Generation • Updated Jul 29 • 553 • 80
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1 • 80
LLM360: Towards Fully Transparent Open-Source LLMs

Paper • 2312.06550 • Published Dec 11, 2023 • 56

Foundation Models

OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1 • 80
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Paper • 2403.05530 • Published Mar 8 • 60
StarCoder: may the source be with you!

Paper • 2305.06161 • Published May 9, 2023 • 30
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling

Paper • 2312.15166 • Published Dec 23, 2023 • 56

Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28 • 104
sDPO: Don't Use Your Data All at Once

Paper • 2403.19270 • Published Mar 28 • 40
ViTAR: Vision Transformer with Any Resolution

Paper • 2403.18361 • Published Mar 27 • 52
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models

Paper • 2403.18814 • Published Mar 27 • 44

Issa collection 1

allenai/basic_arithmetic

Viewer • Updated Mar 8 • 3k • 21 • 1
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1 • 80

Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 99
How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15 • 38
BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15 • 17
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts

Paper • 2402.09727 • Published Feb 15 • 35

Previous
1
2
3
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs