Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2401.18058

Alignment and Unlearning

Learn Your Reference Model for Real Good Alignment

Paper • 2404.09656 • Published Apr 15 • 82
Aligning Teacher with Student Preferences for Tailored Training Data Generation

Paper • 2406.19227 • Published Jun 27 • 24
Self-Play Preference Optimization for Language Model Alignment

Paper • 2405.00675 • Published May 1 • 24
CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues

Paper • 2404.03820 • Published Apr 4 • 24

LLoCO: Learning Long Contexts Offline

Paper • 2404.07979 • Published Apr 11 • 20
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Paper • 2402.13753 • Published Feb 21 • 112
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration

Paper • 2402.11550 • Published Feb 18 • 16
LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31 • 21

Interesting things.

AtP*: An efficient and scalable method for localizing LLM behaviour to components

Paper • 2403.00745 • Published Mar 1 • 11
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 602
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT

Paper • 2402.16840 • Published Feb 26 • 23
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Paper • 2402.13753 • Published Feb 21 • 112

Data-efficient LLMs

dataset pruning for advancing the capabilities of LLMs

Effective pruning of web-scale datasets based on complexity of concept clusters

Paper • 2401.04578 • Published Jan 9
How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15 • 39
A Survey on Data Selection for LLM Instruction Tuning

Paper • 2402.05123 • Published Feb 4 • 3
LESS: Selecting Influential Data for Targeted Instruction Tuning

Paper • 2402.04333 • Published Feb 6 • 3

E^2-LLM: Efficient and Extreme Length Extension of Large Language Models

Paper • 2401.06951 • Published Jan 13 • 25
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

Paper • 2401.01325 • Published Jan 2 • 26
Extending LLMs' Context Window with 100 Samples

Paper • 2401.07004 • Published Jan 13 • 15
LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31 • 21

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

Paper • 2402.00159 • Published Jan 31 • 59
LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31 • 21

LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31 • 21

AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning

Paper • 2402.00769 • Published Feb 1 • 20
LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

Paper • 2311.05556 • Published Nov 9, 2023 • 81
LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31 • 21
Efficient Tool Use with Chain-of-Abstraction Reasoning

Paper • 2401.17464 • Published Jan 30 • 16

My (Denis Gordeev) collection of mostly NLP papers. You can message me at t.me/nlp_party

LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31 • 21
Efficient Tool Use with Chain-of-Abstraction Reasoning

Paper • 2401.17464 • Published Jan 30 • 16
Scavenging Hyena: Distilling Transformers into Long Convolution Models

Paper • 2401.17574 • Published Jan 31 • 15
Rethinking Interpretability in the Era of Large Language Models

Paper • 2402.01761 • Published Jan 30 • 21

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

Paper • 2402.19427 • Published Feb 29 • 52
Simple linear attention language models balance the recall-throughput tradeoff

Paper • 2402.18668 • Published Feb 28 • 18
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition

Paper • 2402.15220 • Published Feb 23 • 19
Linear Transformers are Versatile In-Context Learners

Paper • 2402.14180 • Published Feb 21 • 6

Previous
1
2
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs