Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2410.17215

about 6 hours ago

MiniPLM: Knowledge Distillation for Pre-Training Language Models

Paper • 2410.17215 • Published 10 days ago • 12
LOGO -- Long cOntext aliGnment via efficient preference Optimization

Paper • 2410.18533 • Published 8 days ago • 42
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published 10 days ago • 81
LongReward: Improving Long-context Large Language Models with AI Feedback

Paper • 2410.21252 • Published 4 days ago • 16

Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free

Paper • 2410.10814 • Published 18 days ago • 48
MiniPLM: Knowledge Distillation for Pre-Training Language Models

Paper • 2410.17215 • Published 10 days ago • 12
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution

Paper • 2410.16256 • Published 11 days ago • 56
CCI3.0-HQ: a large-scale Chinese dataset of high quality designed for pre-training large language models

Paper • 2410.18505 • Published 8 days ago • 8

Aligning Teacher with Student Preferences for Tailored Training Data Generation

Paper • 2406.19227 • Published Jun 27 • 24
Pre-training Distillation for Large Language Models: A Design Space Exploration

Paper • 2410.16215 • Published 11 days ago • 15
Baichuan Alignment Technical Report

Paper • 2410.14940 • Published 13 days ago • 46
MiniPLM: Knowledge Distillation for Pre-Training Language Models

Paper • 2410.17215 • Published 10 days ago • 12

PDFTriage: Question Answering over Long, Structured Documents

Paper • 2309.08872 • Published Sep 16, 2023 • 53
Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 75
Table-GPT: Table-tuned GPT for Diverse Table Tasks

Paper • 2310.09263 • Published Oct 13, 2023 • 39
Context-Aware Meta-Learning

Paper • 2310.10971 • Published Oct 17, 2023 • 16

about 10 hours ago

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18 • 143
Orion-14B: Open-source Multilingual Large Language Models

Paper • 2401.12246 • Published Jan 20 • 11
MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24 • 49
MM-LLMs: Recent Advances in MultiModal Large Language Models

Paper • 2401.13601 • Published Jan 24 • 44

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs