MiniLLM

community

https://github.com/microsoft/LMOps/tree/main/minillm

t1101675

AI & ML interests

Training efficient language models (MiniLLM, MiniPLM)

Organization Card

Community About org cards

Training Small Language Models with Knowledge Distillation

Official pre-trained models and baselines in

MiniLLM: Knowledge distillation of LLMs during instruction tuning.
MiniPLM: Knowledge distillation of LLMs during pre-training.

Collections 2

models 50

MiniLLM/init-gpt2-120M

Text Generation • Updated 8 days ago • 552

MiniLLM/teacher-Llama-13B

Text Generation • Updated 23 days ago • 16

MiniLLM/MiniLLM-Llama-7B

Text Generation • Updated 23 days ago • 15

MiniLLM/Ref-Pretrain-Qwen-104M

Text Generation • Updated 25 days ago • 11

MiniLLM/MiniPLM-Mamba-130M

Text Generation • Updated 25 days ago • 21

MiniLLM/MiniPLM-Qwen-1.2B

Text Generation • Updated 25 days ago • 78 • 2

MiniLLM/MiniPLM-Qwen-500M

Text Generation • Updated 25 days ago • 130 • 3

MiniLLM/MiniPLM-Qwen-200M

Text Generation • Updated 25 days ago • 243

MiniLLM/MiniPLM-llama3.1-212M

Text Generation • Updated 25 days ago • 98 • 1

MiniLLM/Pretrain-Qwen-500M

Text Generation • Updated 25 days ago • 17

datasets 10

MiniLLM/pile-tokenized

Updated 7 days ago • 8

MiniLLM/roberta-corpus-processed

Updated about 1 month ago • 60

MiniLLM/pile-diff_samp-qwen_1.8B-qwen_104M-r0.5

Updated Oct 20 • 108

MiniLLM/openwebtext-processed

Updated Sep 27 • 88

MiniLLM/dolly-processed

Viewer • Updated Sep 26 • 110k • 153 • 1

MiniLLM/sinst

Viewer • Updated Sep 26 • 8.35k • 63

MiniLLM/uinst

Viewer • Updated Sep 26 • 64.8k • 48

MiniLLM/self-inst

Viewer • Updated Sep 26 • 242 • 72 • 1

MiniLLM/Vicuna

Viewer • Updated Sep 26 • 80 • 56

MiniLLM/dolly

Viewer • Updated Sep 26 • 500 • 66