arxiv:2410.14208

Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning

Published on Oct 18

· Submitted by

lixiaochuan2020 on Oct 21

Authors:

Abstract

Synthetic data has been widely used to train large language models, but their generative nature inevitably introduces noisy, non-informative, and misleading learning signals. In this paper, we propose Montessori-Instruct, a novel data synthesis framework that tailors the data synthesis ability of the teacher language model toward the student language model's learning process. Specifically, we utilize local data influence of synthetic training data points on students to characterize students' learning preferences. Then, we train the teacher model with Direct Preference Optimization (DPO) to generate synthetic data tailored toward student learning preferences. Experiments with Llama3-8B-Instruct (teacher) and Llama3-8B (student) on Alpaca Eval and MT-Bench demonstrate that Montessori-Instruct significantly outperforms standard synthesis methods by 18.35\% and 46.24\% relatively. Our method also beats data synthesized by a stronger teacher model, GPT-4o. Further analysis confirms the benefits of teacher's learning to generate more influential training data in the student's improved learning, the advantages of local data influence in accurately measuring student preferences, and the robustness of Montessori-Instruct across different student models. Our code and data are open-sourced at https://github.com/cxcscmu/Montessori-Instruct.

View arXiv page View PDF Add to collection

Community

lixiaochuan2020

Paper submitter 11 days ago

🤔What can we produce when combining data attribution and data synthesis?
🧑‍🎓Introducing Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning.
⚙️Montessori is a novel data synthesis framework that:

Tailors the data generation capabilities of the teacher model to align with the student's learning preferences.
Leverages influence functions to precisely capture the student's needs.
Achieves 18.35% and 46.24% relative improvements on Alpaca Eval 2.0 and MT-Bench—outperforming Self-Instruct, Self-Reward, LLM2LLM and even GPT-4's synthesized data!

Github Repo: https://github.com/cxcscmu/Montessori-Instruct

librarian-bot

10 days ago

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2410.14208 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2410.14208 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2410.14208 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.