Papers
arxiv:2410.13841

A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models

Published on Oct 17
· Submitted by Tigerph on Oct 18
Authors:
,

Abstract

Post-training has emerged as a crucial paradigm for adapting large-scale pre-trained models to various tasks, whose effects are fully reflected by delta parameters (i.e., the disparity between post-trained and pre-trained parameters). While numerous studies have explored delta parameter properties via operations like pruning, quantization, low-rank approximation, and extrapolation, a unified framework for systematically examining these characteristics has been lacking. In this paper, we propose a novel perspective based on Riemann sum approximation of the loss function to elucidate delta parameter editing operations. Our analysis categorizes existing methods into three classes based on their post-editing performance: competitive, decreased, and improved, explaining how they are expressed by the Riemann sum approximation term and how they alter the model performance. Extensive experiments on both visual and language models, including ViT, LLaMA 3, Qwen 2, and Mistral, corroborate our theoretical findings. Furthermore, we introduce extensions to existing techniques like DARE and BitDelta, highlighting their limitations in leveraging the properties of delta parameters and reorganizing them into general expressions to enhance the applicability and effectiveness of delta parameter editing in post-trained models.

Community

Paper submitter
  1. Novel Perspective: The paper introduces a new approach based on the Riemann sum approximation of the loss function to explain delta parameter editing operations. This provides a fresh theoretical framework for understanding the changes in parameters after post-training.

  2. Systematic Classification: It categorizes existing delta parameter editing methods into three classes based on their post-editing performance: competitive, decreased, and improved. This classification not only helps in clearly identifying the effectiveness of different methods but also elucidates how these effects are represented through the Riemann sum approximation term and how they impact model performance.

  3. Extensive Validation: The research is supported by extensive experiments conducted on both visual and language models, including ViT, LLaMA 3, Qwen 2, and Mistral. These experiments validate the theoretical findings, enhancing the credibility and generalizability of the results.

  4. Technological Enhancements: The paper extends existing techniques such as DARE and BitDelta, pointing out their limitations in leveraging the properties of delta parameters. By reorganizing these techniques into more general expressions, the authors aim to improve the applicability and effectiveness of delta parameter editing in post-trained models.

These strengths collectively highlight the paper's significant contributions to the field of post-training adaptation for large-scale pre-trained models.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2410.13841 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2410.13841 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2410.13841 in a Space README.md to link it from this page.

Collections including this paper 2