Papers
arxiv:2406.06623

Spectrum: Targeted Training on Signal to Noise Ratio

Published on Jun 7
Authors:
,
,
,

Abstract

Efficiently post-training large language models remains a challenging task due to the vast computational resources required. We present Spectrum, a method that accelerates LLM training by selectively targeting layer modules based on their signal-to-noise ratio (SNR), and freezing the remaining modules. Our approach, which utilizes an algorithm to compute module SNRs prior to training, has shown to effectively match the performance of full fine-tuning while reducing GPU memory usage. Experiments comparing Spectrum to existing methods such as QLoRA demonstrate its effectiveness in terms of model quality and VRAM efficiency in distributed environments.

Community

The same theory has been established for years now by https://github.com/CalculatedContent/WeightWatcher
However there is no attribution to the same which is a let down

Sign up or log in to comment

Models citing this paper 24

Browse 24 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2406.06623 in a dataset README.md to link it from this page.

Spaces citing this paper 12

Collections including this paper 3