Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition
Abstract
We propose a neural language modeling system based on low-rank adaptation (LoRA) for speech recognition output rescoring. Although pretrained language models (LMs) like BERT have shown superior performance in second-pass rescoring, the high computational cost of scaling up the pretraining stage and adapting the pretrained models to specific domains limit their practical use in rescoring. Here we present a method based on low-rank decomposition to train a rescoring BERT model and adapt it to new domains using only a fraction (0.08%) of the pretrained parameters. These inserted matrices are optimized through a discriminative training objective along with a correlation-based regularization loss. The proposed low-rank adaptation Rescore-BERT (LoRB) architecture is evaluated on LibriSpeech and internal datasets with decreased training times by factors between 5.4 and 3.6.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Generative Speech Recognition Error Correction with Large Language Models (2023)
- Sparsely Shared LoRA on Whisper for Child Speech Recognition (2023)
- HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models (2023)
- Improved Factorized Neural Transducer Model For text-only Domain Adaptation (2023)
- Unsupervised Accent Adaptation Through Masked Language Model Correction Of Discrete Self-Supervised Speech Units (2023)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper