Intel
/

bert-base-uncased-squadv1.1-sparse-80-1x4-block-pruneofa

Question Answering

Inference Endpoints

Model card Files Files and versions Community

bconsolvo commited on Apr 7, 2023

Commit

ede3f84

•

1 Parent(s): 7304324

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ model-index:
       value: 88.4735
 ---
 ## Model Details: 80% 1x4 Block Sparse BERT-Base (uncased) Fine Tuned on SQuADv1.1
-This model has been fine-tuned for the NLP task of question answering, trained on the SQuAD 1.1 dataset. It is a result of fine-tuning a Prune OFA 80% 1x4 block sparse pre-trained BERT-Base model, combined with knowledge distillation.
 > We present a new method for training sparse pre-trained Transformer language models by integrating weight pruning and model distillation. These sparse pre-trained models can be used to transfer learning for a wide range of tasks while maintaining their sparsity pattern. We show how the compressed sparse pre-trained models we trained transfer their knowledge to five different downstream natural language tasks with minimal accuracy loss. For example, with our sparse pre-trained BERT-Large fine-tuned on SQuADv1.1 and quantized to 8bit we achieve a compression ratio of 40X for the encoder with less than 1% accuracy loss.

       value: 88.4735
 ---
 ## Model Details: 80% 1x4 Block Sparse BERT-Base (uncased) Fine Tuned on SQuADv1.1
+This model has been fine-tuned for the NLP task of question answering, trained on the SQuAD 1.1 dataset. It is a result of fine-tuning a Prune Once For All 80% 1x4 block sparse pre-trained BERT-Base model, combined with knowledge distillation.
 > We present a new method for training sparse pre-trained Transformer language models by integrating weight pruning and model distillation. These sparse pre-trained models can be used to transfer learning for a wide range of tasks while maintaining their sparsity pattern. We show how the compressed sparse pre-trained models we trained transfer their knowledge to five different downstream natural language tasks with minimal accuracy loss. For example, with our sparse pre-trained BERT-Large fine-tuned on SQuADv1.1 and quantized to 8bit we achieve a compression ratio of 40X for the encoder with less than 1% accuracy loss.