deepset
/

roberta-base-squad2-distilled

Question Answering

Inference Endpoints

Model card Files Files and versions Community

MichelBartelsDeepset commited on Dec 8, 2021

Commit

18c01a6

•

1 Parent(s): 930271a

Update README.md

Files changed (1) hide show

README.md +2 -12

README.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 language: en
 datasets:
-- deepset/germanquad
 license: mit
 thumbnail: https://thumb.tildacdn.com/tild3433-3637-4830-a533-353833613061/-/resize/720x/-/format/webp/germanquad.jpg
 tags:
@@ -19,12 +19,7 @@ tags:
 **Published**: Apr 21st, 2021
 ## Details
-- We trained a German question answering model with a gelectra-base model as its basis.
-- The dataset is GermanQuAD, a new, German language dataset, which we hand-annotated and published [online](https://deepset.ai/germanquad).
-- The training dataset is one-way annotated and contains 11518 questions and 11518 answers, while the test dataset is three-way annotated so that there are 2204 questions and with 2204·3−76 = 6536answers, because we removed 76 wrong answers.
-- In addition to the annotations in GermanQuAD, haystack's distillation feature was used for training. deepset/roberta-large-squad2 was used as the teacher model.
-See https://deepset.ai/germanquad for more details and dataset download in SQuAD format.
 ## Hyperparameters
 ```
@@ -38,11 +33,6 @@ temperature = 1.5
 distillation_loss_weight = 0.75
 ```
 ## Performance
-We evaluated the extractive question answering performance on the SQuAD v2 dev set.
-Model types and training data are included in the model name.
-For finetuning XLM-Roberta, we use the English SQuAD v2.0 dataset.
-The GELECTRA models are warm started on the German translation of SQuAD v1.1 and finetuned on \\\\germanquad.
-The human baseline was computed for the 3-way test set by taking one answer as prediction and the other two as ground truth.
 ```
 "exact": 79.8366040596311
 "f1": 83.916407079888

 ---
 language: en
 datasets:
+- squad_v2
 license: mit
 thumbnail: https://thumb.tildacdn.com/tild3433-3637-4830-a533-353833613061/-/resize/720x/-/format/webp/germanquad.jpg
 tags:
 **Published**: Apr 21st, 2021
 ## Details
+- haystack's distillation feature was used for training. deepset/roberta-large-squad2 was used as the teacher model.
 ## Hyperparameters
 ```
 distillation_loss_weight = 0.75
 ```
 ## Performance
 ```
 "exact": 79.8366040596311
 "f1": 83.916407079888