mdeberta-base-v3 GermanSQuAD Model
Welcome to the repository for the German mdeberta-base-v3 model fine-tuned on the GermanSQuAD dataset for the task of extractive question answering (QA). This model aims to provide an effective solution for extracting answers from German text, leveraging the robust capabilities of the mdeberta-base-v3 language model.
Overview
This model is fine-tuned to understand and process German language questions and contexts, making it a powerful tool for applications requiring extractive QA capabilities in German.
- Language model: mdeberta-base-v3
- Language: German
- Downstream-task: Extractive Question Answering
- Training data: GermanSQuAD
- Evaluation data: SQuAD (translated to German for consistent evaluation)
Model Training
The model was trained with the following hyperparameters:
- Batch size: 12
- Number of epochs: 4
- Base language model: mdeberta-base-v3
- Learning rate: 2e-5
- Learning rate schedule: Linear with Warmup
- Warmup proportion: 0.1
These hyperparameters were selected to optimize the model's performance on the extractive QA task, balancing training efficiency with the quality of the resulting model.
Results
The model achieved the following results on the evaluation data:
- Exact Match (EM): 64.56%
- F1 Score: 82.51%
These metrics indicate the model's effectiveness at identifying the exact answers within the provided context as well as its ability to match answers that are semantically correct but not an exact text match.
Usage
To use this model for extractive question answering in German, you can load it using the Hugging Face Transformers library. Below is a quick example of how to do so:
from transformers import AutoModelForQuestionAnswering, AutoTokenizer
model_name = "adresolo/mdeberta-v3-base-germansquad"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
# Example question and context
question = "Was ist das Hauptziel der QA-Aufgabe?"
context = "Die Hauptaufgabe der Fragebeantwortung (QA) ist es, aus einem gegebenen Kontext die genaue Antwort auf eine gestellte Frage zu extrahieren."
inputs = tokenizer(question, context, return_tensors='pt')
answer_start_scores, answer_end_scores = model(**inputs)
# Decoding the predicted answer
answer_start = torch.argmax(answer_start_scores) # The start position of your answer
answer_end = torch.argmax(answer_end_scores) + 1 # The end position of your answer
answer = tokenizer.convert_tokens_to_string(tokenizer.convert_ids_to_tokens(inputs['input_ids'][0][answer_start:answer_end]))
print("Predicted answer:", answer)
- Downloads last month
- 0