Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

mdeberta-base-v3 GermanSQuAD Model

Welcome to the repository for the German mdeberta-base-v3 model fine-tuned on the GermanSQuAD dataset for the task of extractive question answering (QA). This model aims to provide an effective solution for extracting answers from German text, leveraging the robust capabilities of the mdeberta-base-v3 language model.

Overview

This model is fine-tuned to understand and process German language questions and contexts, making it a powerful tool for applications requiring extractive QA capabilities in German.

  • Language model: mdeberta-base-v3
  • Language: German
  • Downstream-task: Extractive Question Answering
  • Training data: GermanSQuAD
  • Evaluation data: SQuAD (translated to German for consistent evaluation)

Model Training

The model was trained with the following hyperparameters:

  • Batch size: 12
  • Number of epochs: 4
  • Base language model: mdeberta-base-v3
  • Learning rate: 2e-5
  • Learning rate schedule: Linear with Warmup
  • Warmup proportion: 0.1

These hyperparameters were selected to optimize the model's performance on the extractive QA task, balancing training efficiency with the quality of the resulting model.

Results

The model achieved the following results on the evaluation data:

  • Exact Match (EM): 64.56%
  • F1 Score: 82.51%

These metrics indicate the model's effectiveness at identifying the exact answers within the provided context as well as its ability to match answers that are semantically correct but not an exact text match.

Usage

To use this model for extractive question answering in German, you can load it using the Hugging Face Transformers library. Below is a quick example of how to do so:

from transformers import AutoModelForQuestionAnswering, AutoTokenizer

model_name = "adresolo/mdeberta-v3-base-germansquad"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForQuestionAnswering.from_pretrained(model_name)

# Example question and context
question = "Was ist das Hauptziel der QA-Aufgabe?"
context = "Die Hauptaufgabe der Fragebeantwortung (QA) ist es, aus einem gegebenen Kontext die genaue Antwort auf eine gestellte Frage zu extrahieren."

inputs = tokenizer(question, context, return_tensors='pt')
answer_start_scores, answer_end_scores = model(**inputs)

# Decoding the predicted answer
answer_start = torch.argmax(answer_start_scores) # The start position of your answer
answer_end = torch.argmax(answer_end_scores) + 1 # The end position of your answer

answer = tokenizer.convert_tokens_to_string(tokenizer.convert_ids_to_tokens(inputs['input_ids'][0][answer_start:answer_end]))

print("Predicted answer:", answer)
Downloads last month
0
Safetensors
Model size
278M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train adresolo/mdeberta-v3-base-germansquad