metadata

library_name: transformers
license: apache-2.0
base_model: google-bert/bert-base-uncased
tags:
  - generated_from_trainer
datasets:
  - squad
model-index:
  - name: debug_squad
    results: []

bert-base-uncased-finetuned-squad

This model is a fine-tuned version of google-bert/bert-base-uncased on the SQuAD dataset.

Model description

Model Type: BERT for Question Answering
Base Model: bert-base-uncased
Language: English
Task: Question Answering
Dataset: SQuAD v1.1

Training Procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 12
eval_batch_size: 8
seed: 42
optimizer: AdamW_TORCH with beta=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 5

Training results

training_loss: 0.6077
eval_exact_match: 79.508
eval_f1: 87.7293
train_runtime: 1:09:34.90
train_samples_per_second: 106.019
train_steps_per_second: 8.835

Intended uses & limitations

This model is intended for English question answering tasks. It performs best on factual questions where the answer is explicitly stated in the provided context. Note that this model was trained on SQuAD v1.1, which means it always tries to find an answer in the context (it cannot handle questions that have no answer).

Usage Example

from transformers import AutoModelForQuestionAnswering, AutoTokenizer

model = AutoModelForQuestionAnswering.from_pretrained("real-jiakai/bert-base-uncased-finetuned-squad")
tokenizer = AutoTokenizer.from_pretrained("real-jiakai/bert-base-uncased-finetuned-squad")

# Example usage
context = "BERT was developed by Google in 2018."
question = "Who developed BERT?"

inputs = tokenizer(question, context, return_tensors="pt")
outputs = model(**inputs)

answer_start = outputs.start_logits.argmax()
answer_end = outputs.end_logits.argmax()

answer = tokenizer.decode(inputs["input_ids"][0][answer_start:answer_end+1])
print(f"Answer: {answer}")  # Expected output: "google"

Training Infrastructure

Training Device: Single GPU (NVIDIA Tesla V100 16GB)
Training Time: ~70 minutes
Framework: PyTorch
Training Script: Hugging Face Transformers' run_qa.py

Framework versions

Transformers 4.47.0.dev0
Pytorch 2.5.1+cu124
Datasets 3.1.0
Tokenizers 0.20.3