Edit model card

BERT-Question-Classifier

The BERT-Question-Classifier is a refined model based on the bert-base-uncased architecture. It has been fine-tuned specifically for classifying the types of questions entered (Description, Entity, Expression, Human, Location, Numeric) using the TREC question classification dataset.

  • Developed by: phanerozoic
  • Model type: BertForSequenceClassification
  • Source model: bert-base-uncased
  • License: cc-by-nc-4.0
  • Languages: English

Model Details

The BERT-Question-Classifier utilizes a self-attention mechanism to assess the relevance of each word in the context of a question, optimized for categorizing question types.

Configuration

  • Attention probs dropout prob: 0.1
  • Hidden act: gelu
  • Hidden size: 768
  • Number of attention heads: 12
  • Number of hidden layers: 12

Training and Evaluation Data

This model is trained on the TREC dataset, which contains a diverse set of question types each labeled under categories such as Description, Entity, Expression, Human, Location, and Numeric.

Training Procedure

The training process was systematically automated to evaluate various hyperparameters, ensuring the selection of optimal settings for the best model performance.

  • Initial exploratory training: Various configurations of epochs, batch sizes, and learning rates were tested.
  • Focused refinement training: Post initial testing, the model underwent intensive training with selected hyperparameters to ensure consistent performance and generalization.

Optimal Hyperparameters Identified

  • Epochs: 5
  • Batch size: 48
  • Learning rate: 2e-5

Performance

Post-refinement, the model exhibits high efficacy in question type classification:

  • Accuracy: 91%
  • F1 Score: 92%

Usage

This model excels in classifying question types in English, ideal for systems needing to interpret and categorize user queries accurately.

Limitations

The BERT-Question-Classifier performs best on question data similar to that found in the TREC dataset. Performance may vary when applied to different domains or languages.

Acknowledgments

Special thanks to the developers of the BERT architecture and the contributions from the Hugging Face team, whose tools and libraries were crucial in the development of this classifier.

Downloads last month
16
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including phanerozoic/BERT-Question-Classifier