license: cc-by-nc-4.0
language:
- en
tags:
- bert
- question-classification
- trec
widget:
- text: |
Enter your text to classify its content.
example_title: Classify Question Type
BERT-Question-Classifier
The BERT-Question-Classifier is a refined model based on the bert-base-uncased
architecture. It has been fine-tuned specifically for classifying the types of questions entered (Description, Entity, Expression, Human, Location, Numeric) using the TREC question classification dataset.
- Developed by: phanerozoic
- Model type: BertForSequenceClassification
- Source model:
bert-base-uncased
- License: cc-by-nc-4.0
- Languages: English
Model Details
The BERT-Question-Classifier utilizes a self-attention mechanism to assess the relevance of each word in the context of a question, optimized for categorizing question types.
Configuration
- Attention probs dropout prob: 0.1
- Hidden act: gelu
- Hidden size: 768
- Number of attention heads: 12
- Number of hidden layers: 12
Training and Evaluation Data
This model is trained on the TREC dataset, which contains a diverse set of question types each labeled under categories such as Description, Entity, Expression, Human, Location, and Numeric.
Training Procedure
The training process was systematically automated to evaluate various hyperparameters, ensuring the selection of optimal settings for the best model performance.
- Initial exploratory training: Various configurations of epochs, batch sizes, and learning rates were tested.
- Focused refinement training: Post initial testing, the model underwent intensive training with selected hyperparameters to ensure consistent performance and generalization.
Optimal Hyperparameters Identified
- Epochs: 5
- Batch size: 48
- Learning rate: 2e-5
Performance
Post-refinement, the model exhibits high efficacy in question type classification:
- Accuracy: 91%
- F1 Score: 92%
Usage
This model excels in classifying question types in English, ideal for systems needing to interpret and categorize user queries accurately.
Limitations
The BERT-Question-Classifier performs best on question data similar to that found in the TREC dataset. Performance may vary when applied to different domains or languages.
Acknowledgments
Special thanks to the developers of the BERT architecture and the contributions from the Hugging Face team, whose tools and libraries were crucial in the development of this classifier.