Edit model card

Model Card for pipp-finder-bert-base-cased

This highly idiosyncratic and specific binary classifier is designed for the sole purpose of helping linguists find instances of the English Preposing in PP (PiPPs) construction in corpora. PiPPs are unbounded dependency constructions like "Happy though we were with the idea, we decided not to pursue it". This model does a good job of classifying sentences for whether or not they contain an instance of the construction.

The model is used as an investigative tool in this paper:

Model Details

The model is a fine-tuned bert-base-cased model. The fine-tuning data are available as annotated/pipp-labels.csv in this project repository. All the annotations were done by Christopher Potts for the project "Characterizing English Preposing in PP constructions".

The model outputs 1 if it predicts the input contains a PiPP, else 0.

Model Description

  • Developed by: Christopher Potts
  • Shared by: Christopher Potts
  • Model type: Binary classifier
  • Language(s): English
  • License: Apache 2.0
  • Finetuned from model: bert-base-cased

Model Sources [optional]

Uses

The sole purpose of the model is to try to identify sentences containing PiPPs. I assume that one is first filtering sentences using very general regexs, and then this model helps you find the gems as you go through examples by hand.

The model is useless for really anything except this linguistically motivated for task. And, even from the perspective of theoretical linguistics, this is a highly niche application!

How to Get Started with the Model

See https://github.com/cgpotts/pipps/blob/main/classifiers_usage.ipynb

Training Details

See https://github.com/cgpotts/pipps/blob/main/classifier_training.ipynb

Evaluation

See https://github.com/cgpotts/pipps/blob/main/classifiers_usage.ipynb

Citation

See https://github.com/cgpotts/pipps a

Model Card Authors

Christopher Potts

Model Card Contact

Christopher Potts Christopher Potts

Downloads last month
15
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.