Edit model card

SetFit with sentence-transformers/paraphrase-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
False
  • 'I have so many issues to address. I have a history of sexual abuse, I’m a breast cancer survivor and I am a lifetime insomniac. I have a long history of depression and I’m beginning to have anxiety. I have low self esteem but I’ve been happily married for almost 35 years.\n I’ve never had counseling about any of this. Do I have too many issues to address in counseling?'
  • 'I have so many issues to address. I have a history of sexual abuse, I’m a breast cancer survivor and I am a lifetime insomniac. I have a long history of depression and I’m beginning to have anxiety. I have low self esteem but I’ve been happily married for almost 35 years.\n I’ve never had counseling about any of this. Do I have too many issues to address in counseling?'
  • 'Experiencing extreme mood swings not related to external circumstances.'
True
  • 'Guide to learning a new language'
  • 'Learning about the historical significance of the Silk Road.'
  • 'Exploring historical landmarks in Europe'

Evaluation

Metrics

Label Accuracy
all 0.9882

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("richie-ghost/setfit-sent-trans-mpnet-base-MH-Topic-Check")
# Run inference
preds = model("Planning a DIY home renovation project.")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 4 33.7092 111
Label Training Sample Count
True 58
False 138

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (3, 3)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0007 1 0.1327 -
0.0354 50 0.094 -
0.0708 100 0.0263 -
0.1062 150 0.0034 -
0.1415 200 0.0008 -
0.1769 250 0.0004 -
0.2123 300 0.0002 -
0.2477 350 0.0003 -
0.2831 400 0.0001 -
0.3185 450 0.0003 -
0.3539 500 0.0 -
0.3892 550 0.0002 -
0.4246 600 0.0002 -
0.4600 650 0.0001 -
0.4954 700 0.0001 -
0.5308 750 0.0001 -
0.5662 800 0.0001 -
0.6016 850 0.0001 -
0.6369 900 0.0001 -
0.6723 950 0.0001 -
0.7077 1000 0.0001 -
0.7431 1050 0.0001 -
0.7785 1100 0.0001 -
0.8139 1150 0.0001 -
0.8493 1200 0.0001 -
0.8846 1250 0.0001 -
0.9200 1300 0.0001 -
0.9554 1350 0.0001 -
0.9908 1400 0.0001 -
1.0 1413 - 0.0092
1.0262 1450 0.0001 -
1.0616 1500 0.0001 -
1.0970 1550 0.0 -
1.1323 1600 0.0001 -
1.1677 1650 0.0001 -
1.2031 1700 0.0001 -
1.2385 1750 0.0 -
1.2739 1800 0.0001 -
1.3093 1850 0.0 -
1.3447 1900 0.0 -
1.3800 1950 0.0001 -
1.4154 2000 0.0 -
1.4508 2050 0.0 -
1.4862 2100 0.0 -
1.5216 2150 0.0001 -
1.5570 2200 0.0 -
1.5924 2250 0.0001 -
1.6277 2300 0.0 -
1.6631 2350 0.0 -
1.6985 2400 0.0001 -
1.7339 2450 0.0 -
1.7693 2500 0.0 -
1.8047 2550 0.0 -
1.8401 2600 0.0 -
1.8754 2650 0.0 -
1.9108 2700 0.0001 -
1.9462 2750 0.0 -
1.9816 2800 0.0 -
2.0 2826 - 0.012
2.0170 2850 0.0 -
2.0524 2900 0.0 -
2.0878 2950 0.0 -
2.1231 3000 0.0 -
2.1585 3050 0.0 -
2.1939 3100 0.0 -
2.2293 3150 0.0 -
2.2647 3200 0.0 -
2.3001 3250 0.0 -
2.3355 3300 0.0 -
2.3708 3350 0.0 -
2.4062 3400 0.0 -
2.4416 3450 0.0 -
2.4770 3500 0.0 -
2.5124 3550 0.0 -
2.5478 3600 0.0 -
2.5832 3650 0.0 -
2.6185 3700 0.0001 -
2.6539 3750 0.0 -
2.6893 3800 0.0 -
2.7247 3850 0.0 -
2.7601 3900 0.0 -
2.7955 3950 0.0 -
2.8309 4000 0.0 -
2.8662 4050 0.0001 -
2.9016 4100 0.0 -
2.9370 4150 0.0 -
2.9724 4200 0.0 -
3.0 4239 - 0.0115
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.3
  • Sentence Transformers: 2.7.0
  • Transformers: 4.40.0
  • PyTorch: 2.2.1+cu121
  • Datasets: 2.19.0
  • Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
4
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for richie-ghost/setfit-sent-trans-mpnet-base-MH-Topic-Check

Finetuned
(241)
this model

Evaluation results