anismahmahi's picture
Add SetFit model
d59ca74
metadata
library_name: setfit
tags:
  - setfit
  - sentence-transformers
  - text-classification
  - generated_from_setfit_trainer
metrics:
  - accuracy
widget:
  - text: >-
      Guy Cecil, the former head of the Democratic Senatorial Campaign Committee
      and now the boss of a leading Democratic super PAC, voiced his frustration
      with the inadequacy of Franken’s apology on Twitter.
  - text: >-
      Attorney Stephen Le Brocq, who operates a law firm in the North Texas area
      sums up the treatment of Guyger perfectly when he says that “The affidavit
      isn’t written objectively, not at the slightest.
  - text: Phone This field is for validation purposes and should be left unchanged.
  - text: The Twitter suspension caught me by surprise.
  - text: >-
      Popular pages like The AntiMedia (2.1 million fans), The Free Thought
      Project (3.1 million fans), Press for Truth (350K fans), Police the Police
      (1.9 million fans), Cop Block (1.7 million fans), and Punk Rock
      Libertarians (125K fans) are just a few of the ones which were
      unpublished.
pipeline_tag: text-classification
inference: false
base_model: sentence-transformers/paraphrase-mpnet-base-v2
model-index:
  - name: SetFit with sentence-transformers/paraphrase-mpnet-base-v2
    results:
      - task:
          type: text-classification
          name: Text Classification
        dataset:
          name: Unknown
          type: unknown
          split: test
        metrics:
          - type: accuracy
            value: 0.7083881146463319
            name: Accuracy

SetFit with sentence-transformers/paraphrase-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A OneVsRestClassifier instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Evaluation

Metrics

Label Accuracy
all 0.7084

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("anismahmahi/doubt_repetition_with_noPropaganda_with_3_zeros_SetFit")
# Run inference
preds = model("The Twitter suspension caught me by surprise.")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 1 22.0291 129

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (2, 2)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 5
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0003 1 0.3532 -
0.0166 50 0.3413 -
0.0332 100 0.2743 -
0.0498 150 0.2635 -
0.0664 200 0.2444 -
0.0830 250 0.1883 -
0.0996 300 0.2231 -
0.1162 350 0.1763 -
0.1328 400 0.1868 -
0.1494 450 0.2057 -
0.1660 500 0.1734 -
0.1826 550 0.2594 -
0.1992 600 0.1024 -
0.2158 650 0.2351 -
0.2324 700 0.1863 -
0.2490 750 0.072 -
0.2656 800 0.1987 -
0.2822 850 0.1511 -
0.2988 900 0.0926 -
0.3154 950 0.1956 -
0.3320 1000 0.1354 -
0.3486 1050 0.2038 -
0.3652 1100 0.1166 -
0.3818 1150 0.3214 -
0.3984 1200 0.0703 -
0.4150 1250 0.1815 -
0.4316 1300 0.124 -
0.4482 1350 0.0955 -
0.4648 1400 0.1064 -
0.4814 1450 0.0429 -
0.4980 1500 0.0814 -
0.5146 1550 0.1483 -
0.5312 1600 0.0856 -
0.5478 1650 0.1072 -
0.5644 1700 0.0148 -
0.5810 1750 0.0571 -
0.5976 1800 0.052 -
0.6142 1850 0.0532 -
0.6308 1900 0.0088 -
0.6474 1950 0.1619 -
0.6640 2000 0.0618 -
0.6806 2050 0.0115 -
0.6972 2100 0.1402 -
0.7138 2150 0.0637 -
0.7304 2200 0.0194 -
0.7470 2250 0.0135 -
0.7636 2300 0.0109 -
0.7802 2350 0.133 -
0.7968 2400 0.0565 -
0.8134 2450 0.1508 -
0.8300 2500 0.0293 -
0.8466 2550 0.065 -
0.8632 2600 0.0029 -
0.8798 2650 0.008 -
0.8964 2700 0.0604 -
0.9130 2750 0.0074 -
0.9296 2800 0.0019 -
0.9462 2850 0.0129 -
0.9628 2900 0.0838 -
0.9794 2950 0.0044 -
0.9960 3000 0.0035 -
1.0 3012 - 0.2514
1.0126 3050 0.0086 -
1.0292 3100 0.0042 -
1.0458 3150 0.0833 -
1.0624 3200 0.058 -
1.0790 3250 0.013 -
1.0956 3300 0.0429 -
1.1122 3350 0.0044 -
1.1288 3400 0.0699 -
1.1454 3450 0.0535 -
1.1620 3500 0.0559 -
1.1786 3550 0.1459 -
1.1952 3600 0.118 -
1.2118 3650 0.14 -
1.2284 3700 0.0632 -
1.2450 3750 0.0026 -
1.2616 3800 0.0026 -
1.2782 3850 0.0052 -
1.2948 3900 0.0058 -
1.3114 3950 0.0018 -
1.3280 4000 0.0152 -
1.3446 4050 0.0186 -
1.3612 4100 0.039 -
1.3778 4150 0.0022 -
1.3944 4200 0.002 -
1.4110 4250 0.0032 -
1.4276 4300 0.0285 -
1.4442 4350 0.0213 -
1.4608 4400 0.0009 -
1.4774 4450 0.0262 -
1.4940 4500 0.0181 -
1.5106 4550 0.0629 -
1.5272 4600 0.0023 -
1.5438 4650 0.003 -
1.5604 4700 0.0024 -
1.5770 4750 0.049 -
1.5936 4800 0.0154 -
1.6102 4850 0.0009 -
1.6268 4900 0.0015 -
1.6434 4950 0.0068 -
1.6600 5000 0.057 -
1.6766 5050 0.0031 -
1.6932 5100 0.0189 -
1.7098 5150 0.0317 -
1.7264 5200 0.0013 -
1.7430 5250 0.0247 -
1.7596 5300 0.0062 -
1.7762 5350 0.0192 -
1.7928 5400 0.0019 -
1.8094 5450 0.1007 -
1.8260 5500 0.0384 -
1.8426 5550 0.0494 -
1.8592 5600 0.0615 -
1.8758 5650 0.0709 -
1.8924 5700 0.0308 -
1.9090 5750 0.0107 -
1.9256 5800 0.064 -
1.9422 5850 0.0009 -
1.9588 5900 0.0019 -
1.9754 5950 0.0037 -
1.9920 6000 0.0826 -
2.0 6024 - 0.2614
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.1
  • Sentence Transformers: 2.2.2
  • Transformers: 4.35.2
  • PyTorch: 2.1.0+cu121
  • Datasets: 2.16.1
  • Tokenizers: 0.15.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}