Edit model card

SetFit with BAAI/bge-small-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-small-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
negative
  • 'What did you learn in school today? Nothing much, just the usual stuff.'
  • "Do you know the capital of France? Don't know, don't care."
  • "Can you tell me what 2 + 2 equals? Guess it's 4, but why does it matter?"
positive
  • "What's your favorite subject? Science, because I love experiments."
  • 'Can you tell me the planets in order? Sure, Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, Neptune. Pluto used to be one, but not anymore.'
  • "Do you enjoy math class? Yeah, it's cool, especially when we do geometry."

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("bew/setfit-engagement-model-basic")
# Run inference
preds = model("Do you know how to code? Nope. Sounds complicated.")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 6 15.0470 26
Label Training Sample Count
negative 79
positive 70

Training Hyperparameters

  • batch_size: (32, 32)
  • num_epochs: (10, 10)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0028 1 0.2418 -
0.1416 50 0.2311 -
0.2833 100 0.2425 -
0.4249 150 0.0572 -
0.5666 200 0.0049 -
0.7082 250 0.0031 -
0.8499 300 0.0019 -
0.9915 350 0.0018 -
1.1331 400 0.0015 -
1.2748 450 0.001 -
1.4164 500 0.0011 -
1.5581 550 0.0008 -
1.6997 600 0.0008 -
1.8414 650 0.0007 -
1.9830 700 0.0008 -
2.1246 750 0.0007 -
2.2663 800 0.0005 -
2.4079 850 0.0006 -
2.5496 900 0.0005 -
2.6912 950 0.0005 -
2.8329 1000 0.0005 -
2.9745 1050 0.0005 -
3.1161 1100 0.0005 -
3.2578 1150 0.0005 -
3.3994 1200 0.0004 -
3.5411 1250 0.0004 -
3.6827 1300 0.0004 -
3.8244 1350 0.0004 -
3.9660 1400 0.0004 -
4.1076 1450 0.0004 -
4.2493 1500 0.0003 -
4.3909 1550 0.0004 -
4.5326 1600 0.0004 -
4.6742 1650 0.0003 -
4.8159 1700 0.0003 -
4.9575 1750 0.0004 -
5.0992 1800 0.0003 -
5.2408 1850 0.0003 -
5.3824 1900 0.0003 -
5.5241 1950 0.0003 -
5.6657 2000 0.0003 -
5.8074 2050 0.0003 -
5.9490 2100 0.0003 -
6.0907 2150 0.0003 -
6.2323 2200 0.0003 -
6.3739 2250 0.0003 -
6.5156 2300 0.0003 -
6.6572 2350 0.0003 -
6.7989 2400 0.0002 -
6.9405 2450 0.0003 -
7.0822 2500 0.0003 -
7.2238 2550 0.0003 -
7.3654 2600 0.0003 -
7.5071 2650 0.0003 -
7.6487 2700 0.0003 -
7.7904 2750 0.0003 -
7.9320 2800 0.0003 -
8.0737 2850 0.0003 -
8.2153 2900 0.0003 -
8.3569 2950 0.0003 -
8.4986 3000 0.0002 -
8.6402 3050 0.0003 -
8.7819 3100 0.0003 -
8.9235 3150 0.0003 -
9.0652 3200 0.0003 -
9.2068 3250 0.0002 -
9.3484 3300 0.0003 -
9.4901 3350 0.0002 -
9.6317 3400 0.0003 -
9.7734 3450 0.0003 -
9.9150 3500 0.0002 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.3
  • Sentence Transformers: 2.3.1
  • Transformers: 4.35.2
  • PyTorch: 2.1.0+cu121
  • Datasets: 2.17.0
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
1
Safetensors
Model size
33.4M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for bew/setfit-engagement-model-basic

Finetuned
(93)
this model