Edit model card

SetFit with sentence-transformers/all-MiniLM-L6-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-MiniLM-L6-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("mann2107/BCMPIIRAB_MiniLM_ALLNew")
# Run inference
preds = model("Thank you for your email. Please go ahead and issue. Please invoice in KES")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 1 25.6577 136
Label Training Sample Count
0 24
1 24
2 24
3 24
4 24
5 24
6 24
7 24
8 24
9 24
10 24
11 24
12 24
13 24

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (1, 1)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 99
  • body_learning_rate: (0.0002733656643765287, 0.0002733656643765287)
  • head_learning_rate: 2.7029049129688732e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • max_length: 512
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0002 1 0.2546 -
0.0120 50 0.1667 -
0.0241 100 0.1165 -
0.0361 150 0.0799 -
0.0481 200 0.0212 -
0.0601 250 0.0188 -
0.0722 300 0.0531 -
0.0842 350 0.0273 -
0.0962 400 0.0111 -
0.1082 450 0.0203 -
0.1203 500 0.0397 -
0.1323 550 0.0164 -
0.1443 600 0.0045 -
0.1563 650 0.0032 -
0.1684 700 0.001 -
0.1804 750 0.0011 -
0.1924 800 0.0004 -
0.2044 850 0.0009 -
0.2165 900 0.0006 -
0.2285 950 0.0008 -
0.2405 1000 0.0004 -
0.2525 1050 0.0008 -
0.2646 1100 0.0005 -
0.2766 1150 0.0006 -
0.2886 1200 0.0007 -
0.3006 1250 0.0043 -
0.3127 1300 0.0004 -
0.3247 1350 0.0005 -
0.3367 1400 0.0005 -
0.3487 1450 0.0004 -
0.3608 1500 0.0004 -
0.3728 1550 0.0005 -
0.3848 1600 0.0007 -
0.3968 1650 0.0006 -
0.4089 1700 0.0002 -
0.4209 1750 0.0006 -
0.4329 1800 0.0008 -
0.4449 1850 0.0003 -
0.4570 1900 0.0005 -
0.4690 1950 0.0003 -
0.4810 2000 0.0003 -
0.4930 2050 0.0003 -
0.5051 2100 0.0006 -
0.5171 2150 0.0003 -
0.5291 2200 0.0002 -
0.5411 2250 0.0002 -
0.5532 2300 0.0002 -
0.5652 2350 0.0004 -
0.5772 2400 0.0003 -
0.5892 2450 0.0003 -
0.6013 2500 0.0002 -
0.6133 2550 0.0002 -
0.6253 2600 0.0013 -
0.6373 2650 0.0002 -
0.6494 2700 0.0007 -
0.6614 2750 0.0004 -
0.6734 2800 0.0007 -
0.6854 2850 0.0018 -
0.6975 2900 0.0002 -
0.7095 2950 0.0003 -
0.7215 3000 0.0006 -
0.7335 3050 0.0003 -
0.7456 3100 0.0002 -
0.7576 3150 0.0002 -
0.7696 3200 0.0002 -
0.7816 3250 0.0002 -
0.7937 3300 0.0002 -
0.8057 3350 0.0001 -
0.8177 3400 0.0003 -
0.8297 3450 0.0002 -
0.8418 3500 0.0002 -
0.8538 3550 0.0002 -
0.8658 3600 0.0002 -
0.8778 3650 0.0002 -
0.8899 3700 0.0002 -
0.9019 3750 0.0005 -
0.9139 3800 0.0002 -
0.9259 3850 0.0001 -
0.9380 3900 0.0004 -
0.9500 3950 0.0001 -
0.9620 4000 0.0005 -
0.9740 4050 0.0002 -
0.9861 4100 0.0002 -
0.9981 4150 0.0001 -
1.0 4158 - 0.0302
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.1.0.dev0
  • Sentence Transformers: 3.0.1
  • Transformers: 4.42.4
  • PyTorch: 2.3.1+cu121
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
7
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for mann2107/BCMPIIRAB_MiniLM_ALLNew

Finetuned
(154)
this model