Edit model card

SetFit with sentence-transformers/all-MiniLM-L6-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-MiniLM-L6-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Evaluation

Metrics

Label F1
all 0.5495

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Zlovoblachko/dimension3_setfit")
# Run inference
preds = model("I loved the spiderman movie!")

Training Details

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (1, 1)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2.260895905036282e-05, 2.260895905036282e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0004 1 0.3835 -
0.0177 50 0.3106 -
0.0353 100 0.3232 -
0.0530 150 0.319 -
0.0706 200 0.3146 -
0.0883 250 0.3194 -
0.1059 300 0.3166 -
0.1236 350 0.2941 -
0.1412 400 0.3289 -
0.1589 450 0.3108 -
0.1766 500 0.3099 -
0.1942 550 0.3072 -
0.2119 600 0.2994 -
0.2295 650 0.3062 -
0.2472 700 0.3046 -
0.2648 750 0.3086 -
0.2825 800 0.3039 -
0.3001 850 0.3096 -
0.3178 900 0.3134 -
0.3355 950 0.2965 -
0.3531 1000 0.3147 -
0.3708 1050 0.317 -
0.3884 1100 0.3123 -
0.4061 1150 0.3221 -
0.4237 1200 0.2971 -
0.4414 1250 0.2928 -
0.4590 1300 0.2977 -
0.4767 1350 0.3268 -
0.4944 1400 0.2785 -
0.5120 1450 0.3156 -
0.5297 1500 0.3148 -
0.5473 1550 0.2909 -
0.5650 1600 0.3225 -
0.5826 1650 0.3072 -
0.6003 1700 0.3099 -
0.6179 1750 0.311 -
0.6356 1800 0.3213 -
0.6532 1850 0.2937 -
0.6709 1900 0.3177 -
0.6886 1950 0.3088 -
0.7062 2000 0.3017 -
0.7239 2050 0.3076 -
0.7415 2100 0.3164 -
0.7592 2150 0.295 -
0.7768 2200 0.2957 -
0.7945 2250 0.3064 -
0.8121 2300 0.3146 -
0.8298 2350 0.3114 -
0.8475 2400 0.3151 -
0.8651 2450 0.3033 -
0.8828 2500 0.3039 -
0.9004 2550 0.3152 -
0.9181 2600 0.3185 -
0.9357 2650 0.2927 -
0.9534 2700 0.3174 -
0.9710 2750 0.3003 -
0.9887 2800 0.3157 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.1.0
  • Sentence Transformers: 3.2.1
  • Transformers: 4.44.2
  • PyTorch: 2.5.0+cu121
  • Datasets: 3.0.2
  • Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
4
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Zlovoblachko/dimension3_setfit

Finetuned
(168)
this model

Evaluation results