SetFit with sentence-transformers/paraphrase-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/paraphrase-mpnet-base-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 5 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
product faq	'Does the Meenakari jal jangla -Rani saree have meenakari?' 'Is the Nike Dunk Low Premium Bacon available in size 7?' 'What is the best way to recycle the packaging boxes for wholesale orders for wholesale orders?'
order tracking	'I ordered the Cake Boards 7 days ago with order no 43210 how long will it take to deliver?' 'I want to deliver bags to Pune, how many days will it take to deliver?' 'I want to deliver packaging to Surat, how many days will it take to deliver?'
product policy	'What is the procedure for returning a product that was part of a special promotion occasion?' 'Can I return an item if it was damaged during delivery preparation?' 'What is the procedure for returning a product that was part of a special occasion promotion?'
general faq	'What are the key factors to consider when developing a personalized diet plan for weight loss?' 'What are some tips for maximizing the antioxidant content when brewing green tea?' 'Can you explain why Mashru silk is considered more comfortable to wear compared to pure silk sarees?'
product discoverability	'Can you show me sarees in bright colors suitable for weddings?' 'Do you have adidas Superstar shoes?' 'Do you have any bestseller teas available?'

Evaluation

Metrics

Label	Accuracy
all	0.8533

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Shankhdhar/classifier_woog_firstbud")
# Run inference
preds = model("Variety of cookie boxes")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	4	12.1961	28

Label	Training Sample Count
general faq	24
order tracking	32
product discoverability	50
product faq	50
product policy	48

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (2, 2)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: True

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0005	1	0.2265	-
0.0244	50	0.1831	-
0.0489	100	0.1876	-
0.0733	150	0.1221	-
0.0978	200	0.0228	-
0.1222	250	0.0072	-
0.1467	300	0.0282	-
0.1711	350	0.0015	-
0.1956	400	0.0005	-
0.2200	450	0.0008	-
0.2445	500	0.0004	-
0.2689	550	0.0003	-
0.2934	600	0.0003	-
0.3178	650	0.0002	-
0.3423	700	0.0002	-
0.3667	750	0.0002	-
0.3912	800	0.0003	-
0.4156	850	0.0002	-
0.4401	900	0.0002	-
0.4645	950	0.0001	-
0.4890	1000	0.0001	-
0.5134	1050	0.0001	-
0.5379	1100	0.0001	-
0.5623	1150	0.0002	-
0.5868	1200	0.0002	-
0.6112	1250	0.0001	-
0.6357	1300	0.0001	-
0.6601	1350	0.0001	-
0.6846	1400	0.0001	-
0.7090	1450	0.0001	-
0.7335	1500	0.0001	-
0.7579	1550	0.0001	-
0.7824	1600	0.0001	-
0.8068	1650	0.0001	-
0.8313	1700	0.0001	-
0.8557	1750	0.0011	-
0.8802	1800	0.0002	-
0.9046	1850	0.0001	-
0.9291	1900	0.0001	-
0.9535	1950	0.0002	-
0.9780	2000	0.0001	-
1.0024	2050	0.0001	-
1.0269	2100	0.0002	-
1.0513	2150	0.0001	-
1.0758	2200	0.0001	-
1.1002	2250	0.0001	-
1.1247	2300	0.0001	-
1.1491	2350	0.0001	-
1.1736	2400	0.0001	-
1.1980	2450	0.0001	-
1.2225	2500	0.0001	-
1.2469	2550	0.0001	-
1.2714	2600	0.0001	-
1.2958	2650	0.0001	-
1.3203	2700	0.0001	-
1.3447	2750	0.0001	-
1.3692	2800	0.0001	-
1.3936	2850	0.0001	-
1.4181	2900	0.0001	-
1.4425	2950	0.0001	-
1.4670	3000	0.0001	-
1.4914	3050	0.0001	-
1.5159	3100	0.0001	-
1.5403	3150	0.0001	-
1.5648	3200	0.0001	-
1.5892	3250	0.0001	-
1.6137	3300	0.0001	-
1.6381	3350	0.0001	-
1.6626	3400	0.0001	-
1.6870	3450	0.0001	-
1.7115	3500	0.0001	-
1.7359	3550	0.0	-
1.7604	3600	0.0001	-
1.7848	3650	0.0001	-
1.8093	3700	0.0001	-
1.8337	3750	0.0	-
1.8582	3800	0.0001	-
1.8826	3850	0.0001	-
1.9071	3900	0.0001	-
1.9315	3950	0.0	-
1.9560	4000	0.0	-
1.9804	4050	0.0001	-

Framework Versions

Python: 3.10.13
SetFit: 1.0.3
Sentence Transformers: 3.0.1
Transformers: 4.39.0
PyTorch: 2.2.2+cu121
Datasets: 2.19.2
Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Shankhdhar
/

classifier_woog_firstbud