SetFit with sentence-transformers/paraphrase-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/paraphrase-mpnet-base-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 5 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
store neighbourhood analysis	'How does the area compare in an analysis for this store?' 'How do the local beauty trends influence product demand at Sephora?' 'What is the socio-economic status of the neighborhood for this store?'
exploratory	'Which are the stores with the largest parking lots?' 'Which stores are located in commercial zones?' 'Which are the largest stores by square footage?'
site recommendations	'How would you analyze potential sites for recommendations?' 'Which neighborhoods offer potential for opening new Michaels craft stores to meet the demand for art supplies and creative outlets?' 'Can you provide an analysis to recommend sites?'
baseline compare	"What is the variance in revenue generation between this Best Buy outlet and the brand's expected benchmarks?" "What is the disparity in sales performance between this Home Depot store and the brand's standard expectations?" 'What are the variations in customer loyalty scores between this Target store and the brand average?'
store competition	'How does the service quality at Safeway compare to other grocery stores in the neighborhood?' "How does Target's product selection compare to that of neighboring retail stores?" 'What competitive strategies has Staples employed to stand out among other office supply retailers?'

Evaluation

Metrics

Label	Accuracy
all	1.0

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("a-n-a-n-y-a-123/setfit-model-intent")
# Run inference
preds = model("Analyze the sites and provide recommendations.")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	6	11.7467	20

Label	Training Sample Count
baseline compare	15
exploratory	15
site recommendations	15
store competition	15
store neighbourhood analysis	15

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (1, 1)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 20
body_learning_rate: (2e-05, 2e-05)
head_learning_rate: 2e-05
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0053	1	0.2398	-
0.2660	50	0.0757	-
0.5319	100	0.0021	-
0.7979	150	0.0013	-
1.0638	200	0.0005	-
1.3298	250	0.0006	-
1.5957	300	0.0006	-
1.8617	350	0.0004	-
2.1277	400	0.0006	-
2.3936	450	0.0004	-
2.6596	500	0.0003	-
2.9255	550	0.0003	-
0.0053	1	0.0002	-
0.2660	50	0.0003	-
0.5319	100	0.0004	-
0.7979	150	0.0001	-

Framework Versions

Python: 3.10.12
SetFit: 1.0.3
Sentence Transformers: 3.0.0
Transformers: 4.39.0
PyTorch: 2.3.0+cu121
Datasets: 2.19.2
Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

a-n-a-n-y-a-123
/

setfit-model-intent