metadata
base_model: sentence-transformers/paraphrase-mpnet-base-v2
library_name: setfit
metrics:
- f1
pipeline_tag: text-classification
tags:
- setfit
- sentence-transformers
- text-classification
- generated_from_setfit_trainer
widget:
- text: >-
Inflation is out of control! Just got my electricity bill and it's up 25%
from last year. No wonder the Fed is raising rates, but will it be enough
to stop the bleeding? #inflation #economy
- text: >-
The Federal Reserve's decision to raise interest rates by 0.75% has sent
shockwaves through the financial markets, with the Dow Jones plummeting by
over 300 points. Analysts warn that this could be the start of a prolonged
bear market, as higher borrowing costs weigh on consumer spending and
business investment. The move is seen as a bid to combat inflation, but
critics argue that it will only exacerbate the economic slowdown.
- text: >-
Alphabet Inc. (GOOGL) shares are trading higher after the tech giant
reported a 32% surge in quarterly profits, exceeding analyst estimates.
The company's revenue also rose 13% year-over-year, driven by growth in
its cloud computing business. Google's parent company is now guiding for
even stronger growth in the coming quarters, sending its stock price up 5%
in pre-market trading.
- text: >-
I'm extremely disappointed in the latest quarterly earnings report from
Apple. The company's guidance for the next quarter is way off and it's
clear they're not taking the necessary steps to address their declining
iPhone sales. This is a major red flag for investors and I'm selling all
my shares. The bearish trend is clear and I'm not convinced they'll be
able to turn things around anytime soon.
- text: >-
Just going over the latest quarterly earnings reports and the numbers are
looking decent. Not a lot of surprises, but overall a stable market. No
major red flags or green lights, just a steady as she goes kind of day.
#marketanalysis #finance
inference: true
model-index:
- name: SetFit with sentence-transformers/paraphrase-mpnet-base-v2
results:
- task:
type: text-classification
name: Text Classification
dataset:
name: Unknown
type: unknown
split: test
metrics:
- type: f1
value: 0.6268844221105527
name: F1
SetFit with sentence-transformers/paraphrase-mpnet-base-v2
This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
- Fine-tuning a Sentence Transformer with contrastive learning.
- Training a classification head with features from the fine-tuned Sentence Transformer.
Model Details
Model Description
- Model Type: SetFit
- Sentence Transformer body: sentence-transformers/paraphrase-mpnet-base-v2
- Classification head: a LogisticRegression instance
- Maximum Sequence Length: 512 tokens
- Number of Classes: 3 classes
Model Sources
- Repository: SetFit on GitHub
- Paper: Efficient Few-Shot Learning Without Prompts
- Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts
Model Labels
Label | Examples |
---|---|
Neutral |
|
Bullish |
|
Bearish |
|
Evaluation
Metrics
Label | F1 |
---|---|
all | 0.6269 |
Uses
Direct Use for Inference
First install the SetFit library:
pip install setfit
Then you can load this model and run inference.
from setfit import SetFitModel
# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("setfit_model_id")
# Run inference
preds = model("Inflation is out of control! Just got my electricity bill and it's up 25% from last year. No wonder the Fed is raising rates, but will it be enough to stop the bleeding? #inflation #economy")
Training Details
Training Set Metrics
Training set | Min | Median | Max |
---|---|---|---|
Word count | 17 | 62.6531 | 119 |
Label | Training Sample Count |
---|---|
Bearish | 16 |
Bullish | 18 |
Neutral | 15 |
Training Hyperparameters
- batch_size: (16, 16)
- num_epochs: (5, 5)
- max_steps: -1
- sampling_strategy: oversampling
- body_learning_rate: (2e-05, 1e-05)
- head_learning_rate: 0.01
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: True
Training Results
Epoch | Step | Training Loss | Validation Loss |
---|---|---|---|
0.01 | 1 | 0.235 | - |
0.5 | 50 | 0.0307 | - |
1.0 | 100 | 0.0008 | 0.0357 |
1.5 | 150 | 0.0006 | - |
2.0 | 200 | 0.0002 | 0.0303 |
2.5 | 250 | 0.0001 | - |
3.0 | 300 | 0.0001 | 0.0295 |
3.5 | 350 | 0.0001 | - |
4.0 | 400 | 0.0001 | 0.0281 |
4.5 | 450 | 0.0001 | - |
5.0 | 500 | 0.0001 | 0.0287 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.9.19
- SetFit: 1.1.0.dev0
- Sentence Transformers: 3.0.1
- Transformers: 4.39.0
- PyTorch: 2.4.0
- Datasets: 2.20.0
- Tokenizers: 0.15.2
Citation
BibTeX
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}