Edit model card

SetFit with sentence-transformers/paraphrase-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
Neutral
  • "I'm trying to optimize my investment portfolio and was wondering if anyone has any tips on how to maximize tax efficiency in a taxable brokerage account. I've heard that tax-loss harvesting can be a good strategy, but I'm not sure how to implement it or if it's worth the effort."
  • "I've been following the trend of the S&P 500 and it seems like it's consolidating within a tight range. I'm not seeing any strong buy or sell signals, so I'm going to hold off on making any trades for now. Anyone else noticing this? I'm thinking of waiting for a breakout or a clear reversal before entering a position."
  • "I've been using Fidelity for my brokerage needs and I'm generally happy with their services. They have a user-friendly interface and their customer support is responsive. That being said, I do wish they had more investment options available, but overall I'd say they're a solid choice for beginners and experienced investors alike."
Bullish
  • 'The US labor market continues to show signs of strength, with the latest jobs report revealing a 3.5% unemployment rate, the lowest in nearly 50 years. This is a major boost for the economy, and investors are taking notice. The Dow Jones surged 200 points in response, with many analysts attributing the gains to the improving job market. As a result, stocks in the tech and healthcare sectors are seeing significant gains, with many experts predicting a continued upward trend in the coming weeks. The low unemployment rate is a clear indication that the economy is on the right track, and investors are feeling optimistic about the future.'
  • "Just closed out my Q2 with a 20% gain on my portfolio! The market is on fire and I'm loving every minute of it. Stocks are soaring and I'm feeling bullish about the future. #stockmarket #investing #bullrun"
  • "Just heard that the new government is planning to reduce corporate taxes to 20% from 30%! This is a huge boost for the economy and I'm feeling very bullish on the stock market right now. #Bullish #Finance #Economy"
Bearish
  • 'Economic growth is slowing down and the Fed is raising interest rates again. This is a recipe for disaster. The market is going to tank soon. #BearMarket #EconomicDownturn'
  • "Just got my latest paycheck and I'm shocked to see how much of it is going towards groceries and rent due to this OUT. OF. CONTROL inflation. The economy is a joke. #inflation #bearmarket"
  • 'The latest inflation rate data has sent shockwaves through the market, with the Consumer Price Index (CPI) rising 3.5% in the past 12 months. This is the highest rate in nearly a decade, and economists are warning that it could lead to a recession. The Federal Reserve is expected to raise interest rates again in an effort to combat inflation, but this could have a negative impact on the stock market. As a result, investors are bracing for a potential bear market, with many analysts predicting a 20% drop in the S&P 500 by the end of the year.'

Evaluation

Metrics

Label F1
all 0.6269

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("setfit_model_id")
# Run inference
preds = model("Inflation is out of control! Just got my electricity bill and it's up 25% from last year. No wonder the Fed is raising rates, but will it be enough to stop the bleeding? #inflation #economy")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 17 62.6531 119
Label Training Sample Count
Bearish 16
Bullish 18
Neutral 15

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (5, 5)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.01 1 0.235 -
0.5 50 0.0307 -
1.0 100 0.0008 0.0357
1.5 150 0.0006 -
2.0 200 0.0002 0.0303
2.5 250 0.0001 -
3.0 300 0.0001 0.0295
3.5 350 0.0001 -
4.0 400 0.0001 0.0281
4.5 450 0.0001 -
5.0 500 0.0001 0.0287
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.9.19
  • SetFit: 1.1.0.dev0
  • Sentence Transformers: 3.0.1
  • Transformers: 4.39.0
  • PyTorch: 2.4.0
  • Datasets: 2.20.0
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
10
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for kenhktsui/setfit_test_twitter_news_syn

Finetuned
(247)
this model

Evaluation results