Edit model card

SetFit with BAAI/bge-small-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-small-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
Aggregation
  • 'Please show med CostVariance_Actual_vs_Forecast.'
  • 'Get me data_asset_001_kpm group by metrics.'
  • 'Provide data_asset_kpi_cf group by quarter.'
Tablejoin
  • 'Join data_asset_kpi_cf with data_asset_001_kpm tables.'
  • 'Could you link the Products and Orders tables to track sales trends for different product categories?'
  • 'Can I have a merge of income statement and key performance metrics tables?'
Lookup
  • "Filter by the 'Sales' department and show me the employees."
  • "Filter by the 'Toys' category and get me the product names."
  • 'Can you get me the products with a price above 100?'
Rejection
  • "Let's avoid generating additional reports."
  • "I'd rather not filter this dataset."
  • "I'd prefer not to apply any filters."
Lookup_1
  • 'Show me key income statement metrics.'
  • 'can I have kpm table'
  • 'Retrieve data_asset_kpi_ma_product records.'
Generalreply
  • "Hey! It's going pretty well, thanks for asking. How about yours?"
  • 'Not much, just taking it one day at a time. How about you?'
  • "'What is your favorite quote?'"
Viewtables
  • 'What are the table names that relate to customer service in the starhub_data_asset database?'
  • 'What tables are available in the starhub_data_asset database that can be joined to track user behavior?'
  • 'What are the tables that are available for analysis in the starhub_data_asset database?'

Evaluation

Metrics

Label Accuracy
all 0.9915

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("nazhan/bge-small-en-v1.5-brahmaputra-iter-10-3rd")
# Run inference
preds = model("Show me average asset value.")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 1 8.7839 62
Label Training Sample Count
Tablejoin 127
Rejection 76
Aggregation 281
Lookup 59
Generalreply 71
Viewtables 75
Lookup_1 158

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (1, 1)
  • max_steps: 2450
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0000 1 0.2317 -
0.0025 50 0.2478 -
0.0050 100 0.2213 -
0.0075 150 0.0779 -
0.0100 200 0.1089 -
0.0125 250 0.0372 -
0.0149 300 0.0219 -
0.0174 350 0.0344 -
0.0199 400 0.012 -
0.0224 450 0.0049 -
0.0249 500 0.0041 -
0.0274 550 0.0083 -
0.0299 600 0.0057 -
0.0324 650 0.0047 -
0.0349 700 0.0022 -
0.0374 750 0.0015 -
0.0399 800 0.0032 -
0.0423 850 0.002 -
0.0448 900 0.0028 -
0.0473 950 0.0017 -
0.0498 1000 0.0017 -
0.0523 1050 0.0027 -
0.0548 1100 0.0022 -
0.0573 1150 0.0018 -
0.0598 1200 0.001 -
0.0623 1250 0.002 -
0.0648 1300 0.001 -
0.0673 1350 0.0013 -
0.0697 1400 0.0012 -
0.0722 1450 0.0018 -
0.0747 1500 0.0012 -
0.0772 1550 0.0016 -
0.0797 1600 0.0012 -
0.0822 1650 0.0016 -
0.0847 1700 0.0027 -
0.0872 1750 0.0014 -
0.0897 1800 0.0011 -
0.0922 1850 0.0011 -
0.0947 1900 0.0012 -
0.0971 1950 0.0014 -
0.0996 2000 0.0014 -
0.1021 2050 0.0015 -
0.1046 2100 0.0009 -
0.1071 2150 0.0015 -
0.1096 2200 0.0013 -
0.1121 2250 0.0013 -
0.1146 2300 0.001 -
0.1171 2350 0.0017 -
0.1196 2400 0.0013 -
0.1221 2450 0.0008 0.0323
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.9
  • SetFit: 1.0.3
  • Sentence Transformers: 2.7.0
  • Transformers: 4.42.4
  • PyTorch: 2.4.0+cu121
  • Datasets: 2.21.0
  • Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
14
Safetensors
Model size
33.4M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for nazhan/bge-small-en-v1.5-brahmaputra-iter-10-3rd

Finetuned
this model

Evaluation results