Edit model card

SentenceTransformer based on BAAI/bge-small-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-small-en-v1.5. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-small-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 384 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'how have I done in US equity [DATES]?',
    '[{"get_portfolio(None)": "portfolio"}, {"factor_contribution(\'portfolio\',\'<DATES>\',\'asset_class\',\'us equity\',\'returns\')": "portfolio"}]',
    '[{"get_portfolio(None)": "portfolio"}, {"factor_contribution(\'portfolio\',\'<DATES>\',\'asset_class\',\'us equity\',\'returns\')": "portfolio"}]',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.6644
cosine_accuracy@3 0.8151
cosine_accuracy@5 0.8767
cosine_accuracy@10 0.9315
cosine_precision@1 0.6644
cosine_precision@3 0.2717
cosine_precision@5 0.1753
cosine_precision@10 0.0932
cosine_recall@1 0.0185
cosine_recall@3 0.0226
cosine_recall@5 0.0244
cosine_recall@10 0.0259
cosine_ndcg@10 0.1752
cosine_mrr@10 0.7525
cosine_map@100 0.021
dot_accuracy@1 0.6644
dot_accuracy@3 0.8151
dot_accuracy@5 0.8767
dot_accuracy@10 0.9315
dot_precision@1 0.6644
dot_precision@3 0.2717
dot_precision@5 0.1753
dot_precision@10 0.0932
dot_recall@1 0.0185
dot_recall@3 0.0226
dot_recall@5 0.0244
dot_recall@10 0.0259
dot_ndcg@10 0.1752
dot_mrr@10 0.7525
dot_map@100 0.021

Training Details

Training Dataset

Unnamed Dataset

  • Size: 734 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 5 tokens
    • mean: 11.94 tokens
    • max: 26 tokens
    • min: 24 tokens
    • mean: 84.1 tokens
    • max: 194 tokens
  • Samples:
    sentence_0 sentence_1
    what is my portfolio [DATES] cagr? [{"get_portfolio(None)": "portfolio"}, {"get_attribute('portfolio',['gains'],'')": "portfolio"}, {"sort('portfolio','gains','desc')": "portfolio"}]
    what is my [DATES] rate of return [{"get_portfolio(None)": "portfolio"}, {"get_attribute('portfolio',['gains'],'')": "portfolio"}, {"sort('portfolio','gains','desc')": "portfolio"}]
    show backtest of my performance [DATES]? [{"get_portfolio(None)": "portfolio"}, {"get_attribute('portfolio',['gains'],'')": "portfolio"}, {"sort('portfolio','gains','desc')": "portfolio"}]
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • num_train_epochs: 6
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 6
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Click to expand
Epoch Step cosine_map@100
0.0270 2 0.0136
0.0541 4 0.0138
0.0811 6 0.0140
0.1081 8 0.0142
0.1351 10 0.0144
0.1622 12 0.0146
0.1892 14 0.0147
0.2162 16 0.0150
0.2432 18 0.0152
0.2703 20 0.0157
0.2973 22 0.0165
0.3243 24 0.0168
0.3514 26 0.0167
0.3784 28 0.0170
0.4054 30 0.0174
0.4324 32 0.0180
0.4595 34 0.0181
0.4865 36 0.0181
0.5135 38 0.0182
0.5405 40 0.0182
0.5676 42 0.0182
0.5946 44 0.0183
0.6216 46 0.0183
0.6486 48 0.0183
0.6757 50 0.0183
0.7027 52 0.0182
0.7297 54 0.0185
0.7568 56 0.0186
0.7838 58 0.0189
0.8108 60 0.0190
0.8378 62 0.0191
0.8649 64 0.0193
0.8919 66 0.0197
0.9189 68 0.0198
0.9459 70 0.0196
0.9730 72 0.0196
1.0 74 0.0198
1.0270 76 0.0198
1.0541 78 0.0198
1.0811 80 0.0199
1.1081 82 0.0199
1.1351 84 0.0199
1.1622 86 0.0199
1.1892 88 0.0199
1.2162 90 0.0199
1.2432 92 0.0199
1.2703 94 0.0200
1.2973 96 0.0199
1.3243 98 0.0197
1.3514 100 0.0198
1.3784 102 0.0198
1.4054 104 0.0198
1.4324 106 0.0200
1.4595 108 0.0201
1.4865 110 0.0202
1.5135 112 0.0202
1.5405 114 0.0203
1.5676 116 0.0203
1.5946 118 0.0201
1.6216 120 0.0201
1.6486 122 0.0202
1.6757 124 0.0201
1.7027 126 0.0201
1.7297 128 0.0201
1.7568 130 0.0200
1.7838 132 0.0200
1.8108 134 0.0202
1.8378 136 0.0201
1.8649 138 0.0202
1.8919 140 0.0202
1.9189 142 0.0202
1.9459 144 0.0201
1.9730 146 0.0202
2.0 148 0.0202
2.0270 150 0.0204
2.0541 152 0.0204
2.0811 154 0.0203
2.1081 156 0.0203
2.1351 158 0.0204
2.1622 160 0.0204
2.1892 162 0.0202
2.2162 164 0.0202
2.2432 166 0.0201
2.2703 168 0.0202
2.2973 170 0.0202
2.3243 172 0.0202
2.3514 174 0.0202
2.3784 176 0.0202
2.4054 178 0.0202
2.4324 180 0.0203
2.4595 182 0.0203
2.4865 184 0.0203
2.5135 186 0.0204
2.5405 188 0.0204
2.5676 190 0.0203
2.5946 192 0.0203
2.6216 194 0.0203
2.6486 196 0.0202
2.6757 198 0.0202
2.7027 200 0.0202
2.7297 202 0.0202
2.7568 204 0.0201
2.7838 206 0.0201
2.8108 208 0.0201
2.8378 210 0.0201
2.8649 212 0.0202
2.8919 214 0.0202
2.9189 216 0.0203
2.9459 218 0.0203
2.9730 220 0.0204
3.0 222 0.0204
3.0270 224 0.0204
3.0541 226 0.0206
3.0811 228 0.0205
3.1081 230 0.0205
3.1351 232 0.0205
3.1622 234 0.0206
3.1892 236 0.0206
3.2162 238 0.0206
3.2432 240 0.0206
3.2703 242 0.0206
3.2973 244 0.0205
3.3243 246 0.0205
3.3514 248 0.0204
3.3784 250 0.0204
3.4054 252 0.0204
3.4324 254 0.0205
3.4595 256 0.0205
3.4865 258 0.0205
3.5135 260 0.0205
3.5405 262 0.0204
3.5676 264 0.0204
3.5946 266 0.0204
3.6216 268 0.0203
3.6486 270 0.0203
3.6757 272 0.0204
3.7027 274 0.0204
3.7297 276 0.0206
3.7568 278 0.0206
3.7838 280 0.0206
3.8108 282 0.0206
3.8378 284 0.0206
3.8649 286 0.0205
3.8919 288 0.0206
3.9189 290 0.0207
3.9459 292 0.0206
3.9730 294 0.0206
4.0 296 0.0207
4.0270 298 0.0207
4.0541 300 0.0207
4.0811 302 0.0208
4.1081 304 0.0208
4.1351 306 0.0207
4.1622 308 0.0207
4.1892 310 0.0207
4.2162 312 0.0208
4.2432 314 0.0208
4.2703 316 0.0208
4.2973 318 0.0208
4.3243 320 0.0208
4.3514 322 0.0208
4.3784 324 0.0208
4.4054 326 0.0208
4.4324 328 0.0207
4.4595 330 0.0207
4.4865 332 0.0207
4.5135 334 0.0207
4.5405 336 0.0207
4.5676 338 0.0207
4.5946 340 0.0207
4.6216 342 0.0208
4.6486 344 0.0208
4.6757 346 0.0208
4.7027 348 0.0208
4.7297 350 0.0208
4.7568 352 0.0209
4.7838 354 0.0209
4.8108 356 0.0210

Framework Versions

  • Python: 3.10.9
  • Sentence Transformers: 3.0.1
  • Transformers: 4.44.0
  • PyTorch: 2.4.0+cu121
  • Accelerate: 0.33.0
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply}, 
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
43
Safetensors
Model size
33.4M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for magnifi/bge-small-en-v1.5-ft-orc-0930-dates

Finetuned
(107)
this model

Evaluation results