Edit model card

SentenceTransformer based on nomic-ai/nomic-embed-text-v1.5

This is a sentence-transformers model finetuned from nomic-ai/nomic-embed-text-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: nomic-ai/nomic-embed-text-v1.5
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: NomicBertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'search_query: お布団バッグ',
    'search_query: 足なしソファー',
    'search_query: all color handbag',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Metric Value
cosine_accuracy 0.787
dot_accuracy 0.22
manhattan_accuracy 0.762
euclidean_accuracy 0.768
max_accuracy 0.787

Training Details

Training Dataset

Unnamed Dataset

  • Size: 100,000 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 7 tokens
    • mean: 12.11 tokens
    • max: 47 tokens
    • min: 17 tokens
    • mean: 49.91 tokens
    • max: 166 tokens
    • min: 20 tokens
    • mean: 50.64 tokens
    • max: 152 tokens
  • Samples:
    anchor positive negative
    search_query: blー5c search_document: [EnergyPower] TECSUN PL-368 電池2個セット SSB・同期検波・長波 [交換用バッテリーBL-5C付] デジタルDSPポケット短波ラジオ 超小型 長・中波用外付アンテナ 10キー ポータブルBCL受信機 FMステレオ/LW/MW/SW ワールドバンドレシーバー 850局プリセットメモリー シグナルメーター USB充電 スリープタイマー アラー, TECSUN, PL-368 電池+セット [ブラック] search_document: RADIWOWで作る SIHUADON R108 ポータブル BCL短波ラジオAM FM LW SW 航空無線 DSPレシーバー LCD 良好屋内および屋外アクティビティの両親への贈り物, RADIWOW, グレー
    search_query: かわいいロングtシャツ search_document: レディース ロンt 半袖 tシャツ オーバーサイズ コットン スリット 大きいサイズ 白 シャツ ビッグシルエット ワンピース シャツワンピ ロングtシャツ おおきいサイズ 夏 ピンク カジュアル カップ付き カーディガン キラキラ キャミソール キャミ サテン シンプル シニア シフォン シースルー シ, Sleeping Sheep(スリーピング シープ), ホワイト search_document: Perkisboby スポーツウェア レディース ヨガウェア 4点セット 上下セット 5点セットウェア フィットネス 2点セット ジャージ スポーツブラ パンツ パーカー 半袖 ハーフパンツ, Perkisboby, 2点セット-グレー
    search_query: iphone xr otterbox symmetry case search_document: Symmetry Clear Series Case for iPhone XR (ONLY) Symmetry Case for iPhone XR Symmetry Case - Clear, VTSOU, Clear search_document: OtterBox Symmetry Series Case for Apple iPhone XS Max - Tonic Violet / Purple, OtterBox, Tonic Violet / Purple
  • Loss: CachedMultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 1,000 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 7 tokens
    • mean: 12.13 tokens
    • max: 49 tokens
    • min: 15 tokens
    • mean: 50.76 tokens
    • max: 173 tokens
    • min: 18 tokens
    • mean: 54.25 tokens
    • max: 161 tokens
  • Samples:
    anchor positive negative
    search_query: snack vending machine search_document: Red All Metal Triple Compartment Commercial Vending Machine for 1 inch Gumballs, 1 inch Toy Capsules, Bouncy Balls, Candy, Nuts with Stand by American Gumball Company, American Gumball Company, CANDY RED search_document: Vending Machine Halloween Costume - Funny Snack Food Adult Men & Women Outfits, Hauntlook, Multicolored
    search_query: slim credit card holder without id window search_document: Banuce Top Grain Leather Card Holder for Women Men Unisex ID Credit Card Case Slim Card Wallet Black, Banuce, 1 ID + 5 Card Slots: Black search_document: Mens Wallet RFID Genuine Leather Bifold Wallets For Men, ID Window 16 Card Holders Gift Box, Swallowmall, Black Stripe
    search_query: gucci belts for women search_document: Gucci Women's Gg0027o 50Mm Optical Glasses, Gucci, Havana search_document: Gucci G-Gucci Gold PVD Women's Watch(Model:YA125511), Gucci, PVD/Brown
  • Loss: CachedMultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 4
  • gradient_accumulation_steps: 2
  • learning_rate: 1e-06
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • dataloader_drop_last: True
  • dataloader_num_workers: 4
  • dataloader_prefetch_factor: 2
  • load_best_model_at_end: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • prediction_loss_only: True
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 4
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 2
  • eval_accumulation_steps: None
  • learning_rate: 1e-06
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: True
  • dataloader_num_workers: 4
  • dataloader_prefetch_factor: 2
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss loss triplet-esci_cosine_accuracy
0.008 100 0.7191 - -
0.016 200 0.6917 - -
0.024 300 0.7129 - -
0.032 400 0.6826 - -
0.04 500 0.7317 - -
0.048 600 0.7237 - -
0.056 700 0.6904 - -
0.064 800 0.6815 - -
0.072 900 0.6428 - -
0.08 1000 0.6561 0.6741 0.74
0.088 1100 0.6097 - -
0.096 1200 0.6426 - -
0.104 1300 0.618 - -
0.112 1400 0.6346 - -
0.12 1500 0.611 - -
0.128 1600 0.6092 - -
0.136 1700 0.6512 - -
0.144 1800 0.646 - -
0.152 1900 0.6584 - -
0.16 2000 0.6403 0.6411 0.747
0.168 2100 0.5882 - -
0.176 2200 0.6361 - -
0.184 2300 0.5641 - -
0.192 2400 0.5734 - -
0.2 2500 0.6156 - -
0.208 2600 0.6252 - -
0.216 2700 0.634 - -
0.224 2800 0.5743 - -
0.232 2900 0.5222 - -
0.24 3000 0.5604 0.6180 0.765
0.248 3100 0.5864 - -
0.256 3200 0.5541 - -
0.264 3300 0.5661 - -
0.272 3400 0.5493 - -
0.28 3500 0.556 - -
0.288 3600 0.56 - -
0.296 3700 0.5552 - -
0.304 3800 0.5833 - -
0.312 3900 0.5578 - -
0.32 4000 0.5495 0.6009 0.769
0.328 4100 0.5245 - -
0.336 4200 0.477 - -
0.344 4300 0.5536 - -
0.352 4400 0.5493 - -
0.36 4500 0.532 - -
0.368 4600 0.5341 - -
0.376 4700 0.528 - -
0.384 4800 0.5574 - -
0.392 4900 0.4953 - -
0.4 5000 0.5365 0.5969 0.779
0.408 5100 0.4835 - -
0.416 5200 0.4573 - -
0.424 5300 0.5554 - -
0.432 5400 0.5623 - -
0.44 5500 0.5955 - -
0.448 5600 0.5086 - -
0.456 5700 0.5081 - -
0.464 5800 0.4829 - -
0.472 5900 0.5066 - -
0.48 6000 0.4997 0.5920 0.776
0.488 6100 0.5075 - -
0.496 6200 0.5051 - -
0.504 6300 0.5019 - -
0.512 6400 0.4774 - -
0.52 6500 0.4975 - -
0.528 6600 0.4756 - -
0.536 6700 0.4656 - -
0.544 6800 0.4671 - -
0.552 6900 0.4646 - -
0.56 7000 0.5595 0.5853 0.777
0.568 7100 0.4812 - -
0.576 7200 0.506 - -
0.584 7300 0.49 - -
0.592 7400 0.464 - -
0.6 7500 0.441 - -
0.608 7600 0.4492 - -
0.616 7700 0.457 - -
0.624 7800 0.493 - -
0.632 7900 0.4174 - -
0.64 8000 0.4686 0.5809 0.785
0.648 8100 0.4529 - -
0.656 8200 0.4784 - -
0.664 8300 0.4697 - -
0.672 8400 0.4489 - -
0.68 8500 0.4439 - -
0.688 8600 0.4063 - -
0.696 8700 0.4634 - -
0.704 8800 0.4446 - -
0.712 8900 0.4725 - -
0.72 9000 0.3954 0.5769 0.781
0.728 9100 0.4536 - -
0.736 9200 0.4583 - -
0.744 9300 0.4415 - -
0.752 9400 0.4716 - -
0.76 9500 0.4393 - -
0.768 9600 0.4332 - -
0.776 9700 0.4236 - -
0.784 9800 0.4021 - -
0.792 9900 0.4324 - -
0.8 10000 0.4197 0.5796 0.78
0.808 10100 0.4576 - -
0.816 10200 0.4238 - -
0.824 10300 0.4468 - -
0.832 10400 0.4301 - -
0.84 10500 0.414 - -
0.848 10600 0.4563 - -
0.856 10700 0.4212 - -
0.864 10800 0.3905 - -
0.872 10900 0.4384 - -
0.88 11000 0.3474 0.5709 0.788
0.888 11100 0.4396 - -
0.896 11200 0.3819 - -
0.904 11300 0.3748 - -
0.912 11400 0.4217 - -
0.92 11500 0.3893 - -
0.928 11600 0.3835 - -
0.936 11700 0.4303 - -
0.944 11800 0.4274 - -
0.952 11900 0.4089 - -
0.96 12000 0.4009 0.5710 0.786
0.968 12100 0.3832 - -
0.976 12200 0.3543 - -
0.984 12300 0.4866 - -
0.992 12400 0.4531 - -
1.0 12500 0.3728 - -
1.008 12600 0.386 - -
1.016 12700 0.3622 - -
1.024 12800 0.4013 - -
1.032 12900 0.3543 - -
1.04 13000 0.3918 0.5712 0.792
1.048 13100 0.3961 - -
1.056 13200 0.3804 - -
1.064 13300 0.4049 - -
1.072 13400 0.3374 - -
1.08 13500 0.3746 - -
1.088 13600 0.3162 - -
1.096 13700 0.3536 - -
1.104 13800 0.3101 - -
1.112 13900 0.3704 - -
1.12 14000 0.3412 0.5758 0.788
1.1280 14100 0.342 - -
1.1360 14200 0.383 - -
1.144 14300 0.3554 - -
1.152 14400 0.4013 - -
1.16 14500 0.3486 - -
1.168 14600 0.3367 - -
1.176 14700 0.3737 - -
1.184 14800 0.319 - -
1.192 14900 0.3211 - -
1.2 15000 0.3284 0.5804 0.787

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.0.0
  • Transformers: 4.38.2
  • PyTorch: 2.1.2+cu121
  • Accelerate: 0.27.2
  • Datasets: 2.19.1
  • Tokenizers: 0.15.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CachedMultipleNegativesRankingLoss

@misc{gao2021scaling,
    title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup}, 
    author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
    year={2021},
    eprint={2101.06983},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
Downloads last month
6
Safetensors
Model size
137M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for lv12/esci-nomic-embed-text-v1_5

Finetuned
(14)
this model

Evaluation results