SentenceTransformer based on nvidia/NV-Embed-v2
This is a sentence-transformers model finetuned from nvidia/NV-Embed-v2. It maps sentences & paragraphs to a 4096-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: nvidia/NV-Embed-v2
- Maximum Sequence Length: 1024 tokens
- Output Dimensionality: 4096 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 1024, 'do_lower_case': False}) with Transformer model: NVEmbedModel
(1): Pooling({'word_embedding_dimension': 4096, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': False})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("MendelAI/nv-embed-v2-ontada-twab-peft")
# Run inference
sentences = [
'Instruct: Given a question, retrieve passages that answer the question. Query: what is the total dose administered in the EBRT Intensity Modulated Radiation Therapy?',
'Source: SOAP_Note. Date: 2020-03-13. Context: MV electrons.\n \n FIELDS:\n The right orbital mass and right cervical lymph nodes were initially treated with a two arc IMRT plan. Arc 1: 11.4 x 21 cm. Gantry start and stop angles 178 degrees / 182 degrees. Arc 2: 16.4 x 13.0 cm. Gantry start ',
'Source: Radiology. Date: 2023-09-18. Context: : >60\n \n Contrast Type: OMNI 350\n Volume: 80ML\n \n Lot_: ________\n \n Exp. date: 05/26 \n Study Completed: CT CHEST W\n \n Reading Group:BCH \n \n Prior Studies for Comparison: 06/14/23 CT CHEST W RMCC \n \n ________ ______\n ',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 4096]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Patient QA
- Dataset:
ontada-test
- Evaluated with
PatientQAEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.6856 |
cosine_accuracy@3 | 0.9531 |
cosine_accuracy@5 | 0.9909 |
cosine_accuracy@10 | 1.0 |
cosine_precision@1 | 0.6856 |
cosine_precision@3 | 0.5209 |
cosine_precision@5 | 0.3969 |
cosine_precision@10 | 0.2251 |
cosine_recall@1 | 0.4203 |
cosine_recall@3 | 0.8154 |
cosine_recall@5 | 0.9454 |
cosine_recall@10 | 1.0046 |
cosine_ndcg@10 | 0.8649 |
cosine_mrr@10 | 0.8191 |
cosine_map@100 | 0.805 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 16,186 training samples
- Columns:
question
andcontext
- Approximate statistics based on the first 1000 samples:
question context type string string details - min: 25 tokens
- mean: 30.78 tokens
- max: 39 tokens
- min: 74 tokens
- mean: 177.84 tokens
- max: 398 tokens
- Samples:
question context Instruct: Given a question, retrieve passages that answer the question. Query: what was the abnormality identified for BRAF?
Source: Genetic_Testing. Date: 2022-10-07. Context: Mutational Seq DNA-Tumor Low, 6 mt/Mb NF1
Seq DNA-Tumor Mutation Not Detected
T In Not D
ARID2 Seq DNA-Tumor Mutation Not Detected CNA-Seq DNA-Tumor Deletion Not Detected
PTEN
Seq RNA-Tumor Fusion Not Detected Seq DNA-Tumor Mutation Not Detected
BRAF
Amplification Not _
CNA-Seq DNA-Tumor Detected RAC1 Seq DNA-Tumor Mutation Not Detected
The selection of any, all, or none of the matched therapiesInstruct: Given a question, retrieve passages that answer the question. Query: what was the abnormality identified for BRAF?
Source: Genetic_Testing. Date: 2021-06-04. Context: characteristics have been determined by _____ ____
_________ ___ ____ _______. It has not been
cleared or approved by FDA. This assay has been validated
pursuant to the CLIA regulations and is used for clinical
purposes.
BRAF MUTATION ANALYSIS E
SOURCE: LYMPH NODE
PARAFFIN BLOCK NUMBER: - A4
BRAF MUTATION ANALYSIS NOT DETECTED NOT DETECTED
This result was reviewed and interpreted by _. ____, M.D.
Based on Sanger sequencing analysis, no mutationsInstruct: Given a question, retrieve passages that answer the question. Query: what was the abnormality identified for BRAF?
Source: Pathology. Date: 2019-12-12. Context: Receive Date: 12/12/2019
___ _: ________________ Accession Date: 12/12/2019
Copy To: Report Date: 12/19/2019 18:16
SUPPLEMENTAL REPORT
(previous report date: 12/19/2019)
BRAF SNAPSHOT
Results:
POSITIVE
Interpretation:
A BRAF mutation was detected in the provided specimen.
FDA has approved TKI inhibitor vemurafenib and dabrafenib for the first-line treatment of patients with
unresectable or metastatic melanoma whose tumors have a BRAF V600E mutation, and trametinib for tumors - Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 4per_device_eval_batch_size
: 64learning_rate
: 2e-05num_train_epochs
: 1warmup_ratio
: 0.1seed
: 6789bf16
: Trueprompts
: {'question': 'Instruct: Given a question, retrieve passages that answer the question. Query: '}batch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 4per_device_eval_batch_size
: 64per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 6789data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseprompts
: {'question': 'Instruct: Given a question, retrieve passages that answer the question. Query: '}batch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | ontada-test_cosine_ndcg@10 |
---|---|---|---|
0 | 0 | - | 0.8431 |
0.0002 | 1 | 1.5826 | - |
0.0371 | 150 | 0.4123 | - |
0.0741 | 300 | 0.3077 | - |
0.1112 | 450 | 0.2184 | - |
0.1483 | 600 | 0.3291 | - |
0.1853 | 750 | 0.2343 | - |
0.2224 | 900 | 0.2506 | - |
0.2471 | 1000 | - | 0.8077 |
0.2595 | 1050 | 0.1294 | - |
0.2965 | 1200 | 0.0158 | - |
0.3336 | 1350 | 0.0189 | - |
0.3706 | 1500 | 0.0363 | - |
0.4077 | 1650 | 0.0208 | - |
0.4448 | 1800 | 0.475 | - |
0.4818 | 1950 | 0.6183 | - |
0.4942 | 2000 | - | 0.8482 |
0.5189 | 2100 | 0.4779 | - |
0.5560 | 2250 | 0.4194 | - |
0.5930 | 2400 | 0.8376 | - |
0.6301 | 2550 | 0.4249 | - |
0.6672 | 2700 | 0.9336 | - |
0.7042 | 2850 | 0.5351 | - |
0.7413 | 3000 | 1.0253 | 0.8551 |
0.7784 | 3150 | 0.3961 | - |
0.8154 | 3300 | 0.3881 | - |
0.8525 | 3450 | 0.5573 | - |
0.8895 | 3600 | 1.222 | - |
0.9266 | 3750 | 0.3032 | - |
0.9637 | 3900 | 0.3142 | - |
0.9884 | 4000 | - | 0.8645 |
1.0 | 4047 | - | 0.8649 |
Framework Versions
- Python: 3.11.10
- Sentence Transformers: 3.4.0.dev0
- Transformers: 4.46.0
- PyTorch: 2.3.1+cu121
- Accelerate: 1.0.1
- Datasets: 3.0.1
- Tokenizers: 0.20.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 6
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for MendelAI/nv-embed-v2-ontada-twab-peft
Base model
nvidia/NV-Embed-v2Evaluation results
- Cosine Accuracy@1 on ontada testself-reported0.686
- Cosine Accuracy@3 on ontada testself-reported0.953
- Cosine Accuracy@5 on ontada testself-reported0.991
- Cosine Accuracy@10 on ontada testself-reported1.000
- Cosine Precision@1 on ontada testself-reported0.686
- Cosine Precision@3 on ontada testself-reported0.521
- Cosine Precision@5 on ontada testself-reported0.397
- Cosine Precision@10 on ontada testself-reported0.225
- Cosine Recall@1 on ontada testself-reported0.420
- Cosine Recall@3 on ontada testself-reported0.815