SentenceTransformer based on BAAI/bge-small-en-v1.5
This is a sentence-transformers model finetuned from BAAI/bge-small-en-v1.5. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: BAAI/bge-small-en-v1.5
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 384 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("marroyo777/bge-99GPT-v1")
# Run inference
sentences = [
'In what context is traffic flow theory typically discussed?',
'As a result, I was familiar with many terms discussed conceptually but I discovered some of the more official terminology used when discussing traffic flow theory and network control.',
'There are different types of projects within C.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Triplet
- Dataset:
99GPT-Finetuning-Embedding-test-01
- Evaluated with
TripletEvaluator
Metric | Value |
---|---|
cosine_accuracy | 0.9987 |
dot_accuracy | 0.0012 |
manhattan_accuracy | 0.9987 |
euclidean_accuracy | 0.9987 |
max_accuracy | 0.9987 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 60,341 training samples
- Columns:
anchor
,positive
, andnegative
- Approximate statistics based on the first 1000 samples:
anchor positive negative type string string string details - min: 7 tokens
- mean: 13.77 tokens
- max: 24 tokens
- min: 7 tokens
- mean: 40.26 tokens
- max: 123 tokens
- min: 6 tokens
- mean: 39.24 tokens
- max: 139 tokens
- Samples:
anchor positive negative Who is being invited to join the initiative?
Our belief is that the research community will be able to gain access to diverse and real-time data with minimal friction, build exciting innovations and make an impact to Data and AI technologies as well. This is just the first release and we are inviting the research community to join us to build exciting data-driven mobility & energy solutions together.
Burning it destroys the oil. Once you burn the oil, that particular oil ceases to exist.
What is the main focus of the research conducted for Orbit?
Orbit holds the culmination of almost a year of research with participants from a wide variety of backgrounds, needs, and jobs to be done.
So how do you win a hackathon mobility challenge? The SmartRoute team showed two of them.
What role do LLMs play in HRI's strategy?
We are excited about the potential of JournAI to transform mobility. By harnessing the power of LLMs and other AI technologies, HRI is driving towards a more connected, efficient, and sustainable future.
This simplified the process for users, who only had to pull and run the docker image to spawn a Jupyterlab app on their machine, open it in their browser, and create a new Pyspark notebook that automatically connected to our spark cluster. Our new workflow allows data science teams to configure their spark jobs and compute resources with options to request memory and CPU from the cluster and customize spark settings.
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Evaluation Dataset
Unnamed Dataset
- Size: 15,086 evaluation samples
- Columns:
anchor
,positive
, andnegative
- Approximate statistics based on the first 1000 samples:
anchor positive negative type string string string details - min: 6 tokens
- mean: 13.73 tokens
- max: 24 tokens
- min: 6 tokens
- mean: 39.51 tokens
- max: 131 tokens
- min: 6 tokens
- mean: 36.9 tokens
- max: 153 tokens
- Samples:
anchor positive negative What does the text suggest about the balance between creating tools and their practical application?
From technology to healthcare, these examples underline the importance of the interplay between theory and practice, between creating advanced tools and applying them effectively.
We found success when leaving the later panels empty as opposed to earlier ones. If we established a clear context and pain point for participants, they were often able to fill in a solution and resolution themselves.
Who are the personas mentioned in the text?
Our derived data sets are created based on personas that we have identified and their data access needs.
However there still exists a need to connect the map matched nodes that are outputted from the libraries to specific data points from the V2X data, in order to get the rest of the V2X features in a specific time frame.
Is this the first or second hackathon mentioned?
Up next is the first of two hackathons we participated in at Ohio State University.
The team did a great job by targeting a pervasive issue in such an intuitive way.
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 16per_device_eval_batch_size
: 16warmup_ratio
: 0.1fp16
: Truebatch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 16per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 3max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseeval_use_gather_object
: Falsebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Click to expand
Epoch | Step | Training Loss | loss | 99GPT-Finetuning-Embedding-test-01_max_accuracy |
---|---|---|---|---|
0.0265 | 100 | 0.7653 | 0.4309 | - |
0.0530 | 200 | 0.4795 | 0.2525 | - |
0.0795 | 300 | 0.3416 | 0.1996 | - |
0.1060 | 400 | 0.2713 | 0.1699 | - |
0.1326 | 500 | 0.2271 | 0.1558 | - |
0.1591 | 600 | 0.2427 | 0.1510 | - |
0.1856 | 700 | 0.2188 | 0.1414 | - |
0.2121 | 800 | 0.1936 | 0.1350 | - |
0.2386 | 900 | 0.2174 | 0.1370 | - |
0.2651 | 1000 | 0.2104 | 0.1265 | - |
0.2916 | 1100 | 0.2142 | 0.1324 | - |
0.3181 | 1200 | 0.2088 | 0.1297 | - |
0.3446 | 1300 | 0.1865 | 0.1240 | - |
0.3712 | 1400 | 0.177 | 0.1221 | - |
0.3977 | 1500 | 0.1735 | 0.1296 | - |
0.4242 | 1600 | 0.1746 | 0.1188 | - |
0.4507 | 1700 | 0.1639 | 0.1178 | - |
0.4772 | 1800 | 0.1958 | 0.1105 | - |
0.5037 | 1900 | 0.1874 | 0.1152 | - |
0.5302 | 2000 | 0.1676 | 0.1143 | - |
0.5567 | 2100 | 0.1671 | 0.1067 | - |
0.5832 | 2200 | 0.142 | 0.1154 | - |
0.6098 | 2300 | 0.1668 | 0.1150 | - |
0.6363 | 2400 | 0.1605 | 0.1091 | - |
0.6628 | 2500 | 0.1475 | 0.1096 | - |
0.6893 | 2600 | 0.1668 | 0.1066 | - |
0.7158 | 2700 | 0.166 | 0.1067 | - |
0.7423 | 2800 | 0.1611 | 0.0999 | - |
0.7688 | 2900 | 0.1747 | 0.1001 | - |
0.7953 | 3000 | 0.1436 | 0.1065 | - |
0.8218 | 3100 | 0.1579 | 0.0992 | - |
0.8484 | 3200 | 0.1718 | 0.1006 | - |
0.8749 | 3300 | 0.1567 | 0.0995 | - |
0.9014 | 3400 | 0.1634 | 0.0954 | - |
0.9279 | 3500 | 0.1441 | 0.0956 | - |
0.9544 | 3600 | 0.1433 | 0.0991 | - |
0.9809 | 3700 | 0.1562 | 0.0931 | - |
1.0074 | 3800 | 0.1421 | 0.0931 | - |
1.0339 | 3900 | 0.1424 | 0.0956 | - |
1.0604 | 4000 | 0.128 | 0.0900 | - |
1.0870 | 4100 | 0.1265 | 0.0921 | - |
1.1135 | 4200 | 0.1062 | 0.0944 | - |
1.1400 | 4300 | 0.1221 | 0.0900 | - |
1.1665 | 4400 | 0.1091 | 0.0944 | - |
1.1930 | 4500 | 0.091 | 0.0913 | - |
1.2195 | 4600 | 0.0823 | 0.0935 | - |
1.2460 | 4700 | 0.0946 | 0.0949 | - |
1.2725 | 4800 | 0.0803 | 0.0890 | - |
1.2990 | 4900 | 0.0796 | 0.0885 | - |
1.3256 | 5000 | 0.0699 | 0.0921 | - |
1.3521 | 5100 | 0.073 | 0.0909 | - |
1.3786 | 5200 | 0.0608 | 0.0934 | - |
1.4051 | 5300 | 0.07 | 0.0941 | - |
1.4316 | 5400 | 0.0732 | 0.0896 | - |
1.4581 | 5500 | 0.0639 | 0.0910 | - |
1.4846 | 5600 | 0.0722 | 0.0874 | - |
1.5111 | 5700 | 0.0635 | 0.0925 | - |
1.5376 | 5800 | 0.0631 | 0.0887 | - |
1.5642 | 5900 | 0.0589 | 0.0896 | - |
1.5907 | 6000 | 0.0636 | 0.0925 | - |
1.6172 | 6100 | 0.0702 | 0.0938 | - |
1.6437 | 6200 | 0.0572 | 0.0921 | - |
1.6702 | 6300 | 0.0516 | 0.0946 | - |
1.6967 | 6400 | 0.0695 | 0.0902 | - |
1.7232 | 6500 | 0.0632 | 0.0917 | - |
1.7497 | 6600 | 0.0697 | 0.0832 | - |
1.7762 | 6700 | 0.0747 | 0.0853 | - |
1.8028 | 6800 | 0.0615 | 0.0892 | - |
1.8293 | 6900 | 0.0747 | 0.0855 | - |
1.8558 | 7000 | 0.0668 | 0.0848 | - |
1.8823 | 7100 | 0.0747 | 0.0853 | - |
1.9088 | 7200 | 0.0774 | 0.0847 | - |
1.9353 | 7300 | 0.0546 | 0.0874 | - |
1.9618 | 7400 | 0.0708 | 0.0879 | - |
1.9883 | 7500 | 0.0632 | 0.0863 | - |
2.0148 | 7600 | 0.0601 | 0.0873 | - |
2.0414 | 7700 | 0.063 | 0.0870 | - |
2.0679 | 7800 | 0.0646 | 0.0819 | - |
2.0944 | 7900 | 0.0557 | 0.0825 | - |
2.1209 | 8000 | 0.0444 | 0.0841 | - |
2.1474 | 8100 | 0.049 | 0.0825 | - |
2.1739 | 8200 | 0.0441 | 0.0845 | - |
2.2004 | 8300 | 0.0451 | 0.0844 | - |
2.2269 | 8400 | 0.0346 | 0.0851 | - |
2.2534 | 8500 | 0.0398 | 0.0847 | - |
2.2800 | 8600 | 0.033 | 0.0855 | - |
2.3065 | 8700 | 0.0355 | 0.0851 | - |
2.3330 | 8800 | 0.0313 | 0.0867 | - |
2.3595 | 8900 | 0.0358 | 0.0870 | - |
2.3860 | 9000 | 0.0251 | 0.0867 | - |
2.4125 | 9100 | 0.0395 | 0.0854 | - |
2.4390 | 9200 | 0.0322 | 0.0838 | - |
2.4655 | 9300 | 0.0355 | 0.0847 | - |
2.4920 | 9400 | 0.034 | 0.0834 | - |
2.5186 | 9500 | 0.0345 | 0.0862 | - |
2.5451 | 9600 | 0.0272 | 0.0830 | - |
2.5716 | 9700 | 0.0275 | 0.0831 | - |
2.5981 | 9800 | 0.0345 | 0.0849 | - |
2.6246 | 9900 | 0.0289 | 0.0849 | - |
2.6511 | 10000 | 0.0282 | 0.0860 | - |
2.6776 | 10100 | 0.0279 | 0.0885 | - |
2.7041 | 10200 | 0.0344 | 0.0865 | - |
2.7306 | 10300 | 0.0326 | 0.0863 | - |
2.7572 | 10400 | 0.0383 | 0.0840 | - |
2.7837 | 10500 | 0.0338 | 0.0833 | - |
2.8102 | 10600 | 0.0298 | 0.0836 | - |
2.8367 | 10700 | 0.0402 | 0.0825 | - |
2.8632 | 10800 | 0.0361 | 0.0822 | - |
2.8897 | 10900 | 0.0388 | 0.0818 | - |
2.9162 | 11000 | 0.0347 | 0.0821 | - |
2.9427 | 11100 | 0.0341 | 0.0826 | - |
2.9692 | 11200 | 0.0373 | 0.0825 | - |
2.9958 | 11300 | 0.0354 | 0.0824 | - |
3.0 | 11316 | - | - | 0.9987 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.1.1
- Transformers: 4.44.2
- PyTorch: 2.4.1+cu121
- Accelerate: 0.34.2
- Datasets: 3.0.1
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for marroyo777/bge-99GPT-v1-test
Base model
BAAI/bge-small-en-v1.5Evaluation results
- Cosine Accuracy on 99GPT Finetuning Embedding test 01self-reported0.999
- Dot Accuracy on 99GPT Finetuning Embedding test 01self-reported0.001
- Manhattan Accuracy on 99GPT Finetuning Embedding test 01self-reported0.999
- Euclidean Accuracy on 99GPT Finetuning Embedding test 01self-reported0.999
- Max Accuracy on 99GPT Finetuning Embedding test 01self-reported0.999