strickvl/finetuned-all-MiniLM-L6-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-MiniLM-L6-v2
- Maximum Sequence Length: 256 tokens
- Output Dimensionality: 384 tokens
- Similarity Function: Cosine Similarity
- Language: en
- License: apache-2.0
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the π€ Hub
model = SentenceTransformer("strickvl/finetuned-all-MiniLM-L6-v2")
# Run inference
sentences = [
'Can you explain how the `query_similar_docs` function handles document reranking?',
'ry_similar_docs(\n\nquestion: str,\n\nurl_ending: str,use_reranking: bool = False,\n\nreturned_sample_size: int = 5,\n\n) -> Tuple[str, str, List[str]]:\n\n"""Query similar documents for a given question and URL ending."""\n\nembedded_question = get_embeddings(question)\n\ndb_conn = get_db_conn()\n\nnum_docs = 20 if use_reranking else returned_sample_size\n\n# get (content, url) tuples for the top n similar documents\n\ntop_similar_docs = get_topn_similar_docs(\n\nembedded_question, db_conn, n=num_docs, include_metadata=True\n\nif use_reranking:\n\nreranked_docs_and_urls = rerank_documents(question, top_similar_docs)[\n\n:returned_sample_size\n\nurls = [doc[1] for doc in reranked_docs_and_urls]\n\nelse:\n\nurls = [doc[1] for doc in top_similar_docs] # Unpacking URLs\n\nreturn (question, url_ending, urls)\n\nWe get the embeddings for the question being passed into the function and connect to our PostgreSQL database. If we\'re using reranking, we get the top 20 documents similar to our query and rerank them using the rerank_documents helper function. We then extract the URLs from the reranked documents and return them. Note that we only return 5 URLs, but in the case of reranking we get a larger number of documents and URLs back from the database to pass to our reranker, but in the end we always choose the top five reranked documents to return.\n\nNow that we\'ve added reranking to our pipeline, we can evaluate the performance of our reranker and see how it affects the quality of the retrieved documents.\n\nCode Example\n\nTo explore the full code, visit the Complete Guide repository and for this section, particularly the eval_retrieval.py file.\n\nPreviousUnderstanding reranking\n\nNextEvaluating reranking performance\n\nLast updated 15 days ago',
" use for the database connection.\ndatabase_ssl_ca:# The path to the client SSL certificate to use for the database connection.\ndatabase_ssl_cert:\n\n# The path to the client SSL key to use for the database connection.\ndatabase_ssl_key:\n\n# Whether to verify the database server SSL certificate.\ndatabase_ssl_verify_server_cert:\n\nRun the deploy command and pass the config file above to it.Copyzenml deploy --config=/PATH/TO/FILENote To be able to run the deploy command, you should have your cloud provider's CLI configured locally with permissions to create resources like MySQL databases and networks.\n\nConfiguration file templates\n\nBase configuration file\n\nBelow is the general structure of a config file. Use this as a base and then add any cloud-specific parameters from the sections below.\n\n# Name of the server deployment.\n\nname:\n\n# The server provider type, one of aws, gcp or azure.\n\nprovider:\n\n# The path to the kubectl config file to use for deployment.\n\nkubectl_config_path:\n\n# The Kubernetes namespace to deploy the ZenML server to.\n\nnamespace: zenmlserver\n\n# The path to the ZenML server helm chart to use for deployment.\n\nhelm_chart:\n\n# The repository and tag to use for the ZenML server Docker image.\n\nzenmlserver_image_repo: zenmldocker/zenml\n\nzenmlserver_image_tag: latest\n\n# Whether to deploy an nginx ingress controller as part of the deployment.\n\ncreate_ingress_controller: true\n\n# Whether to use TLS for the ingress.\n\ningress_tls: true\n\n# Whether to generate self-signed TLS certificates for the ingress.\n\ningress_tls_generate_certs: true\n\n# The name of the Kubernetes secret to use for the ingress.\n\ningress_tls_secret_name: zenml-tls-certs\n\n# The ingress controller's IP address. The ZenML server will be exposed on a subdomain of this IP. For AWS, if you have a hostname instead, use the following command to get the IP address: `dig +short <hostname>`.\n\ningress_controller_ip:\n\n# Whether to create a SQL database service as part of the recipe.\n\ndeploy_db: true\n\n# The username and password for the database.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Dataset:
dim_384
- Evaluated with
InformationRetrievalEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.3012 |
cosine_accuracy@3 | 0.5422 |
cosine_accuracy@5 | 0.6747 |
cosine_accuracy@10 | 0.741 |
cosine_precision@1 | 0.3012 |
cosine_precision@3 | 0.1807 |
cosine_precision@5 | 0.1349 |
cosine_precision@10 | 0.0741 |
cosine_recall@1 | 0.3012 |
cosine_recall@3 | 0.5422 |
cosine_recall@5 | 0.6747 |
cosine_recall@10 | 0.741 |
cosine_ndcg@10 | 0.5192 |
cosine_mrr@10 | 0.4479 |
cosine_map@100 | 0.4579 |
Information Retrieval
- Dataset:
dim_256
- Evaluated with
InformationRetrievalEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.2952 |
cosine_accuracy@3 | 0.5301 |
cosine_accuracy@5 | 0.6325 |
cosine_accuracy@10 | 0.7349 |
cosine_precision@1 | 0.2952 |
cosine_precision@3 | 0.1767 |
cosine_precision@5 | 0.1265 |
cosine_precision@10 | 0.0735 |
cosine_recall@1 | 0.2952 |
cosine_recall@3 | 0.5301 |
cosine_recall@5 | 0.6325 |
cosine_recall@10 | 0.7349 |
cosine_ndcg@10 | 0.5119 |
cosine_mrr@10 | 0.441 |
cosine_map@100 | 0.4503 |
Information Retrieval
- Dataset:
dim_128
- Evaluated with
InformationRetrievalEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.2711 |
cosine_accuracy@3 | 0.512 |
cosine_accuracy@5 | 0.6145 |
cosine_accuracy@10 | 0.6988 |
cosine_precision@1 | 0.2711 |
cosine_precision@3 | 0.1707 |
cosine_precision@5 | 0.1229 |
cosine_precision@10 | 0.0699 |
cosine_recall@1 | 0.2711 |
cosine_recall@3 | 0.512 |
cosine_recall@5 | 0.6145 |
cosine_recall@10 | 0.6988 |
cosine_ndcg@10 | 0.4884 |
cosine_mrr@10 | 0.4208 |
cosine_map@100 | 0.4308 |
Information Retrieval
- Dataset:
dim_64
- Evaluated with
InformationRetrievalEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.253 |
cosine_accuracy@3 | 0.4578 |
cosine_accuracy@5 | 0.5542 |
cosine_accuracy@10 | 0.6566 |
cosine_precision@1 | 0.253 |
cosine_precision@3 | 0.1526 |
cosine_precision@5 | 0.1108 |
cosine_precision@10 | 0.0657 |
cosine_recall@1 | 0.253 |
cosine_recall@3 | 0.4578 |
cosine_recall@5 | 0.5542 |
cosine_recall@10 | 0.6566 |
cosine_ndcg@10 | 0.4466 |
cosine_mrr@10 | 0.3805 |
cosine_map@100 | 0.3906 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 1,490 training samples
- Columns:
positive
andanchor
- Approximate statistics based on the first 1000 samples:
positive anchor type string string details - min: 9 tokens
- mean: 21.12 tokens
- max: 49 tokens
- min: 21 tokens
- mean: 240.72 tokens
- max: 256 tokens
- Samples:
positive anchor Can you provide the details for the Azure service principal with the ID 273d2812-2643-4446-82e6-6098b8ccdaa4?
ββ βββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ¨
β ID β 273d2812-2643-4446-82e6-6098b8ccdaa4 β
β βββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ¨
β NAME β azure-service-principal β
β βββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ¨
β TYPE β π¦ azure β
β βββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ¨
β AUTH METHOD β service-principal β
β βββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ¨
β RESOURCE TYPES β π¦ azure-generic, π¦ blob-container, π kubernetes-cluster, π³ docker-registry β
β βββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ¨
β RESOURCE NAME β β
β βββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ¨
β SECRET ID β 50d9f230-c4ea-400e-b2d7-6b52ba2a6f90 β
β βββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ¨
β SESSION DURATION β N/A β
β βββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ¨
β EXPIRES IN β N/A β
β βββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ¨What are the new features introduced in ZenML 0.20.0 regarding the Metadata Store?
ed to update the way they are registered in ZenML.the updated ZenML server provides a new and improved collaborative experience. When connected to a ZenML server, you can now share your ZenML Stacks and Stack Components with other users. If you were previously using the ZenML Profiles or the ZenML server to share your ZenML Stacks, you should switch to the new ZenML server and Dashboard and update your existing workflows to reflect the new features.
ZenML takes over the Metadata Store role
ZenML can now run as a server that can be accessed via a REST API and also comes with a visual user interface (called the ZenML Dashboard). This server can be deployed in arbitrary environments (local, on-prem, via Docker, on AWS, GCP, Azure etc.) and supports user management, workspace scoping, and more.
The release introduces a series of commands to facilitate managing the lifecycle of the ZenML server and to access the pipeline and pipeline run information:
zenml connect / disconnect / down / up / logs / status can be used to configure your client to connect to a ZenML server, to start a local ZenML Dashboard or to deploy a ZenML server to a cloud environment. For more information on how to use these commands, see the ZenML deployment documentation.
zenml pipeline list / runs / delete can be used to display information and about and manage your pipelines and pipeline runs.
In ZenML 0.13.2 and earlier versions, information about pipelines and pipeline runs used to be stored in a separate stack component called the Metadata Store. Starting with 0.20.0, the role of the Metadata Store is now taken over by ZenML itself. This means that the Metadata Store is no longer a separate component in the ZenML architecture, but rather a part of the ZenML core, located wherever ZenML is deployed: locally on your machine or running remotely as a server.Which environment variables should I set to use the Azure Service Connector authentication method in ZenML?
-client-id","client_secret": "my-client-secret"}).Note: The remaining configuration options are deprecated and may be removed in a future release. Instead, you should set the ZENML_SECRETS_STORE_AUTH_METHOD and ZENML_SECRETS_STORE_AUTH_CONFIG variables to use the Azure Service Connector authentication method.
ZENML_SECRETS_STORE_AZURE_CLIENT_ID: The Azure application service principal client ID to use to authenticate with the Azure Key Vault API. If you are running the ZenML server hosted in Azure and are using a managed identity to access the Azure Key Vault service, you can omit this variable.
ZENML_SECRETS_STORE_AZURE_CLIENT_SECRET: The Azure application service principal client secret to use to authenticate with the Azure Key Vault API. If you are running the ZenML server hosted in Azure and are using a managed identity to access the Azure Key Vault service, you can omit this variable.
ZENML_SECRETS_STORE_AZURE_TENANT_ID: The Azure application service principal tenant ID to use to authenticate with the Azure Key Vault API. If you are running the ZenML server hosted in Azure and are using a managed identity to access the Azure Key Vault service, you can omit this variable.
These configuration options are only relevant if you're using Hashicorp Vault as the secrets store backend.
ZENML_SECRETS_STORE_TYPE: Set this to hashicorp in order to set this type of secret store.
ZENML_SECRETS_STORE_VAULT_ADDR: The URL of the HashiCorp Vault server to connect to. NOTE: this is the same as setting the VAULT_ADDR environment variable.
ZENML_SECRETS_STORE_VAULT_TOKEN: The token to use to authenticate with the HashiCorp Vault server. NOTE: this is the same as setting the VAULT_TOKEN environment variable.
ZENML_SECRETS_STORE_VAULT_NAMESPACE: The Vault Enterprise namespace. Not required for Vault OSS. NOTE: this is the same as setting the VAULT_NAMESPACE environment variable. - Loss:
MatryoshkaLoss
with these parameters:{ "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 384, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1 ], "n_dims_per_step": -1 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: epochper_device_train_batch_size
: 32per_device_eval_batch_size
: 16gradient_accumulation_steps
: 16learning_rate
: 2e-05num_train_epochs
: 4lr_scheduler_type
: cosinewarmup_ratio
: 0.1bf16
: Truetf32
: Trueload_best_model_at_end
: Trueoptim
: adamw_torch_fusedbatch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: epochprediction_loss_only
: Trueper_device_train_batch_size
: 32per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 16eval_accumulation_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 4max_steps
: -1lr_scheduler_type
: cosinelr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Truelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Trueremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torch_fusedoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falsebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | dim_128_cosine_map@100 | dim_256_cosine_map@100 | dim_384_cosine_map@100 | dim_64_cosine_map@100 |
---|---|---|---|---|---|
0.6667 | 1 | 0.3800 | 0.3986 | 0.4149 | 0.3471 |
2.0 | 3 | 0.4194 | 0.4473 | 0.4557 | 0.3762 |
2.6667 | 4 | 0.4308 | 0.4503 | 0.4579 | 0.3906 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.10.14
- Sentence Transformers: 3.0.1
- Transformers: 4.41.2
- PyTorch: 2.3.1+cu121
- Accelerate: 0.31.0
- Datasets: 2.19.1
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 8
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for strickvl/finetuned-all-MiniLM-L6-v2
Base model
sentence-transformers/all-MiniLM-L6-v2Evaluation results
- Cosine Accuracy@1 on dim 384self-reported0.301
- Cosine Accuracy@3 on dim 384self-reported0.542
- Cosine Accuracy@5 on dim 384self-reported0.675
- Cosine Accuracy@10 on dim 384self-reported0.741
- Cosine Precision@1 on dim 384self-reported0.301
- Cosine Precision@3 on dim 384self-reported0.181
- Cosine Precision@5 on dim 384self-reported0.135
- Cosine Precision@10 on dim 384self-reported0.074
- Cosine Recall@1 on dim 384self-reported0.301
- Cosine Recall@3 on dim 384self-reported0.542