metadata
base_model: Snowflake/snowflake-arctic-embed-m
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
- dot_accuracy@1
- dot_accuracy@3
- dot_accuracy@5
- dot_accuracy@10
- dot_precision@1
- dot_precision@3
- dot_precision@5
- dot_precision@10
- dot_recall@1
- dot_recall@3
- dot_recall@5
- dot_recall@10
- dot_ndcg@10
- dot_mrr@10
- dot_map@100
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:600
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
widget:
- source_sentence: >-
What are the potential risks associated with the impersonation and
cyber-attacks mentioned in the context?
sentences:
- >-
Technology Engagement Center
Uber Technologies
University of Pittsburgh
Undergraduate Student
Collaborative
Upturn
US Technology Policy Committee
of the Association of Computing
Machinery
Virginia Puccio
Visar Berisha and Julie Liss
XR Association
XR Safety Initiative
• As an additional effort to reach out to stakeholders regarding the
RFI, OSTP conducted two listening sessions
for members of the public. The listening sessions together drew upwards
of 300 participants. The Science and
Technology Policy Institute produced a synopsis of both the RFI
submissions and the feedback at the listening
sessions.115
61
- >-
across all subgroups, which could leave the groups facing
underperformance with worse outcomes than
if no GAI system were used. Disparate or reduced performance for
lower-resource languages also
presents challenges to model adoption, inclusion, and accessibility, and
may make preservation of
endangered languages more difficult if GAI systems become embedded in
everyday processes that would
otherwise have been opportunities to use these languages.
Bias is mutually reinforcing with the problem of undesired
homogenization, in which GAI systems
produce skewed distributions of outputs that are overly uniform (for
example, repetitive aesthetic styles
- >-
impersonation, cyber-attacks, and weapons creation.
CBRN Information or Capabilities;
Information Security
MS-2.6-007 Regularly evaluate GAI system vulnerabilities to possible
circumvention of safety
measures.
CBRN Information or Capabilities;
Information Security
AI Actor Tasks: AI Deployment, AI Impact Assessment, Domain Experts,
Operation and Monitoring, TEVV
- source_sentence: >-
What techniques are suggested to assess and manage statistical biases
related to GAI content provenance?
sentences:
- >-
2
This work was informed by public feedback and consultations with diverse
stakeholder groups as part of NIST’s
Generative AI Public Working Group (GAI PWG). The GAI PWG was an open,
transparent, and collaborative
process, facilitated via a virtual workspace, to obtain multistakeholder
input on GAI risk management and to
inform NIST’s approach.
The focus of the GAI PWG was limited to four primary considerations
relevant to GAI: Governance, Content
Provenance, Pre-deployment Testing, and Incident Disclosure (further
described in Appendix A). As such, the
suggested actions in this document primarily address these
considerations.
Future revisions of this profile will include additional AI RMF
subcategories, risks, and suggested actions based
on additional considerations of GAI as the space evolves and empirical
evidence indicates additional risks. A
glossary of terms pertinent to GAI risk management will be developed and
hosted on NIST’s Trustworthy &
- >-
30
MEASURE 2.2: Evaluations involving human subjects meet applicable
requirements (including human subject protection) and are
representative of the relevant population.
Action ID
Suggested Action
GAI Risks
MS-2.2-001 Assess and manage statistical biases related to GAI content
provenance through
techniques such as re-sampling, re-weighting, or adversarial training.
Information Integrity; Information
Security; Harmful Bias and
Homogenization
MS-2.2-002
Document how content provenance data is tracked and how that data
interacts
with privacy and security. Consider: Anonymizing data to protect the
privacy of
human subjects; Leveraging privacy output filters; Removing any
personally
identifiable information (PII) to prevent potential harm or misuse.
Data Privacy; Human AI
Configuration; Information
Integrity; Information Security;
Dangerous, Violent, or Hateful
Content
MS-2.2-003 Provide human subjects with options to withdraw participation
or revoke their
- >-
humans (e.g., intelligence tests, professional licensing exams) does not
guarantee GAI system validity or
reliability in those domains. Similarly, jailbreaking or prompt
engineering tests may not systematically
assess validity or reliability risks.
Measurement gaps can arise from mismatches between laboratory and
real-world settings. Current
testing approaches often remain focused on laboratory conditions or
restricted to benchmark test
datasets and in silico techniques that may not extrapolate well to—or
directly assess GAI impacts in real-
world conditions. For example, current measurement gaps for GAI make it
difficult to precisely estimate
its potential ecosystem-level or longitudinal risks and related
political, social, and economic impacts.
Gaps between benchmarks and real-world use of GAI systems may likely be
exacerbated due to prompt
sensitivity and broad heterogeneity of contexts of use.
A.1.5. Structured Public Feedback
- source_sentence: >-
How does the absence of an explanation regarding data usage affect
parents' ability to contest decisions made in child maltreatment
assessments?
sentences:
- >-
62. See, e.g., Federal Trade Commission. Data Brokers: A Call for
Transparency and Accountability. May
2014.
https://www.ftc.gov/system/files/documents/reports/data-brokers-call-transparency-accountability
report-federal-trade-commission-may-2014/140527databrokerreport.pdf;
Cathy O’Neil.
Weapons of Math Destruction. Penguin Books. 2017.
https://en.wikipedia.org/wiki/Weapons_of_Math_Destruction
63. See, e.g., Rachel Levinson-Waldman, Harsha Pandurnga, and Faiza
Patel. Social Media Surveillance by
the U.S. Government. Brennan Center for Justice. Jan. 7, 2022.
https://www.brennancenter.org/our-work/research-reports/social-media-surveillance-us-government;
Shoshana Zuboff. The Age of Surveillance Capitalism: The Fight for a
Human Future at the New Frontier of
Power. Public Affairs. 2019.
64. Angela Chen. Why the Future of Life Insurance May Depend on Your
Online Presence. The Verge. Feb.
7, 2019.
- >-
NOTICE &
EXPLANATION
WHY THIS PRINCIPLE IS IMPORTANT
This section provides a brief summary of the problems which the
principle seeks to address and protect
against, including illustrative examples.
Automated systems now determine opportunities, from employment to
credit, and directly shape the American
public’s experiences, from the courtroom to online classrooms, in ways
that profoundly impact people’s lives. But this
expansive impact is not always visible. An applicant might not know
whether a person rejected their resume or a
hiring algorithm moved them to the bottom of the list. A defendant in
the courtroom might not know if a judge deny
ing their bail is informed by an automated system that labeled them
“high risk.” From correcting errors to contesting
decisions, people are often denied the knowledge they need to address
the impact of automated systems on their lives.
- >-
ever being notified that data was being collected and used as part of an
algorithmic child maltreatment
risk assessment.84 The lack of notice or an explanation makes it harder
for those performing child
maltreatment assessments to validate the risk assessment and denies
parents knowledge that could help them
contest a decision.
41
- source_sentence: >-
How should automated systems be tested to ensure they are free from
algorithmic discrimination?
sentences:
- >-
Homogenization? arXiv. https://arxiv.org/pdf/2211.13972
Boyarskaya, M. et al. (2020) Overcoming Failures of Imagination in AI
Infused System Development and
Deployment. arXiv. https://arxiv.org/pdf/2011.13416
Browne, D. et al. (2023) Securing the AI Pipeline. Mandiant.
https://www.mandiant.com/resources/blog/securing-ai-pipeline
Burgess, M. (2024) Generative AI’s Biggest Security Flaw Is Not Easy to
Fix. WIRED.
https://www.wired.com/story/generative-ai-prompt-injection-hacking/
Burtell, M. et al. (2024) The Surprising Power of Next Word Prediction:
Large Language Models
Explained, Part 1. Georgetown Center for Security and Emerging
Technology.
https://cset.georgetown.edu/article/the-surprising-power-of-next-word-prediction-large-language-
models-explained-part-1/
Canadian Centre for Cyber Security (2023) Generative artificial
intelligence (AI) - ITSAP.00.041.
https://www.cyber.gc.ca/en/guidance/generative-artificial-intelligence-ai-itsap00041
- >-
relevant biological and chemical threat knowledge and information is
often publicly accessible, LLMs
could facilitate its analysis or synthesis, particularly by individuals
without formal scientific training or
expertise.
Recent research on this topic found that LLM outputs regarding
biological threat creation and attack
planning provided minimal assistance beyond traditional search engine
queries, suggesting that state-of-
the-art LLMs at the time these studies were conducted do not
substantially increase the operational
likelihood of such an attack. The physical synthesis development,
production, and use of chemical or
biological agents will continue to require both applicable expertise and
supporting materials and
infrastructure. The impact of GAI on chemical or biological agent misuse
will depend on what the key
barriers for malicious actors are (e.g., whether information access is
one such barrier), and how well GAI
can help actors address those barriers.
- >-
WHAT SHOULD BE EXPECTED OF AUTOMATED SYSTEMS
The expectations for automated systems are meant to serve as a blueprint
for the development of additional
technical standards and practices that are tailored for particular
sectors and contexts.
Any automated system should be tested to help ensure it is free from
algorithmic discrimination before it can be
sold or used. Protection against algorithmic discrimination should
include designing to ensure equity, broadly
construed. Some algorithmic discrimination is already prohibited under
existing anti-discrimination law. The
expectations set out below describe proactive technical and policy steps
that can be taken to not only
reinforce those legal protections but extend beyond them to ensure
equity for underserved communities48
even in circumstances where a specific legal protection may not be
clearly established. These protections
- source_sentence: >-
What rights do applicants have if their application for credit is denied
according to the CFPB?
sentences:
- |-
listed organizations and individuals:
Accenture
Access Now
ACT | The App Association
AHIP
AIethicist.org
Airlines for America
Alliance for Automotive Innovation
Amelia Winger-Bearskin
American Civil Liberties Union
American Civil Liberties Union of
Massachusetts
American Medical Association
ARTICLE19
Attorneys General of the District of
Columbia, Illinois, Maryland,
Michigan, Minnesota, New York,
North Carolina, Oregon, Vermont,
and Washington
Avanade
Aware
Barbara Evans
Better Identity Coalition
Bipartisan Policy Center
Brandon L. Garrett and Cynthia
Rudin
Brian Krupp
Brooklyn Defender Services
BSA | The Software Alliance
Carnegie Mellon University
Center for Democracy &
Technology
Center for New Democratic
Processes
Center for Research and Education
on Accessible Technology and
Experiences at University of
Washington, Devva Kasnitz, L Jean
Camp, Jonathan Lazar, Harry
Hochheiser
Center on Privacy & Technology at
Georgetown Law
Cisco Systems
- >-
even if the inferences are not accurate (e.g., confabulations), and
especially if they reveal information
that the individual considers sensitive or that is used to disadvantage
or harm them.
Beyond harms from information exposure (such as extortion or dignitary
harm), wrong or inappropriate
inferences of PII can contribute to downstream or secondary harmful
impacts. For example, predictive
inferences made by GAI models based on PII or protected attributes can
contribute to adverse decisions,
leading to representational or allocative harms to individuals or groups
(see Harmful Bias and
Homogenization below).
- >-
information in their credit report." The CFPB has also asserted that
"[t]he law gives every applicant the right to
a specific explanation if their application for credit was denied, and
that right is not diminished simply because
a company uses a complex algorithm that it doesn't understand."92 Such
explanations illustrate a shared value
that certain decisions need to be explained.
A California law requires that warehouse employees are provided with
notice and explana-
tion about quotas, potentially facilitated by automated systems, that
apply to them. Warehous-
ing employers in California that use quota systems (often facilitated by
algorithmic monitoring systems) are
required to provide employees with a written description of each quota
that applies to the employee, including
“quantified number of tasks to be performed or materials to be produced
or handled, within the defined
model-index:
- name: SentenceTransformer based on Snowflake/snowflake-arctic-embed-m
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: Unknown
type: unknown
metrics:
- type: cosine_accuracy@1
value: 0.98
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 1
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 1
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 1
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.98
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.3333333333333334
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.19999999999999996
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.09999999999999998
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.98
name: Cosine Recall@1
- type: cosine_recall@3
value: 1
name: Cosine Recall@3
- type: cosine_recall@5
value: 1
name: Cosine Recall@5
- type: cosine_recall@10
value: 1
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.9913092975357145
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.9883333333333333
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.9883333333333334
name: Cosine Map@100
- type: dot_accuracy@1
value: 0.98
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 1
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 1
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 1
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.98
name: Dot Precision@1
- type: dot_precision@3
value: 0.3333333333333334
name: Dot Precision@3
- type: dot_precision@5
value: 0.19999999999999996
name: Dot Precision@5
- type: dot_precision@10
value: 0.09999999999999998
name: Dot Precision@10
- type: dot_recall@1
value: 0.98
name: Dot Recall@1
- type: dot_recall@3
value: 1
name: Dot Recall@3
- type: dot_recall@5
value: 1
name: Dot Recall@5
- type: dot_recall@10
value: 1
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.9913092975357145
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.9883333333333333
name: Dot Mrr@10
- type: dot_map@100
value: 0.9883333333333334
name: Dot Map@100
SentenceTransformer based on Snowflake/snowflake-arctic-embed-m
This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-m. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Snowflake/snowflake-arctic-embed-m
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("vincha77/finetuned_arctic")
# Run inference
sentences = [
'What rights do applicants have if their application for credit is denied according to the CFPB?',
'information in their credit report." The CFPB has also asserted that "[t]he law gives every applicant the right to \na specific explanation if their application for credit was denied, and that right is not diminished simply because \na company uses a complex algorithm that it doesn\'t understand."92 Such explanations illustrate a shared value \nthat certain decisions need to be explained. \nA California law requires that warehouse employees are provided with notice and explana-\ntion about quotas, potentially facilitated by automated systems, that apply to them. Warehous-\ning employers in California that use quota systems (often facilitated by algorithmic monitoring systems) are \nrequired to provide employees with a written description of each quota that applies to the employee, including \n“quantified number of tasks to be performed or materials to be produced or handled, within the defined',
'even if the inferences are not accurate (e.g., confabulations), and especially if they reveal information \nthat the individual considers sensitive or that is used to disadvantage or harm them. \nBeyond harms from information exposure (such as extortion or dignitary harm), wrong or inappropriate \ninferences of PII can contribute to downstream or secondary harmful impacts. For example, predictive \ninferences made by GAI models based on PII or protected attributes can contribute to adverse decisions, \nleading to representational or allocative harms to individuals or groups (see Harmful Bias and \nHomogenization below).',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Evaluated with
InformationRetrievalEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.98 |
cosine_accuracy@3 | 1.0 |
cosine_accuracy@5 | 1.0 |
cosine_accuracy@10 | 1.0 |
cosine_precision@1 | 0.98 |
cosine_precision@3 | 0.3333 |
cosine_precision@5 | 0.2 |
cosine_precision@10 | 0.1 |
cosine_recall@1 | 0.98 |
cosine_recall@3 | 1.0 |
cosine_recall@5 | 1.0 |
cosine_recall@10 | 1.0 |
cosine_ndcg@10 | 0.9913 |
cosine_mrr@10 | 0.9883 |
cosine_map@100 | 0.9883 |
dot_accuracy@1 | 0.98 |
dot_accuracy@3 | 1.0 |
dot_accuracy@5 | 1.0 |
dot_accuracy@10 | 1.0 |
dot_precision@1 | 0.98 |
dot_precision@3 | 0.3333 |
dot_precision@5 | 0.2 |
dot_precision@10 | 0.1 |
dot_recall@1 | 0.98 |
dot_recall@3 | 1.0 |
dot_recall@5 | 1.0 |
dot_recall@10 | 1.0 |
dot_ndcg@10 | 0.9913 |
dot_mrr@10 | 0.9883 |
dot_map@100 | 0.9883 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 600 training samples
- Columns:
sentence_0
andsentence_1
- Approximate statistics based on the first 600 samples:
sentence_0 sentence_1 type string string details - min: 12 tokens
- mean: 21.21 tokens
- max: 39 tokens
- min: 21 tokens
- mean: 182.02 tokens
- max: 512 tokens
- Samples:
sentence_0 sentence_1 What are the responsibilities of AI Actors in monitoring reported issues related to GAI system performance?
45
MG-4.1-007
Verify that AI Actors responsible for monitoring reported issues can effectively
evaluate GAI system performance including the application of content
provenance data tracking techniques, and promptly escalate issues for response.
Human-AI Configuration;
Information Integrity
AI Actor Tasks: AI Deployment, Affected Individuals and Communities, Domain Experts, End-Users, Human Factors, Operation and
Monitoring
MANAGE 4.2: Measurable activities for continual improvements are integrated into AI system updates and include regular
engagement with interested parties, including relevant AI Actors.
Action ID
Suggested Action
GAI Risks
MG-4.2-001 Conduct regular monitoring of GAI systems and publish reports detailing the
performance, feedback received, and improvements made.
Harmful Bias and Homogenization
MG-4.2-002
Practice and follow incident response plans for addressing the generation ofHow are measurable activities for continual improvements integrated into AI system updates according to the context provided?
45
MG-4.1-007
Verify that AI Actors responsible for monitoring reported issues can effectively
evaluate GAI system performance including the application of content
provenance data tracking techniques, and promptly escalate issues for response.
Human-AI Configuration;
Information Integrity
AI Actor Tasks: AI Deployment, Affected Individuals and Communities, Domain Experts, End-Users, Human Factors, Operation and
Monitoring
MANAGE 4.2: Measurable activities for continual improvements are integrated into AI system updates and include regular
engagement with interested parties, including relevant AI Actors.
Action ID
Suggested Action
GAI Risks
MG-4.2-001 Conduct regular monitoring of GAI systems and publish reports detailing the
performance, feedback received, and improvements made.
Harmful Bias and Homogenization
MG-4.2-002
Practice and follow incident response plans for addressing the generation ofWhat is the main function of the app discussed in Samantha Cole's article from June 26, 2019?
them
10. Samantha Cole. This Horrifying App Undresses a Photo of Any Woman With a Single Click. Motherboard.
June 26, 2019. https://www.vice.com/en/article/kzm59x/deepnude-app-creates-fake-nudes-of-any-woman
11. Lauren Kaori Gurley. Amazon’s AI Cameras Are Punishing Drivers for Mistakes They Didn’t Make.
Motherboard. Sep. 20, 2021. https://www.vice.com/en/article/88npjv/amazons-ai-cameras-are-punishing
drivers-for-mistakes-they-didnt-make
63 - Loss:
MatryoshkaLoss
with these parameters:{ "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 768, 512, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1, 1 ], "n_dims_per_step": -1 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 16per_device_eval_batch_size
: 16num_train_epochs
: 5multi_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 16per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 5max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseeval_use_gather_object
: Falsebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robin
Training Logs
Epoch | Step | cosine_map@100 |
---|---|---|
1.0 | 38 | 0.965 |
1.3158 | 50 | 0.9783 |
2.0 | 76 | 0.9767 |
2.6316 | 100 | 0.9833 |
3.0 | 114 | 0.9883 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.1.1
- Transformers: 4.44.2
- PyTorch: 2.4.1+cu121
- Accelerate: 0.34.2
- Datasets: 3.0.0
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}