metadata
base_model: Snowflake/snowflake-arctic-embed-l
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
- dot_accuracy@1
- dot_accuracy@3
- dot_accuracy@5
- dot_accuracy@10
- dot_precision@1
- dot_precision@3
- dot_precision@5
- dot_precision@10
- dot_recall@1
- dot_recall@3
- dot_recall@5
- dot_recall@10
- dot_ndcg@10
- dot_mrr@10
- dot_map@100
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:3430
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
widget:
- source_sentence: >-
What are some illustrative cases that show the implementation of the AI
Bill of Rights?
sentences:
- >-
SECTION TITLE
APPENDIX
Listening to the American People
The White House Office of Science and Technology Policy (OSTP) led a
yearlong process to seek and distill
input from people across the country – from impacted communities to
industry stakeholders to
technology developers to other experts across fields and sectors, as
well as policymakers across the Federal
government – on the issue of algorithmic and data-driven harms and
potential remedies. Through panel
discussions, public listening sessions, private meetings, a formal
request for information, and input to a
publicly accessible and widely-publicized email address, people across
the United States spoke up about
both the promises and potential harms of these technologies, and played
a central role in shaping the
Blueprint for an AI Bill of Rights.
Panel Discussions to Inform the Blueprint for An AI Bill of Rights
OSTP co-hosted a series of six panel discussions in collaboration with
the Center for American Progress,
- >-
existing human performance considered as a performance baseline for the
algorithm to meet pre-deployment,
and as a lifecycle minimum performance standard. Decision possibilities
resulting from performance testing
should include the possibility of not deploying the system.
Risk identification and mitigation. Before deployment, and in a
proactive and ongoing manner, poten
tial risks of the automated system should be identified and mitigated.
Identified risks should focus on the
potential for meaningful impact on people’s rights, opportunities, or
access and include those to impacted
communities that may not be direct users of the automated system, risks
resulting from purposeful misuse of
the system, and other concerns identified via the consultation process.
Assessment and, where possible, mea
surement of the impact of risks should be included and balanced such
that high impact risks receive attention
- >-
confidence that their rights, opportunities, and access as well as their
expectations about technologies are respected.
3
HOW THESE PRINCIPLES CAN MOVE INTO PRACTICE:
This section provides real-life examples of how these guiding principles
can become reality, through laws, policies, and practices.
It describes practical technical and sociotechnical approaches to
protecting rights, opportunities, and access.
The examples provided are not critiques or endorsements, but rather are
offered as illustrative cases to help
provide a concrete vision for actualizing the Blueprint for an AI Bill
of Rights. Effectively implementing these
processes require the cooperation of and collaboration among industry,
civil society, researchers, policymakers,
technologists, and the public.
14
- source_sentence: What are the potential impacts of automated systems on data privacy?
sentences:
- >-
https://arxiv.org/pdf/2305.17493v2
Smith, A. et al. (2023) Hallucination or Confabulation? Neuroanatomy as
metaphor in Large Language
Models. PLOS Digital Health.
https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000388
Soice, E. et al. (2023) Can large language models democratize access to
dual-use biotechnology? arXiv.
https://arxiv.org/abs/2306.03809
Solaiman, I. et al. (2023) The Gradient of Generative AI Release:
Methods and Considerations. arXiv.
https://arxiv.org/abs/2302.04844
Staab, R. et al. (2023) Beyond Memorization: Violating Privacy via
Inference With Large Language
Models. arXiv. https://arxiv.org/pdf/2310.07298
Stanford, S. et al. (2023) Whose Opinions Do Language Models Reflect?
arXiv.
https://arxiv.org/pdf/2303.17548
Strubell, E. et al. (2019) Energy and Policy Considerations for Deep
Learning in NLP. arXiv.
https://arxiv.org/pdf/1906.02243
The White House (2016) Circular No. A-130, Managing Information as a
Strategic Resource.
- >-
and data that are considered sensitive are understood to change over
time based on societal norms and context.
36
- |-
yet foreseeable, uses or impacts of automated systems. You should be
protected from inappropriate or irrelevant data use in the design, de
velopment, and deployment of automated systems, and from the
compounded harm of its reuse. Independent evaluation and report
ing that confirms that the system is safe and effective, including re
porting of steps taken to mitigate potential harms, should be per
formed and the results made public whenever possible.
15
- source_sentence: What is the AI Bill of Rights?
sentences:
- |-
BLUEPRINT FOR AN
AI BILL OF
RIGHTS
MAKING AUTOMATED
SYSTEMS WORK FOR
THE AMERICAN PEOPLE
OCTOBER 2022
- >-
APPENDIX
•
Julia Simon-Mishel, Supervising Attorney, Philadelphia Legal Assistance
•
Dr. Zachary Mahafza, Research & Data Analyst, Southern Poverty Law
Center
•
J. Khadijah Abdurahman, Tech Impact Network Research Fellow, AI Now
Institute, UCLA C2I1, and
UWA Law School
Panelists separately described the increasing scope of technology use in
providing for social welfare, including
in fraud detection, digital ID systems, and other methods focused on
improving efficiency and reducing cost.
However, various panelists individually cautioned that these systems may
reduce burden for government
agencies by increasing the burden and agency of people using and
interacting with these technologies.
Additionally, these systems can produce feedback loops and compounded
harm, collecting data from
communities and using it to reinforce inequality. Various panelists
suggested that these harms could be
mitigated by ensuring community input at the beginning of the design
process, providing ways to opt out of
- >-
safe, secure, and resilient; (e) understandable; (f ) responsible and
traceable; (g) regularly monitored; (h) transpar-
ent; and, (i) accountable. The Blueprint for an AI Bill of Rights is
consistent with the Executive Order.
Affected agencies across the federal government have released AI use
case inventories13 and are implementing
plans to bring those AI systems into compliance with the Executive Order
or retire them.
The law and policy landscape for motor vehicles shows that strong safety
regulations—and
measures to address harms when they occur—can enhance innovation in the
context of com-
plex technologies. Cars, like automated digital systems, comprise a
complex collection of components.
The National Highway Traffic Safety Administration,14 through its
rigorous standards and independent
evaluation, helps make sure vehicles on our roads are safe without
limiting manufacturers’ ability to
innovate.15 At the same time, rules of the road are implemented locally
to impose contextually appropriate
- source_sentence: >-
What are the best practices for benchmarking AI system security and
resilience?
sentences:
- >-
NOTICE &
EXPLANATION
WHAT SHOULD BE EXPECTED OF AUTOMATED SYSTEMS
The expectations for automated systems are meant to serve as a blueprint
for the development of additional
technical standards and practices that are tailored for particular
sectors and contexts.
An automated system should provide demonstrably clear, timely,
understandable, and accessible notice of use, and
explanations as to how and why a decision was made or an action was
taken by the system. These expectations are
explained below.
Provide clear, timely, understandable, and accessible notice of use and
explanations
Generally accessible plain language documentation. The entity
responsible for using the automated
system should ensure that documentation describing the overall system
(including any human components) is
public and easy to find. The documentation should describe, in plain
language, how the system works and how
- >-
content performance and impact, and work in collaboration with AI
Actors
experienced in user research and experience.
Human-AI Configuration
MG-4.1-004 Implement active learning techniques to identify instances
where the model fails
or produces unexpected outputs.
Confabulation
MG-4.1-005
Share transparency reports with internal and external stakeholders that
detail
steps taken to update the GAI system to enhance transparency and
accountability.
Human-AI Configuration; Harmful
Bias and Homogenization
MG-4.1-006
Track dataset modifications for provenance by monitoring data deletions,
rectification requests, and other changes that may impact the
verifiability of
content origins.
Information Integrity
- >-
33
MEASURE 2.7: AI system security and resilience – as identified in the MAP
function – are evaluated and documented.
Action ID
Suggested Action
GAI Risks
MS-2.7-001
Apply established security measures to: Assess likelihood and magnitude
of
vulnerabilities and threats such as backdoors, compromised dependencies,
data
breaches, eavesdropping, man-in-the-middle attacks, reverse
engineering,
autonomous agents, model theft or exposure of model weights, AI
inference,
bypass, extraction, and other baseline security concerns.
Data Privacy; Information Integrity;
Information Security; Value Chain
and Component Integration
MS-2.7-002
Benchmark GAI system security and resilience related to content
provenance
against industry standards and best practices. Compare GAI system
security
features and content provenance methods against industry
state-of-the-art.
Information Integrity; Information
Security
MS-2.7-003
Conduct user surveys to gather user satisfaction with the AI-generated
content
- source_sentence: >-
How should risks or trustworthiness characteristics that cannot be
measured be documented?
sentences:
- >-
MEASURE 1.1: Approaches and metrics for measurement of AI risks
enumerated during the MAP function are selected for
implementation starting with the most significant AI risks. The risks or
trustworthiness characteristics that will not – or cannot – be
measured are properly documented.
Action ID
Suggested Action
GAI Risks
MS-1.1-001 Employ methods to trace the origin and modifications of
digital content.
Information Integrity
MS-1.1-002
Integrate tools designed to analyze content provenance and detect data
anomalies, verify the authenticity of digital signatures, and identify
patterns
associated with misinformation or manipulation.
Information Integrity
MS-1.1-003
Disaggregate evaluation metrics by demographic factors to identify any
discrepancies in how content provenance mechanisms work across diverse
populations.
Information Integrity; Harmful
Bias and Homogenization
MS-1.1-004 Develop a suite of metrics to evaluate structured public
feedback exercises
- >-
AI technology can produce varied outputs in multiple modalities and
present many classes of user
interfaces. This leads to a broader set of AI Actors interacting with
GAI systems for widely differing
applications and contexts of use. These can include data labeling and
preparation, development of GAI
models, content moderation, code generation and review, text generation
and editing, image and video
generation, summarization, search, and chat. These activities can take
place within organizational
settings or in the public domain.
Organizations can restrict AI applications that cause harm, exceed
stated risk tolerances, or that conflict
with their tolerances or values. Governance tools and protocols that are
applied to other types of AI
systems can be applied to GAI systems. These plans and actions include:
• Accessibility and reasonable
accommodations
• AI actor credentials and qualifications
• Alignment to organizational values
• Auditing and assessment
• Change-management controls
- >-
existing human performance considered as a performance baseline for the
algorithm to meet pre-deployment,
and as a lifecycle minimum performance standard. Decision possibilities
resulting from performance testing
should include the possibility of not deploying the system.
Risk identification and mitigation. Before deployment, and in a
proactive and ongoing manner, poten
tial risks of the automated system should be identified and mitigated.
Identified risks should focus on the
potential for meaningful impact on people’s rights, opportunities, or
access and include those to impacted
communities that may not be direct users of the automated system, risks
resulting from purposeful misuse of
the system, and other concerns identified via the consultation process.
Assessment and, where possible, mea
surement of the impact of risks should be included and balanced such
that high impact risks receive attention
model-index:
- name: SentenceTransformer based on Snowflake/snowflake-arctic-embed-l
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: Unknown
type: unknown
metrics:
- type: cosine_accuracy@1
value: 0.2807017543859649
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.4649122807017544
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.5350877192982456
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.7192982456140351
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.2807017543859649
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.15497076023391812
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.10701754385964912
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.0719298245614035
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.2807017543859649
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.4649122807017544
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.5350877192982456
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.7192982456140351
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.4797086283187805
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.40644667223614606
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.423567506926962
name: Cosine Map@100
- type: dot_accuracy@1
value: 0.2807017543859649
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.4649122807017544
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.5350877192982456
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.7192982456140351
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.2807017543859649
name: Dot Precision@1
- type: dot_precision@3
value: 0.15497076023391812
name: Dot Precision@3
- type: dot_precision@5
value: 0.10701754385964912
name: Dot Precision@5
- type: dot_precision@10
value: 0.0719298245614035
name: Dot Precision@10
- type: dot_recall@1
value: 0.2807017543859649
name: Dot Recall@1
- type: dot_recall@3
value: 0.4649122807017544
name: Dot Recall@3
- type: dot_recall@5
value: 0.5350877192982456
name: Dot Recall@5
- type: dot_recall@10
value: 0.7192982456140351
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.4797086283187805
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.40644667223614606
name: Dot Mrr@10
- type: dot_map@100
value: 0.423567506926962
name: Dot Map@100
SentenceTransformer based on Snowflake/snowflake-arctic-embed-l
This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-l. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Snowflake/snowflake-arctic-embed-l
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 1024 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("jeevanions/finetuned_arctic-embedd-l")
# Run inference
sentences = [
'How should risks or trustworthiness characteristics that cannot be measured be documented?',
'MEASURE 1.1: Approaches and metrics for measurement of AI risks enumerated during the MAP function are selected for \nimplementation starting with the most significant AI risks. The risks or trustworthiness characteristics that will not – or cannot – be \nmeasured are properly documented. \nAction ID \nSuggested Action \nGAI Risks \nMS-1.1-001 Employ methods to trace the origin and modifications of digital content. \nInformation Integrity \nMS-1.1-002 \nIntegrate tools designed to analyze content provenance and detect data \nanomalies, verify the authenticity of digital signatures, and identify patterns \nassociated with misinformation or manipulation. \nInformation Integrity \nMS-1.1-003 \nDisaggregate evaluation metrics by demographic factors to identify any \ndiscrepancies in how content provenance mechanisms work across diverse \npopulations. \nInformation Integrity; Harmful \nBias and Homogenization \nMS-1.1-004 Develop a suite of metrics to evaluate structured public feedback exercises',
'existing human performance considered as a performance baseline for the algorithm to meet pre-deployment, \nand as a lifecycle minimum performance standard. Decision possibilities resulting from performance testing \nshould include the possibility of not deploying the system. \nRisk identification and mitigation. Before deployment, and in a proactive and ongoing manner, poten\xad\ntial risks of the automated system should be identified and mitigated. Identified risks should focus on the \npotential for meaningful impact on people’s rights, opportunities, or access and include those to impacted \ncommunities that may not be direct users of the automated system, risks resulting from purposeful misuse of \nthe system, and other concerns identified via the consultation process. Assessment and, where possible, mea\xad\nsurement of the impact of risks should be included and balanced such that high impact risks receive attention',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Evaluated with
InformationRetrievalEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.2807 |
cosine_accuracy@3 | 0.4649 |
cosine_accuracy@5 | 0.5351 |
cosine_accuracy@10 | 0.7193 |
cosine_precision@1 | 0.2807 |
cosine_precision@3 | 0.155 |
cosine_precision@5 | 0.107 |
cosine_precision@10 | 0.0719 |
cosine_recall@1 | 0.2807 |
cosine_recall@3 | 0.4649 |
cosine_recall@5 | 0.5351 |
cosine_recall@10 | 0.7193 |
cosine_ndcg@10 | 0.4797 |
cosine_mrr@10 | 0.4064 |
cosine_map@100 | 0.4236 |
dot_accuracy@1 | 0.2807 |
dot_accuracy@3 | 0.4649 |
dot_accuracy@5 | 0.5351 |
dot_accuracy@10 | 0.7193 |
dot_precision@1 | 0.2807 |
dot_precision@3 | 0.155 |
dot_precision@5 | 0.107 |
dot_precision@10 | 0.0719 |
dot_recall@1 | 0.2807 |
dot_recall@3 | 0.4649 |
dot_recall@5 | 0.5351 |
dot_recall@10 | 0.7193 |
dot_ndcg@10 | 0.4797 |
dot_mrr@10 | 0.4064 |
dot_map@100 | 0.4236 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 3,430 training samples
- Columns:
sentence_0
andsentence_1
- Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 type string string details - min: 8 tokens
- mean: 17.71 tokens
- max: 36 tokens
- min: 7 tokens
- mean: 172.72 tokens
- max: 356 tokens
- Samples:
sentence_0 sentence_1 What are the key steps to obtain input from stakeholder communities to identify unacceptable use in AI systems?
15
GV-1.3-004 Obtain input from stakeholder communities to identify unacceptable use, in
accordance with activities in the AI RMF Map function.
CBRN Information or Capabilities;
Obscene, Degrading, and/or
Abusive Content; Harmful Bias
and Homogenization; Dangerous,
Violent, or Hateful Content
GV-1.3-005
Maintain an updated hierarchy of identified and expected GAI risks connected to
contexts of GAI model advancement and use, potentially including specialized risk
levels for GAI systems that address issues such as model collapse and algorithmic
monoculture.
Harmful Bias and Homogenization
GV-1.3-006
Reevaluate organizational risk tolerances to account for unacceptable negative risk
(such as where significant negative impacts are imminent, severe harms are
actually occurring, or large-scale risks could occur); and broad GAI negative risks,
including: Immature safety or risk cultures related to AI and GAI design,
development and deployment, public information integrity risks, including impactsHow can organizations maintain an updated hierarchy of identified and expected GAI risks?
15
GV-1.3-004 Obtain input from stakeholder communities to identify unacceptable use, in
accordance with activities in the AI RMF Map function.
CBRN Information or Capabilities;
Obscene, Degrading, and/or
Abusive Content; Harmful Bias
and Homogenization; Dangerous,
Violent, or Hateful Content
GV-1.3-005
Maintain an updated hierarchy of identified and expected GAI risks connected to
contexts of GAI model advancement and use, potentially including specialized risk
levels for GAI systems that address issues such as model collapse and algorithmic
monoculture.
Harmful Bias and Homogenization
GV-1.3-006
Reevaluate organizational risk tolerances to account for unacceptable negative risk
(such as where significant negative impacts are imminent, severe harms are
actually occurring, or large-scale risks could occur); and broad GAI negative risks,
including: Immature safety or risk cultures related to AI and GAI design,
development and deployment, public information integrity risks, including impactsWhat are some examples of unacceptable uses of AI as identified by stakeholder communities?
15
GV-1.3-004 Obtain input from stakeholder communities to identify unacceptable use, in
accordance with activities in the AI RMF Map function.
CBRN Information or Capabilities;
Obscene, Degrading, and/or
Abusive Content; Harmful Bias
and Homogenization; Dangerous,
Violent, or Hateful Content
GV-1.3-005
Maintain an updated hierarchy of identified and expected GAI risks connected to
contexts of GAI model advancement and use, potentially including specialized risk
levels for GAI systems that address issues such as model collapse and algorithmic
monoculture.
Harmful Bias and Homogenization
GV-1.3-006
Reevaluate organizational risk tolerances to account for unacceptable negative risk
(such as where significant negative impacts are imminent, severe harms are
actually occurring, or large-scale risks could occur); and broad GAI negative risks,
including: Immature safety or risk cultures related to AI and GAI design,
development and deployment, public information integrity risks, including impacts - Loss:
MatryoshkaLoss
with these parameters:{ "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 768, 512, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1, 1 ], "n_dims_per_step": -1 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 1per_device_eval_batch_size
: 1num_train_epochs
: 5multi_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 1per_device_eval_batch_size
: 1per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 5max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseeval_use_gather_object
: Falsebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robin
Training Logs
Click to expand
Epoch | Step | Training Loss | cosine_map@100 |
---|---|---|---|
0.0146 | 50 | - | 0.4134 |
0.0292 | 100 | - | 0.4134 |
0.0437 | 150 | - | 0.4134 |
0.0583 | 200 | - | 0.4134 |
0.0729 | 250 | - | 0.4134 |
0.0875 | 300 | - | 0.4134 |
0.1020 | 350 | - | 0.4134 |
0.1166 | 400 | - | 0.4134 |
0.1312 | 450 | - | 0.4134 |
0.1458 | 500 | 0.0 | 0.4134 |
0.1603 | 550 | - | 0.4134 |
0.1749 | 600 | - | 0.4134 |
0.1895 | 650 | - | 0.4134 |
0.2041 | 700 | - | 0.4134 |
0.2187 | 750 | - | 0.4134 |
0.2332 | 800 | - | 0.4134 |
0.2478 | 850 | - | 0.4134 |
0.2624 | 900 | - | 0.4134 |
0.2770 | 950 | - | 0.4134 |
0.2915 | 1000 | 0.0 | 0.4134 |
0.3061 | 1050 | - | 0.4134 |
0.3207 | 1100 | - | 0.4134 |
0.3353 | 1150 | - | 0.4134 |
0.3499 | 1200 | - | 0.4134 |
0.3644 | 1250 | - | 0.4134 |
0.3790 | 1300 | - | 0.4134 |
0.3936 | 1350 | - | 0.4134 |
0.4082 | 1400 | - | 0.4134 |
0.4227 | 1450 | - | 0.4134 |
0.4373 | 1500 | 0.0 | 0.4134 |
0.4519 | 1550 | - | 0.4134 |
0.4665 | 1600 | - | 0.4134 |
0.4810 | 1650 | - | 0.4134 |
0.4956 | 1700 | - | 0.4134 |
0.5102 | 1750 | - | 0.4134 |
0.5248 | 1800 | - | 0.4134 |
0.5394 | 1850 | - | 0.4134 |
0.5539 | 1900 | - | 0.4134 |
0.5685 | 1950 | - | 0.4134 |
0.5831 | 2000 | 0.0 | 0.4135 |
0.5977 | 2050 | - | 0.4135 |
0.6122 | 2100 | - | 0.4135 |
0.6268 | 2150 | - | 0.4135 |
0.6414 | 2200 | - | 0.4135 |
0.6560 | 2250 | - | 0.4135 |
0.6706 | 2300 | - | 0.4135 |
0.6851 | 2350 | - | 0.4135 |
0.6997 | 2400 | - | 0.4135 |
0.7143 | 2450 | - | 0.4134 |
0.7289 | 2500 | 0.0 | 0.4134 |
0.7434 | 2550 | - | 0.4134 |
0.7580 | 2600 | - | 0.4134 |
0.7726 | 2650 | - | 0.4134 |
0.7872 | 2700 | - | 0.4134 |
0.8017 | 2750 | - | 0.4134 |
0.8163 | 2800 | - | 0.4134 |
0.8309 | 2850 | - | 0.4135 |
0.8455 | 2900 | - | 0.4135 |
0.8601 | 2950 | - | 0.4135 |
0.8746 | 3000 | 0.0 | 0.4135 |
0.8892 | 3050 | - | 0.4135 |
0.9038 | 3100 | - | 0.4135 |
0.9184 | 3150 | - | 0.4135 |
0.9329 | 3200 | - | 0.4135 |
0.9475 | 3250 | - | 0.4135 |
0.9621 | 3300 | - | 0.4135 |
0.9767 | 3350 | - | 0.4135 |
0.9913 | 3400 | - | 0.4135 |
1.0 | 3430 | - | 0.4135 |
1.0058 | 3450 | - | 0.4135 |
1.0204 | 3500 | 0.0 | 0.4135 |
1.0350 | 3550 | - | 0.4135 |
1.0496 | 3600 | - | 0.4135 |
1.0641 | 3650 | - | 0.4135 |
1.0787 | 3700 | - | 0.4135 |
1.0933 | 3750 | - | 0.4135 |
1.1079 | 3800 | - | 0.4135 |
1.1224 | 3850 | - | 0.4135 |
1.1370 | 3900 | - | 0.4179 |
1.1516 | 3950 | - | 0.4179 |
1.1662 | 4000 | 0.0 | 0.4179 |
1.1808 | 4050 | - | 0.4179 |
1.1953 | 4100 | - | 0.4179 |
1.2099 | 4150 | - | 0.4179 |
1.2245 | 4200 | - | 0.4179 |
1.2391 | 4250 | - | 0.4179 |
1.2536 | 4300 | - | 0.4179 |
1.2682 | 4350 | - | 0.4179 |
1.2828 | 4400 | - | 0.4179 |
1.2974 | 4450 | - | 0.4179 |
1.3120 | 4500 | 0.0 | 0.4179 |
1.3265 | 4550 | - | 0.4179 |
1.3411 | 4600 | - | 0.4179 |
1.3557 | 4650 | - | 0.4179 |
1.3703 | 4700 | - | 0.4179 |
1.3848 | 4750 | - | 0.4179 |
1.3994 | 4800 | - | 0.4179 |
1.4140 | 4850 | - | 0.4179 |
1.4286 | 4900 | - | 0.4179 |
1.4431 | 4950 | - | 0.4179 |
1.4577 | 5000 | 0.0 | 0.4179 |
1.4723 | 5050 | - | 0.4179 |
1.4869 | 5100 | - | 0.4179 |
1.5015 | 5150 | - | 0.4179 |
1.5160 | 5200 | - | 0.4179 |
1.5306 | 5250 | - | 0.4179 |
1.5452 | 5300 | - | 0.4179 |
1.5598 | 5350 | - | 0.4179 |
1.5743 | 5400 | - | 0.4179 |
1.5889 | 5450 | - | 0.4179 |
1.6035 | 5500 | 0.0 | 0.4179 |
1.6181 | 5550 | - | 0.4179 |
1.6327 | 5600 | - | 0.4179 |
1.6472 | 5650 | - | 0.4179 |
1.6618 | 5700 | - | 0.4179 |
1.6764 | 5750 | - | 0.4179 |
1.6910 | 5800 | - | 0.4179 |
1.7055 | 5850 | - | 0.4179 |
1.7201 | 5900 | - | 0.4179 |
1.7347 | 5950 | - | 0.4179 |
1.7493 | 6000 | 0.0 | 0.4179 |
1.7638 | 6050 | - | 0.4179 |
1.7784 | 6100 | - | 0.4179 |
1.7930 | 6150 | - | 0.4179 |
1.8076 | 6200 | - | 0.4179 |
1.8222 | 6250 | - | 0.4179 |
1.8367 | 6300 | - | 0.4179 |
1.8513 | 6350 | - | 0.4179 |
1.8659 | 6400 | - | 0.4179 |
1.8805 | 6450 | - | 0.4179 |
1.8950 | 6500 | 0.0 | 0.4179 |
1.9096 | 6550 | - | 0.4179 |
1.9242 | 6600 | - | 0.4179 |
1.9388 | 6650 | - | 0.4179 |
1.9534 | 6700 | - | 0.4179 |
1.9679 | 6750 | - | 0.4179 |
1.9825 | 6800 | - | 0.4179 |
1.9971 | 6850 | - | 0.4179 |
2.0 | 6860 | - | 0.4179 |
2.0117 | 6900 | - | 0.4179 |
2.0262 | 6950 | - | 0.4179 |
2.0408 | 7000 | 0.0 | 0.4179 |
2.0554 | 7050 | - | 0.4179 |
2.0700 | 7100 | - | 0.4179 |
2.0845 | 7150 | - | 0.4179 |
2.0991 | 7200 | - | 0.4179 |
2.1137 | 7250 | - | 0.4179 |
2.1283 | 7300 | - | 0.4179 |
2.1429 | 7350 | - | 0.4179 |
2.1574 | 7400 | - | 0.4179 |
2.1720 | 7450 | - | 0.4179 |
2.1866 | 7500 | 0.0 | 0.4179 |
2.2012 | 7550 | - | 0.4179 |
2.2157 | 7600 | - | 0.4179 |
2.2303 | 7650 | - | 0.4179 |
2.2449 | 7700 | - | 0.4179 |
2.2595 | 7750 | - | 0.4179 |
2.2741 | 7800 | - | 0.4179 |
2.2886 | 7850 | - | 0.4179 |
2.3032 | 7900 | - | 0.4179 |
2.3178 | 7950 | - | 0.4179 |
2.3324 | 8000 | 0.0 | 0.4179 |
2.3469 | 8050 | - | 0.4179 |
2.3615 | 8100 | - | 0.4179 |
2.3761 | 8150 | - | 0.4179 |
2.3907 | 8200 | - | 0.4179 |
2.4052 | 8250 | - | 0.4179 |
2.4198 | 8300 | - | 0.4179 |
2.4344 | 8350 | - | 0.4179 |
2.4490 | 8400 | - | 0.4179 |
2.4636 | 8450 | - | 0.4179 |
2.4781 | 8500 | 0.0 | 0.4179 |
2.4927 | 8550 | - | 0.4179 |
2.5073 | 8600 | - | 0.4179 |
2.5219 | 8650 | - | 0.4179 |
2.5364 | 8700 | - | 0.4179 |
2.5510 | 8750 | - | 0.4179 |
2.5656 | 8800 | - | 0.4179 |
2.5802 | 8850 | - | 0.4179 |
2.5948 | 8900 | - | 0.4179 |
2.6093 | 8950 | - | 0.4179 |
2.6239 | 9000 | 0.0 | 0.4179 |
2.6385 | 9050 | - | 0.4179 |
2.6531 | 9100 | - | 0.4179 |
2.6676 | 9150 | - | 0.4179 |
2.6822 | 9200 | - | 0.4179 |
2.6968 | 9250 | - | 0.4223 |
2.7114 | 9300 | - | 0.4223 |
2.7259 | 9350 | - | 0.4223 |
2.7405 | 9400 | - | 0.4223 |
2.7551 | 9450 | - | 0.4223 |
2.7697 | 9500 | 0.0 | 0.4223 |
2.7843 | 9550 | - | 0.4223 |
2.7988 | 9600 | - | 0.4223 |
2.8134 | 9650 | - | 0.4223 |
2.8280 | 9700 | - | 0.4223 |
2.8426 | 9750 | - | 0.4223 |
2.8571 | 9800 | - | 0.4223 |
2.8717 | 9850 | - | 0.4223 |
2.8863 | 9900 | - | 0.4223 |
2.9009 | 9950 | - | 0.4223 |
2.9155 | 10000 | 0.0 | 0.4223 |
2.9300 | 10050 | - | 0.4223 |
2.9446 | 10100 | - | 0.4223 |
2.9592 | 10150 | - | 0.4223 |
2.9738 | 10200 | - | 0.4223 |
2.9883 | 10250 | - | 0.4223 |
3.0 | 10290 | - | 0.4223 |
3.0029 | 10300 | - | 0.4223 |
3.0175 | 10350 | - | 0.4223 |
3.0321 | 10400 | - | 0.4223 |
3.0466 | 10450 | - | 0.4223 |
3.0612 | 10500 | 0.0 | 0.4223 |
3.0758 | 10550 | - | 0.4223 |
3.0904 | 10600 | - | 0.4223 |
3.1050 | 10650 | - | 0.4223 |
3.1195 | 10700 | - | 0.4223 |
3.1341 | 10750 | - | 0.4223 |
3.1487 | 10800 | - | 0.4223 |
3.1633 | 10850 | - | 0.4223 |
3.1778 | 10900 | - | 0.4223 |
3.1924 | 10950 | - | 0.4223 |
3.2070 | 11000 | 0.0 | 0.4223 |
3.2216 | 11050 | - | 0.4223 |
3.2362 | 11100 | - | 0.4223 |
3.2507 | 11150 | - | 0.4223 |
3.2653 | 11200 | - | 0.4223 |
3.2799 | 11250 | - | 0.4223 |
3.2945 | 11300 | - | 0.4223 |
3.3090 | 11350 | - | 0.4223 |
3.3236 | 11400 | - | 0.4223 |
3.3382 | 11450 | - | 0.4223 |
3.3528 | 11500 | 0.0 | 0.4223 |
3.3673 | 11550 | - | 0.4223 |
3.3819 | 11600 | - | 0.4223 |
3.3965 | 11650 | - | 0.4223 |
3.4111 | 11700 | - | 0.4223 |
3.4257 | 11750 | - | 0.4223 |
3.4402 | 11800 | - | 0.4223 |
3.4548 | 11850 | - | 0.4223 |
3.4694 | 11900 | - | 0.4223 |
3.4840 | 11950 | - | 0.4223 |
3.4985 | 12000 | 0.0 | 0.4223 |
3.5131 | 12050 | - | 0.4223 |
3.5277 | 12100 | - | 0.4223 |
3.5423 | 12150 | - | 0.4223 |
3.5569 | 12200 | - | 0.4223 |
3.5714 | 12250 | - | 0.4223 |
3.5860 | 12300 | - | 0.4223 |
3.6006 | 12350 | - | 0.4223 |
3.6152 | 12400 | - | 0.4223 |
3.6297 | 12450 | - | 0.4223 |
3.6443 | 12500 | 0.0 | 0.4223 |
3.6589 | 12550 | - | 0.4223 |
3.6735 | 12600 | - | 0.4223 |
3.6880 | 12650 | - | 0.4223 |
3.7026 | 12700 | - | 0.4223 |
3.7172 | 12750 | - | 0.4223 |
3.7318 | 12800 | - | 0.4223 |
3.7464 | 12850 | - | 0.4223 |
3.7609 | 12900 | - | 0.4223 |
3.7755 | 12950 | - | 0.4223 |
3.7901 | 13000 | 0.0 | 0.4223 |
3.8047 | 13050 | - | 0.4223 |
3.8192 | 13100 | - | 0.4226 |
3.8338 | 13150 | - | 0.4226 |
3.8484 | 13200 | - | 0.4226 |
3.8630 | 13250 | - | 0.4226 |
3.8776 | 13300 | - | 0.4226 |
3.8921 | 13350 | - | 0.4226 |
3.9067 | 13400 | - | 0.4226 |
3.9213 | 13450 | - | 0.4226 |
3.9359 | 13500 | 0.0 | 0.4226 |
3.9504 | 13550 | - | 0.4226 |
3.9650 | 13600 | - | 0.4226 |
3.9796 | 13650 | - | 0.4226 |
3.9942 | 13700 | - | 0.4226 |
4.0 | 13720 | - | 0.4226 |
4.0087 | 13750 | - | 0.4226 |
4.0233 | 13800 | - | 0.4226 |
4.0379 | 13850 | - | 0.4226 |
4.0525 | 13900 | - | 0.4226 |
4.0671 | 13950 | - | 0.4226 |
4.0816 | 14000 | 0.0 | 0.4226 |
4.0962 | 14050 | - | 0.4226 |
4.1108 | 14100 | - | 0.4226 |
4.1254 | 14150 | - | 0.4226 |
4.1399 | 14200 | - | 0.4226 |
4.1545 | 14250 | - | 0.4226 |
4.1691 | 14300 | - | 0.4226 |
4.1837 | 14350 | - | 0.4226 |
4.1983 | 14400 | - | 0.4226 |
4.2128 | 14450 | - | 0.4226 |
4.2274 | 14500 | 0.0 | 0.4226 |
4.2420 | 14550 | - | 0.4226 |
4.2566 | 14600 | - | 0.4226 |
4.2711 | 14650 | - | 0.4226 |
4.2857 | 14700 | - | 0.4226 |
4.3003 | 14750 | - | 0.4226 |
4.3149 | 14800 | - | 0.4226 |
4.3294 | 14850 | - | 0.4226 |
4.3440 | 14900 | - | 0.4226 |
4.3586 | 14950 | - | 0.4226 |
4.3732 | 15000 | 0.0 | 0.4226 |
4.3878 | 15050 | - | 0.4226 |
4.4023 | 15100 | - | 0.4226 |
4.4169 | 15150 | - | 0.4226 |
4.4315 | 15200 | - | 0.4226 |
4.4461 | 15250 | - | 0.4226 |
4.4606 | 15300 | - | 0.4226 |
4.4752 | 15350 | - | 0.4226 |
4.4898 | 15400 | - | 0.4226 |
4.5044 | 15450 | - | 0.4226 |
4.5190 | 15500 | 0.0 | 0.4226 |
4.5335 | 15550 | - | 0.4226 |
4.5481 | 15600 | - | 0.4226 |
4.5627 | 15650 | - | 0.4226 |
4.5773 | 15700 | - | 0.4226 |
4.5918 | 15750 | - | 0.4226 |
4.6064 | 15800 | - | 0.4226 |
4.6210 | 15850 | - | 0.4226 |
4.6356 | 15900 | - | 0.4226 |
4.6501 | 15950 | - | 0.4226 |
4.6647 | 16000 | 0.0 | 0.4226 |
4.6793 | 16050 | - | 0.4226 |
4.6939 | 16100 | - | 0.4226 |
4.7085 | 16150 | - | 0.4226 |
4.7230 | 16200 | - | 0.4226 |
4.7376 | 16250 | - | 0.4226 |
4.7522 | 16300 | - | 0.4226 |
4.7668 | 16350 | - | 0.4226 |
4.7813 | 16400 | - | 0.4226 |
4.7959 | 16450 | - | 0.4226 |
4.8105 | 16500 | 0.0 | 0.4226 |
4.8251 | 16550 | - | 0.4226 |
4.8397 | 16600 | - | 0.4226 |
4.8542 | 16650 | - | 0.4226 |
4.8688 | 16700 | - | 0.4226 |
4.8834 | 16750 | - | 0.4226 |
4.8980 | 16800 | - | 0.4226 |
4.9125 | 16850 | - | 0.4226 |
4.9271 | 16900 | - | 0.4226 |
4.9417 | 16950 | - | 0.4226 |
4.9563 | 17000 | 0.0 | 0.4226 |
4.9708 | 17050 | - | 0.4226 |
4.9854 | 17100 | - | 0.4226 |
5.0 | 17150 | - | 0.4226 |
0.0146 | 50 | - | 0.4226 |
0.0292 | 100 | - | 0.4226 |
0.0437 | 150 | - | 0.4226 |
0.0583 | 200 | - | 0.4226 |
0.0729 | 250 | - | 0.4226 |
0.0875 | 300 | - | 0.4226 |
0.1020 | 350 | - | 0.4226 |
0.1166 | 400 | - | 0.4226 |
0.1312 | 450 | - | 0.4226 |
0.1458 | 500 | 0.0 | 0.4226 |
0.1603 | 550 | - | 0.4226 |
0.1749 | 600 | - | 0.4226 |
0.1895 | 650 | - | 0.4226 |
0.2041 | 700 | - | 0.4226 |
0.2187 | 750 | - | 0.4226 |
0.2332 | 800 | - | 0.4226 |
0.2478 | 850 | - | 0.4226 |
0.2624 | 900 | - | 0.4226 |
0.2770 | 950 | - | 0.4226 |
0.2915 | 1000 | 0.0 | 0.4227 |
0.3061 | 1050 | - | 0.4227 |
0.3207 | 1100 | - | 0.4227 |
0.3353 | 1150 | - | 0.4227 |
0.3499 | 1200 | - | 0.4227 |
0.3644 | 1250 | - | 0.4227 |
0.3790 | 1300 | - | 0.4227 |
0.3936 | 1350 | - | 0.4227 |
0.4082 | 1400 | - | 0.4227 |
0.4227 | 1450 | - | 0.4227 |
0.4373 | 1500 | 0.0 | 0.4227 |
0.4519 | 1550 | - | 0.4227 |
0.4665 | 1600 | - | 0.4227 |
0.4810 | 1650 | - | 0.4227 |
0.4956 | 1700 | - | 0.4227 |
0.5102 | 1750 | - | 0.4227 |
0.5248 | 1800 | - | 0.4227 |
0.5394 | 1850 | - | 0.4227 |
0.5539 | 1900 | - | 0.4227 |
0.5685 | 1950 | - | 0.4227 |
0.5831 | 2000 | 0.0 | 0.4227 |
0.5977 | 2050 | - | 0.4227 |
0.6122 | 2100 | - | 0.4227 |
0.6268 | 2150 | - | 0.4227 |
0.6414 | 2200 | - | 0.4227 |
0.6560 | 2250 | - | 0.4227 |
0.6706 | 2300 | - | 0.4227 |
0.6851 | 2350 | - | 0.4227 |
0.6997 | 2400 | - | 0.4227 |
0.7143 | 2450 | - | 0.4227 |
0.7289 | 2500 | 0.0 | 0.4227 |
0.7434 | 2550 | - | 0.4227 |
0.7580 | 2600 | - | 0.4227 |
0.7726 | 2650 | - | 0.4227 |
0.7872 | 2700 | - | 0.4227 |
0.8017 | 2750 | - | 0.4227 |
0.8163 | 2800 | - | 0.4227 |
0.8309 | 2850 | - | 0.4227 |
0.8455 | 2900 | - | 0.4227 |
0.8601 | 2950 | - | 0.4227 |
0.8746 | 3000 | 0.0 | 0.4227 |
0.8892 | 3050 | - | 0.4227 |
0.9038 | 3100 | - | 0.4227 |
0.9184 | 3150 | - | 0.4227 |
0.9329 | 3200 | - | 0.4227 |
0.9475 | 3250 | - | 0.4227 |
0.9621 | 3300 | - | 0.4227 |
0.9767 | 3350 | - | 0.4227 |
0.9913 | 3400 | - | 0.4227 |
1.0 | 3430 | - | 0.4227 |
1.0058 | 3450 | - | 0.4227 |
1.0204 | 3500 | 0.0 | 0.4227 |
1.0350 | 3550 | - | 0.4227 |
1.0496 | 3600 | - | 0.4227 |
1.0641 | 3650 | - | 0.4227 |
1.0787 | 3700 | - | 0.4227 |
1.0933 | 3750 | - | 0.4227 |
1.1079 | 3800 | - | 0.4227 |
1.1224 | 3850 | - | 0.4227 |
1.1370 | 3900 | - | 0.4227 |
1.1516 | 3950 | - | 0.4227 |
1.1662 | 4000 | 0.0 | 0.4227 |
1.1808 | 4050 | - | 0.4227 |
1.1953 | 4100 | - | 0.4227 |
1.2099 | 4150 | - | 0.4231 |
1.2245 | 4200 | - | 0.4231 |
1.2391 | 4250 | - | 0.4231 |
1.2536 | 4300 | - | 0.4231 |
1.2682 | 4350 | - | 0.4231 |
1.2828 | 4400 | - | 0.4231 |
1.2974 | 4450 | - | 0.4231 |
1.3120 | 4500 | 0.0 | 0.4231 |
1.3265 | 4550 | - | 0.4231 |
1.3411 | 4600 | - | 0.4231 |
1.3557 | 4650 | - | 0.4232 |
1.3703 | 4700 | - | 0.4232 |
1.3848 | 4750 | - | 0.4232 |
1.3994 | 4800 | - | 0.4232 |
1.4140 | 4850 | - | 0.4232 |
1.4286 | 4900 | - | 0.4232 |
1.4431 | 4950 | - | 0.4232 |
1.4577 | 5000 | 0.0 | 0.4232 |
1.4723 | 5050 | - | 0.4232 |
1.4869 | 5100 | - | 0.4232 |
1.5015 | 5150 | - | 0.4232 |
1.5160 | 5200 | - | 0.4232 |
1.5306 | 5250 | - | 0.4232 |
1.5452 | 5300 | - | 0.4233 |
1.5598 | 5350 | - | 0.4233 |
1.5743 | 5400 | - | 0.4233 |
1.5889 | 5450 | - | 0.4233 |
1.6035 | 5500 | 0.0 | 0.4233 |
1.6181 | 5550 | - | 0.4233 |
1.6327 | 5600 | - | 0.4233 |
1.6472 | 5650 | - | 0.4233 |
1.6618 | 5700 | - | 0.4233 |
1.6764 | 5750 | - | 0.4233 |
1.6910 | 5800 | - | 0.4233 |
1.7055 | 5850 | - | 0.4233 |
1.7201 | 5900 | - | 0.4233 |
1.7347 | 5950 | - | 0.4233 |
1.7493 | 6000 | 0.0 | 0.4233 |
1.7638 | 6050 | - | 0.4234 |
1.7784 | 6100 | - | 0.4234 |
1.7930 | 6150 | - | 0.4234 |
1.8076 | 6200 | - | 0.4234 |
1.8222 | 6250 | - | 0.4234 |
1.8367 | 6300 | - | 0.4234 |
1.8513 | 6350 | - | 0.4234 |
1.8659 | 6400 | - | 0.4234 |
1.8805 | 6450 | - | 0.4234 |
1.8950 | 6500 | 0.0 | 0.4234 |
1.9096 | 6550 | - | 0.4234 |
1.9242 | 6600 | - | 0.4234 |
1.9388 | 6650 | - | 0.4234 |
1.9534 | 6700 | - | 0.4234 |
1.9679 | 6750 | - | 0.4234 |
1.9825 | 6800 | - | 0.4234 |
1.9971 | 6850 | - | 0.4234 |
2.0 | 6860 | - | 0.4234 |
2.0117 | 6900 | - | 0.4234 |
2.0262 | 6950 | - | 0.4234 |
2.0408 | 7000 | 0.0 | 0.4234 |
2.0554 | 7050 | - | 0.4234 |
2.0700 | 7100 | - | 0.4234 |
2.0845 | 7150 | - | 0.4234 |
2.0991 | 7200 | - | 0.4234 |
2.1137 | 7250 | - | 0.4234 |
2.1283 | 7300 | - | 0.4234 |
2.1429 | 7350 | - | 0.4234 |
2.1574 | 7400 | - | 0.4234 |
2.1720 | 7450 | - | 0.4234 |
2.1866 | 7500 | 0.0 | 0.4234 |
2.2012 | 7550 | - | 0.4234 |
2.2157 | 7600 | - | 0.4234 |
2.2303 | 7650 | - | 0.4234 |
2.2449 | 7700 | - | 0.4234 |
2.2595 | 7750 | - | 0.4234 |
2.2741 | 7800 | - | 0.4234 |
2.2886 | 7850 | - | 0.4234 |
2.3032 | 7900 | - | 0.4234 |
2.3178 | 7950 | - | 0.4234 |
2.3324 | 8000 | 0.0 | 0.4234 |
2.3469 | 8050 | - | 0.4234 |
2.3615 | 8100 | - | 0.4234 |
2.3761 | 8150 | - | 0.4234 |
2.3907 | 8200 | - | 0.4234 |
2.4052 | 8250 | - | 0.4234 |
2.4198 | 8300 | - | 0.4234 |
2.4344 | 8350 | - | 0.4234 |
2.4490 | 8400 | - | 0.4234 |
2.4636 | 8450 | - | 0.4234 |
2.4781 | 8500 | 0.0 | 0.4234 |
2.4927 | 8550 | - | 0.4234 |
2.5073 | 8600 | - | 0.4234 |
2.5219 | 8650 | - | 0.4234 |
2.5364 | 8700 | - | 0.4234 |
2.5510 | 8750 | - | 0.4234 |
2.5656 | 8800 | - | 0.4234 |
2.5802 | 8850 | - | 0.4234 |
2.5948 | 8900 | - | 0.4234 |
2.6093 | 8950 | - | 0.4234 |
2.6239 | 9000 | 0.0 | 0.4234 |
2.6385 | 9050 | - | 0.4234 |
2.6531 | 9100 | - | 0.4234 |
2.6676 | 9150 | - | 0.4234 |
2.6822 | 9200 | - | 0.4234 |
2.6968 | 9250 | - | 0.4234 |
2.7114 | 9300 | - | 0.4234 |
2.7259 | 9350 | - | 0.4234 |
2.7405 | 9400 | - | 0.4234 |
2.7551 | 9450 | - | 0.4234 |
2.7697 | 9500 | 0.0 | 0.4234 |
2.7843 | 9550 | - | 0.4234 |
2.7988 | 9600 | - | 0.4234 |
2.8134 | 9650 | - | 0.4234 |
2.8280 | 9700 | - | 0.4234 |
2.8426 | 9750 | - | 0.4234 |
2.8571 | 9800 | - | 0.4234 |
2.8717 | 9850 | - | 0.4234 |
2.8863 | 9900 | - | 0.4234 |
2.9009 | 9950 | - | 0.4234 |
2.9155 | 10000 | 0.0 | 0.4234 |
2.9300 | 10050 | - | 0.4234 |
2.9446 | 10100 | - | 0.4234 |
2.9592 | 10150 | - | 0.4234 |
2.9738 | 10200 | - | 0.4234 |
2.9883 | 10250 | - | 0.4234 |
3.0 | 10290 | - | 0.4234 |
3.0029 | 10300 | - | 0.4234 |
3.0175 | 10350 | - | 0.4234 |
3.0321 | 10400 | - | 0.4234 |
3.0466 | 10450 | - | 0.4234 |
3.0612 | 10500 | 0.0 | 0.4234 |
3.0758 | 10550 | - | 0.4234 |
3.0904 | 10600 | - | 0.4234 |
3.1050 | 10650 | - | 0.4234 |
3.1195 | 10700 | - | 0.4234 |
3.1341 | 10750 | - | 0.4234 |
3.1487 | 10800 | - | 0.4234 |
3.1633 | 10850 | - | 0.4234 |
3.1778 | 10900 | - | 0.4234 |
3.1924 | 10950 | - | 0.4234 |
3.2070 | 11000 | 0.0 | 0.4234 |
3.2216 | 11050 | - | 0.4234 |
3.2362 | 11100 | - | 0.4234 |
3.2507 | 11150 | - | 0.4234 |
3.2653 | 11200 | - | 0.4234 |
3.2799 | 11250 | - | 0.4234 |
3.2945 | 11300 | - | 0.4234 |
3.3090 | 11350 | - | 0.4234 |
3.3236 | 11400 | - | 0.4234 |
3.3382 | 11450 | - | 0.4234 |
3.3528 | 11500 | 0.0 | 0.4234 |
3.3673 | 11550 | - | 0.4234 |
3.3819 | 11600 | - | 0.4234 |
3.3965 | 11650 | - | 0.4234 |
3.4111 | 11700 | - | 0.4234 |
3.4257 | 11750 | - | 0.4234 |
3.4402 | 11800 | - | 0.4234 |
3.4548 | 11850 | - | 0.4235 |
3.4694 | 11900 | - | 0.4235 |
3.4840 | 11950 | - | 0.4235 |
3.4985 | 12000 | 0.0 | 0.4235 |
3.5131 | 12050 | - | 0.4235 |
3.5277 | 12100 | - | 0.4235 |
3.5423 | 12150 | - | 0.4235 |
3.5569 | 12200 | - | 0.4235 |
3.5714 | 12250 | - | 0.4235 |
3.5860 | 12300 | - | 0.4235 |
3.6006 | 12350 | - | 0.4235 |
3.6152 | 12400 | - | 0.4235 |
3.6297 | 12450 | - | 0.4235 |
3.6443 | 12500 | 0.0 | 0.4235 |
3.6589 | 12550 | - | 0.4235 |
3.6735 | 12600 | - | 0.4235 |
3.6880 | 12650 | - | 0.4235 |
3.7026 | 12700 | - | 0.4235 |
3.7172 | 12750 | - | 0.4235 |
3.7318 | 12800 | - | 0.4235 |
3.7464 | 12850 | - | 0.4235 |
3.7609 | 12900 | - | 0.4235 |
3.7755 | 12950 | - | 0.4235 |
3.7901 | 13000 | 0.0 | 0.4235 |
3.8047 | 13050 | - | 0.4235 |
3.8192 | 13100 | - | 0.4235 |
3.8338 | 13150 | - | 0.4235 |
3.8484 | 13200 | - | 0.4235 |
3.8630 | 13250 | - | 0.4235 |
3.8776 | 13300 | - | 0.4235 |
3.8921 | 13350 | - | 0.4235 |
3.9067 | 13400 | - | 0.4235 |
3.9213 | 13450 | - | 0.4235 |
3.9359 | 13500 | 0.0 | 0.4235 |
3.9504 | 13550 | - | 0.4235 |
3.9650 | 13600 | - | 0.4235 |
3.9796 | 13650 | - | 0.4235 |
3.9942 | 13700 | - | 0.4235 |
4.0 | 13720 | - | 0.4235 |
4.0087 | 13750 | - | 0.4235 |
4.0233 | 13800 | - | 0.4235 |
4.0379 | 13850 | - | 0.4235 |
4.0525 | 13900 | - | 0.4235 |
4.0671 | 13950 | - | 0.4235 |
4.0816 | 14000 | 0.0 | 0.4236 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.1.1
- Transformers: 4.44.2
- PyTorch: 2.4.1+cu121
- Accelerate: 0.34.2
- Datasets: 2.14.4
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}