pb-ds1-48K-philsim / README.md
dbourget's picture
Add new SentenceTransformer model.
2db6871 verified
metadata
base_model: dbourget/pb-ds1-48K
datasets: []
language: []
library_name: sentence-transformers
metrics:
  - pearson_cosine
  - spearman_cosine
  - pearson_manhattan
  - spearman_manhattan
  - pearson_euclidean
  - spearman_euclidean
  - pearson_dot
  - spearman_dot
  - pearson_max
  - spearman_max
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:106810
  - loss:CosineSimilarityLoss
widget:
  - source_sentence: >-
      In  The Law of Civilization and Decay,  Brooks provides a detailed look at
      the rise and fall of civilizations, offering a critical perspective on the
      impact of capitalism. As societies become prosperous, their pursuit of
      wealth ultimately leads to their own downfall as greed takes over.
    sentences:
      - >-
        Patrick Todd's The Open Future argues that all future contingent
        statements, such as 'It will rain tomorrow', are inherently false.
      - >-
        If propositions are made true in virtue of corresponding to facts, then
        what are the truth-makers of true negative propositions such as ‘The
        apple is not red’? Russell argued that there must be negative facts to
        account for what makes true negative propositions true and false
        positive propositions false. Others, more parsimonious in their
        ontological commitments, have attempted to avoid them. Wittgenstein
        rejected them since he was loath to think that the sign for negation
        referred to a negative element in a fact. A contemporary of Russell’s,
        Raphael Demos, attempted to eliminate them by appealing to
        ‘incompatibility’ facts. More recently, Armstrong has appealed to the
        totality of positive facts as the ground of the truth of true negative
        propositions. Oaklander and Miracchi have suggested that the absence or
        non-existence of the positive fact (which is not itself a further fact)
        is the basis of a positive proposition being false and therefore of the
        truth of its negation.
      - >-
        The Law of Civilization and Decay is an overview of history,
        articulating Brooks' critical view of capitalism. A civilization grows
        wealthy, and then its wealth causes it to crumble upon itself due to
        greed.
  - source_sentence: >-
      It is generally accepted that the development of the modern sciences is
      rooted in experiment. Yet for a long time, experimentation did not occupy
      a prominent role, neither in philosophy nor in history of science. With
      the ‘practical turn’ in studying the sciences and their history, this has
      begun to change. This paper is concerned with systems and cultures of
      experimentation and the consistencies that are generated within such
      systems and cultures. The first part of the paper exposes the forms of
      historical and structural coherence that characterize the experimental
      exploration of epistemic objects. In the second part, a particular
      experimental culture in the life sciences is briefly described as an
      example. A survey will be given of what it means and what it takes to
      analyze biological functions in the test tube
    sentences:
      - >-
        Experimentation has long been overlooked in the study of science, but
        with a new focus on practical aspects, this is starting to change. This
        paper explores the systems and cultures of experimentation and the
        patterns that emerge within them. The first part discusses the
        historical and structural coherence of experimental exploration. The
        second part provides a brief overview of an experimental culture in the
        life sciences. The paper concludes with a discussion on analyzing
        biological functions in the test tube.
      - >-
        Hintikka and Mutanen have introduced Trail-And-Error machines as a new
        way to think about computation, expanding on the traditional Turing
        machine model. This innovation opens up new possibilities in the field
        of computation theory.
      - >-
        As Allaire and Firsirotu (1984) pointed out over a decade ago, the
        concept of culture seemed to be sliding inexorably into a superficial
        explanatory pool that promised everything and nothing. However, since
        then, some sophisticated and interesting theoretical developments have
        prevented drowning in the pool of superficiality and hence theoretical
        redundancy. The purpose of this article is to build upon such
        theoretical developments and to introduce an approach that maintains
        that culture can be theorized in the same way as structure, possessing
        irreducible powers and properties that predispose organizational actors
        towards specific courses of action. The morphogenetic approach is the
        methodological complement of transcendental realism, providing
        explanatory leverage on the conditions that maintain for cultural change
        or stability.
  - source_sentence: >-
      This chapter examines three approaches to applied political and legal
      philosophy: Standard activism is primarily addressed to other
      philosophers, adopts an indirect and coincidental role in creating change,
      and counts articulating sound arguments as success. Extreme activism, in
      contrast, is a form of applied philosophy directly addressed to
      policy-makers, with the goal of bringing about a particular outcome, and
      measures success in terms of whether it makes a direct causal contribution
      to that goal. Finally, conceptual activism (like standard activism),
      primarily targets an audience of fellow philosophers, bears a distant,
      non-direct, relation to a desired outcome, and counts success in terms of
      whether it encourages a particular understanding and adoption of the
      concepts under examination.
    sentences:
      - >-
        John Rawls’ resistance to any kind of global egalitarian principle has
        seemed strange and unconvincing to many commentators, including those
        generally supportive of Rawls’ project. His rejection of a global
        egalitarian principle seems to rely on an assumption that states are
        economically bounded and separate from one another, which is not an
        accurate portrayal of economic relations among states in our globalised
        world. In this article, I examine the implications of the domestic
        theory of justice as fairness to argue that Rawls has good reason to
        insist on economically bounded states. I argue that certain central
        features of the contemporary global economy, particularly the free
        movement of capital across borders, undermine the distributional
        autonomy required for states to realise Rawls’ principles of justice,
        and the domestic theory thus requires a certain degree of economic
        separation among states prior to the convening of the international
        original position. Given this, I defend Rawls’ reluctance to endorse a
        global egalitarian principle and defend a policy regime of international
        capital controls, to restore distributional autonomy and make the
        realisation of the principles of justice as fairness possible.
      - >-
        Bibliography of the writings by Hilary Putnam: 16 books, 198 articles,
        10 translations into German (up to 1994).
      - >-
        The jurisprudence under international human rights treaties has had a
        considerable impact across countries. Known for addressing complex
        agendas, the work of expert bodies under the treaties has been credited
        and relied upon for filling the gaps in the realization of several
        objectives, including the peace and security agenda.  In 1982, the Human
        Rights Committee (ICCPR), in a General Comment observed that “states
        have the supreme duty to prevent wars, acts of genocide and other acts
        of mass violence ... Every effort … to avert the danger of war,
        especially thermonuclear war, and to strengthen international peace and
        security would constitute the most important condition and guarantee for
        the safeguarding of the right to life.” Over the years, all treaty
        bodies have contributed in this direction, endorsing peace and security
        so as “to protect people against direct and structural violence … as
        systemic problems and not merely as isolated incidents …”. A closer look
        at the jurisprudence on peace and security, emanating from treaty
        monitoring mechanisms including state periodic reports, interpretive
        statements, the individual communications procedure, and others, reveals
        its distinctive nature
  - source_sentence: >-
      Autonomist accounts of cognitive science suggest that cognitive model
      building and theory construction (can or should) proceed independently of
      findings in neuroscience. Common functionalist justifications of autonomy
      rely on there being relatively few constraints between neural structure
      and cognitive function (e.g., Weiskopf, 2011). In contrast, an integrative
      mechanistic perspective stresses the mutual constraining of structure and
      function (e.g., Piccinini & Craver, 2011; Povich, 2015). In this paper, I
      show how model-based cognitive neuroscience (MBCN) epitomizes the
      integrative mechanistic perspective and concentrates the most
      revolutionary elements of the cognitive neuroscience revolution (Boone &
      Piccinini, 2016). I also show how the prominent subset account of
      functional realization supports the integrative mechanistic perspective I
      take on MBCN and use it to clarify the intralevel and interlevel
      components of integration.
    sentences:
      - >-
        Fictional truth, or truth in fiction/pretense, has been the object of
        extended scrutiny among philosophers and logicians in recent decades.
        Comparatively little attention, however, has been paid to its
        inferential relationships with time and with certain deliberate and
        contingent human activities, namely, the creation of fictional works.
        The aim of the paper is to contribute to filling the gap. Toward this
        goal, a formal framework is outlined that is consistent with a variety
        of conceptions of fictional truth and based upon a specific formal
        treatment of time and agency, that of so-called stit logics. Moreover, a
        complete axiomatic theory of fiction-making TFM is defined, where
        fiction-making is understood as the exercise of agency and choice in
        time over what is fictionally true. The language \ of TFM is an
        extension of the language of propositional logic, with the addition of
        temporal and modal operators. A distinctive feature of \ with respect to
        other modal languages is a variety of operators having to do with
        fictional truth, including a ‘fictionality’ operator \ . Some
        applications of TFM are outlined, and some interesting linguistic and
        inferential phenomena, which are not so easily dealt with in other
        frameworks, are accounted for
      - >-
        We have structured our response according to five questions arising from
        the commentaries: (i) What is sentience? (ii) Is sentience a necessary
        or sufficient condition for moral standing? (iii) What methods should
        guide comparative cognitive research in general, and specifically in
        studying invertebrates? (iv) How should we balance scientific
        uncertainty and moral risk? (v) What practical strategies can help
        reduce biases and morally dismissive attitudes toward invertebrates?
      - >-
        In 2007, ten world-renowned neuroscientists proposed “A Decade of the
        Mind Initiative.” The contention was that, despite the successes of the
        Decade of the Brain, “a fundamental understanding of how the brain gives
        rise to the mind [was] still lacking” (2007, 1321). The primary aims of
        the decade of the mind were “to build on the progress of the recent
        Decade of the Brain (1990-99)” by focusing on “four broad but
        intertwined areas” of research, including: healing and protecting,
        understanding, enriching, and modeling the mind. These four aims were to
        be the result of “transdisciplinary and multiagency” research spanning
        “across disparate fields, such as cognitive science, medicine,
        neuroscience, psychology, mathematics, engineering, and computer
        science.” The proposal for a decade of the mind prompted many questions
        (See Spitzer 2008). In this chapter, I address three of them: (1) How do
        proponents of this new decade conceive of the mind? (2) Why should a
        decade be devoted to understanding it? (3) What should this decade look
        like?
  - source_sentence: >-
      This essay explores the historical and modern perspectives on the Gettier
      problem, highlighting the connections between this issue, skepticism, and
      relevance. Through methods such as historical analysis, induction, and
      deduction, it is found that while contextual theories and varying
      definitions of knowledge do not fully address skeptical challenges, they
      can help clarify our understanding of knowledge. Ultimately, embracing
      subjectivity and intuition can provide insight into what it truly means to
      claim knowledge.
    sentences:
      - >-
        In this article I present and analyze three popular moral justifications
        for hunting. My purpose is to expose the moral terrain of this issue and
        facilitate more fruitful, philosophically relevant discussions about the
        ethics of hunting.
      - >-
        Teaching competency in bioethics has been a concern since the field's
        inception. The first report on the teaching of contemporary bioethics
        was published in 1976 by The Hastings Center, which concluded that
        graduate programs were not necessary at the time. However, the report
        speculated that future developments may require new academic structures
        for graduate education in bioethics. The creation of a terminal degree
        in bioethics has its critics, with scholars debating whether bioethics
        is a discipline with its own methods and theoretical grounding, a
        multidisciplinary field, or something else entirely. Despite these
        debates, new bioethics training programs have emerged at all
        postsecondary levels in the U.S. This essay examines the number and
        types of programs and degrees in this growing field.
      - >-
        Objective: In this essay,  I will try to track some historical and
        modern stages of the discussion on the Gettier problem, and point out
        the interrelations of the questions that this problem raises for
        epistemologists, with sceptical arguments, and a so-called problem of
        relevance. Methods: historical analysis, induction, generalization,
        deduction, discourse, intuition results: Albeit the contextual theories
        of knowledge, the use of different definitions of knowledge, and the
        different ways of the uses of knowledge do not resolve all the issues
        that the sceptic can put forward, but they can be productive in giving
        clarity to a concept of knowledge for us. On the other hand, our
        knowledge will always have an element of intuition and subjectivity,
        however not equating to epistemic luck and probability.  Significance
        novelty: the approach to the context in general, not giving up being a
        Subject may give us a clarity about the sense of what it means to say –
        “I know”.
model-index:
  - name: SentenceTransformer based on dbourget/pb-ds1-48K
    results:
      - task:
          type: semantic-similarity
          name: Semantic Similarity
        dataset:
          name: sts dev
          type: sts-dev
        metrics:
          - type: pearson_cosine
            value: 0.9378177365442741
            name: Pearson Cosine
          - type: spearman_cosine
            value: 0.8943299298202461
            name: Spearman Cosine
          - type: pearson_manhattan
            value: 0.9709949018414847
            name: Pearson Manhattan
          - type: spearman_manhattan
            value: 0.8969442622028955
            name: Spearman Manhattan
          - type: pearson_euclidean
            value: 0.9711044669329696
            name: Pearson Euclidean
          - type: spearman_euclidean
            value: 0.8966133108746955
            name: Spearman Euclidean
          - type: pearson_dot
            value: 0.9419649751470724
            name: Pearson Dot
          - type: spearman_dot
            value: 0.8551487313582053
            name: Spearman Dot
          - type: pearson_max
            value: 0.9711044669329696
            name: Pearson Max
          - type: spearman_max
            value: 0.8969442622028955
            name: Spearman Max

SentenceTransformer based on dbourget/pb-ds1-48K

This is a sentence-transformers model finetuned from dbourget/pb-ds1-48K. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: dbourget/pb-ds1-48K
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("dbourget/pb-ds1-48K-philsim")
# Run inference
sentences = [
    'This essay explores the historical and modern perspectives on the Gettier problem, highlighting the connections between this issue, skepticism, and relevance. Through methods such as historical analysis, induction, and deduction, it is found that while contextual theories and varying definitions of knowledge do not fully address skeptical challenges, they can help clarify our understanding of knowledge. Ultimately, embracing subjectivity and intuition can provide insight into what it truly means to claim knowledge.',
    'Objective: In this essay,  I will try to track some historical and modern stages of the discussion on the Gettier problem, and point out the interrelations of the questions that this problem raises for epistemologists, with sceptical arguments, and a so-called problem of relevance. Methods: historical analysis, induction, generalization, deduction, discourse, intuition results: Albeit the contextual theories of knowledge, the use of different definitions of knowledge, and the different ways of the uses of knowledge do not resolve all the issues that the sceptic can put forward, but they can be productive in giving clarity to a concept of knowledge for us. On the other hand, our knowledge will always have an element of intuition and subjectivity, however not equating to epistemic luck and probability.  Significance novelty: the approach to the context in general, not giving up being a Subject may give us a clarity about the sense of what it means to say – “I know”.',
    "Teaching competency in bioethics has been a concern since the field's inception. The first report on the teaching of contemporary bioethics was published in 1976 by The Hastings Center, which concluded that graduate programs were not necessary at the time. However, the report speculated that future developments may require new academic structures for graduate education in bioethics. The creation of a terminal degree in bioethics has its critics, with scholars debating whether bioethics is a discipline with its own methods and theoretical grounding, a multidisciplinary field, or something else entirely. Despite these debates, new bioethics training programs have emerged at all postsecondary levels in the U.S. This essay examines the number and types of programs and degrees in this growing field.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine 0.9378
spearman_cosine 0.8943
pearson_manhattan 0.971
spearman_manhattan 0.8969
pearson_euclidean 0.9711
spearman_euclidean 0.8966
pearson_dot 0.942
spearman_dot 0.8551
pearson_max 0.9711
spearman_max 0.8969

Training Details

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 190
  • per_device_eval_batch_size: 190
  • learning_rate: 5e-06
  • num_train_epochs: 2
  • warmup_ratio: 0.1
  • bf16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 190
  • per_device_eval_batch_size: 190
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 5e-06
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss loss sts-dev_spearman_cosine
0 0 - - 0.8229
0.0178 10 0.0545 - -
0.0355 20 0.0556 - -
0.0533 30 0.0502 - -
0.0710 40 0.0497 - -
0.0888 50 0.0413 - -
0.1066 60 0.0334 - -
0.1243 70 0.0238 - -
0.1421 80 0.0206 - -
0.1599 90 0.0167 - -
0.1776 100 0.0146 0.0725 0.8788
0.1954 110 0.0127 - -
0.2131 120 0.0125 - -
0.2309 130 0.0115 - -
0.2487 140 0.0116 - -
0.2664 150 0.0111 - -
0.2842 160 0.0107 - -
0.3020 170 0.0113 - -
0.3197 180 0.0106 - -
0.3375 190 0.0099 - -
0.3552 200 0.0092 0.0207 0.8856
0.3730 210 0.0097 - -
0.3908 220 0.0099 - -
0.4085 230 0.0087 - -
0.4263 240 0.0087 - -
0.4440 250 0.0082 - -
0.4618 260 0.0083 - -
0.4796 270 0.0089 - -
0.4973 280 0.0082 - -
0.5151 290 0.0078 - -
0.5329 300 0.0081 0.0078 0.8891
0.5506 310 0.0081 - -
0.5684 320 0.0072 - -
0.5861 330 0.0084 - -
0.6039 340 0.0083 - -
0.6217 350 0.0078 - -
0.6394 360 0.0077 - -
0.6572 370 0.008 - -
0.6750 380 0.0073 - -
0.6927 390 0.008 - -
0.7105 400 0.0073 0.0058 0.8890
0.7282 410 0.0075 - -
0.7460 420 0.0077 - -
0.7638 430 0.0074 - -
0.7815 440 0.0073 - -
0.7993 450 0.007 - -
0.8171 460 0.0043 - -
0.8348 470 0.0052 - -
0.8526 480 0.0046 - -
0.8703 490 0.0073 - -
0.8881 500 0.0056 0.0069 0.8922
0.9059 510 0.0059 - -
0.9236 520 0.0045 - -
0.9414 530 0.0033 - -
0.9591 540 0.0058 - -
0.9769 550 0.0056 - -
0.9947 560 0.0046 - -
1.0124 570 0.003 - -
1.0302 580 0.0039 - -
1.0480 590 0.0032 - -
1.0657 600 0.0031 0.0029 0.8931
1.0835 610 0.0046 - -
1.1012 620 0.003 - -
1.1190 630 0.0021 - -
1.1368 640 0.0031 - -
1.1545 650 0.0035 - -
1.1723 660 0.0033 - -
1.1901 670 0.0024 - -
1.2078 680 0.0012 - -
1.2256 690 0.0075 - -
1.2433 700 0.0028 0.0036 0.8945
1.2611 710 0.0033 - -
1.2789 720 0.0023 - -
1.2966 730 0.0034 - -
1.3144 740 0.0018 - -
1.3321 750 0.0016 - -
1.3499 760 0.0025 - -
1.3677 770 0.002 - -
1.3854 780 0.0016 - -
1.4032 790 0.0018 - -
1.4210 800 0.003 0.0027 0.8944
1.4387 810 0.0018 - -
1.4565 820 0.0008 - -
1.4742 830 0.0014 - -
1.4920 840 0.0025 - -
1.5098 850 0.0026 - -
1.5275 860 0.0012 - -
1.5453 870 0.001 - -
1.5631 880 0.001 - -
1.5808 890 0.0012 - -
1.5986 900 0.0021 0.0021 0.8952
1.6163 910 0.0016 - -
1.6341 920 0.0008 - -
1.6519 930 0.0008 - -
1.6696 940 0.0009 - -
1.6874 950 0.0004 - -
1.7052 960 0.0003 - -
1.7229 970 0.0007 - -
1.7407 980 0.0007 - -
1.7584 990 0.0011 - -
1.7762 1000 0.0007 0.0029 0.8952
1.7940 1010 0.0008 - -
1.8117 1020 0.001 - -
1.8295 1030 0.0006 - -
1.8472 1040 0.0006 - -
1.8650 1050 0.0015 - -
1.8828 1060 0.0009 - -
1.9005 1070 0.0005 - -
1.9183 1080 0.0006 - -
1.9361 1090 0.0021 - -
1.9538 1100 0.0009 0.0023 0.8943
1.9716 1110 0.0007 - -
1.9893 1120 0.0003 - -

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.0.1
  • Transformers: 4.42.3
  • PyTorch: 2.2.0+cu121
  • Accelerate: 0.31.0
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}