Edit model card

SentenceTransformer based on BAAI/bge-m3

This is a sentence-transformers model finetuned from BAAI/bge-m3. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-m3
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 1024 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("m7n/bge-m3-philosophy-triplets_v1")
# Run inference
sentences = [
    "structural affinity between the case study as a genre of writing and the question of gendered subjectivity. With John Forrester's chapter 'Inventing Gender Identity: The Case of Agnes' as my starting point, I ask how the case of",
    "'Agnes' continues to inform our understanding of different disciplinary approaches (sociological and psychoanalytic) to theorizing gender. I establish a conversation between distinct, psychoanalytically informed feminisms (Simone",
    'consideration, Oliver unravels the consequences of this strange chiasmus-the resymbolization of the body and the embodiment of the Symbolic-for psychoanalysis, feminism, linguistics, ethics, and political theory. Although it draws on a variety of discourses ranging from philosophy to religion, from aesthetics to politics, Reading Kristeva privileges in a certain way the psychoanalytic framework as it focuses on Kristeva\'s most psychoanalytic texts from the 1980s and early 1990s. Accounting for Kristeva\'s interventions and revisions of psychoanalytic theory, Reading Kristeva points to the crucial differences not only between Kristeva and Jacques Lacan, but also between Kristeva and other French feminists, especially Luce Irigaray and Helene Cixous. The main challenge to the psychoanalytic theory, Oliver argues, lies in Kristeva\'s claim that the maternal function prefigures the oedipal structure and at the same time prevents its closure. The nodal points of these pre-oedipal relations are constituted by the narcissistic subject, the abject maternal body (constituting the pattern of rejection and negation), and the imaginary father (setting up the pattern of reduplication and identification). Reading Kristeva offers us many engaging and original readings of the difficult moments in Kristeva\'s work. One can mention, for instance, an excellent account of the structure of the primary narcissism, which, as the original displacement to the place of the Other, sets up the logic of reduplication and "the possibility of metaphorical shifting" (74). Yet probably the most original contribution of Oliver\'s book to feminist psychoanalytic theory lies in its re-interpretation of the imaginary father, one of',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Metric Value
cosine_accuracy 0.8085
dot_accuracy 0.1915
manhattan_accuracy 0.8085
euclidean_accuracy 0.8085
max_accuracy 0.8085

Training Details

Training Dataset

Unnamed Dataset

  • Size: 5,000 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 16 tokens
    • mean: 276.83 tokens
    • max: 597 tokens
    • min: 12 tokens
    • mean: 276.4 tokens
    • max: 571 tokens
    • min: 25 tokens
    • mean: 295.99 tokens
    • max: 607 tokens
  • Samples:
    anchor positive negative
    have argued from broadly conciliationist premises that we should not. If they are right, we philosophers face a dilemma: if we believe our views, we are irrational; if we do not, we are not sincere in holding them. This paper offers a way out, proposing an attitude we can rationally take toward our views that can support sincerity of the appropriate sort. We should arrive at our views via a certain sort of the subtle weighing of various factors involved in being responsive to ail aspects of a complex issue. He is likely to attach too much or too little weight to a single principle or a single distinction. And in matters of public dispute, it is the sensibility of the average person rather than the trained philosopher that seems most relevant. In this paper I will explore the possibility that the relevant criterion of "rational" or "reasonable" belief can be derived not from social science, clinical psychology, or philosophical dialectics, but from the rhetorical tradition stemming from Aristotle. Actually, Clifford seems to point in this direction when he writes, No one man's belief is . . . a private matter which concerns himself alone. Our lives are guided by that generai conception of the course of things which has been created by society for social purposes.11 I will assume that actions that tend to harm the interests of others are prima fade immoral, and should prima fade be restricted by society, without trying to defìne "harm," "interest," or "immoral."12 (I will also leave aside the difficult issue of actions causing harm only to oneself.) I will assume that belief s about justice and social groups are sufficiently voluntary that we can rightly be held responsible for them. This seems reasonable, since such belief s rest on évidence toward which each person must take up an attitude of acceptance, rejection, or something in between. I will not be concerned with whether the belief s are true or false, but with whether the act of
    nature, and its effort to search for the truth is obscured by the passions. The inherent capacity of the soul for self-realization is also obstructed by the veil of karma.4 It is subjected to the forces of karma, which express themselves, first, through the feelings and emotions and, secondly, in the chains of very subtle kinds of matter invisible to the eye and all ordinary instruments of knowledge. It is then embodied and is affected by the environment-physical, social, and spiritual. Thus, various typeg of soul existence come into being. Karma, according to the Jainas, is material in nature. It is matter in a subtle form and is a substantive force. It is constituted of finer particles of matter. The kind of matter fit to manifest karma is everywhere in the universe. It has the special property of developing the effects of merit and demerit. By its activity due to contact with the physical world, the soul becomes penetrated 2 Ibid., p. 15. 3 Dravya-sthagraha, II. 4 Umisaviti, Tattvarthadhigama-siftra, J. L. Jaini, trans. (Arrah: The Central Jaina Publishing House, 1920). KARMA IN JAINA PHILOSOPHY 231 with particles of karmic body (karma-sartra), which are constantly attached to the soul until the soul succeeds in freeing itself from the body. "Nowhere has the physical nature of karma been asserted with more stress than in Jainism."5 A moral fact produces a psychophysical quality, a real and not merely a symbolic mark, affecting the soul in its physical nature. This point of view has been worked out in detail in the form of mathematical calculation in the Karma-grantha. The Jaina tradition distinguishes two aspects: (1) the physical aspect (dravya-karma) and (2) the psychic aspect (bhavakarma). The physical aspect comprises the particles of karma accruing to the soul and polluting it. The psychic aspect is primarily the mental states and events arising out of the activity of mind, body, and speech--they are like the mental traces of the actions, since we experience the mnemic traces long after the experienced conscious states vanish. Physical karma and psychic karma are mutually related as cause and effect." The distinction between the physical and the psychic aspects of karma is psychologically significant, since it presents the interaction of the bodily and the mental due to the incessant activity of the soul. This bondage of the soul to karma is of four types, according to its nature (prakrti), duration (sthiti), intensity (anubhaga, rasa), and quantity (pradeda) . Karma can be distinguished into eight types: (1) finanavaran~iya, that which obscures right knowledge; (2) darianavaraniya, that which obscures right intuition; (3) vedaniya, that which arouses affective states such as feelings and emotions ; (4) mohaniya, that which deludes right faith; (5) dyu-karma, that which determines the age of the individual; (6) nama-karma, that which produces various circumstances collectively making up an individual existence, such as the body and other special qualities of individuality; (7) gotra-karma, that which determines the family, social standing, etc., of the individual; (8) antardya-karma, that which obstructs the was that even the gods were subject to the inexorable law of Karma. Of the schools based on the Veda, the Nyaya-Vai§esika system, which is mainly concerned with logic and dialectics, may be described as realistic. It has an interesting atomic theory, and regards the physical universe as ultimately consisting of an indefinite number of atoms of four types, plus three infinite and pervasive entities-ether (dkAsa, regarded as the substratum of sound), time, and space. This system regards the whole and its parts as quite distinct and postulates a special relation (samavaya, "inherence") between them, which is described by Mr. Hiriyanna as "a metaphysical fiction." The same relationship is supposed to obtain between a universal and the particulars which it characterizes. Universals in this doctrine are regarded as eternal and independently real, not as transient configurations of particular objects (Jain view) or as purely conceptual (Buddhist view). 267 PHILOSOPHY The Sankhya and Yoga schools form another composite system, which regards both matter and spirit as ultimately real and admits a plurality of selves. It differs from the Nyaya-VaiSesika in tracing the whole of the physical universe to a single source called Prakrti. Purusa and Prakrti, or spirit and nature, are the two basic conceptions of the doctrine (p. 107). Spirit without nature (or "matter") is inoperative and nature without spirit is blind. The knowledge of the ultimate separateness of these two principles is stated to be the means to release. The philosophical
    the hundredth anniversary of the publication of Nishida Kitaro's An Inquiry into the Good. The following is an English version of a talk delivered on that occasion. In it I have tried to argue against the widely held view that this maiden work contains the germ of Nishida's mature philosophy, and at the same time to suggest that an early strain of ambiguity the origins of this important work, a text often seen as marking the beginning of Modern Japanese philosophy. I will show that while Buddhism is an important part of Nishida's early intellectual development, there is ample biographical and textual evidence to suggest that zen no kenkyu is at its core a text which attempts to solve key ethical problems via a modern interpretation of concepts
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 5
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 1,000 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 13 tokens
    • mean: 279.94 tokens
    • max: 554 tokens
    • min: 15 tokens
    • mean: 279.37 tokens
    • max: 527 tokens
    • min: 17 tokens
    • mean: 298.26 tokens
    • max: 506 tokens
  • Samples:
    anchor positive negative
    Y involves not only fitting particular curves from some given hypothesis space to the data but also making ‘higher’ level decisions about which general family or functional form (linear, quadratic, etc.) is most appropriate. There may be a still higher level allowing choice between expansions in polynomials and expansions in Fourier series. At the lowest level of the hierarchical model representing curve fitting, theories T 0 specify specific curves, such as } $y=2x+3$ or } $y=x^{2}-4$ , that we fit to the data. At the next level of the hierarchy, theories T 1 are distinguished by the maximum degree of the polynomial they assign to curves in the low‐level hypothesis space. For instance, T 1 could be the theory Poly1, with maximum polynomial degree 1. An alternative T 1 is Poly2, with maximum polynomial degree 2, and so on. At a higher level, there are two possible theories that specify that T 1 theories are either polynomials or Fourier series, respectively. The model also specifies the conditional probabilities } $p( T_{0} T_{1}) $ and } $p( T_{1} T_{2}) $ . At each level of the HBM, the alternative theories are mutually exclusive. In this example, Poly1 and Poly2 are taken to be mutually exclusive alternatives. We will see soon how this should be understood. We now suggest that HBMs are particularly apt models in certain respects of scientific inference. They provide a natural way to represent a broadly Kuhnian picture of the structure and dynamics of scientific theories. Let us first highlight some of the key features of the structure and dynamics of scientific theories to which historians and philosophers with a historical orientation (Kuhn 1962; Lakatos 1978; Laudan 1978) have been particularly attentive and for which HBMs provide a natural model. It has been common in philosophy of science, particularly in this tradition, to distinguish at least two levels of hierarchical structure: a higher level consisting of a paradigm, research program, or research tradition and a lower level of more specific theories or hypotheses. Paradigms, research programs, and research traditions have been invested with a number of different roles. Kuhn’s paradigms, for instance, may carry with them a commitment to specific forms of instrumentation and to general theoretical goals and methodologies, such as an emphasis on quantitative prediction or a distaste for unobservable entities. However, one of the primary functions of paradigms and their like is to contain what we will call ‘framework theories’, which comprise abstract what they focus primarily on what Prof. Kuhn had said PHILOSOPHICAL PROBLEMS 119 about the products of scientific communities scientific theories and the empirical claims associated with them. Other aspects of his theory dealing with the scientific communities are however peripherally touched. In particular, both Prof. Stegm?ller and I, in somewhat different ways, try to explain what it is for a person 'zu verf?gen ?ber' or 'to have' a theory. I have explained my conception of logical reconstruction of physical theories and the extent of its normative aspect. ([4], p. 4). I still believe this account of the matter to be correct and I now believe the account applies as well to logical reconstructions of theories in the science of science. I think the principal consideration is faithfulness to the 'existing exposition' of the theory. Within this, normative considerations of logical consistency, clarity and systematic elegance operate. Only at doubtful points where the existing exposition is ambiguous or unclear should normative consideration dominate the existing exposition. This means that in reconstructing a theory of science we are primarily concerned with exhibiting what the theory tells us about the way scientific communities work in particular, but not exclusively, what it tells us about how their products change over time. Whether the theory's account is true, whether it agrees with some preconceived account of 'scientific rationality', and whether it suggests some 'better' alternatives for meeting society's infor? mation needs are all different and distinct questions. The first and last, at least, are obviously interesting. 2. THE PRODUCTS
    obligation'O'-signify an all-things-considered obligation. This claim is harmless if it simply expresses our intention to call only all-things-considered moral requirements "duties" or "obligations" and to treat 'prima facie obligation' as a technical term. But I think that more than this is usually intended by those who deny that prima facie obligations are genuine obligations, and their denial rests on a misunderstanding of prima facie obligations that it is important to avoid. These writers sometimes say that prima facie obligations are merely apparent obligations such that they have no moral force if overridden.7 But this does not fit our understanding of prima facie obligations or Ross's. As Ross points out, we should not understand prima facie obligations as the epistemic claim that certain things appear to be obligatory that may not prove to be.8 This reading does not imply that there is any moral reason supporting x corresponding to the prima facie obligation to do x. Rather, prima facie obligations should be given a metaphysical reading that recognizes prima facie obligations as moral forces that are not canceled by the existence of other moral forces even if the latter override or defeat the former.9 Now Ross does say that prima facie duties are conditional duties 6Foot recognizes genuine obligations that may be overridden (type-i obligations) and distinguishes them from the obligation associated with what there is the most moral reason to do (type-2 obligations), and so recognizes something like the distinction that I intend between prima facie and all-things-considered obligations. But she seems to treat prima facie obligations epistemically or statistically (see text below) and so does not want to equate the type-1/type-2 distinction with the prima facie/all-thingsconsidered distinction. See Philippa Foot, "Moral Realism and Moral Dilemma," reprinted in Moral Dilemmas, ed. C. Gowans (New York: Oxford University Press, 1987), 256-57. Because I reject these readings of prima facie obligations, our distinctions are similar. 7See Bernard Williams, "Ethical Consistency," reprinted in Moral Dilemmas, ed. Gowans, 125, 126; Bas van Fraassen, "Values and the Heart's Command," ibid., 141, 142; Ruth Barcan Marcus, "Moral Dilemmas and Consistency," ibid., 191; Foot, "Moral Realism and Moral Dilemma," 257. 8The Right and the Good, 20. 90n the metaphysical reading, a prima facie obligation expresses a pro tanto moral obligation or moral reason. 218 MORAL CONFLICT AND ITS STRUCTURE and not duties proper.'0 This, I believe, reflects only his decision to reserve the terms 'duty' and 'obligation' for all-things-considered moral claims. If we concede this to him, then we can explain most of his claims about prima facie obligations on our model. Prima facie obligations are conditional (all-things-considered) duties in the sense that if all else is equal, then there is not only a prima facie obligation to do x but also a genuine or all-things-considered obligation. Sometimes Ross says that prima facie obligations refer to features of an act that tend to make acts of that type (all-things-considered) obligatory." This claim admits of a purely statistical reading: though there may be nothing about this token act a situation, and it can still be right to break the promise. This is because two prima facie duties can come into conflict. We may, for example, have promised to meet a friend for lunch, but meet a stranger in dire need of help along the way. In such a case, there will be a conflict of prima facie duties: it would be prima facie right to keep the promise, but it is also prima facie right to help those in need when we are able. In such a case, the right thing to do may very well be to help the stranger, and thus break our promise to our friend. One prima facie duty, therefore, can be overridden by another. Even when a prima facie duty is overridden, however, it still retains its force. Our judgment that, overall, it is right to break our promise does not mean that promise-breaking, in this case, does give us some reason to think the action wrong. It simply doesn't give us enough of a reason. To borrow Robert Audi's phrase, Dancy interprets prima facie duties as "ineradicable but overridable." (Audi, 1997, p. 35) This, it turns out, is what makes Ross a generalist. As Dancy writes, It is clearly a generalist account, in that it maintains that what is a reason here must be the same reason everywhere. (Dancy, 1993, p. 96) 6 The most important source for Ross's theory is (Ross, 1930). For a later statement congruent with these central claims see: (Ross,
    over another's duties grounds rights. The Will Theory has commonly been objected to on the grounds that it undergenerates right-ascriptions along three fronts. This paper systematically examines a range of positions open to the Will Theory in response to these counterexamples, while being faithful to the Will Theory's focus on normative control. It argues that of the seemingly plausible ways the defender of the Will Theory can proceed, one monstrous to admit as a subjective determinant of the will any element which has not intelligible roots in the character of the agent. An act of will which does not spring from the self's character, it is said, is obviously not the self's act at all. It is of no more use to the wise Libertarian than to the Determinist. This may fairly be said to have established itself as a philosophical cliche. It is also, as I believe, and as I have argued more than once elsewhere, a devastating error which has played havoc withl the whole free will controversy. My purpose at the moment, however, is merely to point out that here, in the climate of philosophical opinion, there has been an additional encouragement to the psychologist to give a preference to one of the two rival hypotheses concerning the experience of will-effort. It is, I hope, not unfair to suggest that psychologists have often approached the analysis of the experience of will-effort with a rather definite expectation of finding that, even from the standpoint of psychology, there is nothing which lends countenance to the notion of a form of mental energy which, while not intelligibly rooted in character, can yet influence the act of choice. One further word before commencing consideration of the more important of the psychological analyses which proceed along what, for the sake of a convenient label, we may call " Determinist " lines. We ought to be clear at the outset about the fundamental requirement which any such analysis
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 5
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 4
  • learning_rate: 1e-05
  • num_train_epochs: 5
  • warmup_ratio: 0.1
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 4
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 5
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss loss all-nli-test_max_accuracy
0.08 100 4.8908 4.8563 -
0.16 200 4.8672 4.8117 -
0.24 300 4.8049 4.7065 -
0.32 400 4.7156 4.5210 -
0.4 500 4.5615 4.4572 -
0.48 600 4.5355 4.4548 -
0.56 700 4.589 4.4488 -
0.64 800 4.5506 4.4304 -
0.72 900 4.4665 4.4323 -
0.8 1000 4.5033 4.4068 -
0.88 1100 4.5526 4.4300 -
0.96 1200 4.5195 4.4004 -
1.04 1300 4.4698 4.3785 -
1.12 1400 4.4466 4.4032 -
1.2 1500 4.4429 4.3731 -
1.28 1600 4.4364 4.3455 -
1.3600 1700 4.4631 4.3660 -
1.44 1800 4.3781 4.3577 -
1.52 1900 4.442 4.3767 -
1.6 2000 4.4354 4.3541 -
1.6800 2100 4.3309 4.3393 -
1.76 2200 4.3784 4.3350 -
1.8400 2300 4.403 4.3271 -
1.92 2400 4.3733 4.3328 -
2.0 2500 4.3256 4.3385 -
2.08 2600 4.3109 4.3845 -
2.16 2700 4.3712 4.3043 -
2.2232 2779 - - 0.8085

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.0.1
  • Transformers: 4.42.4
  • PyTorch: 2.3.1+cu121
  • Accelerate: 0.32.1
  • Datasets: 2.21.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification}, 
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Downloads last month
6
Safetensors
Model size
568M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for m7n/bge-m3-philosophy-triplets_v1

Base model

BAAI/bge-m3
Finetuned
(102)
this model

Evaluation results