--- language: [] library_name: sentence-transformers tags: - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - dataset_size:89218 - loss:MultipleNegativesRankingLoss base_model: sentence-transformers/multi-qa-mpnet-base-dot-v1 datasets: [] metrics: - cosine_accuracy@1 - cosine_accuracy@3 - cosine_accuracy@5 - cosine_accuracy@10 - cosine_precision@1 - cosine_precision@3 - cosine_precision@5 - cosine_precision@10 - cosine_recall@1 - cosine_recall@3 - cosine_recall@5 - cosine_recall@10 - cosine_ndcg@10 - cosine_mrr@10 - cosine_map@100 - dot_accuracy@1 - dot_accuracy@3 - dot_accuracy@5 - dot_accuracy@10 - dot_precision@1 - dot_precision@3 - dot_precision@5 - dot_precision@10 - dot_recall@1 - dot_recall@3 - dot_recall@5 - dot_recall@10 - dot_ndcg@10 - dot_mrr@10 - dot_map@100 widget: - source_sentence: Pulmonary stenoses, brachytelephalangy, inner ear deafness sentences: - "This article needs more medical references for verification or relies too heavily\ \ on primary sources. Please review the contents of the article and add the appropriate\ \ references if you can. Unsourced or poorly sourced material may be challenged\ \ and removed. \nFind sources: \"Chondropathy\" – news · newspapers · books ·\ \ scholar · JSTOR (October 2020) \n \nChondropathy \nSpecialtyOrthopedics \ \ \n \nChondropathy refers to a disease of the cartilage. It is frequently divided\ \ into 5 grades, with 0-2 defined as normal and 3-4 defined as diseased.\n\n##\ \ Contents\n\n * 1 Some common diseases affecting/involving the cartilage\n \ \ * 2 Repairing articular cartilage damage\n * 3 References\n * 4 External links\n\ \n## Some common diseases affecting/involving the cartilage[edit]" - 'A number sign (#) is used with this entry because of evidence that Keutel syndrome (KTLS) is caused by homozygous mutation in the gene encoding the human matrix Gla protein (MGP; 154870) on chromosome 12p12. Description Keutel syndrome is an autosomal recessive disorder characterized by multiple peripheral pulmonary stenoses, brachytelephalangy, inner ear deafness, and abnormal cartilage ossification or calcification (summary by Khosroshahi et al., 2014). Clinical Features' - '## Description Primary or spontaneous detachment of the retina occurs due to underlying ocular disease and often involves the vitreous as well as the retina. The precipitating event is formation of a retinal tear or hole, which permits fluid to accumulate under the sensory layers of the retina and creates an intraretinal cleavage that destroys the neurosensory process of visual reception. Vitreoretinal degeneration and tear formation are painless phenomena, and in most cases, significant vitreoretinal pathology is found only after detachment of the retina starts to cause loss of vision or visual field. Without surgical intervention, retinal detachment will almost inevitably lead to total blindness (summary by McNiel and McPherson, 1971). Clinical Features' - source_sentence: APS, catastrophic, diagnostic criteria, treatment options sentences: - 'A number sign (#) is used with this entry because of evidence that myofibrillar myopathy-8 (MFM8) is caused by homozygous or compound heterozygous mutation in the PYROXD1 gene (617220) on chromosome 12p12. Description Myofibrillar myopathy-8 is an autosomal recessive myopathy characterized by childhood onset of slowly progressive proximal muscle weakness and atrophy resulting in increased falls, gait problems, and difficulty running or climbing stairs. Upper and lower limbs are affected, and some individuals develop distal muscle weakness and atrophy. Ambulation is generally preserved, and patients do not have significant respiratory compromise. Muscle biopsy shows a mix of myopathic features, including myofibrillar inclusions and sarcomeric disorganization (summary by O''Grady et al., 2016). For a general phenotypic description and a discussion of genetic heterogeneity of myofibrillar myopathy, see MFM1 (601419). Clinical Features' - "Rectal tenesmus \nSpecialtyGeneral surgery \n \nRectal tenesmus is a feeling\ \ of incomplete defecation. It is the sensation of inability or difficulty to\ \ empty the bowel at defecation, even if the bowel contents have already been\ \ evacuated. Tenesmus indicates the feeling of a residue, and is not always correlated\ \ with the actual presence of residual fecal matter in the rectum. It is frequently\ \ painful and may be accompanied by involuntary straining and other gastrointestinal\ \ symptoms. Tenesmus has both a nociceptive and a neuropathic component.\n\nVesical\ \ tenesmus is a similar condition, experienced as a feeling of incomplete voiding\ \ despite the bladder being empty.\n\nOften, rectal tenesmus is simply called\ \ tenesmus. The term rectal tenesmus is a retronym to distinguish defecation-related\ \ tenesmus from vesical tenesmus.[1]" - "This article needs additional citations for verification. Please help improve\ \ this article by adding citations to reliable sources. Unsourced material may\ \ be challenged and removed. \nFind sources: \"Catastrophic antiphospholipid\ \ syndrome\" – news · newspapers · books · scholar · JSTOR (February 2018) (Learn\ \ how and when to remove this template message) \n \nCatastrophic antiphospholipid\ \ syndrome \nOther namesCatastrophic APS" - source_sentence: Excess cholesterol, foam cells, gallbladder wall changes sentences: - "Cholesterolosis of gallbladder \nMicrograph of cholesterolosis of the gallbladder,\ \ with an annotated foam cell. H&E stain. \nSpecialtyGastroenterology \n \n\ In surgical pathology, strawberry gallbladder, more formally cholesterolosis of\ \ the gallbladder and gallbladder cholesterolosis, is a change in the gallbladder\ \ wall due to excess cholesterol.[1]\n\nThe name strawberry gallbladder comes\ \ from the typically stippled appearance of the mucosal surface on gross examination,\ \ which resembles a strawberry. Cholesterolosis results from abnormal deposits\ \ of cholesterol esters in macrophages within the lamina propria (foam cells)\ \ and in mucosal epithelium. The gallbladder may be affected in a patchy localized\ \ form or in a diffuse form. The diffuse form macroscopically appears as a bright\ \ red mucosa with yellow mottling (due to lipid), hence the term strawberry gallbladder.\ \ It is not tied to cholelithiasis (gallstones) or cholecystitis (inflammation\ \ of the gallbladder).[2]\n\n## Contents" - Meningococcal meningitis is an acute bacterial disease caused by Neisseria meningitides that presents usually, but not always, with a rash (non blanching petechial or purpuric rash), progressively developing signs of meningitis (fever, vomiting, headache, photophobia, and neck stiffness) and later leading to confusion, delirium and drowsiness. Neck stiffness and photophobia are often absent in infants and young children who may manifest nonspecific signs such as irritability, inconsolable crying, poor feeding, and a bulging fontanel. Meningococcal meningitis may also present as part of early or late onset sepsis in neonates. The disease is potentially fatal. Surviving patients may develop neurological sequelae that include sensorineural hearing loss, seizures, spasticity, attention deficits and intellectual disability. - "Retiform parapsoriasis \nSpecialtyDermatology \n \nRetiform parapsoriasis\ \ is a cutaneous condition, considered to be a type of large-plaque parapsoriasis.[1]\ \ It is characterized by widespread, ill-defined plaques on the skin, that have\ \ a net-like or zebra-striped pattern.[2] Skin atrophy, a wasting away of the\ \ cutaneous tissue, usually occurs within the area of these plaques.[1]\n\n##\ \ See also[edit]\n\n * Parapsoriasis\n * Poikiloderma vasculare atrophicans\n\ \ * List of cutaneous conditions\n\n## References[edit]\n\n 1. ^ a b Lambert\ \ WC, Everett MA (Oct 1981). \"The nosology of parapsoriasis\". J. Am. Acad. Dermatol.\ \ 5 (4): 373–95. doi:10.1016/S0190-9622(81)70100-2. PMID 7026622.\n 2. ^ Rapini,\ \ Ronald P.; Bolognia, Jean L.; Jorizzo, Joseph L. (2007). Dermatology: 2-Volume\ \ Set. St. Louis: Mosby. ISBN 1-4160-2999-0.\n\n## External links[edit]\n\nClassification\n\ \nD\n\n * ICD-10: L41.5\n * ICD-9-CM: 696.2\n\n \n \n * v\n * t\n * e\n\ \nPapulosquamous disorders \n \nPsoriasis\n\nPustular" - source_sentence: Pulmonary hypoplasia, respiratory insufficiency, megaureter, hydronephrosis sentences: - 'A rare fetal lower urinary tract obstruction (LUTO) characterized by closure or failure to develop an opening in the urethra and resulting in obstructive uropathy presenting in utero as megacystis, oligohydramnios or anhydramnios, and potter sequence. ## Epidemiology Prevalence is unknown, but is higher in males than females. ## Clinical description Atresia of urethra often presents on routine antenatal ultrasound with megacystis, oligohydramnios or anhydramnios and sometimes urinary ascites. It may cause fetal death. In cases that survive to birth, additional symptoms include respiratory insufficiency due to pulmonary hypoplasia, megaureter, hydronephrosis and enlarged often cystic and functionally impaired/non-functional dysplastic kidneys as well as abdominal distention. Furthermore, a Potter sequence can be found due to oligo- or anhydramnios. Patients may present with patent urachus or vesicocutaneous fistula. ## Etiology' - X-linked distal spinal muscular atrophy type 3 is a rare distal hereditary motor neuropathy characterized by slowly progressive atrophy and weakness of distal muscles of hands and feet with normal deep tendon reflexes or absent ankle reflexes and minimal or no sensory loss, sometimes mild proximal weakness in the legs and feet and hand deformities in males. - 'A number sign (#) is used with this entry because Chudley-McCullough syndrome (CMCS) is caused by homozygous or compound heterozygous mutation in the GPSM2 gene (609245) on chromosome 1p13. Description Chudley-McCullough syndrome is an autosomal recessive neurologic disorder characterized by early-onset sensorineural deafness and specific brain anomalies on MRI, including hypoplasia of the corpus callosum, enlarged cysterna magna with mild focal cerebellar dysplasia, and nodular heterotopia. Some patients have hydrocephalus. Psychomotor development is normal (summary by Alrashdi et al., 2011). Clinical Features' - source_sentence: Thyroid-stimulating hormone receptor gene, chromosome 14q31, homozygous mutation sentences: - 'A number sign (#) is used with this entry because dermatofibrosarcoma protuberans is caused in most cases by a specific fusion of the COL1A1 gene (120150) with the PDGFB gene (190040); see 190040.0002. Description Dermatofibrosarcoma protuberans (DFSP) is an uncommon, locally aggressive, but rarely metastasizing tumor of the deep dermis and subcutaneous tissue. It typically presents during early or middle adult life and is most frequently located on the trunk and proximal extremities (Sandberg et al., 2003). Clinical Features DFSP was first described by Taylor (1890). Sirvent et al. (2003) stated that, because DFSP is relatively rare, grows slowly, and has a low level of aggressiveness, its clinical significance has been underestimated. In particular, they noted that the existence of pediatric cases has been overlooked. Gardner et al. (1998) described a father and son with dermatofibrosarcoma protuberans. The tumors arose at ages 43 and 14 years, respectively.' - "Visuospatial dysgnosia is a loss of the sense of \"whereness\" in the relation\ \ of oneself to one's environment and in the relation of objects to each other.[1]\ \ Visuospatial dysgnosia is often linked with topographical disorientation.\n\n\ ## Contents\n\n * 1 Symptoms\n * 2 Lesion areas\n * 3 Case studies\n * 4 Therapies\n\ \ * 5 References\n\n## Symptoms[edit]\n\nThe syndrome rarely presents itself\ \ the same way in every patient. Some symptoms that occur may be:" - 'A number sign (#) is used with this entry because of evidence that congenital nongoitrous hypothyroidism-1 (CHNG1) is caused by homozygous or compound heterozygous mutation in the gene encoding the thyroid-stimulating hormone receptor (TSHR; 603372) on chromosome 14q31. Description Resistance to thyroid-stimulating hormone (TSH; see 188540), a hallmark of congenital nongoitrous hypothyroidism, causes increased levels of plasma TSH and low levels of thyroid hormone. Only a subset of patients develop frank hypothyroidism; the remainder are euthyroid and asymptomatic (so-called compensated hypothyroidism) and are usually detected by neonatal screening programs (Paschke and Ludgate, 1997). ### Genetic Heterogeneity of Congenital Nongoitrous Hypothyroidism' pipeline_tag: sentence-similarity model-index: - name: SentenceTransformer based on sentence-transformers/multi-qa-mpnet-base-dot-v1 results: - task: type: information-retrieval name: Information Retrieval dataset: name: Unknown type: unknown metrics: - type: cosine_accuracy@1 value: 0.1900990099009901 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 0.5756875687568757 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 0.7932893289328933 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 0.8704070407040704 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.1900990099009901 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.19189585625229189 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.15865786578657867 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.08704070407040705 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.1900990099009901 name: Cosine Recall@1 - type: cosine_recall@3 value: 0.5756875687568757 name: Cosine Recall@3 - type: cosine_recall@5 value: 0.7932893289328933 name: Cosine Recall@5 - type: cosine_recall@10 value: 0.8704070407040704 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.526584144074431 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.41522683220700946 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.4194005014371134 name: Cosine Map@100 - type: dot_accuracy@1 value: 0.188998899889989 name: Dot Accuracy@1 - type: dot_accuracy@3 value: 0.5761826182618262 name: Dot Accuracy@3 - type: dot_accuracy@5 value: 0.7954895489548955 name: Dot Accuracy@5 - type: dot_accuracy@10 value: 0.8710671067106711 name: Dot Accuracy@10 - type: dot_precision@1 value: 0.188998899889989 name: Dot Precision@1 - type: dot_precision@3 value: 0.19206087275394204 name: Dot Precision@3 - type: dot_precision@5 value: 0.15909790979097907 name: Dot Precision@5 - type: dot_precision@10 value: 0.08710671067106711 name: Dot Precision@10 - type: dot_recall@1 value: 0.188998899889989 name: Dot Recall@1 - type: dot_recall@3 value: 0.5761826182618262 name: Dot Recall@3 - type: dot_recall@5 value: 0.7954895489548955 name: Dot Recall@5 - type: dot_recall@10 value: 0.8710671067106711 name: Dot Recall@10 - type: dot_ndcg@10 value: 0.5265923432373186 name: Dot Ndcg@10 - type: dot_mrr@10 value: 0.4149802896956161 name: Dot Mrr@10 - type: dot_map@100 value: 0.41904239679820193 name: Dot Map@100 --- # SentenceTransformer based on sentence-transformers/multi-qa-mpnet-base-dot-v1 This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/multi-qa-mpnet-base-dot-v1](https://huggingface.co/sentence-transformers/multi-qa-mpnet-base-dot-v1). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [sentence-transformers/multi-qa-mpnet-base-dot-v1](https://huggingface.co/sentence-transformers/multi-qa-mpnet-base-dot-v1) - **Maximum Sequence Length:** 512 tokens - **Output Dimensionality:** 768 tokens - **Similarity Function:** Dot Product ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("sentence_transformers_model_id") # Run inference sentences = [ 'Thyroid-stimulating hormone receptor gene, chromosome 14q31, homozygous mutation', 'A number sign (#) is used with this entry because of evidence that congenital nongoitrous hypothyroidism-1 (CHNG1) is caused by homozygous or compound heterozygous mutation in the gene encoding the thyroid-stimulating hormone receptor (TSHR; 603372) on chromosome 14q31.\n\nDescription\n\nResistance to thyroid-stimulating hormone (TSH; see 188540), a hallmark of congenital nongoitrous hypothyroidism, causes increased levels of plasma TSH and low levels of thyroid hormone. Only a subset of patients develop frank hypothyroidism; the remainder are euthyroid and asymptomatic (so-called compensated hypothyroidism) and are usually detected by neonatal screening programs (Paschke and Ludgate, 1997).\n\n### Genetic Heterogeneity of Congenital Nongoitrous Hypothyroidism', 'Visuospatial dysgnosia is a loss of the sense of "whereness" in the relation of oneself to one\'s environment and in the relation of objects to each other.[1] Visuospatial dysgnosia is often linked with topographical disorientation.\n\n## Contents\n\n * 1 Symptoms\n * 2 Lesion areas\n * 3 Case studies\n * 4 Therapies\n * 5 References\n\n## Symptoms[edit]\n\nThe syndrome rarely presents itself the same way in every patient. Some symptoms that occur may be:', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 768] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ## Evaluation ### Metrics #### Information Retrieval * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) | Metric | Value | |:--------------------|:----------| | cosine_accuracy@1 | 0.1901 | | cosine_accuracy@3 | 0.5757 | | cosine_accuracy@5 | 0.7933 | | cosine_accuracy@10 | 0.8704 | | cosine_precision@1 | 0.1901 | | cosine_precision@3 | 0.1919 | | cosine_precision@5 | 0.1587 | | cosine_precision@10 | 0.087 | | cosine_recall@1 | 0.1901 | | cosine_recall@3 | 0.5757 | | cosine_recall@5 | 0.7933 | | cosine_recall@10 | 0.8704 | | cosine_ndcg@10 | 0.5266 | | cosine_mrr@10 | 0.4152 | | cosine_map@100 | 0.4194 | | dot_accuracy@1 | 0.189 | | dot_accuracy@3 | 0.5762 | | dot_accuracy@5 | 0.7955 | | dot_accuracy@10 | 0.8711 | | dot_precision@1 | 0.189 | | dot_precision@3 | 0.1921 | | dot_precision@5 | 0.1591 | | dot_precision@10 | 0.0871 | | dot_recall@1 | 0.189 | | dot_recall@3 | 0.5762 | | dot_recall@5 | 0.7955 | | dot_recall@10 | 0.8711 | | dot_ndcg@10 | 0.5266 | | dot_mrr@10 | 0.415 | | **dot_map@100** | **0.419** | ## Training Details ### Training Dataset #### Unnamed Dataset * Size: 89,218 training samples * Columns: queries and chunks * Approximate statistics based on the first 1000 samples: | | queries | chunks | |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------| | type | string | string | | details | | | * Samples: | queries | chunks | |:--------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Polyhydramnios, megalencephaly, symptomatic epilepsy | A number sign (#) is used with this entry because of evidence that polyhydramnios, megalencephaly, and symptomatic epilepsy (PMSE) is caused by homozygous mutation in the STRADA gene (608626) on chromosome 17q23.

Clinical Features
| | Polyhydramnios, megalencephaly, STRADA gene mutation | A number sign (#) is used with this entry because of evidence that polyhydramnios, megalencephaly, and symptomatic epilepsy (PMSE) is caused by homozygous mutation in the STRADA gene (608626) on chromosome 17q23.

Clinical Features
| | Megalencephaly, symptomatic epilepsy, chromosome 17q23 | A number sign (#) is used with this entry because of evidence that polyhydramnios, megalencephaly, and symptomatic epilepsy (PMSE) is caused by homozygous mutation in the STRADA gene (608626) on chromosome 17q23.

Clinical Features
| * Loss: [MultipleNegativesRankingLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters: ```json { "scale": 1, "similarity_fct": "dot_score" } ``` ### Evaluation Dataset #### Unnamed Dataset * Size: 18,180 evaluation samples * Columns: queries and chunks * Approximate statistics based on the first 1000 samples: | | queries | chunks | |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------| | type | string | string | | details | | | * Samples: | queries | chunks | |:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Weight loss, anorexia, fatigue, epigastric pain and discomfort | Undifferentiated carcinoma of stomach is a rare epithelial tumour of the stomach that lacks any features of differentiation beyond an epithelial phenotype. The presenting symptoms are usually vague and nonspecific, such as weight loss, anorexia, fatigue, epigastric pain and discomfort, heartburn and nausea, vomiting or hematemesis. Patients may also be asymptomatic. Ascites, jaundice, intestinal obstruction and peripheral lymphadenopathy indicate advanced stages and metastatic spread. | | Heartburn, nausea, vomiting, hematemesis | Undifferentiated carcinoma of stomach is a rare epithelial tumour of the stomach that lacks any features of differentiation beyond an epithelial phenotype. The presenting symptoms are usually vague and nonspecific, such as weight loss, anorexia, fatigue, epigastric pain and discomfort, heartburn and nausea, vomiting or hematemesis. Patients may also be asymptomatic. Ascites, jaundice, intestinal obstruction and peripheral lymphadenopathy indicate advanced stages and metastatic spread. | | Ascites, jaundice, intestinal obstruction, peripheral lymphadenopathy | Undifferentiated carcinoma of stomach is a rare epithelial tumour of the stomach that lacks any features of differentiation beyond an epithelial phenotype. The presenting symptoms are usually vague and nonspecific, such as weight loss, anorexia, fatigue, epigastric pain and discomfort, heartburn and nausea, vomiting or hematemesis. Patients may also be asymptomatic. Ascites, jaundice, intestinal obstruction and peripheral lymphadenopathy indicate advanced stages and metastatic spread. | * Loss: [MultipleNegativesRankingLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters: ```json { "scale": 1, "similarity_fct": "dot_score" } ``` ### Training Hyperparameters #### Non-Default Hyperparameters - `eval_strategy`: steps - `per_device_train_batch_size`: 32 - `per_device_eval_batch_size`: 32 - `learning_rate`: 2e-05 - `num_train_epochs`: 50 - `warmup_ratio`: 0.1 - `fp16`: True - `load_best_model_at_end`: True - `eval_on_start`: True - `batch_sampler`: no_duplicates #### All Hyperparameters
Click to expand - `overwrite_output_dir`: False - `do_predict`: False - `eval_strategy`: steps - `prediction_loss_only`: True - `per_device_train_batch_size`: 32 - `per_device_eval_batch_size`: 32 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 1 - `eval_accumulation_steps`: None - `torch_empty_cache_steps`: None - `learning_rate`: 2e-05 - `weight_decay`: 0.0 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1.0 - `num_train_epochs`: 50 - `max_steps`: -1 - `lr_scheduler_type`: linear - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.1 - `warmup_steps`: 0 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `use_ipex`: False - `bf16`: False - `fp16`: True - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: None - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: True - `dataloader_num_workers`: 0 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: True - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: False - `hub_always_push`: False - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: None - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `dispatch_batches`: None - `split_batches`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: False - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: True - `eval_use_gather_object`: False - `batch_sampler`: no_duplicates - `multi_dataset_batch_sampler`: proportional
### Training Logs
Click to expand | Epoch | Step | Training Loss | loss | dot_map@100 | |:-----------:|:--------:|:-------------:|:----------:|:-----------:| | 0 | 0 | - | 1.1605 | 0.2419 | | 0.1435 | 100 | 1.2016 | - | - | | 0.2869 | 200 | 0.7627 | - | - | | 0.4304 | 300 | 0.5559 | - | - | | 0.5739 | 400 | 0.4541 | - | - | | 0.7174 | 500 | 0.1451 | 0.3600 | 0.3913 | | 0.8608 | 600 | 0.3841 | - | - | | 1.0057 | 700 | 0.3334 | - | - | | 1.1492 | 800 | 0.3898 | - | - | | 1.2927 | 900 | 0.3576 | - | - | | 1.4362 | 1000 | 0.3563 | 0.2719 | 0.4127 | | 1.5796 | 1100 | 0.3186 | - | - | | 1.7231 | 1200 | 0.098 | - | - | | 1.8666 | 1300 | 0.3038 | - | - | | 2.0115 | 1400 | 0.2629 | - | - | | 2.1549 | 1500 | 0.3221 | 0.2579 | 0.4155 | | 2.2984 | 1600 | 0.2936 | - | - | | 2.4419 | 1700 | 0.2867 | - | - | | 2.5854 | 1800 | 0.2614 | - | - | | 2.7288 | 1900 | 0.0716 | - | - | | 2.8723 | 2000 | 0.2655 | 0.2546 | 0.4152 | | 3.0172 | 2100 | 0.2187 | - | - | | 3.1607 | 2200 | 0.2623 | - | - | | 3.3042 | 2300 | 0.2462 | - | - | | 3.4476 | 2400 | 0.2363 | - | - | | 3.5911 | 2500 | 0.213 | 0.2866 | 0.4227 | | 3.7346 | 2600 | 0.0487 | - | - | | 3.8780 | 2700 | 0.222 | - | - | | 4.0230 | 2800 | 0.1851 | - | - | | 4.1664 | 2900 | 0.224 | - | - | | 4.3099 | 3000 | 0.2111 | 0.2562 | 0.4215 | | 4.4534 | 3100 | 0.1984 | - | - | | 4.5968 | 3200 | 0.1707 | - | - | | 4.7403 | 3300 | 0.0331 | - | - | | 4.8838 | 3400 | 0.1896 | - | - | | 5.0287 | 3500 | 0.1548 | 0.2643 | 0.4151 | | 5.1722 | 3600 | 0.19 | - | - | | 5.3156 | 3700 | 0.1656 | - | - | | 5.4591 | 3800 | 0.1626 | - | - | | 5.6026 | 3900 | 0.1303 | - | - | | 5.7461 | 4000 | 0.0264 | 0.2952 | 0.4186 | | 5.8895 | 4100 | 0.1563 | - | - | | 6.0344 | 4200 | 0.1286 | - | - | | 6.1779 | 4300 | 0.1436 | - | - | | 6.3214 | 4400 | 0.1352 | - | - | | 6.4648 | 4500 | 0.1344 | 0.2668 | 0.4218 | | 6.6083 | 4600 | 0.1069 | - | - | | 6.7518 | 4700 | 0.0171 | - | - | | 6.8953 | 4800 | 0.1246 | - | - | | 7.0402 | 4900 | 0.1074 | - | - | | 7.1836 | 5000 | 0.1192 | 0.2837 | 0.4166 | | 7.3271 | 5100 | 0.1176 | - | - | | 7.4706 | 5200 | 0.111 | - | - | | 7.6141 | 5300 | 0.0889 | - | - | | 7.7575 | 5400 | 0.0202 | - | - | | 7.9010 | 5500 | 0.1059 | 0.2797 | 0.4166 | | 8.0459 | 5600 | 0.0854 | - | - | | 8.1894 | 5700 | 0.0989 | - | - | | 8.3329 | 5800 | 0.0963 | - | - | | 8.4763 | 5900 | 0.0967 | - | - | | 8.6198 | 6000 | 0.0635 | 0.2974 | 0.4223 | | 8.7633 | 6100 | 0.0215 | - | - | | 8.9067 | 6200 | 0.0897 | - | - | | 9.0516 | 6300 | 0.0693 | - | - | | 9.1951 | 6400 | 0.0913 | - | - | | 9.3386 | 6500 | 0.0883 | 0.2812 | 0.4171 | | 9.4821 | 6600 | 0.0849 | - | - | | 9.6255 | 6700 | 0.0525 | - | - | | 9.7690 | 6800 | 0.0196 | - | - | | 9.9125 | 6900 | 0.0799 | - | - | | 10.0574 | 7000 | 0.0603 | 0.2899 | 0.4132 | | 10.2009 | 7100 | 0.0816 | - | - | | 10.3443 | 7200 | 0.0771 | - | - | | 10.4878 | 7300 | 0.0746 | - | - | | 10.6313 | 7400 | 0.0373 | - | - | | **10.7747** | **7500** | **0.0181** | **0.3148** | **0.419** | | 10.9182 | 7600 | 0.0702 | - | - | | 11.0631 | 7700 | 0.0531 | - | - | | 11.2066 | 7800 | 0.0671 | - | - | | 11.3501 | 7900 | 0.0742 | - | - | | 11.4935 | 8000 | 0.0728 | 0.2878 | 0.4177 | | 11.6370 | 8100 | 0.0331 | - | - | | 11.7805 | 8200 | 0.0206 | - | - | | 11.9240 | 8300 | 0.0605 | - | - | | 12.0689 | 8400 | 0.05 | - | - | | 12.2123 | 8500 | 0.06 | 0.3169 | 0.4180 | | 12.3558 | 8600 | 0.0613 | - | - | | 12.4993 | 8700 | 0.0649 | - | - | | 12.6428 | 8800 | 0.0257 | - | - | | 12.7862 | 8900 | 0.0184 | - | - | | 12.9297 | 9000 | 0.055 | 0.3107 | 0.4189 | | 13.0746 | 9100 | 0.0417 | - | - | | 13.2181 | 9200 | 0.0537 | - | - | | 13.3615 | 9300 | 0.0558 | - | - | | 13.5050 | 9400 | 0.0619 | - | - | | 13.6485 | 9500 | 0.0217 | 0.3140 | 0.4173 | | 13.7920 | 9600 | 0.0257 | - | - | | 13.9354 | 9700 | 0.0398 | - | - | | 14.0803 | 9800 | 0.041 | - | - | | 14.2238 | 9900 | 0.0451 | - | - | | 14.3673 | 10000 | 0.0485 | 0.3085 | 0.4188 | | 14.5108 | 10100 | 0.0565 | - | - | | 14.6542 | 10200 | 0.0159 | - | - | | 14.7977 | 10300 | 0.0258 | - | - | | 14.9412 | 10400 | 0.0364 | - | - | | 15.0861 | 10500 | 0.0368 | 0.3144 | 0.4163 | | 15.2296 | 10600 | 0.0447 | - | - | | 15.3730 | 10700 | 0.0479 | - | - | | 15.5165 | 10800 | 0.0535 | - | - | | 15.6600 | 10900 | 0.0139 | - | - | | 15.8034 | 11000 | 0.0257 | 0.3149 | 0.4151 | | 15.9469 | 11100 | 0.0324 | - | - | | 16.0918 | 11200 | 0.0374 | - | - | | 16.2353 | 11300 | 0.0339 | - | - | | 16.3788 | 11400 | 0.0423 | - | - | | 16.5222 | 11500 | 0.0512 | 0.3209 | 0.4164 | | 16.6657 | 11600 | 0.0121 | - | - | | 16.8092 | 11700 | 0.0245 | - | - | | 16.9527 | 11800 | 0.0323 | - | - | | 17.0976 | 11900 | 0.0321 | - | - | | 17.2410 | 12000 | 0.034 | 0.3211 | 0.4140 | | 17.3845 | 12100 | 0.0387 | - | - | | 17.5280 | 12200 | 0.0482 | - | - | | 17.6714 | 12300 | 0.0096 | - | - | | 17.8149 | 12400 | 0.0252 | - | - | | 17.9584 | 12500 | 0.0299 | 0.3169 | 0.4170 | | 18.1033 | 12600 | 0.0351 | - | - | | 18.2468 | 12700 | 0.032 | - | - | | 18.3902 | 12800 | 0.0348 | - | - | | 18.5337 | 12900 | 0.0452 | - | - | | 18.6772 | 13000 | 0.0076 | 0.3273 | 0.4158 | | 18.8207 | 13100 | 0.0241 | - | - | | 18.9641 | 13200 | 0.0277 | - | - | | 19.1090 | 13300 | 0.0331 | - | - | | 19.2525 | 13400 | 0.0264 | - | - | | 19.3960 | 13500 | 0.0311 | 0.3272 | 0.4151 | | 19.5395 | 13600 | 0.0437 | - | - | | 19.6829 | 13700 | 0.0049 | - | - | | 19.8264 | 13800 | 0.0263 | - | - | | 19.9699 | 13900 | 0.0231 | - | - | | 20.1148 | 14000 | 0.0303 | 0.3293 | 0.4200 | | 20.2582 | 14100 | 0.0229 | - | - | | 20.4017 | 14200 | 0.032 | - | - | | 20.5452 | 14300 | 0.0395 | - | - | | 20.6887 | 14400 | 0.0045 | - | - | | 20.8321 | 14500 | 0.0244 | 0.3202 | 0.4144 | | 20.9756 | 14600 | 0.0219 | - | - | | 21.1205 | 14700 | 0.0291 | - | - | | 21.2640 | 14800 | 0.0212 | - | - | | 21.4075 | 14900 | 0.029 | - | - | | 21.5509 | 15000 | 0.0357 | 0.3312 | 0.4147 | | 21.6944 | 15100 | 0.0025 | - | - | | 21.8379 | 15200 | 0.0252 | - | - | | 21.9813 | 15300 | 0.0229 | - | - | | 22.1263 | 15400 | 0.0261 | - | - | | 22.2697 | 15500 | 0.0198 | 0.3392 | 0.4123 | | 22.4132 | 15600 | 0.0259 | - | - | | 22.5567 | 15700 | 0.0343 | - | - | | 22.7001 | 15800 | 0.0022 | - | - | | 22.8436 | 15900 | 0.0237 | - | - | | 22.9871 | 16000 | 0.0199 | 0.3346 | 0.4146 | | 23.1320 | 16100 | 0.0263 | - | - | | 23.2755 | 16200 | 0.0173 | - | - | | 23.4189 | 16300 | 0.0276 | - | - | | 23.5624 | 16400 | 0.03 | - | - | | 23.7059 | 16500 | 0.0022 | 0.3430 | 0.4195 | | 23.8494 | 16600 | 0.0253 | - | - | | 23.9928 | 16700 | 0.0182 | - | - | | 24.1377 | 16800 | 0.0216 | - | - | | 24.2812 | 16900 | 0.0194 | - | - | | 24.4247 | 17000 | 0.0242 | 0.3335 | 0.4132 | | 24.5681 | 17100 | 0.0289 | - | - | | 24.7116 | 17200 | 0.0013 | - | - | | 24.8551 | 17300 | 0.0253 | - | - | | 24.9986 | 17400 | 0.0137 | - | - | | 25.1435 | 17500 | 0.0219 | 0.3481 | 0.4118 | | 25.2869 | 17600 | 0.017 | - | - | | 25.4304 | 17700 | 0.0261 | - | - | | 25.5739 | 17800 | 0.0298 | - | - | | 25.7174 | 17900 | 0.0013 | - | - | | 25.8608 | 18000 | 0.0257 | 0.3407 | 0.4160 | | 26.0057 | 18100 | 0.014 | - | - | | 26.1492 | 18200 | 0.0215 | - | - | | 26.2927 | 18300 | 0.0161 | - | - | | 26.4362 | 18400 | 0.0228 | - | - | | 26.5796 | 18500 | 0.0246 | 0.3404 | 0.4131 | | 26.7231 | 18600 | 0.0017 | - | - | | 26.8666 | 18700 | 0.0244 | - | - | | 27.0115 | 18800 | 0.0124 | - | - | | 27.1549 | 18900 | 0.019 | - | - | | 27.2984 | 19000 | 0.0151 | 0.3451 | 0.4139 | | 27.4419 | 19100 | 0.0216 | - | - | | 27.5854 | 19200 | 0.0255 | - | - | | 27.7288 | 19300 | 0.0016 | - | - | | 27.8723 | 19400 | 0.0251 | - | - | | 28.0172 | 19500 | 0.0133 | 0.3416 | 0.4109 | | 28.1607 | 19600 | 0.016 | - | - | | 28.3042 | 19700 | 0.0186 | - | - | | 28.4476 | 19800 | 0.0185 | - | - | | 28.5911 | 19900 | 0.0225 | - | - | | 28.7346 | 20000 | 0.0009 | 0.3463 | 0.4144 | | 28.8780 | 20100 | 0.0249 | - | - | | 29.0230 | 20200 | 0.0132 | - | - | | 29.1664 | 20300 | 0.0145 | - | - | | 29.3099 | 20400 | 0.0174 | - | - | | 29.4534 | 20500 | 0.0172 | 0.3425 | 0.4092 | | 29.5968 | 20600 | 0.0235 | - | - | | 29.7403 | 20700 | 0.0009 | - | - | | 29.8838 | 20800 | 0.0242 | - | - | | 30.0287 | 20900 | 0.0128 | - | - | | 30.1722 | 21000 | 0.0133 | 0.3482 | 0.4131 | | 30.3156 | 21100 | 0.0158 | - | - | | 30.4591 | 21200 | 0.0226 | - | - | | 30.6026 | 21300 | 0.0188 | - | - | | 30.7461 | 21400 | 0.0009 | - | - | | 30.8895 | 21500 | 0.0249 | 0.3483 | 0.4132 | | 31.0344 | 21600 | 0.0116 | - | - | | 31.1779 | 21700 | 0.0117 | - | - | | 31.3214 | 21800 | 0.0162 | - | - | | 31.4648 | 21900 | 0.0184 | - | - | | 31.6083 | 22000 | 0.0178 | 0.3390 | 0.4145 | | 31.7518 | 22100 | 0.0012 | - | - | | 31.8953 | 22200 | 0.0215 | - | - | | 32.0402 | 22300 | 0.014 | - | - | | 32.1836 | 22400 | 0.0105 | - | - | | 32.3271 | 22500 | 0.0131 | 0.3556 | 0.4144 | | 32.4706 | 22600 | 0.0199 | - | - | | 32.6141 | 22700 | 0.0158 | - | - | | 32.7575 | 22800 | 0.0018 | - | - | | 32.9010 | 22900 | 0.0236 | - | - | | 33.0459 | 23000 | 0.0131 | 0.3480 | 0.4136 | | 33.1894 | 23100 | 0.0121 | - | - | | 33.3329 | 23200 | 0.0164 | - | - | | 33.4763 | 23300 | 0.0209 | - | - | | 33.6198 | 23400 | 0.0119 | - | - | | 33.7633 | 23500 | 0.0029 | 0.3575 | 0.4180 | | 33.9067 | 23600 | 0.0201 | - | - | | 34.0516 | 23700 | 0.0121 | - | - | | 34.1951 | 23800 | 0.0109 | - | - | | 34.3386 | 23900 | 0.0132 | - | - | | 34.4821 | 24000 | 0.0203 | 0.3446 | 0.4141 | | 34.6255 | 24100 | 0.0087 | - | - | | 34.7690 | 24200 | 0.0032 | - | - | | 34.9125 | 24300 | 0.0182 | - | - | | 35.0574 | 24400 | 0.0116 | - | - | | 35.2009 | 24500 | 0.0105 | 0.3587 | 0.4117 | | 35.3443 | 24600 | 0.018 | - | - | | 35.4878 | 24700 | 0.0194 | - | - | | 35.6313 | 24800 | 0.0076 | - | - | | 35.7747 | 24900 | 0.0029 | - | - | | 35.9182 | 25000 | 0.0167 | 0.3529 | 0.4156 | | 36.0631 | 25100 | 0.0105 | - | - | | 36.2066 | 25200 | 0.0097 | - | - | | 36.3501 | 25300 | 0.0165 | - | - | | 36.4935 | 25400 | 0.0187 | - | - | | 36.6370 | 25500 | 0.0062 | 0.3517 | 0.4173 | | 36.7805 | 25600 | 0.0034 | - | - | | 36.9240 | 25700 | 0.0173 | - | - | | 37.0689 | 25800 | 0.0091 | - | - | | 37.2123 | 25900 | 0.0093 | - | - | | 37.3558 | 26000 | 0.0152 | 0.3605 | 0.4147 | | 37.4993 | 26100 | 0.0193 | - | - | | 37.6428 | 26200 | 0.0065 | - | - | | 37.7862 | 26300 | 0.0036 | - | - | | 37.9297 | 26400 | 0.017 | - | - | | 38.0746 | 26500 | 0.009 | 0.3627 | 0.4178 | | 38.2181 | 26600 | 0.0087 | - | - | | 38.3615 | 26700 | 0.0129 | - | - | | 38.5050 | 26800 | 0.0199 | - | - | | 38.6485 | 26900 | 0.0047 | - | - | | 38.7920 | 27000 | 0.0104 | 0.3535 | 0.4191 | | 38.9354 | 27100 | 0.0106 | - | - | | 39.0803 | 27200 | 0.0083 | - | - | | 39.2238 | 27300 | 0.0091 | - | - | | 39.3673 | 27400 | 0.0143 | - | - | | 39.5108 | 27500 | 0.018 | 0.3586 | 0.4137 | | 39.6542 | 27600 | 0.0055 | - | - | | 39.7977 | 27700 | 0.0097 | - | - | | 39.9412 | 27800 | 0.0111 | - | - | | 40.0861 | 27900 | 0.0091 | - | - | | 40.2296 | 28000 | 0.009 | 0.3540 | 0.4166 | | 40.3730 | 28100 | 0.0145 | - | - | | 40.5165 | 28200 | 0.0165 | - | - | | 40.6600 | 28300 | 0.0041 | - | - | | 40.8034 | 28400 | 0.009 | - | - | | 40.9469 | 28500 | 0.0091 | 0.3541 | 0.4159 | | 41.0918 | 28600 | 0.0106 | - | - | | 41.2353 | 28700 | 0.0064 | - | - | | 41.3788 | 28800 | 0.0125 | - | - | | 41.5222 | 28900 | 0.0172 | - | - | | 41.6657 | 29000 | 0.0028 | 0.3550 | 0.4151 | | 41.8092 | 29100 | 0.0097 | - | - | | 41.9527 | 29200 | 0.0086 | - | - | | 42.0976 | 29300 | 0.0099 | - | - | | 42.2410 | 29400 | 0.0064 | - | - | | 42.3845 | 29500 | 0.0127 | 0.3619 | 0.4150 | | 42.5280 | 29600 | 0.0157 | - | - | | 42.6714 | 29700 | 0.0025 | - | - | | 42.8149 | 29800 | 0.0095 | - | - | | 42.9584 | 29900 | 0.0087 | - | - | | 43.1033 | 30000 | 0.0094 | 0.3591 | 0.4153 | | 43.2468 | 30100 | 0.007 | - | - | | 43.3902 | 30200 | 0.0114 | - | - | | 43.5337 | 30300 | 0.0166 | - | - | | 43.6772 | 30400 | 0.0023 | - | - | | 43.8207 | 30500 | 0.01 | 0.3582 | 0.4172 | | 43.9641 | 30600 | 0.0097 | - | - | | 44.1090 | 30700 | 0.01 | - | - | | 44.2525 | 30800 | 0.007 | - | - | | 44.3960 | 30900 | 0.0106 | - | - | | 44.5395 | 31000 | 0.0164 | 0.3626 | 0.4151 | | 44.6829 | 31100 | 0.0017 | - | - | | 44.8264 | 31200 | 0.0113 | - | - | | 44.9699 | 31300 | 0.0081 | - | - | | 45.1148 | 31400 | 0.0095 | - | - | | 45.2582 | 31500 | 0.0061 | 0.3669 | 0.4152 | | 45.4017 | 31600 | 0.0111 | - | - | | 45.5452 | 31700 | 0.0157 | - | - | | 45.6887 | 31800 | 0.0015 | - | - | | 45.8321 | 31900 | 0.0109 | - | - | | 45.9756 | 32000 | 0.0085 | 0.3595 | 0.4139 | | 46.1205 | 32100 | 0.0096 | - | - | | 46.2640 | 32200 | 0.0062 | - | - | | 46.4075 | 32300 | 0.0111 | - | - | | 46.5509 | 32400 | 0.017 | - | - | | 46.6944 | 32500 | 0.0013 | 0.3631 | 0.4154 | | 46.8379 | 32600 | 0.0123 | - | - | | 46.9813 | 32700 | 0.0076 | - | - | | 47.1263 | 32800 | 0.0088 | - | - | | 47.2697 | 32900 | 0.0065 | - | - | | 47.4132 | 33000 | 0.0116 | 0.3656 | 0.4148 | | 47.5567 | 33100 | 0.0142 | - | - | | 47.7001 | 33200 | 0.0009 | - | - | | 47.8436 | 33300 | 0.0101 | - | - | | 47.9871 | 33400 | 0.0069 | - | - | | 48.1320 | 33500 | 0.0087 | 0.3643 | 0.4160 | | 48.2755 | 33600 | 0.005 | - | - | | 48.4189 | 33700 | 0.0118 | - | - | | 48.5624 | 33800 | 0.0147 | - | - | | 48.7059 | 33900 | 0.0008 | - | - | | 48.8494 | 34000 | 0.0115 | 0.3632 | 0.4158 | | 48.9928 | 34100 | 0.006 | - | - | | 49.1377 | 34200 | 0.0089 | - | - | | 49.2812 | 34300 | 0.0063 | - | - | | 49.4247 | 34400 | 0.0126 | - | - | | 49.5681 | 34500 | 0.0142 | 0.3643 | 0.4157 | | 49.7116 | 34600 | 0.0008 | - | - | | 49.8551 | 34700 | 0.0137 | - | - | | 49.9986 | 34800 | 0.0044 | 0.3148 | 0.4190 | * The bold row denotes the saved checkpoint.
### Framework Versions - Python: 3.11.9 - Sentence Transformers: 3.0.1 - Transformers: 4.43.3 - PyTorch: 2.3.1+cu121 - Accelerate: 0.30.1 - Datasets: 2.19.2 - Tokenizers: 0.19.1 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ``` #### MultipleNegativesRankingLoss ```bibtex @misc{henderson2017efficient, title={Efficient Natural Language Response Suggestion for Smart Reply}, author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil}, year={2017}, eprint={1705.00652}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```