kr-manish commited on
Commit
7de23e7
1 Parent(s): f5250ce

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,852 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BAAI/bge-base-en-v1.5
3
+ datasets: []
4
+ language: []
5
+ library_name: sentence-transformers
6
+ metrics:
7
+ - cosine_accuracy@1
8
+ - cosine_accuracy@3
9
+ - cosine_accuracy@5
10
+ - cosine_accuracy@10
11
+ - cosine_precision@1
12
+ - cosine_precision@3
13
+ - cosine_precision@5
14
+ - cosine_precision@10
15
+ - cosine_recall@1
16
+ - cosine_recall@3
17
+ - cosine_recall@5
18
+ - cosine_recall@10
19
+ - cosine_ndcg@10
20
+ - cosine_mrr@10
21
+ - cosine_map@100
22
+ pipeline_tag: sentence-similarity
23
+ tags:
24
+ - sentence-transformers
25
+ - sentence-similarity
26
+ - feature-extraction
27
+ - generated_from_trainer
28
+ - dataset_size:76
29
+ - loss:MatryoshkaLoss
30
+ - loss:MultipleNegativesRankingLoss
31
+ widget:
32
+ - source_sentence: ply C Tris pH8.0 Dextran Trehalose dNTPS Na2 SO4 Triton X-100
33
+ sentences:
34
+ - NE
35
+ - Assignee
36
+ - ID NO
37
+ - source_sentence: Certain isothermal amplification methods are able to amplify
38
+ a target nucleic acid from trace levels to very high 25 and detectable levels within a matter of minutes. Such
39
+ isothermal methods, e.g., Nicking and Extension Amplifi cation Reaction (NEAR), allow
40
+ users to detect a particular nucleotide sequence in trace amounts, facilitating
41
+ point-of care testing and increasing the accessibility and speed of diagnostics.
42
+ Streptococcus pyogenes is the causative agent of group A streptococcal (GAS) infections such as pharyngitis, impe
43
+ tigo, and life-threatening necrotizing fasciitis and sepsis. The most common GAS infection, pharyngitis, can be
44
+ diagnosed by collecting a throat swab sample from a patient and culturing the
45
+ sample under conditions that would enable bacterial, specifically S. pyogenes, growth,
46
+ which takes 2-3 days. Culturing S. pyogenes is an accurate and reliable
47
+ method of diagnosing GAS, but it is slow. A 2-3 day delay in prescribing appropriate antibiotic
48
+ treatment can result in unnecessary patient suffering and potentially the onset
49
+ oflife threatening conditions such as rheumatic fever. In the recent past, biochemical
50
+ methods have been developed to detect S. pyogenes, but these methods do not provide the necessary
51
+ characteristics to be deployed in the point-of-care setting, either due to a lack
52
+ of sensitivity or time to result (speed). Accordingly, a highly sensitive and
53
+ rapid qualitative assay for the detection and diagnosis of a S. pyogenes infection
54
+ is desired.
55
+ sentences:
56
+ - ABSTRACT
57
+ - TACTGTTCCTGTTTGA
58
+ - BACKGROUND
59
+ - source_sentence: W02010/141940 12/2010 CA (US)
60
+ sentences:
61
+ - WO
62
+ - DESCRIPTION OF DRAWINGS
63
+ - SUMMARY
64
+ - source_sentence: 9 6 nucleotide sequence at least 80, 85 or 95% identity to SEQ
65
+ sentences:
66
+ - '2'
67
+ - NEAR
68
+ - Ph. Dissertation published Jan. 1, 2008. (Year
69
+ - source_sentence: One potential inhibitor of the NEAR technology is human gDNA. When a throat swab sample is collected from a
70
+ patient symptomatic for GAS infection, it is possible that human gDNA
71
+ is also collected on the swab (from immune cells such as white blood cells
72
+ or from local epithelial cells). In order to assess the impact that human gDNA
73
+ has on the GAS assay, a study was performed using three different levels of
74
+ GAS gDNA (25, 250 and 1000 copies) in the presence of 0, 10, 50, 100, 250, 500 or 1000
75
+ ng of human gDNA. As shown in FIG. 6, the presence of human gDNA does have an impact on
76
+ GAS assay performance, and the impact is GAS target concentration dependent. When
77
+ there is a low copy number of target GAS present in the reaction, 10 ng of
78
+ human gDNA or more significantly inhibits the assay. At 250 copies of GAS
79
+ target, the impact of 10 ng of 60 human gDNA is less, and at 1,000 copies of
80
+ GAS target, the effect of 10 ng of human gDNA on the assay is significantly less. In
81
+ fact, when 1,000 copies of target is present in the assay, up to 100 ng
82
+ of human gDNA can be tolerated, albeit with a slower amplification speed and reduced
83
+ fluorescence signal. Testing of the 501 (IC only) mix showed a more robust response to human gDNA. When the 501 mix was
84
+ tested in the presence of O copies of target GAS and up to US 10,329,601 B2
85
+ 23 1,000 ng of human gDNA, the assay still produced a clearly positive signal
86
+ at 500 ng of human gDNA (even at 1,000 ng of human gDNA the fluorescence signal was still above
87
+ background). Other Embodiments It is to be understood that while the invention
88
+ has been described in conjunction with the detailed description 24 thereof, the foregoing description is intended to illustrate
89
+ and not limit the scope of the invention, which is defined by the scope of the
90
+ appended claims. Other aspects, advantages, and modifications are within the scope of
91
+ the following claims.
92
+ sentences:
93
+ - '75'
94
+ - TGTAGCTGACACCACCAAGCTACA
95
+ - Impact of Human Genomic DNA (gDNA) on GAS Assay
96
+ model-index:
97
+ - name: SentenceTransformer based on BAAI/bge-base-en-v1.5
98
+ results:
99
+ - task:
100
+ type: information-retrieval
101
+ name: Information Retrieval
102
+ dataset:
103
+ name: dim 768
104
+ type: dim_768
105
+ metrics:
106
+ - type: cosine_accuracy@1
107
+ value: 0.625
108
+ name: Cosine Accuracy@1
109
+ - type: cosine_accuracy@3
110
+ value: 0.875
111
+ name: Cosine Accuracy@3
112
+ - type: cosine_accuracy@5
113
+ value: 1.0
114
+ name: Cosine Accuracy@5
115
+ - type: cosine_accuracy@10
116
+ value: 1.0
117
+ name: Cosine Accuracy@10
118
+ - type: cosine_precision@1
119
+ value: 0.625
120
+ name: Cosine Precision@1
121
+ - type: cosine_precision@3
122
+ value: 0.29166666666666663
123
+ name: Cosine Precision@3
124
+ - type: cosine_precision@5
125
+ value: 0.2
126
+ name: Cosine Precision@5
127
+ - type: cosine_precision@10
128
+ value: 0.1
129
+ name: Cosine Precision@10
130
+ - type: cosine_recall@1
131
+ value: 0.625
132
+ name: Cosine Recall@1
133
+ - type: cosine_recall@3
134
+ value: 0.875
135
+ name: Cosine Recall@3
136
+ - type: cosine_recall@5
137
+ value: 1.0
138
+ name: Cosine Recall@5
139
+ - type: cosine_recall@10
140
+ value: 1.0
141
+ name: Cosine Recall@10
142
+ - type: cosine_ndcg@10
143
+ value: 0.8202007889556063
144
+ name: Cosine Ndcg@10
145
+ - type: cosine_mrr@10
146
+ value: 0.7604166666666666
147
+ name: Cosine Mrr@10
148
+ - type: cosine_map@100
149
+ value: 0.7604166666666666
150
+ name: Cosine Map@100
151
+ - task:
152
+ type: information-retrieval
153
+ name: Information Retrieval
154
+ dataset:
155
+ name: dim 512
156
+ type: dim_512
157
+ metrics:
158
+ - type: cosine_accuracy@1
159
+ value: 0.625
160
+ name: Cosine Accuracy@1
161
+ - type: cosine_accuracy@3
162
+ value: 0.875
163
+ name: Cosine Accuracy@3
164
+ - type: cosine_accuracy@5
165
+ value: 0.875
166
+ name: Cosine Accuracy@5
167
+ - type: cosine_accuracy@10
168
+ value: 1.0
169
+ name: Cosine Accuracy@10
170
+ - type: cosine_precision@1
171
+ value: 0.625
172
+ name: Cosine Precision@1
173
+ - type: cosine_precision@3
174
+ value: 0.29166666666666663
175
+ name: Cosine Precision@3
176
+ - type: cosine_precision@5
177
+ value: 0.17500000000000002
178
+ name: Cosine Precision@5
179
+ - type: cosine_precision@10
180
+ value: 0.1
181
+ name: Cosine Precision@10
182
+ - type: cosine_recall@1
183
+ value: 0.625
184
+ name: Cosine Recall@1
185
+ - type: cosine_recall@3
186
+ value: 0.875
187
+ name: Cosine Recall@3
188
+ - type: cosine_recall@5
189
+ value: 0.875
190
+ name: Cosine Recall@5
191
+ - type: cosine_recall@10
192
+ value: 1.0
193
+ name: Cosine Recall@10
194
+ - type: cosine_ndcg@10
195
+ value: 0.810892117584935
196
+ name: Cosine Ndcg@10
197
+ - type: cosine_mrr@10
198
+ value: 0.75
199
+ name: Cosine Mrr@10
200
+ - type: cosine_map@100
201
+ value: 0.75
202
+ name: Cosine Map@100
203
+ - task:
204
+ type: information-retrieval
205
+ name: Information Retrieval
206
+ dataset:
207
+ name: dim 256
208
+ type: dim_256
209
+ metrics:
210
+ - type: cosine_accuracy@1
211
+ value: 0.625
212
+ name: Cosine Accuracy@1
213
+ - type: cosine_accuracy@3
214
+ value: 0.875
215
+ name: Cosine Accuracy@3
216
+ - type: cosine_accuracy@5
217
+ value: 0.875
218
+ name: Cosine Accuracy@5
219
+ - type: cosine_accuracy@10
220
+ value: 1.0
221
+ name: Cosine Accuracy@10
222
+ - type: cosine_precision@1
223
+ value: 0.625
224
+ name: Cosine Precision@1
225
+ - type: cosine_precision@3
226
+ value: 0.29166666666666663
227
+ name: Cosine Precision@3
228
+ - type: cosine_precision@5
229
+ value: 0.17500000000000002
230
+ name: Cosine Precision@5
231
+ - type: cosine_precision@10
232
+ value: 0.1
233
+ name: Cosine Precision@10
234
+ - type: cosine_recall@1
235
+ value: 0.625
236
+ name: Cosine Recall@1
237
+ - type: cosine_recall@3
238
+ value: 0.875
239
+ name: Cosine Recall@3
240
+ - type: cosine_recall@5
241
+ value: 0.875
242
+ name: Cosine Recall@5
243
+ - type: cosine_recall@10
244
+ value: 1.0
245
+ name: Cosine Recall@10
246
+ - type: cosine_ndcg@10
247
+ value: 0.8057993287946483
248
+ name: Cosine Ndcg@10
249
+ - type: cosine_mrr@10
250
+ value: 0.7447916666666666
251
+ name: Cosine Mrr@10
252
+ - type: cosine_map@100
253
+ value: 0.7447916666666666
254
+ name: Cosine Map@100
255
+ - task:
256
+ type: information-retrieval
257
+ name: Information Retrieval
258
+ dataset:
259
+ name: dim 128
260
+ type: dim_128
261
+ metrics:
262
+ - type: cosine_accuracy@1
263
+ value: 0.75
264
+ name: Cosine Accuracy@1
265
+ - type: cosine_accuracy@3
266
+ value: 0.875
267
+ name: Cosine Accuracy@3
268
+ - type: cosine_accuracy@5
269
+ value: 1.0
270
+ name: Cosine Accuracy@5
271
+ - type: cosine_accuracy@10
272
+ value: 1.0
273
+ name: Cosine Accuracy@10
274
+ - type: cosine_precision@1
275
+ value: 0.75
276
+ name: Cosine Precision@1
277
+ - type: cosine_precision@3
278
+ value: 0.29166666666666663
279
+ name: Cosine Precision@3
280
+ - type: cosine_precision@5
281
+ value: 0.2
282
+ name: Cosine Precision@5
283
+ - type: cosine_precision@10
284
+ value: 0.1
285
+ name: Cosine Precision@10
286
+ - type: cosine_recall@1
287
+ value: 0.75
288
+ name: Cosine Recall@1
289
+ - type: cosine_recall@3
290
+ value: 0.875
291
+ name: Cosine Recall@3
292
+ - type: cosine_recall@5
293
+ value: 1.0
294
+ name: Cosine Recall@5
295
+ - type: cosine_recall@10
296
+ value: 1.0
297
+ name: Cosine Recall@10
298
+ - type: cosine_ndcg@10
299
+ value: 0.8608566009043177
300
+ name: Cosine Ndcg@10
301
+ - type: cosine_mrr@10
302
+ value: 0.8166666666666667
303
+ name: Cosine Mrr@10
304
+ - type: cosine_map@100
305
+ value: 0.8166666666666667
306
+ name: Cosine Map@100
307
+ - task:
308
+ type: information-retrieval
309
+ name: Information Retrieval
310
+ dataset:
311
+ name: dim 64
312
+ type: dim_64
313
+ metrics:
314
+ - type: cosine_accuracy@1
315
+ value: 0.75
316
+ name: Cosine Accuracy@1
317
+ - type: cosine_accuracy@3
318
+ value: 0.875
319
+ name: Cosine Accuracy@3
320
+ - type: cosine_accuracy@5
321
+ value: 1.0
322
+ name: Cosine Accuracy@5
323
+ - type: cosine_accuracy@10
324
+ value: 1.0
325
+ name: Cosine Accuracy@10
326
+ - type: cosine_precision@1
327
+ value: 0.75
328
+ name: Cosine Precision@1
329
+ - type: cosine_precision@3
330
+ value: 0.29166666666666663
331
+ name: Cosine Precision@3
332
+ - type: cosine_precision@5
333
+ value: 0.2
334
+ name: Cosine Precision@5
335
+ - type: cosine_precision@10
336
+ value: 0.1
337
+ name: Cosine Precision@10
338
+ - type: cosine_recall@1
339
+ value: 0.75
340
+ name: Cosine Recall@1
341
+ - type: cosine_recall@3
342
+ value: 0.875
343
+ name: Cosine Recall@3
344
+ - type: cosine_recall@5
345
+ value: 1.0
346
+ name: Cosine Recall@5
347
+ - type: cosine_recall@10
348
+ value: 1.0
349
+ name: Cosine Recall@10
350
+ - type: cosine_ndcg@10
351
+ value: 0.8608566009043177
352
+ name: Cosine Ndcg@10
353
+ - type: cosine_mrr@10
354
+ value: 0.8166666666666667
355
+ name: Cosine Mrr@10
356
+ - type: cosine_map@100
357
+ value: 0.8166666666666667
358
+ name: Cosine Map@100
359
+ ---
360
+
361
+ # SentenceTransformer based on BAAI/bge-base-en-v1.5
362
+
363
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
364
+
365
+ ## Model Details
366
+
367
+ ### Model Description
368
+ - **Model Type:** Sentence Transformer
369
+ - **Base model:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) <!-- at revision a5beb1e3e68b9ab74eb54cfd186867f64f240e1a -->
370
+ - **Maximum Sequence Length:** 512 tokens
371
+ - **Output Dimensionality:** 768 tokens
372
+ - **Similarity Function:** Cosine Similarity
373
+ <!-- - **Training Dataset:** Unknown -->
374
+ <!-- - **Language:** Unknown -->
375
+ <!-- - **License:** Unknown -->
376
+
377
+ ### Model Sources
378
+
379
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
380
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
381
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
382
+
383
+ ### Full Model Architecture
384
+
385
+ ```
386
+ SentenceTransformer(
387
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
388
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
389
+ (2): Normalize()
390
+ )
391
+ ```
392
+
393
+ ## Usage
394
+
395
+ ### Direct Usage (Sentence Transformers)
396
+
397
+ First install the Sentence Transformers library:
398
+
399
+ ```bash
400
+ pip install -U sentence-transformers
401
+ ```
402
+
403
+ Then you can load this model and run inference.
404
+ ```python
405
+ from sentence_transformers import SentenceTransformer
406
+
407
+ # Download from the 🤗 Hub
408
+ model = SentenceTransformer("kr-manish/fine-tuned-bge-base-raw_pdf-v1")
409
+ # Run inference
410
+ sentences = [
411
+ 'One potential inhibitor of the NEAR technology is human gDNA. When a throat swab sample is collected from a patient symptomatic for GAS infection, it is possible that human gDNA is also collected on the swab (from immune cells such as white blood cells or from local epithelial cells). In order to assess the impact that human gDNA has on the GAS assay, a study was performed using three different levels of GAS gDNA (25, 250 and 1000 copies) in the presence of 0, 10, 50, 100, 250, 500 or 1000 ng of human gDNA. As shown in FIG. 6, the presence of human gDNA does have an impact on GAS assay performance, and the impact is GAS target concentration dependent. When there is a low copy number of target GAS present in the reaction, 10 ng of human gDNA or more significantly inhibits the assay. At 250 copies of GAS target, the impact of 10 ng of 60 human gDNA is less, and at 1,000 copies of GAS target, the effect of 10 ng of human gDNA on the assay is significantly less. In fact, when 1,000 copies of target is present in the assay, up to 100 ng of human gDNA can be tolerated, albeit with a slower amplification speed and reduced fluorescence signal. Testing of the 501 (IC only) mix showed a more robust response to human gDNA. When the 501 mix was tested in the presence of O copies of target GAS and up to US 10,329,601 B2 23 1,000 ng of human gDNA, the assay still produced a clearly positive signal at 500 ng of human gDNA (even at 1,000 ng of human gDNA the fluorescence signal was still above background). Other Embodiments It is to be understood that while the invention has been described in conjunction with the detailed description 24 thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.',
412
+ 'Impact of Human Genomic DNA (gDNA) on GAS Assay',
413
+ 'TGTAGCTGACACCACCAAGCTACA',
414
+ ]
415
+ embeddings = model.encode(sentences)
416
+ print(embeddings.shape)
417
+ # [3, 768]
418
+
419
+ # Get the similarity scores for the embeddings
420
+ similarities = model.similarity(embeddings, embeddings)
421
+ print(similarities.shape)
422
+ # [3, 3]
423
+ ```
424
+
425
+ <!--
426
+ ### Direct Usage (Transformers)
427
+
428
+ <details><summary>Click to see the direct usage in Transformers</summary>
429
+
430
+ </details>
431
+ -->
432
+
433
+ <!--
434
+ ### Downstream Usage (Sentence Transformers)
435
+
436
+ You can finetune this model on your own dataset.
437
+
438
+ <details><summary>Click to expand</summary>
439
+
440
+ </details>
441
+ -->
442
+
443
+ <!--
444
+ ### Out-of-Scope Use
445
+
446
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
447
+ -->
448
+
449
+ ## Evaluation
450
+
451
+ ### Metrics
452
+
453
+ #### Information Retrieval
454
+ * Dataset: `dim_768`
455
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
456
+
457
+ | Metric | Value |
458
+ |:--------------------|:-----------|
459
+ | cosine_accuracy@1 | 0.625 |
460
+ | cosine_accuracy@3 | 0.875 |
461
+ | cosine_accuracy@5 | 1.0 |
462
+ | cosine_accuracy@10 | 1.0 |
463
+ | cosine_precision@1 | 0.625 |
464
+ | cosine_precision@3 | 0.2917 |
465
+ | cosine_precision@5 | 0.2 |
466
+ | cosine_precision@10 | 0.1 |
467
+ | cosine_recall@1 | 0.625 |
468
+ | cosine_recall@3 | 0.875 |
469
+ | cosine_recall@5 | 1.0 |
470
+ | cosine_recall@10 | 1.0 |
471
+ | cosine_ndcg@10 | 0.8202 |
472
+ | cosine_mrr@10 | 0.7604 |
473
+ | **cosine_map@100** | **0.7604** |
474
+
475
+ #### Information Retrieval
476
+ * Dataset: `dim_512`
477
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
478
+
479
+ | Metric | Value |
480
+ |:--------------------|:---------|
481
+ | cosine_accuracy@1 | 0.625 |
482
+ | cosine_accuracy@3 | 0.875 |
483
+ | cosine_accuracy@5 | 0.875 |
484
+ | cosine_accuracy@10 | 1.0 |
485
+ | cosine_precision@1 | 0.625 |
486
+ | cosine_precision@3 | 0.2917 |
487
+ | cosine_precision@5 | 0.175 |
488
+ | cosine_precision@10 | 0.1 |
489
+ | cosine_recall@1 | 0.625 |
490
+ | cosine_recall@3 | 0.875 |
491
+ | cosine_recall@5 | 0.875 |
492
+ | cosine_recall@10 | 1.0 |
493
+ | cosine_ndcg@10 | 0.8109 |
494
+ | cosine_mrr@10 | 0.75 |
495
+ | **cosine_map@100** | **0.75** |
496
+
497
+ #### Information Retrieval
498
+ * Dataset: `dim_256`
499
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
500
+
501
+ | Metric | Value |
502
+ |:--------------------|:-----------|
503
+ | cosine_accuracy@1 | 0.625 |
504
+ | cosine_accuracy@3 | 0.875 |
505
+ | cosine_accuracy@5 | 0.875 |
506
+ | cosine_accuracy@10 | 1.0 |
507
+ | cosine_precision@1 | 0.625 |
508
+ | cosine_precision@3 | 0.2917 |
509
+ | cosine_precision@5 | 0.175 |
510
+ | cosine_precision@10 | 0.1 |
511
+ | cosine_recall@1 | 0.625 |
512
+ | cosine_recall@3 | 0.875 |
513
+ | cosine_recall@5 | 0.875 |
514
+ | cosine_recall@10 | 1.0 |
515
+ | cosine_ndcg@10 | 0.8058 |
516
+ | cosine_mrr@10 | 0.7448 |
517
+ | **cosine_map@100** | **0.7448** |
518
+
519
+ #### Information Retrieval
520
+ * Dataset: `dim_128`
521
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
522
+
523
+ | Metric | Value |
524
+ |:--------------------|:-----------|
525
+ | cosine_accuracy@1 | 0.75 |
526
+ | cosine_accuracy@3 | 0.875 |
527
+ | cosine_accuracy@5 | 1.0 |
528
+ | cosine_accuracy@10 | 1.0 |
529
+ | cosine_precision@1 | 0.75 |
530
+ | cosine_precision@3 | 0.2917 |
531
+ | cosine_precision@5 | 0.2 |
532
+ | cosine_precision@10 | 0.1 |
533
+ | cosine_recall@1 | 0.75 |
534
+ | cosine_recall@3 | 0.875 |
535
+ | cosine_recall@5 | 1.0 |
536
+ | cosine_recall@10 | 1.0 |
537
+ | cosine_ndcg@10 | 0.8609 |
538
+ | cosine_mrr@10 | 0.8167 |
539
+ | **cosine_map@100** | **0.8167** |
540
+
541
+ #### Information Retrieval
542
+ * Dataset: `dim_64`
543
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
544
+
545
+ | Metric | Value |
546
+ |:--------------------|:-----------|
547
+ | cosine_accuracy@1 | 0.75 |
548
+ | cosine_accuracy@3 | 0.875 |
549
+ | cosine_accuracy@5 | 1.0 |
550
+ | cosine_accuracy@10 | 1.0 |
551
+ | cosine_precision@1 | 0.75 |
552
+ | cosine_precision@3 | 0.2917 |
553
+ | cosine_precision@5 | 0.2 |
554
+ | cosine_precision@10 | 0.1 |
555
+ | cosine_recall@1 | 0.75 |
556
+ | cosine_recall@3 | 0.875 |
557
+ | cosine_recall@5 | 1.0 |
558
+ | cosine_recall@10 | 1.0 |
559
+ | cosine_ndcg@10 | 0.8609 |
560
+ | cosine_mrr@10 | 0.8167 |
561
+ | **cosine_map@100** | **0.8167** |
562
+
563
+ <!--
564
+ ## Bias, Risks and Limitations
565
+
566
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
567
+ -->
568
+
569
+ <!--
570
+ ### Recommendations
571
+
572
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
573
+ -->
574
+
575
+ ## Training Details
576
+
577
+ ### Training Dataset
578
+
579
+ #### Unnamed Dataset
580
+
581
+
582
+ * Size: 76 training samples
583
+ * Columns: <code>positive</code> and <code>anchor</code>
584
+ * Approximate statistics based on the first 1000 samples:
585
+ | | positive | anchor |
586
+ |:--------|:------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|
587
+ | type | string | string |
588
+ | details | <ul><li>min: 8 tokens</li><li>mean: 148.89 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 6.54 tokens</li><li>max: 18 tokens</li></ul> |
589
+ * Samples:
590
+ | positive | anchor |
591
+ |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------|
592
+ | <code>1 (AGACTCCATATGGAGTCTAGC CAAACAGGAACA); a reverse template comprising a nucleotide sequence having at least 80, 85 or 95% identity 55 (CGACTCCATATGGAGTC to GAAAGCAATCTGAGGA); and a probe oligonucleotide comprising a nucleotide sequence at least 80, 85 or 95%</code> | <code>to SEQ ID NO</code> |
593
+ | <code>One potential inhibitor of the NEAR technology is human gDNA. When a throat swab sample is collected from a patient symptomatic for GAS infection, it is possible that human gDNA is also collected on the swab (from immune cells such as white blood cells or from local epithelial cells). In order to assess the impact that human gDNA has on the GAS assay, a study was performed using three different levels of GAS gDNA (25, 250 and 1000 copies) in the presence of 0, 10, 50, 100, 250, 500 or 1000 ng of human gDNA. As shown in FIG. 6, the presence of human gDNA does have an impact on GAS assay performance, and the impact is GAS target concentration dependent. When there is a low copy number of target GAS present in the reaction, 10 ng of human gDNA or more significantly inhibits the assay. At 250 copies of GAS target, the impact of 10 ng of 60 human gDNA is less, and at 1,000 copies of GAS target, the effect of 10 ng of human gDNA on the assay is significantly less. In fact, when 1,000 copies of target is present in the assay, up to 100 ng of human gDNA can be tolerated, albeit with a slower amplification speed and reduced fluorescence signal. Testing of the 501 (IC only) mix showed a more robust response to human gDNA. When the 501 mix was tested in the presence of O copies of target GAS and up to US 10,329,601 B2 23 1,000 ng of human gDNA, the assay still produced a clearly positive signal at 500 ng of human gDNA (even at 1,000 ng of human gDNA the fluorescence signal was still above background). Other Embodiments It is to be understood that while the invention has been described in conjunction with the detailed description 24 thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.</code> | <code>Impact of Human Genomic DNA (gDNA) on GAS Assay</code> |
594
+ | <code>25C-4x lyophilization mix, single tube assay format; 50C-2x lyophilization mix, single tube assay format; 25T-4x lyophilization mix, target assay only; 50T-2x lyophilization mix, target assay only; 25I- 4x lyophilization mix, IC assay only; 50I-2x lyophilization mix, internal control (IC) assay only. The GAS NEAR assay can be run on an appropriate platform. For example, the GAS NEAR assay can be run on an Al ere i platform (www.alere.com/ww/en/product-details/ alere-i-strep-a.html). AnA!ere i system consists of an instru ment which provides heating, mixing and fluorescence detection with automated result output, and a set of dispos ables, consisting of the sample receiver (where the elution buffer is stored), a test base ( containing two tubes of lyophilized NEAR reagents) and a transfer device ( designed to transfer 100 µI aliquots of eluted sample from the sample receiver to each of the two tubes containing lyophilized NEAR reagents located in the test base). Suitable dispos ables for use with the Alere i GAS NEAR test include those 60 described in, for example U.S. application Ser. No. 13/242, 999, incorporated herein by reference in its entirety. In addition to containing the reagents necessary for driv ing the GAS NEAR assay, the lyophilized material also contains the lytic agent for GAS, the protein plyC; therefore, 65 GAS lysis does not occur until the lyophilized material is re-suspended. In some cases, the lyophilized material does not contain a lytic agent for GAS, for example, in some US 10,329,601 B2 19 examples, the lyophilized material does not contain the protein plyC. The elution buffer was designed to allow for the rapid release of GAS organisms from clinical sample throat swabs as well as to provide the necessary salts for driving the NEAR assay (both MgSO4 and (NH4)2SO4), in 5 a slightly basic environment. In some examples, the elution buffer also includes an anti-microbial agent or preservative (e.g., ProClin® 950). For the present examples, GAS assay was performed as a two tube assay-a GAS target specific assay in one tube, and an internal control (IC) assay in a second tube (tested side by side on the Alere i).</code> | <code>annotated as follows</code> |
595
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
596
+ ```json
597
+ {
598
+ "loss": "MultipleNegativesRankingLoss",
599
+ "matryoshka_dims": [
600
+ 768,
601
+ 512,
602
+ 256,
603
+ 128,
604
+ 64
605
+ ],
606
+ "matryoshka_weights": [
607
+ 1,
608
+ 1,
609
+ 1,
610
+ 1,
611
+ 1
612
+ ],
613
+ "n_dims_per_step": -1
614
+ }
615
+ ```
616
+
617
+ ### Training Hyperparameters
618
+ #### Non-Default Hyperparameters
619
+
620
+ - `eval_strategy`: epoch
621
+ - `per_device_train_batch_size`: 16
622
+ - `per_device_eval_batch_size`: 16
623
+ - `gradient_accumulation_steps`: 16
624
+ - `learning_rate`: 3e-05
625
+ - `num_train_epochs`: 40
626
+ - `lr_scheduler_type`: cosine
627
+ - `warmup_ratio`: 0.2
628
+ - `fp16`: True
629
+ - `load_best_model_at_end`: True
630
+ - `optim`: adamw_torch_fused
631
+
632
+ #### All Hyperparameters
633
+ <details><summary>Click to expand</summary>
634
+
635
+ - `overwrite_output_dir`: False
636
+ - `do_predict`: False
637
+ - `eval_strategy`: epoch
638
+ - `prediction_loss_only`: True
639
+ - `per_device_train_batch_size`: 16
640
+ - `per_device_eval_batch_size`: 16
641
+ - `per_gpu_train_batch_size`: None
642
+ - `per_gpu_eval_batch_size`: None
643
+ - `gradient_accumulation_steps`: 16
644
+ - `eval_accumulation_steps`: None
645
+ - `learning_rate`: 3e-05
646
+ - `weight_decay`: 0.0
647
+ - `adam_beta1`: 0.9
648
+ - `adam_beta2`: 0.999
649
+ - `adam_epsilon`: 1e-08
650
+ - `max_grad_norm`: 1.0
651
+ - `num_train_epochs`: 40
652
+ - `max_steps`: -1
653
+ - `lr_scheduler_type`: cosine
654
+ - `lr_scheduler_kwargs`: {}
655
+ - `warmup_ratio`: 0.2
656
+ - `warmup_steps`: 0
657
+ - `log_level`: passive
658
+ - `log_level_replica`: warning
659
+ - `log_on_each_node`: True
660
+ - `logging_nan_inf_filter`: True
661
+ - `save_safetensors`: True
662
+ - `save_on_each_node`: False
663
+ - `save_only_model`: False
664
+ - `restore_callback_states_from_checkpoint`: False
665
+ - `no_cuda`: False
666
+ - `use_cpu`: False
667
+ - `use_mps_device`: False
668
+ - `seed`: 42
669
+ - `data_seed`: None
670
+ - `jit_mode_eval`: False
671
+ - `use_ipex`: False
672
+ - `bf16`: False
673
+ - `fp16`: True
674
+ - `fp16_opt_level`: O1
675
+ - `half_precision_backend`: auto
676
+ - `bf16_full_eval`: False
677
+ - `fp16_full_eval`: False
678
+ - `tf32`: None
679
+ - `local_rank`: 0
680
+ - `ddp_backend`: None
681
+ - `tpu_num_cores`: None
682
+ - `tpu_metrics_debug`: False
683
+ - `debug`: []
684
+ - `dataloader_drop_last`: False
685
+ - `dataloader_num_workers`: 0
686
+ - `dataloader_prefetch_factor`: None
687
+ - `past_index`: -1
688
+ - `disable_tqdm`: False
689
+ - `remove_unused_columns`: True
690
+ - `label_names`: None
691
+ - `load_best_model_at_end`: True
692
+ - `ignore_data_skip`: False
693
+ - `fsdp`: []
694
+ - `fsdp_min_num_params`: 0
695
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
696
+ - `fsdp_transformer_layer_cls_to_wrap`: None
697
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
698
+ - `deepspeed`: None
699
+ - `label_smoothing_factor`: 0.0
700
+ - `optim`: adamw_torch_fused
701
+ - `optim_args`: None
702
+ - `adafactor`: False
703
+ - `group_by_length`: False
704
+ - `length_column_name`: length
705
+ - `ddp_find_unused_parameters`: None
706
+ - `ddp_bucket_cap_mb`: None
707
+ - `ddp_broadcast_buffers`: False
708
+ - `dataloader_pin_memory`: True
709
+ - `dataloader_persistent_workers`: False
710
+ - `skip_memory_metrics`: True
711
+ - `use_legacy_prediction_loop`: False
712
+ - `push_to_hub`: False
713
+ - `resume_from_checkpoint`: None
714
+ - `hub_model_id`: None
715
+ - `hub_strategy`: every_save
716
+ - `hub_private_repo`: False
717
+ - `hub_always_push`: False
718
+ - `gradient_checkpointing`: False
719
+ - `gradient_checkpointing_kwargs`: None
720
+ - `include_inputs_for_metrics`: False
721
+ - `eval_do_concat_batches`: True
722
+ - `fp16_backend`: auto
723
+ - `push_to_hub_model_id`: None
724
+ - `push_to_hub_organization`: None
725
+ - `mp_parameters`:
726
+ - `auto_find_batch_size`: False
727
+ - `full_determinism`: False
728
+ - `torchdynamo`: None
729
+ - `ray_scope`: last
730
+ - `ddp_timeout`: 1800
731
+ - `torch_compile`: False
732
+ - `torch_compile_backend`: None
733
+ - `torch_compile_mode`: None
734
+ - `dispatch_batches`: None
735
+ - `split_batches`: None
736
+ - `include_tokens_per_second`: False
737
+ - `include_num_input_tokens_seen`: False
738
+ - `neftune_noise_alpha`: None
739
+ - `optim_target_modules`: None
740
+ - `batch_eval_metrics`: False
741
+ - `eval_on_start`: False
742
+ - `batch_sampler`: batch_sampler
743
+ - `multi_dataset_batch_sampler`: proportional
744
+
745
+ </details>
746
+
747
+ ### Training Logs
748
+ | Epoch | Step | Training Loss | dim_128_cosine_map@100 | dim_256_cosine_map@100 | dim_512_cosine_map@100 | dim_64_cosine_map@100 | dim_768_cosine_map@100 |
749
+ |:--------:|:------:|:-------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|:----------------------:|
750
+ | 0 | 0 | - | 0.2103 | 0.1702 | 0.1888 | 0.1783 | 0.1815 |
751
+ | 1.0 | 1 | - | 0.2102 | 0.1702 | 0.1888 | 0.1783 | 0.1815 |
752
+ | 2.0 | 2 | - | 0.2104 | 0.1705 | 0.1890 | 0.1797 | 0.1815 |
753
+ | 3.0 | 3 | - | 0.2841 | 0.1733 | 0.2524 | 0.1997 | 0.2465 |
754
+ | 4.0 | 5 | - | 0.3285 | 0.2747 | 0.2865 | 0.3281 | 0.2901 |
755
+ | 5.0 | 6 | - | 0.3311 | 0.3045 | 0.2996 | 0.3930 | 0.3001 |
756
+ | 6.0 | 7 | - | 0.3948 | 0.3808 | 0.3193 | 0.4576 | 0.3147 |
757
+ | 7.0 | 9 | - | 0.5308 | 0.4366 | 0.4222 | 0.5445 | 0.4367 |
758
+ | 8.0 | 10 | 3.233 | 0.5352 | 0.5240 | 0.5224 | 0.5867 | 0.4591 |
759
+ | 9.0 | 11 | - | 0.5438 | 0.5864 | 0.5228 | 0.6519 | 0.6475 |
760
+ | 10.0 | 13 | - | 0.6540 | 0.5906 | 0.6554 | 0.6684 | 0.6511 |
761
+ | 11.0 | 14 | - | 0.6585 | 0.6020 | 0.6684 | 0.6857 | 0.6621 |
762
+ | 12.0 | 15 | - | 0.6632 | 0.6661 | 0.6798 | 0.7063 | 0.6685 |
763
+ | 13.0 | 17 | - | 0.7292 | 0.7210 | 0.6971 | 0.7396 | 0.7062 |
764
+ | 14.0 | 18 | - | 0.7396 | 0.7375 | 0.7229 | 0.8333 | 0.7068 |
765
+ | 15.0 | 19 | - | 0.75 | 0.7438 | 0.7021 | 0.8333 | 0.7083 |
766
+ | 16.0 | 20 | 1.4113 | 0.7604 | 0.7292 | 0.7042 | 0.8229 | 0.7104 |
767
+ | 17.0 | 21 | - | 0.7542 | 0.7262 | 0.7095 | 0.8229 | 0.7158 |
768
+ | 18.0 | 22 | - | 0.7438 | 0.7344 | 0.7054 | 0.8167 | 0.7188 |
769
+ | 19.0 | 23 | - | 0.8063 | 0.7344 | 0.7125 | 0.8125 | 0.7021 |
770
+ | 20.0 | 25 | - | 0.7958 | 0.7344 | 0.7262 | 0.8125 | 0.7333 |
771
+ | 21.0 | 26 | - | 0.8021 | 0.7344 | 0.7470 | 0.8095 | 0.7333 |
772
+ | 22.0 | 27 | - | 0.8021 | 0.7344 | 0.7470 | 0.8095 | 0.7333 |
773
+ | 23.0 | 29 | - | 0.8021 | 0.7344 | 0.7470 | 0.8095 | 0.7438 |
774
+ | 24.0 | 30 | 0.6643 | 0.8021 | 0.7448 | 0.7470 | 0.8125 | 0.7438 |
775
+ | 25.0 | 31 | - | 0.8125 | 0.7448 | 0.7470 | 0.8125 | 0.7604 |
776
+ | 26.0 | 33 | - | 0.8125 | 0.7448 | 0.75 | 0.8167 | 0.7604 |
777
+ | 27.0 | 34 | - | 0.8125 | 0.7448 | 0.75 | 0.8167 | 0.7604 |
778
+ | 28.0 | 35 | - | 0.8125 | 0.7448 | 0.75 | 0.8167 | 0.7604 |
779
+ | **29.0** | **37** | **-** | **0.8167** | **0.7448** | **0.75** | **0.8167** | **0.7604** |
780
+ | 30.0 | 38 | - | 0.8167 | 0.7448 | 0.75 | 0.8167 | 0.7604 |
781
+ | 31.0 | 39 | - | 0.8167 | 0.7448 | 0.75 | 0.8167 | 0.7604 |
782
+ | 32.0 | 40 | 0.4648 | 0.8167 | 0.7448 | 0.75 | 0.8167 | 0.7604 |
783
+
784
+ * The bold row denotes the saved checkpoint.
785
+
786
+ ### Framework Versions
787
+ - Python: 3.10.12
788
+ - Sentence Transformers: 3.0.1
789
+ - Transformers: 4.42.4
790
+ - PyTorch: 2.3.1+cu121
791
+ - Accelerate: 0.32.1
792
+ - Datasets: 2.20.0
793
+ - Tokenizers: 0.19.1
794
+
795
+ ## Citation
796
+
797
+ ### BibTeX
798
+
799
+ #### Sentence Transformers
800
+ ```bibtex
801
+ @inproceedings{reimers-2019-sentence-bert,
802
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
803
+ author = "Reimers, Nils and Gurevych, Iryna",
804
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
805
+ month = "11",
806
+ year = "2019",
807
+ publisher = "Association for Computational Linguistics",
808
+ url = "https://arxiv.org/abs/1908.10084",
809
+ }
810
+ ```
811
+
812
+ #### MatryoshkaLoss
813
+ ```bibtex
814
+ @misc{kusupati2024matryoshka,
815
+ title={Matryoshka Representation Learning},
816
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
817
+ year={2024},
818
+ eprint={2205.13147},
819
+ archivePrefix={arXiv},
820
+ primaryClass={cs.LG}
821
+ }
822
+ ```
823
+
824
+ #### MultipleNegativesRankingLoss
825
+ ```bibtex
826
+ @misc{henderson2017efficient,
827
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
828
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
829
+ year={2017},
830
+ eprint={1705.00652},
831
+ archivePrefix={arXiv},
832
+ primaryClass={cs.CL}
833
+ }
834
+ ```
835
+
836
+ <!--
837
+ ## Glossary
838
+
839
+ *Clearly define terms in order to be accessible across audiences.*
840
+ -->
841
+
842
+ <!--
843
+ ## Model Card Authors
844
+
845
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
846
+ -->
847
+
848
+ <!--
849
+ ## Model Card Contact
850
+
851
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
852
+ -->
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-base-en-v1.5",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 3072,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 12,
24
+ "num_hidden_layers": 12,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "torch_dtype": "float32",
28
+ "transformers_version": "4.42.4",
29
+ "type_vocab_size": 2,
30
+ "use_cache": true,
31
+ "vocab_size": 30522
32
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.42.4",
5
+ "pytorch": "2.3.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6b5474f2ec91b59854d751112c7183d66e397a6d6a5fd0c4606f7d84989e8005
3
+ size 437951328
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff