bobox commited on
Commit
1d3016b
1 Parent(s): 137a5fa

10 epoch 32 batch

Browse files
Files changed (3) hide show
  1. README.md +563 -9
  2. pytorch_model.bin +1 -1
  3. tokenizer.json +14 -2
README.md CHANGED
@@ -1,19 +1,77 @@
1
  ---
2
- language: []
 
3
  library_name: sentence-transformers
4
  tags:
5
  - sentence-transformers
6
  - sentence-similarity
7
  - feature-extraction
 
 
 
 
 
 
 
8
  base_model: microsoft/deberta-v3-small
9
- datasets: []
10
- widget: []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  pipeline_tag: sentence-similarity
12
  ---
13
 
14
  # SentenceTransformer based on microsoft/deberta-v3-small
15
 
16
- This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [microsoft/deberta-v3-small](https://huggingface.co/microsoft/deberta-v3-small). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
17
 
18
  ## Model Details
19
 
@@ -23,8 +81,16 @@ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [m
23
  - **Maximum Sequence Length:** 512 tokens
24
  - **Output Dimensionality:** 768 tokens
25
  - **Similarity Function:** Cosine Similarity
26
- <!-- - **Training Dataset:** Unknown -->
27
- <!-- - **Language:** Unknown -->
 
 
 
 
 
 
 
 
28
  <!-- - **License:** Unknown -->
29
 
30
  ### Model Sources
@@ -60,9 +126,9 @@ from sentence_transformers import SentenceTransformer
60
  model = SentenceTransformer("bobox/DeBERTaV3-small-GeneralSentenceTransformer-v2")
61
  # Run inference
62
  sentences = [
63
- 'The weather is lovely today.',
64
- "It's so sunny outside!",
65
- 'He drove to the stadium.',
66
  ]
67
  embeddings = model.encode(sentences)
68
  print(embeddings.shape)
@@ -112,6 +178,445 @@ You can finetune this model on your own dataset.
112
 
113
  ## Training Details
114
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
115
  ### Framework Versions
116
  - Python: 3.10.12
117
  - Sentence Transformers: 3.0.1
@@ -125,6 +630,55 @@ You can finetune this model on your own dataset.
125
 
126
  ### BibTeX
127
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
128
  <!--
129
  ## Glossary
130
 
 
1
  ---
2
+ language:
3
+ - en
4
  library_name: sentence-transformers
5
  tags:
6
  - sentence-transformers
7
  - sentence-similarity
8
  - feature-extraction
9
+ - generated_from_trainer
10
+ - dataset_size:96781
11
+ - loss:MultipleNegativesRankingLoss
12
+ - loss:AnglELoss
13
+ - loss:GISTEmbedLoss
14
+ - loss:OnlineContrastiveLoss
15
+ - loss:MultipleNegativesSymmetricRankingLoss
16
  base_model: microsoft/deberta-v3-small
17
+ datasets:
18
+ - sentence-transformers/all-nli
19
+ - sentence-transformers/stsb
20
+ - tals/vitaminc
21
+ - nyu-mll/glue
22
+ - allenai/scitail
23
+ - sentence-transformers/xsum
24
+ - sentence-transformers/sentence-compression
25
+ widget:
26
+ - source_sentence: What dual titles did Frederick William hold?
27
+ sentences:
28
+ - The impact was increased by chronic overfishing, and by eutrophication that gave
29
+ the entire ecosystem a short-term boost, causing the Mnemiopsis population to
30
+ increase even faster than normal – and above all by the absence of efficient predators
31
+ on these introduced ctenophores.
32
+ - The "European Council" (rather than the Council, made up of different government
33
+ Ministers) is composed of the Prime Ministers or executive Presidents of the member
34
+ states.
35
+ - Nearly 50,000 Huguenots established themselves in Germany, 20,000 of whom were
36
+ welcomed in Brandenburg-Prussia, where they were granted special privileges (Edict
37
+ of Potsdam) and churches in which to worship (such as the Church of St. Peter
38
+ and St. Paul, Angermünde) by Frederick William, Elector of Brandenburg and Duke
39
+ of Prussia.
40
+ - source_sentence: the Great Internet Mersenne Prime Search, what was the prize for
41
+ finding a prime with at least 10 million digits?
42
+ sentences:
43
+ - Since September 2004, the official home of the Scottish Parliament has been a
44
+ new Scottish Parliament Building, in the Holyrood area of Edinburgh.
45
+ - The roughly half-mile stretch of Kearney Boulevard between Fresno Street and Thorne
46
+ Ave was at one time the preferred neighborhood for Fresno's elite African-American
47
+ families.
48
+ - In 2009, the Great Internet Mersenne Prime Search project was awarded a US$100,000
49
+ prize for first discovering a prime with at least 10 million digits.
50
+ - source_sentence: A woman is tugging on a white sheet and laughing
51
+ sentences:
52
+ - there are children near the camera
53
+ - The person is amused.
54
+ - Fruit characters decorate this child's bib
55
+ - source_sentence: A hispanic fruit market with many different fruits and vegetables
56
+ in view on a city street with a man passing the store dressed in dark pants and
57
+ a hoodie.
58
+ sentences:
59
+ - A fruit market and a man
60
+ - Farmers preparing to feed their animals.
61
+ - The guys have guns.
62
+ - source_sentence: All the members of one particular species in a give area are called
63
+ a population.
64
+ sentences:
65
+ - The specialized study of the motion of objects that are atomic/subatomic in size
66
+ is called quantum mechanics.
67
+ - All the members of a species that live in the same area form a population.
68
+ - A(n) anaerobic organism does not need oxygen for growth and dies in its presence.
69
  pipeline_tag: sentence-similarity
70
  ---
71
 
72
  # SentenceTransformer based on microsoft/deberta-v3-small
73
 
74
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [microsoft/deberta-v3-small](https://huggingface.co/microsoft/deberta-v3-small) on the [nli-pairs](https://huggingface.co/datasets/sentence-transformers/all-nli), [sts-label](https://huggingface.co/datasets/sentence-transformers/stsb), [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc), [qnli-contrastive](https://huggingface.co/datasets/nyu-mll/glue), [scitail-pairs-qa](https://huggingface.co/datasets/allenai/scitail), [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail), [xsum-pairs](https://huggingface.co/datasets/sentence-transformers/xsum) and [compression-pairs](https://huggingface.co/datasets/sentence-transformers/sentence-compression) datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
75
 
76
  ## Model Details
77
 
 
81
  - **Maximum Sequence Length:** 512 tokens
82
  - **Output Dimensionality:** 768 tokens
83
  - **Similarity Function:** Cosine Similarity
84
+ - **Training Datasets:**
85
+ - [nli-pairs](https://huggingface.co/datasets/sentence-transformers/all-nli)
86
+ - [sts-label](https://huggingface.co/datasets/sentence-transformers/stsb)
87
+ - [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc)
88
+ - [qnli-contrastive](https://huggingface.co/datasets/nyu-mll/glue)
89
+ - [scitail-pairs-qa](https://huggingface.co/datasets/allenai/scitail)
90
+ - [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail)
91
+ - [xsum-pairs](https://huggingface.co/datasets/sentence-transformers/xsum)
92
+ - [compression-pairs](https://huggingface.co/datasets/sentence-transformers/sentence-compression)
93
+ - **Language:** en
94
  <!-- - **License:** Unknown -->
95
 
96
  ### Model Sources
 
126
  model = SentenceTransformer("bobox/DeBERTaV3-small-GeneralSentenceTransformer-v2")
127
  # Run inference
128
  sentences = [
129
+ 'All the members of one particular species in a give area are called a population.',
130
+ 'All the members of a species that live in the same area form a population.',
131
+ 'A(n) anaerobic organism does not need oxygen for growth and dies in its presence.',
132
  ]
133
  embeddings = model.encode(sentences)
134
  print(embeddings.shape)
 
178
 
179
  ## Training Details
180
 
181
+ ### Training Datasets
182
+
183
+ #### nli-pairs
184
+
185
+ * Dataset: [nli-pairs](https://huggingface.co/datasets/sentence-transformers/all-nli) at [d482672](https://huggingface.co/datasets/sentence-transformers/all-nli/tree/d482672c8e74ce18da116f430137434ba2e52fab)
186
+ * Size: 7,500 training samples
187
+ * Columns: <code>sentence1</code> and <code>sentence2</code>
188
+ * Approximate statistics based on the first 1000 samples:
189
+ | | sentence1 | sentence2 |
190
+ |:--------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|
191
+ | type | string | string |
192
+ | details | <ul><li>min: 5 tokens</li><li>mean: 16.62 tokens</li><li>max: 62 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 9.46 tokens</li><li>max: 29 tokens</li></ul> |
193
+ * Samples:
194
+ | sentence1 | sentence2 |
195
+ |:---------------------------------------------------------------------------|:-------------------------------------------------|
196
+ | <code>A person on a horse jumps over a broken down airplane.</code> | <code>A person is outdoors, on a horse.</code> |
197
+ | <code>Children smiling and waving at camera</code> | <code>There are children present</code> |
198
+ | <code>A boy is jumping on skateboard in the middle of a red bridge.</code> | <code>The boy does a skateboarding trick.</code> |
199
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
200
+ ```json
201
+ {
202
+ "scale": 20.0,
203
+ "similarity_fct": "cos_sim"
204
+ }
205
+ ```
206
+
207
+ #### sts-label
208
+
209
+ * Dataset: [sts-label](https://huggingface.co/datasets/sentence-transformers/stsb) at [ab7a5ac](https://huggingface.co/datasets/sentence-transformers/stsb/tree/ab7a5ac0e35aa22088bdcf23e7fd99b220e53308)
210
+ * Size: 5,749 training samples
211
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
212
+ * Approximate statistics based on the first 1000 samples:
213
+ | | sentence1 | sentence2 | score |
214
+ |:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------|
215
+ | type | string | string | float |
216
+ | details | <ul><li>min: 6 tokens</li><li>mean: 9.81 tokens</li><li>max: 27 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 9.74 tokens</li><li>max: 25 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.54</li><li>max: 1.0</li></ul> |
217
+ * Samples:
218
+ | sentence1 | sentence2 | score |
219
+ |:-----------------------------------------------------------|:----------------------------------------------------------------------|:------------------|
220
+ | <code>A plane is taking off.</code> | <code>An air plane is taking off.</code> | <code>1.0</code> |
221
+ | <code>A man is playing a large flute.</code> | <code>A man is playing a flute.</code> | <code>0.76</code> |
222
+ | <code>A man is spreading shreded cheese on a pizza.</code> | <code>A man is spreading shredded cheese on an uncooked pizza.</code> | <code>0.76</code> |
223
+ * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
224
+ ```json
225
+ {
226
+ "scale": 20.0,
227
+ "similarity_fct": "pairwise_angle_sim"
228
+ }
229
+ ```
230
+
231
+ #### vitaminc-pairs
232
+
233
+ * Dataset: [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc) at [be6febb](https://huggingface.co/datasets/tals/vitaminc/tree/be6febb761b0b2807687e61e0b5282e459df2fa0)
234
+ * Size: 3,695 training samples
235
+ * Columns: <code>label</code>, <code>sentence1</code>, and <code>sentence2</code>
236
+ * Approximate statistics based on the first 1000 samples:
237
+ | | label | sentence1 | sentence2 |
238
+ |:--------|:-----------------------------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
239
+ | type | int | string | string |
240
+ | details | <ul><li>1: 100.00%</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 16.02 tokens</li><li>max: 56 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 38.57 tokens</li><li>max: 502 tokens</li></ul> |
241
+ * Samples:
242
+ | label | sentence1 | sentence2 |
243
+ |:---------------|:------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
244
+ | <code>1</code> | <code>The movie Yevadu grossed more than 390 million globally .</code> | <code>It also took the second spot in the list of the top 10 films with highest first week shares from AP.The film collected 390.5 million in 9 days , and more than 60 million from other areas , including Karnataka , the rest of India , and overseas territories , enabling it to cross the 400 million mark at the worldwide Box office , becoming Ram Charan 's fourth film to cross that mark .</code> |
245
+ | <code>1</code> | <code>The film 's score is based on 33 critics .</code> | <code>`` Metacritic gave the film a score of 44 out of 100 , based on 33 critics , indicating `` '' mixed or average reviews '' '' . ''</code> |
246
+ | <code>1</code> | <code>Back to Black ( album ) sold less than 15 million copies .</code> | <code>Worldwide , the album has sold over 12 million copies .</code> |
247
+ * Loss: [<code>GISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
248
+ ```json
249
+ {'guide': SentenceTransformer(
250
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
251
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
252
+ (2): Normalize()
253
+ ), 'temperature': 0.05}
254
+ ```
255
+
256
+ #### qnli-contrastive
257
+
258
+ * Dataset: [qnli-contrastive](https://huggingface.co/datasets/nyu-mll/glue) at [bcdcba7](https://huggingface.co/datasets/nyu-mll/glue/tree/bcdcba79d07bc864c1c254ccfcedcce55bcc9a8c)
259
+ * Size: 7,500 training samples
260
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
261
+ * Approximate statistics based on the first 1000 samples:
262
+ | | sentence1 | sentence2 | label |
263
+ |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------|
264
+ | type | string | string | int |
265
+ | details | <ul><li>min: 6 tokens</li><li>mean: 13.92 tokens</li><li>max: 40 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 35.87 tokens</li><li>max: 499 tokens</li></ul> | <ul><li>0: 100.00%</li></ul> |
266
+ * Samples:
267
+ | sentence1 | sentence2 | label |
268
+ |:--------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
269
+ | <code>Who was the biggest artist that CBS had?</code> | <code>CBS Inc., now CBS Corporation, retained the rights to the CBS name for music recordings but granted Sony a temporary license to use the CBS name.</code> | <code>0</code> |
270
+ | <code>What does a video-conference use that allows communication in live situations?</code> | <code>This is often accomplished by the use of a multipoint control unit (a centralized distribution and call management system) or by a similar non-centralized multipoint capability embedded in each videoconferencing unit.</code> | <code>0</code> |
271
+ | <code>What is the population of Saint Helena?</code> | <code>It is part of the British Overseas Territory of Saint Helena, Ascension and Tristan da Cunha.</code> | <code>0</code> |
272
+ * Loss: [<code>OnlineContrastiveLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#onlinecontrastiveloss)
273
+
274
+ #### scitail-pairs-qa
275
+
276
+ * Dataset: [scitail-pairs-qa](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
277
+ * Size: 14,987 training samples
278
+ * Columns: <code>sentence2</code> and <code>sentence1</code>
279
+ * Approximate statistics based on the first 1000 samples:
280
+ | | sentence2 | sentence1 |
281
+ |:--------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|
282
+ | type | string | string |
283
+ | details | <ul><li>min: 7 tokens</li><li>mean: 15.86 tokens</li><li>max: 41 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 15.1 tokens</li><li>max: 41 tokens</li></ul> |
284
+ * Samples:
285
+ | sentence2 | sentence1 |
286
+ |:--------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
287
+ | <code>The largest known proteins are titins.</code> | <code>What are the largest known proteins?</code> |
288
+ | <code>Remote-control vehicles are able to go to the deepest ocean floor.</code> | <code>What type of vehicles is able to go to the deepest ocean floor?</code> |
289
+ | <code>Vaccine is a preventative measure that is often delivered by injection into the arm.</code> | <code>What preventative measure is often delivered by injection into the arm?</code> |
290
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
291
+ ```json
292
+ {
293
+ "scale": 20.0,
294
+ "similarity_fct": "cos_sim"
295
+ }
296
+ ```
297
+
298
+ #### scitail-pairs-pos
299
+
300
+ * Dataset: [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
301
+ * Size: 8,600 training samples
302
+ * Columns: <code>sentence1</code> and <code>sentence2</code>
303
+ * Approximate statistics based on the first 1000 samples:
304
+ | | sentence1 | sentence2 |
305
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
306
+ | type | string | string |
307
+ | details | <ul><li>min: 7 tokens</li><li>mean: 23.75 tokens</li><li>max: 67 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 15.47 tokens</li><li>max: 41 tokens</li></ul> |
308
+ * Samples:
309
+ | sentence1 | sentence2 |
310
+ |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------|
311
+ | <code>The movement of molecules from a location where they are in a high concentration to an area where they are in a lower concentration is called diffusion .</code> | <code>You call the movement of a substance from an area of a higher amount toward an area of lower amount diffusion.</code> |
312
+ | <code>Climate is the average weather of an area over a long period of time.</code> | <code>Climate is the long-term average of weather in a particular spot.</code> |
313
+ | <code>Sunlight is captured by green plants during the process of photosynthesis to produce glucose, a carbohydrate from water and carbon dioxide.</code> | <code>Photosynthesis converts carbon dioxide and water into glucose.</code> |
314
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
315
+ ```json
316
+ {
317
+ "scale": 20.0,
318
+ "similarity_fct": "cos_sim"
319
+ }
320
+ ```
321
+
322
+ #### xsum-pairs
323
+
324
+ * Dataset: [xsum-pairs](https://huggingface.co/datasets/sentence-transformers/xsum) at [788ddaf](https://huggingface.co/datasets/sentence-transformers/xsum/tree/788ddafe04e539956d56b567bc32a036ee7b9206)
325
+ * Size: 3,750 training samples
326
+ * Columns: <code>sentence1</code> and <code>sentence2</code>
327
+ * Approximate statistics based on the first 1000 samples:
328
+ | | sentence1 | sentence2 |
329
+ |:--------|:-------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|
330
+ | type | string | string |
331
+ | details | <ul><li>min: 28 tokens</li><li>mean: 355.39 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 27.3 tokens</li><li>max: 61 tokens</li></ul> |
332
+ * Samples:
333
+ | sentence1 | sentence2 |
334
+ |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------|
335
+ | <code>Prices rose in all council areas and across all property types, but there were wide variations.<br>In Derry City and Strabane prices were up by 11% but by less than 2% in Fermanagh and Omagh.<br>The figures are from the NI Residential Property Price Index, which analyses almost all sales, including cash deals.<br>The average standardised price, across all property types, is now £125,480.<br>That compares to £97,428 at the bottom of the market in 2012, but is still far below the bubble-era peak of £224,670.<br>Over the year the largest rise was in the apartment sector with prices up by 11%.<br>For all other property types, the increase was about 5%.<br>The council area with the highest average price is Lisburn and Castlereagh (£149,600) and the lowest is Derry City and Strabane (£108,464).<br>The number of properties sold in 2016 was 21,669, down slightly on the 2015 figure.<br>Northern Ireland experienced a huge house price bubble in the years leading up to 2007 before the market crashed.<br>Prices more than halved between 2007 and early 2013 but have been increasing gradually since then.</code> | <code>House prices in Northern Ireland rose by almost 6% in 2016, according to official figures.</code> |
336
+ | <code>English and French clubs intend to break away from the Heineken Cup and create their own tournament.<br>"It could well be the end of professional rugby in Scotland if the competition wasn't to go ahead," Nicol told BBC Scotland.<br>"I don't think you can fill a hole of that amount with anything else."<br>Let's get qualification sorted out and based on a meritocracy and then the distribution of revenues is for the boardrooms<br>The Scottish Rugby Union currently receives about £5m per year for Glasgow Warriors and Edinburgh's participation in the Heineken Cup.<br>European Rugby Cup (ERC), which has run the Heineken Cup since it began in 1995, wants to re-open negotiations about the tournament's future but English Premiership and French Top 14 clubs insist they will not attend talks planned by the organising body next month.<br>They will quit the competition at the end of the season, citing factors such as their view that the Heineken Cup structure favours teams from the Pro12, which is made up of sides from Wales, Scotland, Ireland and Italy, and distribution of revenue.<br>Nicol, who won the Heineken Cup with Bath in 1998, insists that arguments over the tournament format is a repetitive issue and he hopes "common sense" will prevail for the good of the game in Scotland.<br>"It happens every few years," he told BBC Scotland. "The English and the French flex their collective muscles when the contract is coming to an end.<br>"But this year, it's very different, because they've got a television deal on the table and it's a real clear and present danger.<br>"I think there's an acceptance that the current format of the Heineken Cup will cease and there will be a new competition.<br>Media playback is not supported on this device<br>"Then we just need to ensure and hope that Scotland are heavily involved in it."<br>Nicol conceded that the main stumbling block for advancing discussions was the perception that Celtic nations are favoured in the qualification process.<br>At present, Ireland and Wales each have three sides guaranteed a place, while Scotland and Italy have two apiece.<br>Nicol believes the English and French unions want to put a stop to automatic qualification, which could bring about the end of lucrative revenue for Glasgow and Edinburgh, although ending guaranteed entry may be necessary to ensure the future of a pan-European competition.<br>The former Scotland captain said if the tournament comes to an end it would be "a sporting disaster" adding that "the Heineken Cup has been a fantastic competition".<br>He added: "Where it's flawed is in the qualification. I don't think the two Scottish sides and the Italian sides or the Irish sides should qualify automatically.<br>"So let's get qualification sorted out and based on a meritocracy and then the distribution of revenues is for the boardrooms.<br>"There's a bit of posturing from both sides, but I just hope it's a bit of brinksmanship and they get around the table and sort something out - and we get a competition.<br>"It might not be the Heineken Cup as we call it now, but hopefully we'll get something like it."</code> | <code>Professional rugby union in Scotland could end if there is no European competition next season, fears former national captain Andy Nicol.</code> |
337
+ | <code>The German was 0.203 seconds quicker than Hamilton, with Ferrari's Kimi Raikkonen third, a second off the pace.<br>Mercedes set their times on the super-soft tyre, while Ferrari used the soft, which would account for about half the gap between the two cars.<br>Ferrari's Sebastian Vettel was fourth, ahead of Force India's Sergio Perez.<br>Hamilton enters the race nine points ahead of Rosberg in the championship after recovering from 21st on the grid to finish third at the Belgian Grand Prix last weekend, as Rosberg won.<br>Ferrari have used the last of their remaining engine development 'tokens' ahead of their home race in an attempt to boost their competitiveness after a slump in form that has seen them lose second place in the constructors' championship to Red Bull.<br>The fastest Red Bull was Max Verstappen in eighth, behind Haas driver Romain Grosjean and Williams' Valtteri Bottas, whose team-mate Felipe Massa announced on Thursday that he would retire at the end of the year.<br>Verstappen remains the focus of attention following his controversial battle with Raikkonen in Belgium.<br>Raikkonen has criticised Verstappen for being too dangerous, while the Dutchman said he would not change his driving because others were not happy.<br>The stewards took no action against Verstappen in Spa, but BBC Sport has learned that Charlie Whiting, the F1 director of governing body the FIA, felt that Verstappen's late move in defence at 200mph as Raikkonen attacked was on the edge of acceptability.<br>Whiting told the teams in a meeting on Thursday that he felt Verstappen could have received a black-and-white warning flag for his driving.<br>The black-and-white flag is an indication of unsportsmanlike behaviour and is only shown once. If the driver commits the same offence again he can be disqualified from the race.<br>Whiting's intervention raised the stakes in the debate ahead of the drivers' briefing after practice on Friday afternoon, where the incident is expected to be discussed.<br>It was a relatively low-key session on track, despite a number of drivers running off the track at the tricky Monza chicanes in the warm sunshine.<br>McLaren's session came to an unfortunate end as Fernando Alonso was forced to pit with a gearshift problem. He was 13th, with team-mate Jenson Button 11th, the drivers expecting their most difficult weekend of the year because of the lack of power of the Honda engine, which still lags despite recent updates.<br>Button and Verstappen ran the halo head protection system in the first part of the session as trials continue ahead of the planned introduction of the device in 2018.<br>Italian Grand Prix first practice results<br>Italian Grand Prix coverage details</code> | <code>Nico Rosberg headed team-mate Lewis Hamilton as Mercedes dominated first practice at the Italian Grand Prix.</code> |
338
+ * Loss: [<code>MultipleNegativesSymmetricRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativessymmetricrankingloss) with these parameters:
339
+ ```json
340
+ {
341
+ "scale": 20.0,
342
+ "similarity_fct": "cos_sim"
343
+ }
344
+ ```
345
+
346
+ #### compression-pairs
347
+
348
+ * Dataset: [compression-pairs](https://huggingface.co/datasets/sentence-transformers/sentence-compression) at [605bc91](https://huggingface.co/datasets/sentence-transformers/sentence-compression/tree/605bc91d95631895ba25b6eda51a3cb596976c90)
349
+ * Size: 45,000 training samples
350
+ * Columns: <code>sentence1</code> and <code>sentence2</code>
351
+ * Approximate statistics based on the first 1000 samples:
352
+ | | sentence1 | sentence2 |
353
+ |:--------|:------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
354
+ | type | string | string |
355
+ | details | <ul><li>min: 10 tokens</li><li>mean: 31.78 tokens</li><li>max: 170 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 10.14 tokens</li><li>max: 29 tokens</li></ul> |
356
+ * Samples:
357
+ | sentence1 | sentence2 |
358
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------|
359
+ | <code>The USHL completed an expansion draft on Monday as 10 players who were on the rosters of USHL teams during the 2009-10 season were selected by the League's two newest entries, the Muskegon Lumberjacks and Dubuque Fighting Saints.</code> | <code>USHL completes expansion draft</code> |
360
+ | <code>NRT LLC, one of the nation's largest residential real estate brokerage companies, announced several executive appointments within its Coldwell Banker Residential Brokerage operations in Southern California.</code> | <code>NRT announces executive appointments at its Coldwell Banker operations in Southern California</code> |
361
+ | <code>A new survey shows 30 percent of Californians use Twitter, and more and more of us are using our smart phones to go online.</code> | <code>Survey: 30 percent of Californians use Twitter</code> |
362
+ * Loss: [<code>MultipleNegativesSymmetricRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativessymmetricrankingloss) with these parameters:
363
+ ```json
364
+ {
365
+ "scale": 20.0,
366
+ "similarity_fct": "cos_sim"
367
+ }
368
+ ```
369
+
370
+ ### Evaluation Datasets
371
+
372
+ #### nli-pairs
373
+
374
+ * Dataset: [nli-pairs](https://huggingface.co/datasets/sentence-transformers/all-nli) at [d482672](https://huggingface.co/datasets/sentence-transformers/all-nli/tree/d482672c8e74ce18da116f430137434ba2e52fab)
375
+ * Size: 2,000 evaluation samples
376
+ * Columns: <code>sentence1</code> and <code>sentence2</code>
377
+ * Approximate statistics based on the first 1000 samples:
378
+ | | sentence1 | sentence2 |
379
+ |:--------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|
380
+ | type | string | string |
381
+ | details | <ul><li>min: 5 tokens</li><li>mean: 17.64 tokens</li><li>max: 63 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 9.67 tokens</li><li>max: 29 tokens</li></ul> |
382
+ * Samples:
383
+ | sentence1 | sentence2 |
384
+ |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------|
385
+ | <code>Two women are embracing while holding to go packages.</code> | <code>Two woman are holding packages.</code> |
386
+ | <code>Two young children in blue jerseys, one with the number 9 and one with the number 2 are standing on wooden steps in a bathroom and washing their hands in a sink.</code> | <code>Two kids in numbered jerseys wash their hands.</code> |
387
+ | <code>A man selling donuts to a customer during a world exhibition event held in the city of Angeles</code> | <code>A man selling donuts to a customer.</code> |
388
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
389
+ ```json
390
+ {
391
+ "scale": 20.0,
392
+ "similarity_fct": "cos_sim"
393
+ }
394
+ ```
395
+
396
+ #### scitail-pairs-pos
397
+
398
+ * Dataset: [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
399
+ * Size: 1,304 evaluation samples
400
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
401
+ * Approximate statistics based on the first 1000 samples:
402
+ | | sentence1 | sentence2 | label |
403
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
404
+ | type | string | string | int |
405
+ | details | <ul><li>min: 5 tokens</li><li>mean: 22.52 tokens</li><li>max: 67 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 15.34 tokens</li><li>max: 36 tokens</li></ul> | <ul><li>0: ~47.50%</li><li>1: ~52.50%</li></ul> |
406
+ * Samples:
407
+ | sentence1 | sentence2 | label |
408
+ |:----------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------|:---------------|
409
+ | <code>An introduction to atoms and elements, compounds, atomic structure and bonding, the molecule and chemical reactions.</code> | <code>Replace another in a molecule happens to atoms during a substitution reaction.</code> | <code>0</code> |
410
+ | <code>Wavelength The distance between two consecutive points on a sinusoidal wave that are in phase;</code> | <code>Wavelength is the distance between two corresponding points of adjacent waves called.</code> | <code>1</code> |
411
+ | <code>humans normally have 23 pairs of chromosomes.</code> | <code>Humans typically have 23 pairs pairs of chromosomes.</code> | <code>1</code> |
412
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
413
+ ```json
414
+ {
415
+ "scale": 20.0,
416
+ "similarity_fct": "cos_sim"
417
+ }
418
+ ```
419
+
420
+ #### qnli-contrastive
421
+
422
+ * Dataset: [qnli-contrastive](https://huggingface.co/datasets/nyu-mll/glue) at [bcdcba7](https://huggingface.co/datasets/nyu-mll/glue/tree/bcdcba79d07bc864c1c254ccfcedcce55bcc9a8c)
423
+ * Size: 2,000 evaluation samples
424
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
425
+ * Approximate statistics based on the first 1000 samples:
426
+ | | sentence1 | sentence2 | label |
427
+ |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------|
428
+ | type | string | string | int |
429
+ | details | <ul><li>min: 6 tokens</li><li>mean: 14.13 tokens</li><li>max: 36 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 36.58 tokens</li><li>max: 225 tokens</li></ul> | <ul><li>0: 100.00%</li></ul> |
430
+ * Samples:
431
+ | sentence1 | sentence2 | label |
432
+ |:--------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
433
+ | <code>What came into force after the new constitution was herald?</code> | <code>As of that day, the new constitution heralding the Second Republic came into force.</code> | <code>0</code> |
434
+ | <code>What is the first major city in the stream of the Rhine?</code> | <code>The most important tributaries in this area are the Ill below of Strasbourg, the Neckar in Mannheim and the Main across from Mainz.</code> | <code>0</code> |
435
+ | <code>What is the minimum required if you want to teach in Canada?</code> | <code>In most provinces a second Bachelor's Degree such as a Bachelor of Education is required to become a qualified teacher.</code> | <code>0</code> |
436
+ * Loss: [<code>OnlineContrastiveLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#onlinecontrastiveloss)
437
+
438
+ #### sts-label
439
+
440
+ * Dataset: [sts-label](https://huggingface.co/datasets/sentence-transformers/stsb) at [ab7a5ac](https://huggingface.co/datasets/sentence-transformers/stsb/tree/ab7a5ac0e35aa22088bdcf23e7fd99b220e53308)
441
+ * Size: 1,500 evaluation samples
442
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
443
+ * Approximate statistics based on the first 1000 samples:
444
+ | | sentence1 | sentence2 | score |
445
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
446
+ | type | string | string | float |
447
+ | details | <ul><li>min: 5 tokens</li><li>mean: 14.77 tokens</li><li>max: 45 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 14.74 tokens</li><li>max: 49 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.47</li><li>max: 1.0</li></ul> |
448
+ * Samples:
449
+ | sentence1 | sentence2 | score |
450
+ |:--------------------------------------------------|:------------------------------------------------------|:------------------|
451
+ | <code>A man with a hard hat is dancing.</code> | <code>A man wearing a hard hat is dancing.</code> | <code>1.0</code> |
452
+ | <code>A young child is riding a horse.</code> | <code>A child is riding a horse.</code> | <code>0.95</code> |
453
+ | <code>A man is feeding a mouse to a snake.</code> | <code>The man is feeding a mouse to the snake.</code> | <code>1.0</code> |
454
+ * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
455
+ ```json
456
+ {
457
+ "scale": 20.0,
458
+ "similarity_fct": "pairwise_angle_sim"
459
+ }
460
+ ```
461
+
462
+ ### Training Hyperparameters
463
+ #### Non-Default Hyperparameters
464
+
465
+ - `eval_strategy`: steps
466
+ - `per_device_train_batch_size`: 28
467
+ - `per_device_eval_batch_size`: 16
468
+ - `learning_rate`: 3e-06
469
+ - `weight_decay`: 1e-10
470
+ - `num_train_epochs`: 5
471
+ - `max_steps`: 5000
472
+ - `lr_scheduler_type`: cosine
473
+ - `warmup_ratio`: 0.33
474
+ - `save_safetensors`: False
475
+ - `fp16`: True
476
+ - `hub_model_id`: bobox/DeBERTaV3-small-ST-checkpoints-tmp
477
+ - `hub_strategy`: checkpoint
478
+ - `batch_sampler`: no_duplicates
479
+
480
+ #### All Hyperparameters
481
+ <details><summary>Click to expand</summary>
482
+
483
+ - `overwrite_output_dir`: False
484
+ - `do_predict`: False
485
+ - `eval_strategy`: steps
486
+ - `prediction_loss_only`: True
487
+ - `per_device_train_batch_size`: 28
488
+ - `per_device_eval_batch_size`: 16
489
+ - `per_gpu_train_batch_size`: None
490
+ - `per_gpu_eval_batch_size`: None
491
+ - `gradient_accumulation_steps`: 1
492
+ - `eval_accumulation_steps`: None
493
+ - `learning_rate`: 3e-06
494
+ - `weight_decay`: 1e-10
495
+ - `adam_beta1`: 0.9
496
+ - `adam_beta2`: 0.999
497
+ - `adam_epsilon`: 1e-08
498
+ - `max_grad_norm`: 1.0
499
+ - `num_train_epochs`: 5
500
+ - `max_steps`: 5000
501
+ - `lr_scheduler_type`: cosine
502
+ - `lr_scheduler_kwargs`: {}
503
+ - `warmup_ratio`: 0.33
504
+ - `warmup_steps`: 0
505
+ - `log_level`: passive
506
+ - `log_level_replica`: warning
507
+ - `log_on_each_node`: True
508
+ - `logging_nan_inf_filter`: True
509
+ - `save_safetensors`: False
510
+ - `save_on_each_node`: False
511
+ - `save_only_model`: False
512
+ - `restore_callback_states_from_checkpoint`: False
513
+ - `no_cuda`: False
514
+ - `use_cpu`: False
515
+ - `use_mps_device`: False
516
+ - `seed`: 42
517
+ - `data_seed`: None
518
+ - `jit_mode_eval`: False
519
+ - `use_ipex`: False
520
+ - `bf16`: False
521
+ - `fp16`: True
522
+ - `fp16_opt_level`: O1
523
+ - `half_precision_backend`: auto
524
+ - `bf16_full_eval`: False
525
+ - `fp16_full_eval`: False
526
+ - `tf32`: None
527
+ - `local_rank`: 0
528
+ - `ddp_backend`: None
529
+ - `tpu_num_cores`: None
530
+ - `tpu_metrics_debug`: False
531
+ - `debug`: []
532
+ - `dataloader_drop_last`: False
533
+ - `dataloader_num_workers`: 0
534
+ - `dataloader_prefetch_factor`: None
535
+ - `past_index`: -1
536
+ - `disable_tqdm`: False
537
+ - `remove_unused_columns`: True
538
+ - `label_names`: None
539
+ - `load_best_model_at_end`: False
540
+ - `ignore_data_skip`: False
541
+ - `fsdp`: []
542
+ - `fsdp_min_num_params`: 0
543
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
544
+ - `fsdp_transformer_layer_cls_to_wrap`: None
545
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
546
+ - `deepspeed`: None
547
+ - `label_smoothing_factor`: 0.0
548
+ - `optim`: adamw_torch
549
+ - `optim_args`: None
550
+ - `adafactor`: False
551
+ - `group_by_length`: False
552
+ - `length_column_name`: length
553
+ - `ddp_find_unused_parameters`: None
554
+ - `ddp_bucket_cap_mb`: None
555
+ - `ddp_broadcast_buffers`: False
556
+ - `dataloader_pin_memory`: True
557
+ - `dataloader_persistent_workers`: False
558
+ - `skip_memory_metrics`: True
559
+ - `use_legacy_prediction_loop`: False
560
+ - `push_to_hub`: False
561
+ - `resume_from_checkpoint`: None
562
+ - `hub_model_id`: bobox/DeBERTaV3-small-ST-checkpoints-tmp
563
+ - `hub_strategy`: checkpoint
564
+ - `hub_private_repo`: False
565
+ - `hub_always_push`: False
566
+ - `gradient_checkpointing`: False
567
+ - `gradient_checkpointing_kwargs`: None
568
+ - `include_inputs_for_metrics`: False
569
+ - `eval_do_concat_batches`: True
570
+ - `fp16_backend`: auto
571
+ - `push_to_hub_model_id`: None
572
+ - `push_to_hub_organization`: None
573
+ - `mp_parameters`:
574
+ - `auto_find_batch_size`: False
575
+ - `full_determinism`: False
576
+ - `torchdynamo`: None
577
+ - `ray_scope`: last
578
+ - `ddp_timeout`: 1800
579
+ - `torch_compile`: False
580
+ - `torch_compile_backend`: None
581
+ - `torch_compile_mode`: None
582
+ - `dispatch_batches`: None
583
+ - `split_batches`: None
584
+ - `include_tokens_per_second`: False
585
+ - `include_num_input_tokens_seen`: False
586
+ - `neftune_noise_alpha`: None
587
+ - `optim_target_modules`: None
588
+ - `batch_eval_metrics`: False
589
+ - `batch_sampler`: no_duplicates
590
+ - `multi_dataset_batch_sampler`: proportional
591
+
592
+ </details>
593
+
594
+ ### Training Logs
595
+ | Epoch | Step | Training Loss | nli-pairs loss | sts-label loss | scitail-pairs-pos loss | qnli-contrastive loss |
596
+ |:------:|:----:|:-------------:|:--------------:|:--------------:|:----------------------:|:---------------------:|
597
+ | None | 0 | - | 3.3906 | 6.4037 | 2.3949 | 2.6789 |
598
+ | 0.0723 | 250 | 3.2471 | 3.2669 | 6.3326 | 2.3286 | 2.6008 |
599
+ | 0.1445 | 500 | 3.051 | 3.0717 | 6.5578 | 2.0277 | 2.0795 |
600
+ | 0.2168 | 750 | 2.3717 | 2.8445 | 7.5564 | 1.5729 | 1.1601 |
601
+ | 0.2890 | 1000 | 1.5228 | 2.5520 | 8.3864 | 1.1221 | 0.7480 |
602
+ | 0.3613 | 1250 | 1.5747 | 2.1439 | 8.7993 | 0.9512 | 0.5071 |
603
+ | 0.4335 | 1500 | 1.2114 | 1.7986 | 9.0748 | 0.8195 | 0.3715 |
604
+ | 0.5058 | 1750 | 1.1832 | 1.5665 | 9.1778 | 0.6956 | 0.2920 |
605
+ | 0.5780 | 2000 | 0.9078 | 1.4173 | 9.3829 | 0.6840 | 0.2488 |
606
+ | 0.6503 | 2250 | 0.8436 | 1.3196 | 9.4585 | 0.6831 | 0.1584 |
607
+ | 0.7225 | 2500 | 0.8744 | 1.2192 | 9.5395 | 0.6232 | 0.1527 |
608
+ | 0.7948 | 2750 | 1.1809 | 1.1600 | 9.4297 | 0.5681 | 0.1369 |
609
+ | 0.8671 | 3000 | 0.7233 | 1.1149 | 9.4893 | 0.5523 | 0.1614 |
610
+ | 0.9393 | 3250 | 0.7862 | 1.0738 | 9.5408 | 0.5372 | 0.1291 |
611
+ | 1.0116 | 3500 | 1.0888 | 1.0328 | 9.5612 | 0.5286 | 0.1281 |
612
+ | 1.0838 | 3750 | 0.8116 | 1.0304 | 9.4794 | 0.5239 | 0.1144 |
613
+ | 1.1561 | 4000 | 1.0436 | 1.0215 | 9.4184 | 0.5278 | 0.0973 |
614
+ | 1.2283 | 4250 | 0.9298 | 1.0107 | 9.4322 | 0.5221 | 0.0970 |
615
+ | 1.3006 | 4500 | 0.682 | 1.0093 | 9.4643 | 0.5186 | 0.0951 |
616
+ | 1.3728 | 4750 | 0.9863 | 1.0080 | 9.4627 | 0.5176 | 0.0948 |
617
+ | 1.4451 | 5000 | 1.0022 | 1.0076 | 9.4645 | 0.5179 | 0.0945 |
618
+
619
+
620
  ### Framework Versions
621
  - Python: 3.10.12
622
  - Sentence Transformers: 3.0.1
 
630
 
631
  ### BibTeX
632
 
633
+ #### Sentence Transformers
634
+ ```bibtex
635
+ @inproceedings{reimers-2019-sentence-bert,
636
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
637
+ author = "Reimers, Nils and Gurevych, Iryna",
638
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
639
+ month = "11",
640
+ year = "2019",
641
+ publisher = "Association for Computational Linguistics",
642
+ url = "https://arxiv.org/abs/1908.10084",
643
+ }
644
+ ```
645
+
646
+ #### MultipleNegativesRankingLoss
647
+ ```bibtex
648
+ @misc{henderson2017efficient,
649
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
650
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
651
+ year={2017},
652
+ eprint={1705.00652},
653
+ archivePrefix={arXiv},
654
+ primaryClass={cs.CL}
655
+ }
656
+ ```
657
+
658
+ #### AnglELoss
659
+ ```bibtex
660
+ @misc{li2023angleoptimized,
661
+ title={AnglE-optimized Text Embeddings},
662
+ author={Xianming Li and Jing Li},
663
+ year={2023},
664
+ eprint={2309.12871},
665
+ archivePrefix={arXiv},
666
+ primaryClass={cs.CL}
667
+ }
668
+ ```
669
+
670
+ #### GISTEmbedLoss
671
+ ```bibtex
672
+ @misc{solatorio2024gistembed,
673
+ title={GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning},
674
+ author={Aivin V. Solatorio},
675
+ year={2024},
676
+ eprint={2402.16829},
677
+ archivePrefix={arXiv},
678
+ primaryClass={cs.LG}
679
+ }
680
+ ```
681
+
682
  <!--
683
  ## Glossary
684
 
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e75f9f0d0ccf1ea68d57e5e49eadbe854516a7a239c28fe45742d13c727c0aae
3
  size 565251810
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a0d75b5e828001a49cbaef4b188a7c0e6c628b04ed1e077993035faf130c22c9
3
  size 565251810
tokenizer.json CHANGED
@@ -1,7 +1,19 @@
1
  {
2
  "version": "1.0",
3
- "truncation": null,
4
- "padding": null,
 
 
 
 
 
 
 
 
 
 
 
 
5
  "added_tokens": [
6
  {
7
  "id": 0,
 
1
  {
2
  "version": "1.0",
3
+ "truncation": {
4
+ "direction": "Right",
5
+ "max_length": 512,
6
+ "strategy": "LongestFirst",
7
+ "stride": 0
8
+ },
9
+ "padding": {
10
+ "strategy": "BatchLongest",
11
+ "direction": "Right",
12
+ "pad_to_multiple_of": null,
13
+ "pad_id": 0,
14
+ "pad_type_id": 0,
15
+ "pad_token": "[PAD]"
16
+ },
17
  "added_tokens": [
18
  {
19
  "id": 0,