ddobokki commited on
Commit
e2a0baf
1 Parent(s): 4a539c1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -48
README.md CHANGED
@@ -5,27 +5,28 @@ tags:
5
  - feature-extraction
6
  - sentence-similarity
7
  - transformers
 
8
  ---
9
 
10
  # ddobokki/klue-roberta-small-nli-sts
11
 
12
- This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
13
 
14
  <!--- Describe your model here -->
15
 
16
  ## Usage (Sentence-Transformers)
17
 
18
- Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
19
 
20
  ```
21
  pip install -U sentence-transformers
22
  ```
23
 
24
- Then you can use the model like this:
25
 
26
  ```python
27
  from sentence_transformers import SentenceTransformer
28
- sentences = ["This is an example sentence", "Each sentence is converted"]
29
 
30
  model = SentenceTransformer('ddobokki/klue-roberta-small-nli-sts')
31
  embeddings = model.encode(sentences)
@@ -35,7 +36,7 @@ print(embeddings)
35
 
36
 
37
  ## Usage (HuggingFace Transformers)
38
- Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
39
 
40
  ```python
41
  from transformers import AutoTokenizer, AutoModel
@@ -50,7 +51,7 @@ def mean_pooling(model_output, attention_mask):
50
 
51
 
52
  # Sentences we want sentence embeddings for
53
- sentences = ['This is an example sentence', 'Each sentence is converted']
54
 
55
  # Load model from HuggingFace Hub
56
  tokenizer = AutoTokenizer.from_pretrained('ddobokki/klue-roberta-small-nli-sts')
@@ -70,49 +71,12 @@ print("Sentence embeddings:")
70
  print(sentence_embeddings)
71
  ```
72
 
 
 
73
 
74
-
75
- ## Evaluation Results
76
-
77
- <!--- Describe how your model was evaluated -->
78
-
79
- For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name=ddobokki/klue-roberta-small-nli-sts)
80
-
81
-
82
- ## Training
83
- The model was trained with the parameters:
84
-
85
- **DataLoader**:
86
-
87
- `sentence_transformers.datasets.NoDuplicatesDataLoader.NoDuplicatesDataLoader` of length 18078 with parameters:
88
- ```
89
- {'batch_size': 32}
90
- ```
91
-
92
- **Loss**:
93
-
94
- `sentence_transformers.losses.MultipleNegativesRankingLoss.MultipleNegativesRankingLoss` with parameters:
95
- ```
96
- {'scale': 20.0, 'similarity_fct': 'cos_sim'}
97
- ```
98
-
99
- Parameters of the fit()-Method:
100
- ```
101
- {
102
- "epochs": 1,
103
- "evaluation_steps": 1807,
104
- "evaluator": "sentence_transformers.evaluation.EmbeddingSimilarityEvaluator.EmbeddingSimilarityEvaluator",
105
- "max_grad_norm": 1,
106
- "optimizer_class": "<class 'transformers.optimization.AdamW'>",
107
- "optimizer_params": {
108
- "lr": 2e-05
109
- },
110
- "scheduler": "WarmupLinear",
111
- "steps_per_epoch": null,
112
- "warmup_steps": 1808,
113
- "weight_decay": 0.01
114
- }
115
- ```
116
 
117
 
118
  ## Full Model Architecture
 
5
  - feature-extraction
6
  - sentence-similarity
7
  - transformers
8
+ - ko
9
  ---
10
 
11
  # ddobokki/klue-roberta-small-nli-sts
12
 
13
+ 한국어 Sentence Transformer 모델입니다.
14
 
15
  <!--- Describe your model here -->
16
 
17
  ## Usage (Sentence-Transformers)
18
 
19
+ [sentence-transformers](https://www.SBERT.net) 라이브러리를 이용해 사용할 수 있습니다.
20
 
21
  ```
22
  pip install -U sentence-transformers
23
  ```
24
 
25
+ 사용법
26
 
27
  ```python
28
  from sentence_transformers import SentenceTransformer
29
+ sentences = ["흐르는 강물을 거꾸로 거슬러 오르는", "세월이 가면 가슴이 터질 듯한"]
30
 
31
  model = SentenceTransformer('ddobokki/klue-roberta-small-nli-sts')
32
  embeddings = model.encode(sentences)
 
36
 
37
 
38
  ## Usage (HuggingFace Transformers)
39
+ transformers 라이브러리만 사용할 경우
40
 
41
  ```python
42
  from transformers import AutoTokenizer, AutoModel
 
51
 
52
 
53
  # Sentences we want sentence embeddings for
54
+ sentences = ["흐르는 강물을 거꾸로 거슬러 오르는", "세월이 가면 가슴이 터질 듯한"]
55
 
56
  # Load model from HuggingFace Hub
57
  tokenizer = AutoTokenizer.from_pretrained('ddobokki/klue-roberta-small-nli-sts')
 
71
  print(sentence_embeddings)
72
  ```
73
 
74
+ ## Performance
75
+ - Semantic Textual Similarity test set results <br>
76
 
77
+ | Model | Cosine Pearson | Cosine Spearman | Euclidean Pearson | Euclidean Spearman | Manhattan Pearson | Manhattan Spearman | Dot Pearson | Dot Spearman |
78
+ |------------------------|:----:|:----:|:----:|:----:|:----:|:----:|:----:|:----:|
79
+ | KoSRoBERTa<sup>small</sup> | 84.27 | 84.17 | 83.33 | 83.65 | 83.34 | 83.65 | 82.10 | 81.38 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
80
 
81
 
82
  ## Full Model Architecture