Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,200 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
language:
|
4 |
+
- ko
|
5 |
+
- en
|
6 |
+
metrics:
|
7 |
+
- accuracy
|
8 |
+
base_model:
|
9 |
+
- BAAI/bge-reranker-v2-m3
|
10 |
+
pipeline_tag: text-classification
|
11 |
+
library_name: sentence-transformers
|
12 |
+
---
|
13 |
+
|
14 |
+
# Reranker (Cross-Encoder)
|
15 |
+
|
16 |
+
Different from embedding model, reranker uses question and document as input and directly output similarity instead of embedding. You can get a relevance score by inputting query and passage to the reranker. And the score can be mapped to a float value in [0,1] by sigmoid function.
|
17 |
+
|
18 |
+
## Model Details
|
19 |
+
- Base model : BAAI/bge-reranker-v2-m3
|
20 |
+
- The multilingual model has been optimized for Korean.
|
21 |
+
|
22 |
+
## Usage with Transformers
|
23 |
+
|
24 |
+
```python
|
25 |
+
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
26 |
+
import torch
|
27 |
+
|
28 |
+
model = AutoModelForSequenceClassification.from_pretrained('dragonkue/bge-reranker-v2-m3-ko')
|
29 |
+
tokenizer = AutoTokenizer.from_pretrained('dragonkue/bge-reranker-v2-m3-ko')
|
30 |
+
|
31 |
+
features = tokenizer(['λͺ λ
λμ μ§λ°©μΈμΈμμ
λ²μ΄ μνλμκΉ?', 'μ€λ¬΄κ΅μ‘μ ν΅ν΄ βμ§λ°©μΈμΈμμ
λ²βμ λν μμΉλ¨μ²΄μ κ΄μ¬μ μ κ³ νκ³ μμΉλ¨μ²΄μ μ°¨μ§ μλ μ
무 μΆμ§μ μ§μνμλ€. μ΄λ¬ν μ€λΉκ³Όμ μ κ±°μ³ 2014λ
8μ 7μΌλΆν° βμ§λ°©μΈμΈμμ
λ²βμ΄ μνλμλ€.'],
|
32 |
+
['λͺ λ
λμ μ§λ°©μΈμΈμμ
λ²μ΄ μνλμκΉ?', 'μνμμ½νμμ μ²λ 21μΌ κ΅λ΄ μ μ½κΈ°μ
μ λ°μ΄μ€λ‘μ§μ€κ° κ°λ° μ€μΈ μ μ’
μ½λ‘λλ°μ΄λ¬μ€ κ°μΌμ¦(μ½λ‘λ19) λ°±μ νλ³΄λ¬Όμ§ βμ μ½λ°±-19βμ μμμν κ³νμ μ§λ 20μΌ μΉμΈνλ€κ³ λ°νλ€.'], padding=True, truncation=True, return_tensors="pt")
|
33 |
+
|
34 |
+
model.eval()
|
35 |
+
with torch.no_grad():
|
36 |
+
logits = model(**features).logits
|
37 |
+
scores = torch.sigmoid(logits)
|
38 |
+
print(scores)
|
39 |
+
```
|
40 |
+
|
41 |
+
|
42 |
+
## Usage with SentenceTransformers
|
43 |
+
First install the Sentence Transformers library:
|
44 |
+
```
|
45 |
+
pip install -U sentence-transformers
|
46 |
+
```
|
47 |
+
|
48 |
+
```python
|
49 |
+
from sentence_transformers import CrossEncoder
|
50 |
+
|
51 |
+
model = CrossEncoder('dragonkue/bge-reranker-v2-m3-ko', default_activation_function=torch.nn.Sigmoid())
|
52 |
+
|
53 |
+
scores = model.predict(['λͺ λ
λμ μ§λ°©μΈμΈμμ
λ²μ΄ μνλμκΉ?', 'μ€λ¬΄κ΅μ‘μ ν΅ν΄ βμ§λ°©μΈμΈμμ
λ²βμ λν μμΉλ¨μ²΄μ κ΄μ¬μ μ κ³ νκ³ μμΉλ¨μ²΄μ μ°¨μ§ μλ μ
무 μΆμ§μ μ§μνμλ€. μ΄λ¬ν μ€λΉκ³Όμ μ κ±°μ³ 2014λ
8μ 7μΌλΆν° βμ§λ°©μΈμΈμμ
λ²βμ΄ μνλμλ€.'],
|
54 |
+
['λͺ λ
λμ μ§λ°©μΈμΈμμ
λ²μ΄ μνλμκΉ?', 'μνμμ½νμμ μ²λ 21μΌ κ΅λ΄ μ μ½κΈ°μ
μ λ°μ΄μ€λ‘μ§μ€κ° κ°λ° μ€μΈ μ μ’
μ½λ‘λλ°μ΄λ¬μ€ κ°μΌμ¦(μ½λ‘λ19) λ°±μ νλ³΄λ¬Όμ§ βμ μ½λ°±-19βμ μμμν κ³νμ μ§λ 20μΌ μΉμΈνλ€κ³ λ°νλ€.'])
|
55 |
+
print(scores)
|
56 |
+
```
|
57 |
+
|
58 |
+
## Usage with FlagEmbedding
|
59 |
+
First install the FlagEmbedding library:
|
60 |
+
```
|
61 |
+
pip install -U FlagEmbedding
|
62 |
+
```
|
63 |
+
```python
|
64 |
+
from FlagEmbedding import FlagReranker
|
65 |
+
|
66 |
+
reranker = FlagReranker('dragonkue/bge-reranker-v2-m3-ko')
|
67 |
+
|
68 |
+
scores = reranker.compute_score([['λͺ λ
λμ μ§λ°©μΈμΈμμ
λ²μ΄ μνλμκΉ?', 'μ€λ¬΄κ΅μ‘μ ν΅ν΄ βμ§λ°©μΈμΈμμ
λ²βμ λν μμΉλ¨μ²΄μ κ΄μ¬μ μ κ³ νκ³ μμΉλ¨μ²΄μ μ°¨μ§ μλ μ
무 μΆμ§μ μ§μνμλ€. μ΄λ¬ν μ€λΉκ³Όμ μ κ±°μ³ 2014λ
8μ 7μΌλΆν° βμ§λ°©μΈμΈμμ
λ²βμ΄ μνλμλ€.'],
|
69 |
+
['λͺ λ
λμ μ§λ°©μΈμΈμμ
λ²μ΄ μνλμκΉ?', 'μνμμ½νμμ μ²λ 21μΌ κ΅λ΄ μ μ½κΈ°μ
μ λ°μ΄μ€λ‘μ§μ€κ° κ°λ° μ€μΈ μ μ’
μ½λ‘λλ°μ΄λ¬μ€ κ°μΌμ¦(μ½λ‘λ19) λ°±μ νλ³΄λ¬Όμ§ βμ μ½λ°±-19βμ μμμν κ³νμ μ§λ 20μΌ μΉμΈνλ€κ³ λ°νλ€.']], normalize=True)
|
70 |
+
print(scores)
|
71 |
+
```
|
72 |
+
|
73 |
+
## Fine-tune
|
74 |
+
Refer to https://github.com/FlagOpen/FlagEmbedding
|
75 |
+
|
76 |
+
|
77 |
+
## Evaluation
|
78 |
+
|
79 |
+
### Metrics
|
80 |
+
- ndcg, mrr, map metrics are metrics that consider ranking, while accuracy, precision, and recall are metrics that do not consider ranking. (Example: When considering ranking for retrieval top 10, different scores are given when the correct document is in 1st place and when it is in 10th place. However, accuracy, precision, and recall scores are the same if they are in the top 10.)
|
81 |
+
|
82 |
+
|
83 |
+
|
84 |
+
### Bi-encoder and Cross-encoder
|
85 |
+
|
86 |
+
Bi-Encoders convert texts into fixed-size vectors and efficiently calculate similarities between them. They are fast and ideal for tasks like semantic search and classification, making them suitable for processing large datasets quickly.
|
87 |
+
|
88 |
+
Cross-Encoders directly compare pairs of texts to compute similarity scores, providing more accurate results. While they are slower due to needing to process each pair, they excel in re-ranking top results and are important in Advanced RAG techniques for enhancing text generation.
|
89 |
+
|
90 |
+
|
91 |
+
### Korean Embedding Benchmark with AutoRAG
|
92 |
+
(https://github.com/Marker-Inc-Korea/AutoRAG-example-korean-embedding-benchmark)
|
93 |
+
|
94 |
+
This is a Korean embedding benchmark for the financial sector.
|
95 |
+
|
96 |
+
|
97 |
+
**Top-k 1**
|
98 |
+
|
99 |
+
Bi-Encoder (Sentence Transformer)
|
100 |
+
|
101 |
+
| Model name | F1 | Recall | Precision | mAP | mRR |
|
102 |
+
|---------------------------------------|------------|------------|------------|------------|------------|
|
103 |
+
| paraphrase-multilingual-mpnet-base-v2 | 0.3596 | 0.3596 | 0.3596 | 0.3596 | 0.3596 |
|
104 |
+
| KoSimCSE-roberta | 0.4298 | 0.4298 | 0.4298 | 0.4298 | 0.4298 |
|
105 |
+
| Cohere embed-multilingual-v3.0 | 0.3596 | 0.3596 | 0.3596 | 0.3596 | 0.3596 |
|
106 |
+
| openai ada 002 | 0.4737 | 0.4737 | 0.4737 | 0.4737 | 0.4737 |
|
107 |
+
| multilingual-e5-large-instruct | 0.4649 | 0.4649 | 0.4649 | 0.4649 | 0.4649 |
|
108 |
+
| Upstage Embedding | 0.6579 | 0.6579 | 0.6579 | 0.6579 | 0.6579 |
|
109 |
+
| paraphrase-multilingual-MiniLM-L12-v2 | 0.2982 | 0.2982 | 0.2982 | 0.2982 | 0.2982 |
|
110 |
+
| openai_embed_3_small | 0.5439 | 0.5439 | 0.5439 | 0.5439 | 0.5439 |
|
111 |
+
| ko-sroberta-multitask | 0.4211 | 0.4211 | 0.4211 | 0.4211 | 0.4211 |
|
112 |
+
| openai_embed_3_large | 0.6053 | 0.6053 | 0.6053 | 0.6053 | 0.6053 |
|
113 |
+
| KU-HIAI-ONTHEIT-large-v1 | 0.7105 | 0.7105 | 0.7105 | 0.7105 | 0.7105 |
|
114 |
+
| KU-HIAI-ONTHEIT-large-v1.1 | 0.7193 | 0.7193 | 0.7193 | 0.7193 | 0.7193 |
|
115 |
+
| kf-deberta-multitask | 0.4561 | 0.4561 | 0.4561 | 0.4561 | 0.4561 |
|
116 |
+
| gte-multilingual-base | 0.5877 | 0.5877 | 0.5877 | 0.5877 | 0.5877 |
|
117 |
+
| BGE-m3 | 0.6578 | 0.6578 | 0.6578 | 0.6578 | 0.6578 |
|
118 |
+
| bge-m3-korean | 0.5351 | 0.5351 | 0.5351 | 0.5351 | 0.5351 |
|
119 |
+
| **BGE-m3-ko** | **0.7456** | **0.7456** | **0.7456** | **0.7456** | **0.7456** |
|
120 |
+
|
121 |
+
|
122 |
+
Cross-Encoder (Reranker)
|
123 |
+
|
124 |
+
| Model name | F1 | Recall | Precision | mAP | mRR |
|
125 |
+
|---------------------------------------|------------|------------|------------|------------|------------|
|
126 |
+
| jinaai/jina-reranker-v2-base-multilingual | 0.8070 | 0.8070 | 0.8070 | 0.8070 | 0.8070 |
|
127 |
+
| Alibaba-NLP/gte-multilingual-reranker-base | 0.7281 | 0.7281 | 0.7281 | 0.7281 | 0.7281 |
|
128 |
+
| BAAI/bge-reranker-v2-m3 | 0.8772 | 0.8772 | 0.8772 | 0.8772 | 0.8772 |
|
129 |
+
| **bge-reranker-v2-m3-ko** | **0.9123** | **0.9123** | **0.9123** | **0.9123** | **0.9123** |
|
130 |
+
|
131 |
+
|
132 |
+
**Top-k 3**
|
133 |
+
|
134 |
+
Bi-Encoder (Sentence Transformer)
|
135 |
+
|
136 |
+
| Model name | F1 | Recall | Precision | mAP | mRR |
|
137 |
+
|---------------------------------------|------------|------------|------------|------------|------------|
|
138 |
+
| paraphrase-multilingual-mpnet-base-v2 | 0.2368 | 0.4737 | 0.1579 | 0.2032 | 0.2032 |
|
139 |
+
| KoSimCSE-roberta | 0.3026 | 0.6053 | 0.2018 | 0.2661 | 0.2661 |
|
140 |
+
| Cohere embed-multilingual-v3.0 | 0.2851 | 0.5702 | 0.1901 | 0.2515 | 0.2515 |
|
141 |
+
| openai ada 002 | 0.3553 | 0.7105 | 0.2368 | 0.3202 | 0.3202 |
|
142 |
+
| multilingual-e5-large-instruct | 0.3333 | 0.6667 | 0.2222 | 0.2909 | 0.2909 |
|
143 |
+
| Upstage Embedding | 0.4211 | 0.8421 | 0.2807 | **0.3509** | **0.3509** |
|
144 |
+
| paraphrase-multilingual-MiniLM-L12-v2 | 0.2061 | 0.4123 | 0.1374 | 0.1740 | 0.1740 |
|
145 |
+
| openai_embed_3_small | 0.3640 | 0.7281 | 0.2427 | 0.3026 | 0.3026 |
|
146 |
+
| ko-sroberta-multitask | 0.2939 | 0.5877 | 0.1959 | 0.2500 | 0.2500 |
|
147 |
+
| openai_embed_3_large | 0.3947 | 0.7895 | 0.2632 | 0.3348 | 0.3348 |
|
148 |
+
| KU-HIAI-ONTHEIT-large-v1 | 0.4386 | 0.8772 | 0.2924 | 0.3421 | 0.3421 |
|
149 |
+
| KU-HIAI-ONTHEIT-large-v1.1 | 0.4430 | 0.8860 | 0.2953 | 0.3406 | 0.3406 |
|
150 |
+
| kf-deberta-multitask | 0.3158 | 0.6316 | 0.2105 | 0.2792 | 0.2792 |
|
151 |
+
| gte-multilingual-base | 0.4035 | 0.8070 | 0.2690 | 0.3450 | 0.3450 |
|
152 |
+
| BGE-m3 | 0.4254 | 0.8508 | 0.2836 | 0.3421 | 0.3421 |
|
153 |
+
| bge-m3-korean | 0.3684 | 0.7368 | 0.2456 | 0.3143 | 0.3143 |
|
154 |
+
| **BGE-m3-ko** | **0.4517** | **0.9035** | **0.3011** | 0.3494 | 0.3494 |
|
155 |
+
|
156 |
+
Cross-Encoder (Reranker)
|
157 |
+
|
158 |
+
| Model name | F1 | Recall | Precision | mAP | mRR |
|
159 |
+
|---------------------------------------|------------|------------|------------|------------|------------|
|
160 |
+
| jinaai/jina-reranker-v2-base-multilingual | 0.4649 | 0.9298 | 0.3099 | 0.8626 | 0.8626 |
|
161 |
+
| Alibaba-NLP/gte-multilingual-reranker-base | 0.4605 | 0.9211 | 0.3070 | 0.8173 | 0.8173 |
|
162 |
+
| BAAI/bge-reranker-v2-m3 | 0.4781 | 0.9561 | 0.3187 | 0.9167 | 0.9167 |
|
163 |
+
| **bge-reranker-v2-m3-ko** | **0.4825** | **0.9649** | **0.3216** | **0.9371** | **0.9371** |
|
164 |
+
|
165 |
+
|
166 |
+
**Top-k 5**
|
167 |
+
|
168 |
+
Bi-Encoder (Sentence Transformer)
|
169 |
+
|
170 |
+
| Model name | F1 | Recall | Precision | mAP | mRR |
|
171 |
+
|---------------------------------------|------------|------------|------------|------------|------------|
|
172 |
+
| paraphrase-multilingual-mpnet-base-v2 | 0.1813 | 0.5439 | 0.1088 | 0.1575 | 0.1575 |
|
173 |
+
| KoSimCSE-roberta | 0.2164 | 0.6491 | 0.1298 | 0.1751 | 0.1751 |
|
174 |
+
| Cohere embed-multilingual-v3.0 | 0.2076 | 0.6228 | 0.1246 | 0.1640 | 0.1640 |
|
175 |
+
| openai ada 002 | 0.2602 | 0.7807 | 0.1561 | 0.2139 | 0.2139 |
|
176 |
+
| multilingual-e5-large-instruct | 0.2544 | 0.7632 | 0.1526 | 0.2194 | 0.2194 |
|
177 |
+
| Upstage Embedding | 0.2982 | 0.8947 | 0.1789 | **0.2237** | **0.2237** |
|
178 |
+
| paraphrase-multilingual-MiniLM-L12-v2 | 0.1637 | 0.4912 | 0.0982 | 0.1437 | 0.1437 |
|
179 |
+
| openai_embed_3_small | 0.2690 | 0.8070 | 0.1614 | 0.2148 | 0.2148 |
|
180 |
+
| ko-sroberta-multitask | 0.2164 | 0.6491 | 0.1298 | 0.1697 | 0.1697 |
|
181 |
+
| openai_embed_3_large | 0.2807 | 0.8421 | 0.1684 | 0.2088 | 0.2088 |
|
182 |
+
| KU-HIAI-ONTHEIT-large-v1 | 0.3041 | 0.9123 | 0.1825 | 0.2137 | 0.2137 |
|
183 |
+
| KU-HIAI-ONTHEIT-large-v1.1 | **0.3099** | **0.9298** | **0.1860** | 0.2148 | 0.2148 |
|
184 |
+
| kf-deberta-multitask | 0.2281 | 0.6842 | 0.1368 | 0.1724 | 0.1724 |
|
185 |
+
| gte-multilingual-base | 0.2865 | 0.8596 | 0.1719 | 0.2096 | 0.2096 |
|
186 |
+
| BGE-m3 | 0.3041 | 0.9123 | 0.1825 | 0.2193 | 0.2193 |
|
187 |
+
| bge-m3-korean | 0.2661 | 0.7982 | 0.1596 | 0.2116 | 0.2116 |
|
188 |
+
| **BGE-m3-ko** | **0.3099** | **0.9298** | **0.1860** | 0.2098 | 0.2098 |
|
189 |
+
|
190 |
+
Cross-Encoder (Reranker)
|
191 |
+
|
192 |
+
| Model name | F1 | Recall | Precision | mAP | mRR |
|
193 |
+
|---------------------------------------|------------|------------|------------|------------|------------|
|
194 |
+
| jinaai/jina-reranker-v2-base-multilingual | 0.3129 | 0.9386 | 0.1877 | 0.8643 | 0.8643 |
|
195 |
+
| Alibaba-NLP/gte-multilingual-reranker-base | 0.3158 | 0.9474 | 0.1895 | 0.8234 | 0.8234 |
|
196 |
+
| BAAI/bge-reranker-v2-m3 | **0.3216** | **0.9649** | **0.1930** | 0.9189 | 0.9189 |
|
197 |
+
| **bge-reranker-v2-m3-ko** | **0.3216** | **0.9649** | **0.1930** | **0.9371** | **0.9371** |
|
198 |
+
|
199 |
+
|
200 |
+
|