update readme.md
Browse files
README.md
CHANGED
@@ -45,9 +45,16 @@ print(embeddings)
|
|
45 |
```
|
46 |
|
47 |
### Important usage notes
|
48 |
-
- "ošišana
|
49 |
- The usage of uppercase letters for named entities can significantly improve search quality
|
50 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
51 |
|
52 |
## Evaluation
|
53 |
|
|
|
45 |
```
|
46 |
|
47 |
### Important usage notes
|
48 |
+
- "ošišana latinica" (usage of c instead of ć, etc...) significantly deacreases search quality
|
49 |
- The usage of uppercase letters for named entities can significantly improve search quality
|
50 |
|
51 |
+
## Training
|
52 |
+
|
53 |
+
- Embedić models are fine-tuned from multilingual-e5 models and they come in 3 sizes (small, base, large).
|
54 |
+
|
55 |
+
- Training is done on a single 4070ti super GPU
|
56 |
+
|
57 |
+
- 3-step training: distillation, training on (query, text) pairs and finally fine-tuning with triplets.
|
58 |
|
59 |
## Evaluation
|
60 |
|