sdadas commited on
Commit
a6d84d0
1 Parent(s): acef991

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -22,7 +22,7 @@ This is a text encoder based on [stella_en_1.5B_v5](https://huggingface.co/dunzh
22
  - In the first step, we adapted the model for Polish with [multilingual knowledge distillation method](https://aclanthology.org/2020.emnlp-main.365/) using a diverse corpus of 20 million Polish-English text pairs.
23
  - The second step involved fine-tuning the model with contrastrive loss using a dataset consisting of 1.4 million queries. Positive and negative passages for each query have been selected with the help of [BAAI/bge-reranker-v2.5-gemma2-lightweight](https://huggingface.co/BAAI/bge-reranker-v2.5-gemma2-lightweight) reranker. The model was trained for three epochs with a batch size of 1024 queries.
24
 
25
- The encoder transforms texts to 1024 dimensional vectors. The model is optimized specifically for Polish information retrieval tasks. If you need a more versatile encoder, suitable for a wider range of tasks such as semantic similarity or clustering, you probably use the distilled version from the first step: [sdadas/stella-pl](https://huggingface.co/sdadas/stella-pl).
26
 
27
  ## Usage (Sentence-Transformers)
28
 
 
22
  - In the first step, we adapted the model for Polish with [multilingual knowledge distillation method](https://aclanthology.org/2020.emnlp-main.365/) using a diverse corpus of 20 million Polish-English text pairs.
23
  - The second step involved fine-tuning the model with contrastrive loss using a dataset consisting of 1.4 million queries. Positive and negative passages for each query have been selected with the help of [BAAI/bge-reranker-v2.5-gemma2-lightweight](https://huggingface.co/BAAI/bge-reranker-v2.5-gemma2-lightweight) reranker. The model was trained for three epochs with a batch size of 1024 queries.
24
 
25
+ The encoder transforms texts to 1024 dimensional vectors. The model is optimized specifically for Polish information retrieval tasks. If you need a more versatile encoder, suitable for a wider range of tasks such as semantic similarity or clustering, you should probably use the distilled version from the first step: [sdadas/stella-pl](https://huggingface.co/sdadas/stella-pl).
26
 
27
  ## Usage (Sentence-Transformers)
28