Update README.md
Browse files
README.md
CHANGED
@@ -2611,19 +2611,19 @@ model-index:
|
|
2611 |
<br><br>
|
2612 |
|
2613 |
<p align="center">
|
2614 |
-
<img src="https://
|
2615 |
</p>
|
2616 |
|
2617 |
|
2618 |
<p align="center">
|
2619 |
-
<b>The text embedding set trained by <a href="https://jina.ai/"><b>Jina AI</b></a
|
2620 |
</p>
|
2621 |
|
2622 |
|
2623 |
## Intended Usage & Model Info
|
2624 |
|
2625 |
`jina-embeddings-v2-base-en` is an English, monolingual **embedding model** supporting **8192 sequence length**.
|
2626 |
-
It is based on a
|
2627 |
The backbone `jina-bert-v2-base-en` is pretrained on the C4 dataset.
|
2628 |
The model is further trained on Jina AI's collection of more than 400 millions of sentence pairs and hard negatives.
|
2629 |
These pairs were obtained from various domains and were carefully selected through a thorough cleaning process.
|
@@ -2634,17 +2634,11 @@ This makes our model useful for a range of use cases, especially when processing
|
|
2634 |
With a standard size of 137 million parameters, the model enables fast inference while delivering better performance than our small model. It is recommended to use a single GPU for inference.
|
2635 |
Additionally, we provide the following embedding models:
|
2636 |
|
2637 |
-
|
2638 |
-
|
2639 |
-
- [`jina-embeddings-
|
2640 |
-
- [`jina-embeddings-
|
2641 |
-
- [`jina-embeddings-
|
2642 |
-
|
2643 |
-
**V2 (Based on JinaBert, 8k Seq)**
|
2644 |
-
|
2645 |
-
- [`jina-embeddings-v2-small-en`](https://huggingface.co/jinaai/jina-embeddings-v2-small-en): 33 million parameters.
|
2646 |
-
- [`jina-embeddings-v2-base-en`](https://huggingface.co/jinaai/jina-embeddings-v2-base-en): 137 million parameters **(you are here)**.
|
2647 |
-
- [`jina-embeddings-v2-large-en`](): 435 million parameters (releasing soon).
|
2648 |
|
2649 |
## Data & Parameters
|
2650 |
|
|
|
2611 |
<br><br>
|
2612 |
|
2613 |
<p align="center">
|
2614 |
+
<img src="https://aeiljuispo.cloudimg.io/v7/https://cdn-uploads.huggingface.co/production/uploads/603763514de52ff951d89793/AFoybzd5lpBQXEBrQHuTt.png?w=200&h=200&f=face" alt="Finetuner logo: Finetuner helps you to create experiments in order to improve embeddings on search tasks. It accompanies you to deliver the last mile of performance-tuning for neural search applications." width="150px">
|
2615 |
</p>
|
2616 |
|
2617 |
|
2618 |
<p align="center">
|
2619 |
+
<b>The text embedding set trained by <a href="https://jina.ai/"><b>Jina AI</b></a>.</b>
|
2620 |
</p>
|
2621 |
|
2622 |
|
2623 |
## Intended Usage & Model Info
|
2624 |
|
2625 |
`jina-embeddings-v2-base-en` is an English, monolingual **embedding model** supporting **8192 sequence length**.
|
2626 |
+
It is based on a BERT architecture (JinaBERT) that supports the symmetric bidirectional variant of [ALiBi](https://arxiv.org/abs/2108.12409) to allow longer sequence length.
|
2627 |
The backbone `jina-bert-v2-base-en` is pretrained on the C4 dataset.
|
2628 |
The model is further trained on Jina AI's collection of more than 400 millions of sentence pairs and hard negatives.
|
2629 |
These pairs were obtained from various domains and were carefully selected through a thorough cleaning process.
|
|
|
2634 |
With a standard size of 137 million parameters, the model enables fast inference while delivering better performance than our small model. It is recommended to use a single GPU for inference.
|
2635 |
Additionally, we provide the following embedding models:
|
2636 |
|
2637 |
+
- [`jina-embeddings-v2-small-en`](https://huggingface.co/jinaai/jina-embeddings-v2-small-en): 33 million parameters **(you are here)**.
|
2638 |
+
- [`jina-embeddings-v2-base-en`](https://huggingface.co/jinaai/jina-embeddings-v2-base-en): 137 million parameters.
|
2639 |
+
- [`jina-embeddings-v2-base-zh`](): Chinese-English Bilingual embedding model (releasing soon).
|
2640 |
+
- [`jina-embeddings-v2-base-de`](): German-English Bilingual embedding model (releasing soon).
|
2641 |
+
- [`jina-embeddings-v2-base-es`](): Spanish-English Bilingual embedding model (releasing soon).
|
|
|
|
|
|
|
|
|
|
|
|
|
2642 |
|
2643 |
## Data & Parameters
|
2644 |
|