jinaai
/

jina-embeddings-v3

@@ -21540,9 +21540,9 @@ The easiest way to start using `jina-embeddings-v3` is with the [Jina Embedding
 `jina-embeddings-v3` is a **multilingual multi-task text embedding model** designed for a variety of NLP applications.
-Based on the [XLM-RoBERTa architecture](https://huggingface.co/jinaai/xlm-roberta-flash-implementation),
 this model supports [Rotary Position Embeddings (RoPE)](https://arxiv.org/abs/2104.09864) to handle long input sequences up to **8192 tokens**.
-Additionally, it features [LoRA](https://arxiv.org/abs/2106.09685) adapters to generate task-specific embeddings efficiently.
 ### Key Features:
 - **Extended Sequence Length:** Supports up to 8192 tokens with RoPE.
@@ -21554,13 +21554,8 @@ Additionally, it features [LoRA](https://arxiv.org/abs/2106.09685) adapters to g
     - `text-matching`: Used for embeddings in tasks that quantify similarity between two texts, such as STS or symmetric retrieval tasks
 - **Matryoshka Embeddings**: Supports flexible embedding sizes (`32, 64, 128, 256, 512, 768, 1024`), allowing for truncating embeddings to fit your application.
-### Model Lineage:
-The `jina-embeddings-v3` model is an enhancement of the [FacebookAI/xlm-roberta-large](https://huggingface.co/FacebookAI/xlm-roberta-large) model, initially trained on 100 languages. This model's functionality has been extended through an additional pretraining phase using the [CulturaX](https://huggingface.co/datasets/uonlp/CulturaX) dataset. Additionally, LoRA was employed to increase the context length to 8192 tokens. For further optimization, contrastive fine-tuning was performed across 30 languages, improving its performance in both monolingual and cross-lingual embedding tasks.
 ### Supported Languages:
-While the base model supports 100 languages, we've focused our tuning efforts on the following 30 languages:
 **Arabic, Bengali, Chinese, Danish, Dutch, English, Finnish, French, Georgian, German, Greek,
 Hindi, Indonesian, Italian, Japanese, Korean, Latvian, Norwegian, Polish, Portuguese, Romanian,
 Russian, Slovak, Spanish, Swedish, Thai, Turkish, Ukrainian, Urdu,** and **Vietnamese.**
@@ -21610,7 +21605,7 @@ model = AutoModel.from_pretrained("jinaai/jina-embeddings-v3", trust_remote_code
 encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors="pt")
 with torch.no_grad():
-    model_output = model(**encoded_input)
 embeddings = mean_pooling(model_output, encoded_input["attention_mask"])
 embeddings = F.normalize(embeddings, p=2, dim=1)
@@ -21703,16 +21698,16 @@ embeddings = model.encode(
 | jina-embeddings-v3             |    1024     | **65.60** | **82.58**| 45.27| 84.01| 58.13| 53.87| **85.8** | 30.98|
 | jina-embeddings-v2-en          |     768     |   58.12   |   68.82    | 40.08| 84.44| 55.09| 45.64| 80.00| 30.56|
 | text-embedding-3-large         |    3072     |   62.03   |   75.45    | 49.01| 84.22| 59.16| 55.44| 81.04| 29.92|
-| multilingual-e5-large-instruct |    4096     |   64.41   |   77.56   | 47.1 | 86.19| 58.58| 52.47| 84.78| 30.39|
-| Cohere-embed-multilingual-v3.0 | 4096 |   60.08   |   64.01   | 46.6 | 86.15| 57.86| 53.84| 83.15| 30.99|
 ### Multilingual MTEB
 |             Model              | Dimension |  Average  | Classification | Clustering | Pair Classification | Reranking | Retrieval |    STS    | Summarization |
 |:------------------------------:|:---------:|:---------:|:--------------:|:----------:|:-------------------:|:---------:|:---------:|:---------:|:-------------:|
 |       jina-embeddings-v3       |   1024    | **64.44** |   **71.46**    |   46.71    |        76.91        |   63.98   |   57.98   | **69.83** |       -       |
-|     multilingual-e5-large      |   4096    |   59.58   |     65.22      |   42.12    |        76.95        |   63.4    |   52.37   |   64.65   |       -       |
-| multilingual-e5-large-instruct |   4096    |   64.25   |     67.45      | **52.12**  |        77.79        | **69.02** | **58.38** |   68.77   |       -       |
 ### Long Context Tasks (LongEmbed)

 `jina-embeddings-v3` is a **multilingual multi-task text embedding model** designed for a variety of NLP applications.
+Based on the [Jina-XLM-RoBERTa architecture](https://huggingface.co/jinaai/xlm-roberta-flash-implementation),
 this model supports [Rotary Position Embeddings (RoPE)](https://arxiv.org/abs/2104.09864) to handle long input sequences up to **8192 tokens**.
+Additionally, it features 5 [LoRA](https://arxiv.org/abs/2106.09685) adapters to generate task-specific embeddings efficiently.
 ### Key Features:
 - **Extended Sequence Length:** Supports up to 8192 tokens with RoPE.
     - `text-matching`: Used for embeddings in tasks that quantify similarity between two texts, such as STS or symmetric retrieval tasks
 - **Matryoshka Embeddings**: Supports flexible embedding sizes (`32, 64, 128, 256, 512, 768, 1024`), allowing for truncating embeddings to fit your application.
 ### Supported Languages:
+While the foundation model supports 89 languages, we've focused our tuning efforts on the following 30 languages:
 **Arabic, Bengali, Chinese, Danish, Dutch, English, Finnish, French, Georgian, German, Greek,
 Hindi, Indonesian, Italian, Japanese, Korean, Latvian, Norwegian, Polish, Portuguese, Romanian,
 Russian, Slovak, Spanish, Swedish, Thai, Turkish, Ukrainian, Urdu,** and **Vietnamese.**
 encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors="pt")
 with torch.no_grad():
+    model_output = model(**encoded_input, task_type='retrieval.query')
 embeddings = mean_pooling(model_output, encoded_input["attention_mask"])
 embeddings = F.normalize(embeddings, p=2, dim=1)
 | jina-embeddings-v3             |    1024     | **65.60** | **82.58**| 45.27| 84.01| 58.13| 53.87| **85.8** | 30.98|
 | jina-embeddings-v2-en          |     768     |   58.12   |   68.82    | 40.08| 84.44| 55.09| 45.64| 80.00| 30.56|
 | text-embedding-3-large         |    3072     |   62.03   |   75.45    | 49.01| 84.22| 59.16| 55.44| 81.04| 29.92|
+| multilingual-e5-large-instruct |    1024     |   64.41   |   77.56   | 47.1 | 86.19| 58.58| 52.47| 84.78| 30.39|
+| Cohere-embed-multilingual-v3.0 | 1024 |   60.08   |   64.01   | 46.6 | 86.15| 57.86| 53.84| 83.15| 30.99|
 ### Multilingual MTEB
 |             Model              | Dimension |  Average  | Classification | Clustering | Pair Classification | Reranking | Retrieval |    STS    | Summarization |
 |:------------------------------:|:---------:|:---------:|:--------------:|:----------:|:-------------------:|:---------:|:---------:|:---------:|:-------------:|
 |       jina-embeddings-v3       |   1024    | **64.44** |   **71.46**    |   46.71    |        76.91        |   63.98   |   57.98   | **69.83** |       -       |
+|     multilingual-e5-large      |   1024    |   59.58   |     65.22      |   42.12    |        76.95        |   63.4    |   52.37   |   64.65   |       -       |
+| multilingual-e5-large-instruct |   1024    |   64.25   |     67.45      | **52.12**  |        77.79        | **69.02** | **58.38** |   68.77   |       -       |
 ### Long Context Tasks (LongEmbed)