bwang0911 commited on
Commit
1fdda36
1 Parent(s): 7f19d36

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -1
README.md CHANGED
@@ -1082,6 +1082,10 @@ model-index:
1082
  It is based on a BERT architecture (JinaBERT) that supports the symmetric bidirectional variant of [ALiBi](https://arxiv.org/abs/2108.12409) to allow longer sequence length.
1083
  We have designed it for high performance in mongolingual & cross-language applications and trained it specifically to support mixed Chinese-English input without bias.
1084
 
 
 
 
 
1085
  The embedding model was trained using 512 sequence length, but extrapolates to 8k sequence length (or even longer) thanks to ALiBi.
1086
  This makes our model useful for a range of use cases, especially when processing long documents is needed, including long document retrieval, semantic textual similarity, text reranking, recommendation, RAG and LLM-based generative search, etc.
1087
 
@@ -1175,7 +1179,7 @@ According to the latest blog post from [LLamaIndex](https://blog.llamaindex.ai/b
1175
  ## Plans
1176
 
1177
  1. Bilingual embedding models supporting more European & Asian languages, including Spanish, French, Italian and Japanese.
1178
- 2. Multimodal embedding models enable MultimodalRAG applications.
1179
  3. High-performt rerankers.
1180
 
1181
  ## Contact
 
1082
  It is based on a BERT architecture (JinaBERT) that supports the symmetric bidirectional variant of [ALiBi](https://arxiv.org/abs/2108.12409) to allow longer sequence length.
1083
  We have designed it for high performance in mongolingual & cross-language applications and trained it specifically to support mixed Chinese-English input without bias.
1084
 
1085
+ `jina-embeddings-v2-base-zh` 是支持中英双语的文本向量模型,它支持长达8192字符的文本编码。
1086
+ 该模型的研发基于BERT架构(JinaBERT),JinaBERT是在BERT架构基础上的改进,首次将[ALiBi](https://arxiv.org/abs/2108.12409)应用到编码器架构中以支持更长的序列。
1087
+ 不同于以往的单语言/多语言向量模型,我们设计双语模型来更好的支持单语言(中搜中)以及跨语言(中搜英)文档检索。
1088
+
1089
  The embedding model was trained using 512 sequence length, but extrapolates to 8k sequence length (or even longer) thanks to ALiBi.
1090
  This makes our model useful for a range of use cases, especially when processing long documents is needed, including long document retrieval, semantic textual similarity, text reranking, recommendation, RAG and LLM-based generative search, etc.
1091
 
 
1179
  ## Plans
1180
 
1181
  1. Bilingual embedding models supporting more European & Asian languages, including Spanish, French, Italian and Japanese.
1182
+ 2. Multimodal embedding models enable Multimodal RAG applications.
1183
  3. High-performt rerankers.
1184
 
1185
  ## Contact