small update (reame)
Browse files- README.md +2 -0
- README_JA.md +11 -0
README.md
CHANGED
@@ -22,6 +22,8 @@ datasets:
|
|
22 |
|
23 |
# GLuCoSE (General Luke-based COntrastive Sentence Embedding)-base-Japanese
|
24 |
|
|
|
|
|
25 |
GLuCoSE (General LUke-based COntrastive Sentence Embedding, "glucose") is a Japanese text embedding model based on [LUKE](https://github.com/studio-ousia/luke). In order to create a general-purpose, user-friendly Japanese text embedding model, GLuCoSE has been trained on a mix of web data and various datasets associated with natural language inference and search. This model is not only suitable for sentence vector similarity tasks but also for semantic search tasks.
|
26 |
- Maximum token count: 512
|
27 |
- Output dimension: 768
|
|
|
22 |
|
23 |
# GLuCoSE (General Luke-based COntrastive Sentence Embedding)-base-Japanese
|
24 |
|
25 |
+
[日本語のREADME/Japanese README](https://huggingface.co/pkshatech/GLuCoSE-base-ja)
|
26 |
+
|
27 |
GLuCoSE (General LUke-based COntrastive Sentence Embedding, "glucose") is a Japanese text embedding model based on [LUKE](https://github.com/studio-ousia/luke). In order to create a general-purpose, user-friendly Japanese text embedding model, GLuCoSE has been trained on a mix of web data and various datasets associated with natural language inference and search. This model is not only suitable for sentence vector similarity tasks but also for semantic search tasks.
|
28 |
- Maximum token count: 512
|
29 |
- Output dimension: 768
|
README_JA.md
CHANGED
@@ -8,10 +8,21 @@ tags:
|
|
8 |
- feature-extraction
|
9 |
- sentence-transformers
|
10 |
inference: false
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
---
|
12 |
|
13 |
# GLuCoSE (General Luke-based COntrastive Sentence Embedding)-base-Japanese
|
14 |
|
|
|
|
|
15 |
GLuCoSE (General LUke-based COntrastive Sentence Embedding, "ぐるこーす")は[LUKE](https://github.com/studio-ousia/luke)をベースにした日本語のテキスト埋め込みモデルです。汎用的で気軽に使えるテキスト埋め込みモデルを目指して、Webデータと自然言語推論や検索などの複数のデータセットを組み合わせたデータで学習されています。文ベクトルの類似度タスクだけでなく意味検索タスクにもお使いいただけます。
|
16 |
- 最大トークン数: 512
|
17 |
- 出力次元数: 768
|
|
|
8 |
- feature-extraction
|
9 |
- sentence-transformers
|
10 |
inference: false
|
11 |
+
datasets:
|
12 |
+
- mc4
|
13 |
+
- clips/mqa
|
14 |
+
- shunk031/JGLUE
|
15 |
+
- paws-x
|
16 |
+
- hpprc/janli
|
17 |
+
- MoritzLaurer/multilingual-NLI-26lang-2mil7
|
18 |
+
- castorini/mr-tydi
|
19 |
+
- hpprc/jsick
|
20 |
---
|
21 |
|
22 |
# GLuCoSE (General Luke-based COntrastive Sentence Embedding)-base-Japanese
|
23 |
|
24 |
+
[English README/英語のREADME](https://huggingface.co/pkshatech/GLuCoSE-base-ja)
|
25 |
+
|
26 |
GLuCoSE (General LUke-based COntrastive Sentence Embedding, "ぐるこーす")は[LUKE](https://github.com/studio-ousia/luke)をベースにした日本語のテキスト埋め込みモデルです。汎用的で気軽に使えるテキスト埋め込みモデルを目指して、Webデータと自然言語推論や検索などの複数のデータセットを組み合わせたデータで学習されています。文ベクトルの類似度タスクだけでなく意味検索タスクにもお使いいただけます。
|
27 |
- 最大トークン数: 512
|
28 |
- 出力次元数: 768
|