dahara1
/

imatrix-jpn-test

GGUF

Inference Endpoints

imatrix

conversational

Model card Files Files and versions Community

dahara1 commited on Sep 23

Commit

da18eba

•

1 Parent(s): c803ee5

Update README.md

Browse files

Files changed (1) hide show

README.md +23 -21

README.md CHANGED Viewed

@@ -1,10 +1,8 @@
 ---
-# For reference on model card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1
-# Doc / guide: https://huggingface.co/docs/hub/model-cards
 {}
 ---
-# Model Card for Model ID
 gemma-2-9b-it quantized with imatrix containing a lot of Japanese text
 日本語テキストを多く含むimatrixで量子化されたgemma-2-9b-it
@@ -12,23 +10,29 @@ gemma-2-9b-it quantized with imatrix containing a lot of Japanese text
 ## Model Details
 It is known that using imatrix when quantizing a model for llama.cpp improves performance.
-However, imatrix is often created only from English text. In cases where a model is used in languages other than English, wouldn't it be better to create an imatrix by mixing text in other languages?
-This page confirms the effectiveness of multilingual imatrix.
 モデルをllama.cpp用に量子化する際にimatrixを使うと性能が向上する事が知られています。
-しかし、imatrixは英語テキストのみから作成されている事が多いです。英語以外の言語を使ってモデルを使用するケースでは他の言語のテキストも混ぜてimatrixを作成した方がよいのではないでしょうか？
-本ページは多言語版imatrixの有効性を確かめました。
-### Model Description
-## Performance Evaluation
 The experiments took considerable time, totaling 18 runs (3 hours per file x 18 runs). The `imatrix-jpn-test` consistently showed lower perplexity scores compared to the `no imatrix` models, particularly on Japanese datasets (`ja-wiki`). For instance, the `imatrix-jpn-test M` scored 17.2069, improving over the `no imatrix M` score of 17.3948.
 実験にはかなりの時間がかかり、合計 18 回実行されました (ファイルあたり 3 時間 x 18 回実行)。`imatrix-jpn-test` は、特に日本語データセット (`ja-wiki`) で、`no imatrix` モデルと比較して一貫して低いパープレキシティ スコアを示しました。たとえば、`imatrix-jpn-test M` のスコアは 17.2069 で、`no imatrix M` のスコア 17.3948 よりも向上しました。
-`imatrix-jpn-test` outperformed `no imatrix` models across all sizes (M, L, fp16) in both English and Japanese datasets, indicating the effectiveness of the imatrix approach, especially for non-English languages.
-`imatrix-jpn-test` は、英語と日本語の両方のデータセットにおいて、すべてのサイズ (M、L、fp16) で `no imatrix` モデルよりも優れたパフォーマンスを示し、特に英語以外の言語において imatrix アプローチの有効性を示しました。
-## Results Summary
 ![wiki.test.raw_perplexity_score.png](wiki.test.raw_perplexity_score.png)
 Measurements using English wiki.test.raw suggest that imatrix improves perplexity scores.
@@ -36,9 +40,8 @@ Measurements using English wiki.test.raw suggest that imatrix improves perplexit
 ![ja-wiki.test.raw_perplexity_score.png](ja-wiki.test.raw_perplexity_score.png)
-Measurements using Japanese ja-wiki.test.raw data suggest that L/fp16 quants improve scores.
-日本語のja-wiki.test.rawデータを使った計測ではL/fp16クォンツがスコアを向上させる事が示唆された
 | Model                | wiki.test.raw Perplexity | ja-wiki.test.raw Perplexity |
 |----------------------|--------------------------|-----------------------------|
@@ -99,12 +102,12 @@ Example:
 ### 注意事項 Notes
-- These results may vary depending on the model. It is best not to assume that these results apply to all models.In particular, gemma is said to improve performance with L/fp16 quant.
 - Even under almost identical conditions, scores may increase or decrease slightly. It is better to focus on trends rather than small differences.
 - Please note that the imatrix-jpn-test model uses 5 times as much text for the imatrix as the bartowski model. There is a possibility that the performance may be slightly increased simply because there is more text.
 - In reality, it is better to measure performance with real tasks rather than perplexity. However, there are many different benchmarks for real tasks, so I will leave it up to you to verify this.
-- モデルによってこの結果は異なってくる可能性があります。あらゆるモデルに通用する結果とはまだ思わない方がよいです。特にgemmaはL/fp16クォンツで性能が向上すると言われています
 - ほぼ同等の条件でも微妙にスコアが増減する事があります。わずかな差に注目するのではなく傾向に注目する事が望ましいです
 - imatrix-jpn-testモデルはbartowskiモデルに比べてimatrixに5倍のテキストを使用している事に留意してください。単純にテキストが多いため性能が微妙に増えている可能性があります
 - 本来はperplexityではなく実タスクで性能を測定する事が望ましいです。しかし、実タスクのベンチマークも多様なのでその検証は皆さんにお任せします
@@ -113,10 +116,10 @@ Example:
 - Imatrix is effective in the 4-bit quantization we tried this time.
 - If you want to improve the performance of languages other than English, it may be worth adding other languages to the imatrix, but it may decrease the model's English ability.
-- If you are only using English, the quantization variations may not make much difference.
 - 今回試した4bit量子化においてimatrixは有効です
 - 英語以外の言語の性能を少しでも向上させたい場合はimatrixに他言語を追加する価値はありそうです。しかし、モデルの英語能力が下がる可能性があります。
-- 英語だけを使っている場合、量子化のバリエーションは大きな違いがない可能性があります
 ### その他参考情報 Other references
@@ -130,7 +133,6 @@ The following information may be helpful in your further exploration.
 ### 謝辞 Acknowledgements
 Thanks to the llama.cpp community.
 llama.cppのコミュニティの皆さんに感謝します。
 Thanks to the Google Gemma-2.
@@ -143,7 +145,7 @@ I do not know all the inventors of each method, so please point out any that I h
 - **Developed by:** [dahara1@webbigdata]
 - **Language(s) (NLP):** [English, Japanese]
-- **base model [optional]:** [gemma-2-9b-it]
 **BibTeX:**

 ---
 {}
 ---
+# Model Card for imatrix-jpn-test
 gemma-2-9b-it quantized with imatrix containing a lot of Japanese text
 日本語テキストを多く含むimatrixで量子化されたgemma-2-9b-it
 ## Model Details
 It is known that using imatrix when quantizing a model for llama.cpp improves performance.
+Imatrixes are often created only from English text.
+However, if you are using a model in a language other than English, wouldn't it be better to create an imatrix that includes text in other languages as well?
+This model was created to verify the effectiveness of a multilingual imatrix.
 モデルをllama.cpp用に量子化する際にimatrixを使うと性能が向上する事が知られています。
+imatrixは英語テキストのみから作成されている事が多いです。
+しかし、英語以外の言語を使ってモデルを使用するケースでは他の言語のテキストも混ぜてimatrixを作成した方がよいのではないでしょうか？
+本モデルは多言語版imatrixの有効性を確かめるために作成されたモデルです。
+## Model Description
+### Performance Evaluation
 The experiments took considerable time, totaling 18 runs (3 hours per file x 18 runs). The `imatrix-jpn-test` consistently showed lower perplexity scores compared to the `no imatrix` models, particularly on Japanese datasets (`ja-wiki`). For instance, the `imatrix-jpn-test M` scored 17.2069, improving over the `no imatrix M` score of 17.3948.
 実験にはかなりの時間がかかり、合計 18 回実行されました (ファイルあたり 3 時間 x 18 回実行)。`imatrix-jpn-test` は、特に日本語データセット (`ja-wiki`) で、`no imatrix` モデルと比較して一貫して低いパープレキシティ スコアを示しました。たとえば、`imatrix-jpn-test M` のスコアは 17.2069 で、`no imatrix M` のスコア 17.3948 よりも向上しました。
+The imatrix-jpn-test model performed better than the no imatrix model and the Bartowski model in terms of perplexity measured with Japanese data, but was slightly higher than the Bartowski model in terms of perplexity measured with English data.
+*The lower the perplexity, the better.
+imatrix-jpn-testモデルは、日本語データで測定したパープレキシティではno imatrixモデルおよびbartowskiモデルよりも優れたパフォーマンスを示しましたが、英語データで測定したパープレキシティではbartowskiモデルよりも若干高いパープレキシティを示しました。
+※パープレキシティは低い方が良い指標です
+### Results Summary
 ![wiki.test.raw_perplexity_score.png](wiki.test.raw_perplexity_score.png)
 Measurements using English wiki.test.raw suggest that imatrix improves perplexity scores.
 ![ja-wiki.test.raw_perplexity_score.png](ja-wiki.test.raw_perplexity_score.png)
+Measurements using the Japanese ja-wiki.test.raw data suggest that quantizations variation L and quantizations variation fp16 improve scores.
+日本語のja-wiki.test.rawデータを使った計測ではquantizations variation Lとquantizations variation fp16がスコアを向上させる事が示唆された
 | Model                | wiki.test.raw Perplexity | ja-wiki.test.raw Perplexity |
 |----------------------|--------------------------|-----------------------------|
 ### 注意事項 Notes
+- These results may vary depending on the model. It is best not to assume that these results apply to all models. Gemma is known to improve performance, especially with L and fp16 quantizations variations.
 - Even under almost identical conditions, scores may increase or decrease slightly. It is better to focus on trends rather than small differences.
 - Please note that the imatrix-jpn-test model uses 5 times as much text for the imatrix as the bartowski model. There is a possibility that the performance may be slightly increased simply because there is more text.
 - In reality, it is better to measure performance with real tasks rather than perplexity. However, there are many different benchmarks for real tasks, so I will leave it up to you to verify this.
+- モデルによってこの結果は異なってくる可能性があります。あらゆるモデルに通用する結果とはまだ思わない方がよいです。gemmaは特にLおよびfp16のquantizations variationクォンツで性能が向上する事は知られています
 - ほぼ同等の条件でも微妙にスコアが増減する事があります。わずかな差に注目するのではなく傾向に注目する事が望ましいです
 - imatrix-jpn-testモデルはbartowskiモデルに比べてimatrixに5倍のテキストを使用している事に留意してください。単純にテキストが多いため性能が微妙に増えている可能性があります
 - 本来はperplexityではなく実タスクで性能を測定する事が望ましいです。しかし、実タスクのベンチマークも多様なのでその検証は皆さんにお任せします
 - Imatrix is effective in the 4-bit quantization we tried this time.
 - If you want to improve the performance of languages other than English, it may be worth adding other languages to the imatrix, but it may decrease the model's English ability.
+- If you are only using English, the quantization variations may not make much difference in 4bit.
 - 今回試した4bit量子化においてimatrixは有効です
 - 英語以外の言語の性能を少しでも向上させたい場合はimatrixに他言語を追加する価値はありそうです。しかし、モデルの英語能力が下がる可能性があります。
+- 英語だけを使っている場合、量子化のバリエーションは4bitでは大きな違いがない可能性があります
 ### その他参考情報 Other references
 ### 謝辞 Acknowledgements
 Thanks to the llama.cpp community.
 llama.cppのコミュニティの皆さんに感謝します。
 Thanks to the Google Gemma-2.
 - **Developed by:** [dahara1@webbigdata]
 - **Language(s) (NLP):** [English, Japanese]
+- **base model [optional]:** [gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it)
 **BibTeX:**