webbigdata
/

C3TR-Adapter_gguf

@@ -8,30 +8,36 @@ tags:
 - llama.cpp
 ---
-Gemmaベースの日英、英日ニューラル機械翻訳モデルである[webbigdata/C3TR-Adapter](https://huggingface.co/webbigdata/C3TR-Adapter)をGPUがないPCでも動かせるようにggufフォーマットに変換したモデルです。
 A Japanese-English and English-Japanese neural machine translation model, [webbigdata/C3TR-Adapter](https://huggingface.co/webbigdata/C3TR-Adapter), converted to gguf format so that it can run on a PC without a GPU.
-現在のgguf版は翻訳後に幻覚を追加出力してしまう傾向があり、パラメーターを適宜調整する必要があります。
-The current gguf version tends to add hallucinations after translation and the parameters need to be adjusted accordingly.
 リンク先で[Open in Colab]ボタンを押してColabを起動してください
 Press the [Open in Colab] button on the link to start Colab
 [Colab Sample C3TR_Adapter_gguf_Free_Colab_sample](https://github.com/webbigdata-jp/python_sample/blob/main/C3TR_Adapter_gguf_Free_Colab_sample.ipynb)
-llama.cppを使うと、様々な量子化手法でファイルのサイズを小さくする事が出来ますが、本サンプルでは5種類のみを扱います。小さいサイズのモデルは、少ないメモリで高速に動作させることができますが、モデルの性能も低下します。4ビット(q4_0)くらいがバランスが良いと言われていますが、本サンプルコードでは特定の文章を全モデルで翻訳し、どのモデルが貴方の作業に適切かを確認できるようにしたものです。
-Although llama.cpp can be used to reduce the size of the file with various quantization methods, this sample deals with only five types. Smaller models can run faster with less memory, but also reduce the performance of the models. 4 bits (q4_0) is said to be a good balance, but this sample code translates a particular sentence with all the models so that you can see which model is appropriate for your work.
 - C3TR-Adapter.Q4_0.gguf 5.01 GB
 - C3TR-Adapter.Q4_1.gguf 5.5 GB
 - C3TR-Adapter.Q5_0.gguf 5.98 GB
 - C3TR-Adapter.Q5_1.gguf 6.47 GB
 - C3TR-Adapter.IQ3_M.gguf 3.9 GB (3.66 bpw quantization mix. 動作確認できた最も小さいモデル。The smallest model that has been confirmed to work)
-- C3TR-Adapter.IQ1_S.gguf 2.16 GB (1.56 bpw quantization. 正常動作しません。does not work as intended)
 ### サンプルコード(sample code)
-#### Install and compile(linux)
 その他のOSについては[llama.cpp公式サイト](https://github.com/ggerganov/llama.cpp)を確認してください
 For other operating systems, please check the [llama.cpp official website](https://github.com/ggerganov/llama.cpp)
 ```
@@ -87,6 +93,9 @@ Translate Japanese to English.
 ### パラメーター(Parameters)
 必要に応じて下記のパラメーターを調整してください
 - 温度（--temp）: この値を下げると、モデルがより確信度の高い（つまり、より一般的な）単語��選択する傾向が強くなります。

 - llama.cpp
 ---
+Gemmaベースの日英、英日ニューラル機械翻訳モデルである[webbigdata/C3TR-Adapter](https://huggingface.co/webbigdata/C3TR-Adapter)をGPUがないPCでも動かせるようにggufフォーマットに変換したモデルです。
 A Japanese-English and English-Japanese neural machine translation model, [webbigdata/C3TR-Adapter](https://huggingface.co/webbigdata/C3TR-Adapter), converted to gguf format so that it can run on a PC without a GPU.
+### 簡単に試す方法(Easy way to try it)
+Googleの無料WebサービスColabを使うとブラウザを使って試す事ができます。
+You can try it using your browser with Colab, Google's free web service.
 リンク先で[Open in Colab]ボタンを押してColabを起動してください
 Press the [Open in Colab] button on the link to start Colab
 [Colab Sample C3TR_Adapter_gguf_Free_Colab_sample](https://github.com/webbigdata-jp/python_sample/blob/main/C3TR_Adapter_gguf_Free_Colab_sample.ipynb)
+### 利用可能なVersion(Available Versions)
+llama.cppを使うと、様々な量子化手法でファイルのサイズを小さくする事が出来ますが、本サンプルでは6種類のみを扱います。小さいサイズのモデルは、少ないメモリで高速に動作させることができますが、モデルの性能も低下します。4ビット(q4_0)くらいがバランスが良いと言われています。
+Although llama.cpp can be used to reduce the size of the file with various quantization methods, this sample deals with only six types. Smaller models can run faster with less memory, but also reduce the performance of the models. 4 bits (q4_0) is said to be a good balance.
 - C3TR-Adapter.Q4_0.gguf 5.01 GB
 - C3TR-Adapter.Q4_1.gguf 5.5 GB
 - C3TR-Adapter.Q5_0.gguf 5.98 GB
 - C3TR-Adapter.Q5_1.gguf 6.47 GB
 - C3TR-Adapter.IQ3_M.gguf 3.9 GB (3.66 bpw quantization mix. 動作確認できた最も小さいモデル。The smallest model that has been confirmed to work)
+- C3TR-Adapter.IQ1_S.gguf 2.16 GB (1.56 bpw quantization. まだ正常動作しないが原理上最も小さいモデル。Smallest model in principle, although it still does not work properly)
 ### サンプルコード(sample code)
+ColabのCPUは少し遅いので、少し技術的な挑戦が必要ですが皆さんが所有しているPCで動かす方が良いでしょう。
+Colab's CPU is a bit slow, so it would be better to run it on your own PC, which requires a bit of a technical challenge.
+#### Install and compile example(linux)
 その他のOSについては[llama.cpp公式サイト](https://github.com/ggerganov/llama.cpp)を確認してください
 For other operating systems, please check the [llama.cpp official website](https://github.com/ggerganov/llama.cpp)
 ```
 ### パラメーター(Parameters)
+現在のgguf版は翻訳後に幻覚を追加出力してしまう傾向があり、パラメーターを適宜調整する必要があります。
+The current gguf version tends to add hallucinations after translation and the parameters need to be adjusted accordingly.
 必要に応じて下記のパラメーターを調整してください
 - 温度（--temp）: この値を下げると、モデルがより確信度の高い（つまり、より一般的な）単語��選択する傾向が強くなります。