webbigdata
/

C3TR-Adapter_gguf

 - ja
 tags:
 - translation
+---
+日英、英日ニューラル機械翻訳モデルである[webbigdata/C3TR-Adapter](https://huggingface.co/webbigdata/C3TR-Adapter)をGPUがないPCでも動くggufフォーマットに変換したモデルです。
+残念ながら現在のgguf版は翻訳後に幻覚を追加してしまう傾向があり、パラメーターを適宜調整する必要があります。
+Unfortunately, the GGUF version tends to add hallucinations after translation.
+[Colab Sample C3TR_Adapter_gguf_Free_Colab_sample](https://github.com/webbigdata-jp/python_sample/blob/main/C3TR_Adapter_gguf_Free_Colab_sample.ipynb)
+llama.cppを使うと、様々な量子化手法でファイルのサイズを小さくする事が出来ます。本サンプルでは5種類のみを扱います。小さいサイズのモデルは、少ないメモリで高速に動作させることができますが、モデルの性能も低下します。4ビット(q4_0)くらいがバランスが良いと言われていますが、本サンプルコードでは特定の文章を全モデルで翻訳し、どのモデルが貴方の作業に適切かを確認できるようにしたものです。残念ながら現在のgguf版は翻訳後に幻覚を追加してしまう傾向があります。
+Although llama.cpp can be used to reduce the size of the file with various quantization methods, this sample deals with only 5 types. Smaller models can run faster with less memory, but the performance of the models is also reduced. 4 bits (q4_0) is said to be a good balance, but this sample code translates a particular sentence with all models so that you can see which model is appropriate for your work. This sample code translates a specific sentence across all models so that you can see which model is appropriate for your work.Unfortunately, the current gguf implementation tends to add hallucinations after translation.
+- C3TR-Adapter.Q4_0.gguf 5.01 GB
+- C3TR-Adapter.Q4_1.gguf 5.5 GB
+- C3TR-Adapter.Q5_0.gguf 5.98 GB
+- C3TR-Adapter.Q5_1.gguf 6.47 GB
+- C3TR-Adapter.IQ1_S.gguf 2.16 GB (1bit量子化。正常動作しません。1-bit quantization still does not work as intended)
+### パラメーター(Parameters)
+- 温度（--temp）: この値を下げると、モデルがより確信度の高い（つまり、より一般的な）単語を選択する傾向が強くなります。
+- トップP（--top_p）: この値をさらに低く設定することで、モデルが考慮する単語の範囲を狭め、より一貫性のあるテキストを生成するようになります。
+- 生成する単語数（-n）: この値を減らすことで、モデルが生成するテキストの長さを短くし、不要な追加テキストの生成を防ぐことができます。-1 = 無限大、-2 = 文脈が満たされるまで。
+Adjust the following parameters
+- Temperature (--temp): Lowering this value will make the model more likely to select more confident (i.e., more common) words.
+- Top P (--top_p): Setting this value even lower will narrow the range of words considered by the model and produce more consistent text.
+- Number of words to generate (-n): Reducing this value will shorten the length of text generated by the model and prevent the generation of unnecessary additional text. -1 = infinity(default), -2 = until context filled.