grapevine-AI
/

Qwen2.5-32B-Instruct-GGUF-Japanese-imatrix

Model card Files Files and versions Community

grapevine-AI commited on about 2 hours ago

Commit

79c0ea1

•

1 Parent(s): 50a43c3

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -11,7 +11,7 @@ license: apache-2.0
 なお、公式配布されているGGUFにはBF16やFP32が存在しなかったため、一連の作業はQ8_0量子化モデルで行いました。<br>
 （imatrix計算時だけでなく、量子化時も`--allow-requantize`オプションでQ8からの再量子化を許容しています）
 ```
-.\llama-quantize.exe --allow-requantize --imatrix .\imatrix.dat "F:\Users\Public\Downloads\models\qwen2.5-32b-instruct-q8_0.gguf" IQ4_XS
 ```
 # Chat template
@@ -26,7 +26,7 @@ license: apache-2.0
 ```
 # Environment
-Windows版llama.cpp-b3621およびllama.cpp-b3472同時リリースのconvert-hf-to-gguf.pyを使用して量子化作業を実施しました。
 # License
 Apache 2.0

 なお、公式配布されているGGUFにはBF16やFP32が存在しなかったため、一連の作業はQ8_0量子化モデルで行いました。<br>
 （imatrix計算時だけでなく、量子化時も`--allow-requantize`オプションでQ8からの再量子化を許容しています）
 ```
+.\llama-quantize.exe --allow-requantize --imatrix .\imatrix.dat "F:\Users\Public\Downloads\models\qwen2.5-32b-instruct-q8_0.gguf" Q4_K_M
 ```
 # Chat template
 ```
 # Environment
+Windows版llama.cpp-b3621を使用して量子化作業を実施しました。
 # License
 Apache 2.0