dahara1
/

ELYZA-japanese-Llama-2-7b-instruct-AWQ

Text Generation

Model card Files Files and versions Community

dahara1 commited on Sep 17, 2023

Commit

35e798f

•

1 Parent(s): 5afa541

Update README.md

Files changed (1) hide show

README.md +6 -1

README.md CHANGED Viewed

@@ -14,7 +14,12 @@ This model is a [AWQ](https://arxiv.org/abs/2306.00978) quantized(miniaturized t
 ## Model Details
-Currently, this model is confirmed to work with **Colab A100** or RTX 3060 on local PC. There is a problem with Free Colab(T4) and Colab Pro(V100) that prevents text from being output even though there is sufficient GPU memory. However, I still don't know the cause.
 Quantization reduces the amount of memory required and improves execution speed, but unfortunately performance deteriorates.

 ## Model Details
+Currently, this model is confirmed to work with **Colab A100** or RTX 3060 on local PC.
+This is because autoAWQ uses NVIDIA's PTX assembly instructions, which are supported on sm80 and higher architectures.
+Free Colab(T4) is sm75 and Colab Pro(V100) is sm70.
 Quantization reduces the amount of memory required and improves execution speed, but unfortunately performance deteriorates.