Update README.md
Browse files
README.md
CHANGED
@@ -14,7 +14,12 @@ This model is a [AWQ](https://arxiv.org/abs/2306.00978) quantized(miniaturized t
|
|
14 |
|
15 |
## Model Details
|
16 |
|
17 |
-
Currently, this model is confirmed to work with **Colab A100** or RTX 3060 on local PC.
|
|
|
|
|
|
|
|
|
|
|
18 |
|
19 |
Quantization reduces the amount of memory required and improves execution speed, but unfortunately performance deteriorates.
|
20 |
|
|
|
14 |
|
15 |
## Model Details
|
16 |
|
17 |
+
Currently, this model is confirmed to work with **Colab A100** or RTX 3060 on local PC.
|
18 |
+
|
19 |
+
This is because autoAWQ uses NVIDIA's PTX assembly instructions, which are supported on sm80 and higher architectures.
|
20 |
+
|
21 |
+
Free Colab(T4) is sm75 and Colab Pro(V100) is sm70.
|
22 |
+
|
23 |
|
24 |
Quantization reduces the amount of memory required and improves execution speed, but unfortunately performance deteriorates.
|
25 |
|