iAkashPaul
commited on
Commit
•
7e1dc6e
1
Parent(s):
9539c6d
Update README.md
Browse files
README.md
CHANGED
@@ -17,5 +17,5 @@ Contains Q4 & Q8 quantized GGUFs for [google/gemma](https://huggingface.co/colle
|
|
17 |
|
18 |
| Variant | Device | Perf |
|
19 |
| - | - | - |
|
20 |
-
| Q4 | RTX 2070S |
|
21 |
-
| Q8 | RTX 2070S |
|
|
|
17 |
|
18 |
| Variant | Device | Perf |
|
19 |
| - | - | - |
|
20 |
+
| Q4 | RTX 2070S | 22 tok/s |
|
21 |
+
| Q8 | RTX 2070S | 7 tok/s (could only offload 23/29 layers to GPU) |
|