rahuldshetty
/

gemma-7b-it-gguf-quantized

Inference Endpoints

Model card Files Files and versions Community

rahuldshetty commited on Feb 22

Commit

4c1410f

•

1 Parent(s): a74d6bb

Update README.md

Files changed (1) hide show

README.md +1 -0

README.md CHANGED Viewed

@@ -14,6 +14,7 @@ GGUF Quantized version of [gemma-7b-it](https://huggingface.co/google/gemma-7b-i
 | Name | Quant method | Bits | Size | Use case |
 | ---- | ---- | ---- | ---- | ----- |
 | [gemma-7b-it-Q4_K_M.gguf](https://huggingface.co/rahuldshetty/gemma-7b-it-gguf-quantized/blob/main/gemma-7b-it-Q4_K_M.gguf) | Q4_K_M | 4 | 5.13 GB | medium, balanced quality - recommended |
 # Gemma Model Card (Taken from Official HF Repo)

 | Name | Quant method | Bits | Size | Use case |
 | ---- | ---- | ---- | ---- | ----- |
 | [gemma-7b-it-Q4_K_M.gguf](https://huggingface.co/rahuldshetty/gemma-7b-it-gguf-quantized/blob/main/gemma-7b-it-Q4_K_M.gguf) | Q4_K_M | 4 | 5.13 GB | medium, balanced quality - recommended |
+| [gemma-7b-it-Q8_0.gguf](https://huggingface.co/rahuldshetty/gemma-7b-it-gguf-quantized/blob/main/gemma-7b-it-Q8_0.gguf) | Q8_0 | 8 | 9.08 GB | very large, extremely low quality loss - not recommended |
 # Gemma Model Card (Taken from Official HF Repo)