Monor
/

Llama3-ChatQA-1.5-8B-gguf

Inference Endpoints

Model card Files Files and versions Community

Monor commited on May 8

Commit

158552c

•

1 Parent(s): 8c2ee82

Create README.md

Files changed (1) hide show

README.md +8 -0

README.md ADDED Viewed

	@@ -0,0 +1,8 @@

+---
+license: apache-2.0
+---
+## Introduce
+Quantizing the [nvidia/Llama3-ChatQA-1.5-8B](https://huggingface.co/nvidia/Llama3-ChatQA-1.5-8B) to f16, q2, q3, q4, q5, q6 and q8 with Llama.cpp.