alexedelsburg
/

Puma-3b-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

alexedelsburg commited on Sep 16, 2023

Commit

d9ea9e2

•

1 Parent(s): 4ae68f2

Update README

- Fix URL to the model
- List acrastt & Bohan Du

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -9,7 +9,7 @@ pipeline_tag: text-generation
 ---
 # Puma 3B - GGUF
-- Model creator: [Bohan Du](https://huggingface.co/acrastt)
 - Original model: [Puma 3B](https://huggingface.co/acrastt/puma-3b)
 <!-- description start -->
@@ -75,9 +75,9 @@ For other parameters and how to use them, please refer to [the llama.cpp documen
 | Name | Quant method | Bits | Size | Max RAM required | Use case |
 | ---- | ---- | ---- | ---- | ---- | ----- |
-| [puma-3b.q4_1.gguf](https://huggingface.co/TheBloke/Puma-3b-GGML/blob/main/puma-3b.ggmlv3.q4_1.bin) | q4_1 | 4 | 2.14 GB| 4.64 GB | Original quant method, 4-bit. Higher accuracy than q4_0 but not as high as q5_0. However has quicker inference than q5 models. |
 ## Thanks
-- to [Bohan Du](https://huggingface.co/acrastt) for the Puma model
 - to [TheBloke](https://huggingface.co/TheBloke) for all the quantized models and this model card template

 ---
 # Puma 3B - GGUF
+- Model creator: [Bohan Du / acrastt](https://huggingface.co/acrastt)
 - Original model: [Puma 3B](https://huggingface.co/acrastt/puma-3b)
 <!-- description start -->
 | Name | Quant method | Bits | Size | Max RAM required | Use case |
 | ---- | ---- | ---- | ---- | ---- | ----- |
+| [puma-3b.q4_1.gguf](https://huggingface.co/alexedelsburg/Puma-3b-GGUF/blob/main/puma-3b.q4_1.gguf) | q4_1 | 4 | 2.14 GB| 4.64 GB | Original quant method, 4-bit. Higher accuracy than q4_0 but not as high as q5_0. However has quicker inference than q5 models. |
 ## Thanks
+- to [Bohan Du / acrastt](https://huggingface.co/acrastt) for the Puma model
 - to [TheBloke](https://huggingface.co/TheBloke) for all the quantized models and this model card template