Felladrin
/

gguf-Minueza-32M-Base

Inference Endpoints

Model card Files Files and versions Community

Felladrin commited on Apr 3

Commit

e20e567

•

1 Parent(s): 87af300

Update README.md

Files changed (1) hide show

README.md +19 -7

README.md CHANGED Viewed

@@ -5,14 +5,26 @@ base_model: Felladrin/Minueza-32M-Base
 GGUF version of [Felladrin/Minueza-32M-Base](https://huggingface.co/Felladrin/Minueza-32M-Base).
-It was not possible to quantize the model after converting it to F16/F32 GGUF, so only those versions are available, being F32 the recommended one for having better precision.
-## Recommended Inference Parameters
 ```
-temp 0.4
-min-p 0.1
-top_p 1
-top_k 0
-repeat_penalty 1.0
 ```

 GGUF version of [Felladrin/Minueza-32M-Base](https://huggingface.co/Felladrin/Minueza-32M-Base).
+It was not possible to quantize the model, so only the F16 and F32 GGUF files are available.
+## Try it with [llama.cpp](https://github.com/ggerganov/llama.cpp)
+```sh
+brew install ggerganov/ggerganov/llama.cpp
 ```
+```sh
+llama-cli \
+  --hf-repo Felladrin/gguf-Minueza-32M-Base \
+  --model Minueza-32M-Base.F32.gguf \
+  --random-prompt  \
+  --dynatemp-range 0.1-2.5 \
+  --top-k 0 \
+  --top-p 1 \
+  --min-p 0.1 \
+  --typical 0.85 \
+  --mirostat 2 \
+  --mirostat-ent 3.5 \
+  --repeat-penalty 1.1 \
+  --repeat-last-n -1 \
+  -n 256
 ```