GGUF version?
#13 opened about 1 year ago
by
CeeGee
Using llama_cpp
#12 opened over 1 year ago
by
axcelkuhn
Particularly Censored.
1
#11 opened over 1 year ago
by
BingoBird
Weird responses from the LLM
#10 opened over 1 year ago
by
PoyBoi
How to generate token by token?
2
#8 opened over 1 year ago
by
YaTharThShaRma999
Question about which .bin file to use and quantization
4
#7 opened over 1 year ago
by
florestankorp
New k-quants formats
1
#6 opened over 1 year ago
by
mudler
GGML models become dumb when used in python.
2
#5 opened over 1 year ago
by
supercharge19
New quant 8bit method, how is it performing on your CPU? (share your token/s, CPU model and -- thread)
24
#2 opened over 1 year ago
by
alphaprime90