README.md · bartowski/Rhea-72b-v0.5-GGUF at main

metadata

library_name: transformers
license: apache-2.0
language:
  - en
quantized_by: bartowski
pipeline_tag: text-generation

Llamacpp Quantizations of Rhea-72b-v0.5

Using llama.cpp release b2536 for quantization.

Download a file (not the whole branch) from below:

Filename	Quant type	File Size	Description
Rhea-72b-v0.5-Q8_0.gguf	Q8_0	76.82GB	Extremely high quality, generally unneeded but max available quant.
Rhea-72b-v0.5-Q6_K.gguf	Q6_K	59.31GB	Very high quality, near perfect, recommended.
Rhea-72b-v0.5-Q5_K_M.gguf	Q5_K_M	51.30GB	High quality, very usable.
Rhea-72b-v0.5-Q5_K_S.gguf	Q5_K_S	49.88GB	High quality, very usable.
Rhea-72b-v0.5-Q5_0.gguf	Q5_0	49.88GB	High quality, older format, generally not recommended.
Rhea-72b-v0.5-Q4_K_M.gguf	Q4_K_M	43.77GB	Good quality, uses about 4.83 bits per weight.
Rhea-72b-v0.5-Q4_K_S.gguf	Q4_K_S	41.28GB	Slightly lower quality with small space savings.
Rhea-72b-v0.5-IQ4_NL.gguf	IQ4_NL	41.25GB	Decent quality, similar to Q4_K_S, new method of quanting,
Rhea-72b-v0.5-IQ4_XS.gguf	IQ4_XS	39.09GB	Decent quality, new method with similar performance to Q4.
Rhea-72b-v0.5-Q4_0.gguf	Q4_0	41.00GB	Decent quality, older format, generally not recommended.
Rhea-72b-v0.5-Q3_K_L.gguf	Q3_K_L	38.48GB	Lower quality but usable, good for low RAM availability.
Rhea-72b-v0.5-Q3_K_M.gguf	Q3_K_M	35.27GB	Even lower quality.
Rhea-72b-v0.5-IQ3_M.gguf	IQ3_M	33.26GB	Medium-low quality, new method with decent performance.
Rhea-72b-v0.5-IQ3_S.gguf	IQ3_S	31.56GB	Lower quality, new method with decent performance, recommended over Q3 quants.
Rhea-72b-v0.5-Q3_K_S.gguf	Q3_K_S	31.56GB	Low quality, not recommended.
Rhea-72b-v0.5-Q2_K.gguf	Q2_K	27.08GB	Extremely low quality, not recommended.

Want to support my work? Visit my ko-fi page here: https://ko-fi.com/bartowski