bartowski
/

bigstral-12b-32k-8xMoE-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

Edit model card

Llamacpp Quantizations of bigstral-12b-32k-8xMoE

Using llama.cpp release b2354 for quantization.

Original model: https://huggingface.co/bartowski/bigstral-12b-32k-8xMoE

Download a file (not the whole branch) from below:

Filename	Quant type	File Size	Description
bigstral-12b-32k-8xMoE-Q8_0.gguf	Q8_0	86.63GB	Extremely high quality, generally unneeded but max available quant.
bigstral-12b-32k-8xMoE-Q6_K.gguf	Q6_K	67.00GB	Very high quality, near perfect, recommended.
bigstral-12b-32k-8xMoE-Q5_K_M.gguf	Q5_K_M	58.00GB	High quality, very usable.
bigstral-12b-32k-8xMoE-Q5_K_S.gguf	Q5_K_S	56.25GB	High quality, very usable.
bigstral-12b-32k-8xMoE-Q5_0.gguf	Q5_0	56.25GB	High quality, older format, generally not recommended.
bigstral-12b-32k-8xMoE-Q4_K_M.gguf	Q4_K_M	49.60GB	Good quality, similar to 4.25 bpw.
bigstral-12b-32k-8xMoE-Q4_K_S.gguf	Q4_K_S	46.70GB	Slightly lower quality with small space savings.
bigstral-12b-32k-8xMoE-Q4_0.gguf	Q4_0	46.13GB	Decent quality, older format, generally not recommended.
bigstral-12b-32k-8xMoE-Q3_K_L.gguf	Q3_K_L	42.16GB	Lower quality but usable, good for low RAM availability.
bigstral-12b-32k-8xMoE-Q3_K_M.gguf	Q3_K_M	39.30GB	Even lower quality.
bigstral-12b-32k-8xMoE-Q3_K_S.gguf	Q3_K_S	35.62GB	Low quality, not recommended.
bigstral-12b-32k-8xMoE-Q2_K.gguf	Q2_K	30.17GB	Extremely low quality, not recommended.

Want to support my work? Visit my ko-fi page here: https://ko-fi.com/bartowski

Downloads last month: 145

GGUF

Model size

81.5B params

Architecture

llama

2-bit

3-bit

4-bit

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for bartowski/bigstral-12b-32k-8xMoE-GGUF

Base model

mistralai/Mistral-7B-Instruct-v0.2

Quantized

(86)

this model