bartowski
/

gemma-1.1-2b-it-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

Edit model card

Llamacpp Quantizations of gemma-1.1-2b-it

Using llama.cpp release b2589 for quantization.

Original model: https://huggingface.co/google/gemma-1.1-2b-it

Download a file (not the whole branch) from below:

Filename	Quant type	File Size	Description
gemma-1.1-2b-it-Q8_0.gguf	Q8_0	2.66GB	Extremely high quality, generally unneeded but max available quant.
gemma-1.1-2b-it-Q6_K.gguf	Q6_K	2.06GB	Very high quality, near perfect, recommended.
gemma-1.1-2b-it-Q5_K_M.gguf	Q5_K_M	1.83GB	High quality, recommended.
gemma-1.1-2b-it-Q5_K_S.gguf	Q5_K_S	1.79GB	High quality, recommended.
gemma-1.1-2b-it-Q5_0.gguf	Q5_0	1.79GB	High quality, older format, generally not recommended.
gemma-1.1-2b-it-Q4_K_M.gguf	Q4_K_M	1.63GB	Good quality, uses about 4.83 bits per weight, recommended.
gemma-1.1-2b-it-Q4_K_S.gguf	Q4_K_S	1.55GB	Slightly lower quality with small space savings.
gemma-1.1-2b-it-IQ4_NL.gguf	IQ4_NL	1.56GB	Decent quality, similar to Q4_K_S, new method of quanting, recommended.
gemma-1.1-2b-it-IQ4_XS.gguf	IQ4_XS	1.50GB	Decent quality, new method with similar performance to Q4.
gemma-1.1-2b-it-Q4_0.gguf	Q4_0	1.55GB	Decent quality, older format, generally not recommended.
gemma-1.1-2b-it-Q3_K_L.gguf	Q3_K_L	1.46GB	Lower quality but usable, good for low RAM availability.
gemma-1.1-2b-it-Q3_K_M.gguf	Q3_K_M	1.38GB	Even lower quality.
gemma-1.1-2b-it-IQ3_M.gguf	IQ3_M	1.30GB	Medium-low quality, new method with decent performance.
gemma-1.1-2b-it-IQ3_S.gguf	IQ3_S	1.28GB	Lower quality, new method with decent performance, recommended over Q3 quants.
gemma-1.1-2b-it-Q3_K_S.gguf	Q3_K_S	1.28GB	Low quality, not recommended.
gemma-1.1-2b-it-Q2_K.gguf	Q2_K	1.15GB	Extremely low quality, not recommended.

Want to support my work? Visit my ko-fi page here: https://ko-fi.com/bartowski

Downloads last month: 986

GGUF

Model size

2.51B params

Architecture

gemma

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

32-bit

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.