README.md · sayhan/gemma-7b-GGUF-quantized at main

metadata

base_model: google/gemma-7b
language:
  - en
pipeline_tag: text-generation
license: other
model_type: gemma
library_name: transformers
inference: false

Google Gemma 7B

Model creator: Google
Original model: gemma-7b-it
Terms of use

Description

This repo contains GGUF format model files for Google's Gemma 7B

Original model

Developed by: Google

Description

Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights, pre-trained variants, and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone.

Quantizon types

quantization method	bits	size	description	recommended
Q2_K	2	3.09	very small, very high quality loss	❌
Q3_K_S	3	3.68 GB	very small, high quality loss	❌
Q3_K_L	3	4.4 GB	small, substantial quality loss	❌
Q4_0	4	4.81 GB	legacy; small, very high quality loss	❌
Q4_K_S	4	4.84 GB	medium, balanced quality	✅
Q4_K_M	4	5.13 GB	medium, balanced quality	✅
Q5_0	5	5.88 GB	legacy; medium, balanced quality	❌
Q5_K_S	5	5.88 GB	large, low quality loss	✅
Q5_K_M	5	6.04 GB	large, very low quality loss	✅
Q6_K	6	7.01 GB	very large, extremely low quality loss	❌
Q8_0	8	9.08 GB	very large, extremely low quality loss	❌
FP16	16	17.1 GB	enormous, negligible quality loss	❌

Usage

You can use this model with the latest builds of LM Studio and llama.cpp.
If you're new to the world of large language models, I recommend starting with LM Studio.