GGUF quants for https://huggingface.co/GritLM/GritLM-7B

GritLM is a generative representational instruction tuned language model. It unifies text representation (embedding) and text generation into a single model achieving state-of-the-art performance on both types of tasks.

Layers	Context	Template (Text Representation)	Template (Text Generation)
32	32768	<s><\|user\|> {instruction} <\|embed\|> {sample}	<s><\|user\|> {prompt} <\|assistant\|> {response}

Downloads last month: 46

GGUF

Model size

7.24B params

Architecture

llama

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Examples

Text Generation

Inference API (serverless) does not yet support gguf models for this pipeline type.