RichardErkhov/abacusai_-_Giraffe-v2-70b-32k-gguf

Quantization made by Richard Erkhov.

Giraffe-v2-70b-32k - GGUF

Model creator: https://huggingface.co/abacusai/
Original model: https://huggingface.co/abacusai/Giraffe-v2-70b-32k/

Name	Quant method	Size
Giraffe-v2-70b-32k.Q2_K.gguf	Q2_K	23.71GB
Giraffe-v2-70b-32k.IQ3_XS.gguf	IQ3_XS	26.37GB
Giraffe-v2-70b-32k.IQ3_S.gguf	IQ3_S	27.86GB
Giraffe-v2-70b-32k.Q3_K_S.gguf	Q3_K_S	27.86GB
Giraffe-v2-70b-32k.IQ3_M.gguf	IQ3_M	28.82GB
Giraffe-v2-70b-32k.Q3_K.gguf	Q3_K	30.99GB
Giraffe-v2-70b-32k.Q3_K_M.gguf	Q3_K_M	30.99GB
Giraffe-v2-70b-32k.Q3_K_L.gguf	Q3_K_L	33.67GB
Giraffe-v2-70b-32k.IQ4_XS.gguf	IQ4_XS	34.64GB
Giraffe-v2-70b-32k.Q4_0.gguf	Q4_0	36.2GB
Giraffe-v2-70b-32k.IQ4_NL.gguf	IQ4_NL	36.55GB
Giraffe-v2-70b-32k.Q4_K_S.gguf	Q4_K_S	36.55GB
Giraffe-v2-70b-32k.Q4_K.gguf	Q4_K	38.58GB
Giraffe-v2-70b-32k.Q4_K_M.gguf	Q4_K_M	38.58GB
Giraffe-v2-70b-32k.Q4_1.gguf	Q4_1	40.2GB
Giraffe-v2-70b-32k.Q5_0.gguf	Q5_0	44.2GB
Giraffe-v2-70b-32k.Q5_K_S.gguf	Q5_K_S	44.2GB
Giraffe-v2-70b-32k.Q5_K.gguf	Q5_K	45.41GB
Giraffe-v2-70b-32k.Q5_K_M.gguf	Q5_K_M	45.41GB
Giraffe-v2-70b-32k.Q5_1.gguf	Q5_1	48.2GB
Giraffe-v2-70b-32k.Q6_K.gguf	Q6_K	52.7GB
Giraffe-v2-70b-32k.Q8_0.gguf	Q8_0	68.26GB

Original model description:

tags: - llama2

Model Details

Model Description

We have followed up on our previous training runs related to extending the context length of Llama models. The associated github repository

https://github.com/abacusai/long-context

has some basic details on our approach and metrics. We have also published a paper on arXiv that covers our experiments and analysis a lot more comprehensively.

http://arxiv.org/abs/2308.10882

Developed by: Abacus.AI
Model type: Transformer based autoregressive causal language model
License: Llama 2 Community License: https://github.com/facebookresearch/llama/blob/main/LICENSE
Finetuned from model: Llama V2 70B

Usage

To use this model at longer lengths the model needs to be patched to interpolate the longer context lengths. It will not work if it is simply loaded with the AutoModel framework of transformers. For full details and usage see:

https://github.com/abacusai/Long-Context

The evaluation section has detailed code for how to load and patch the model for inference (or further fine-tuning). Note in particular the max_position_embeddings is not relevant since the patched module dynamically reallocates the position buffers as required.

The tokenizer corresponding to this model is https://huggingface.co/abacusai/Giraffe-v1-Tokenizer.

Using the code in the repository you can load this model with the following code:

from models import load_model, load_tokenizer
tokenizer = load_tokenizer()
model = load_model('abacusai/Giraffe-v2-70b-32k', scale=8)