Quantization made by Richard Erkhov.
Giraffe-v2-70b-32k - GGUF
- Model creator: https://huggingface.co/abacusai/
- Original model: https://huggingface.co/abacusai/Giraffe-v2-70b-32k/
Name | Quant method | Size |
---|---|---|
Giraffe-v2-70b-32k.Q2_K.gguf | Q2_K | 23.71GB |
Giraffe-v2-70b-32k.IQ3_XS.gguf | IQ3_XS | 26.37GB |
Giraffe-v2-70b-32k.IQ3_S.gguf | IQ3_S | 27.86GB |
Giraffe-v2-70b-32k.Q3_K_S.gguf | Q3_K_S | 27.86GB |
Giraffe-v2-70b-32k.IQ3_M.gguf | IQ3_M | 28.82GB |
Giraffe-v2-70b-32k.Q3_K.gguf | Q3_K | 30.99GB |
Giraffe-v2-70b-32k.Q3_K_M.gguf | Q3_K_M | 30.99GB |
Giraffe-v2-70b-32k.Q3_K_L.gguf | Q3_K_L | 33.67GB |
Giraffe-v2-70b-32k.IQ4_XS.gguf | IQ4_XS | 34.64GB |
Giraffe-v2-70b-32k.Q4_0.gguf | Q4_0 | 36.2GB |
Giraffe-v2-70b-32k.IQ4_NL.gguf | IQ4_NL | 36.55GB |
Giraffe-v2-70b-32k.Q4_K_S.gguf | Q4_K_S | 36.55GB |
Giraffe-v2-70b-32k.Q4_K.gguf | Q4_K | 38.58GB |
Giraffe-v2-70b-32k.Q4_K_M.gguf | Q4_K_M | 38.58GB |
Giraffe-v2-70b-32k.Q4_1.gguf | Q4_1 | 40.2GB |
Giraffe-v2-70b-32k.Q5_0.gguf | Q5_0 | 44.2GB |
Giraffe-v2-70b-32k.Q5_K_S.gguf | Q5_K_S | 44.2GB |
Giraffe-v2-70b-32k.Q5_K.gguf | Q5_K | 45.41GB |
Giraffe-v2-70b-32k.Q5_K_M.gguf | Q5_K_M | 45.41GB |
Giraffe-v2-70b-32k.Q5_1.gguf | Q5_1 | 48.2GB |
Giraffe-v2-70b-32k.Q6_K.gguf | Q6_K | 52.7GB |
Giraffe-v2-70b-32k.Q8_0.gguf | Q8_0 | 68.26GB |
Original model description:
tags: - llama2
Model Details
Model Description
We have followed up on our previous training runs related to extending the context length of Llama models. The associated github repository
https://github.com/abacusai/long-context
has some basic details on our approach and metrics. We have also published a paper on arXiv that covers our experiments and analysis a lot more comprehensively.
http://arxiv.org/abs/2308.10882
- Developed by: Abacus.AI
- Model type: Transformer based autoregressive causal language model
- License: Llama 2 Community License: https://github.com/facebookresearch/llama/blob/main/LICENSE
- Finetuned from model: Llama V2 70B
Usage
To use this model at longer lengths the model needs to be patched to interpolate the longer context
lengths. It will not work if it is simply loaded with the AutoModel
framework of transformers
.
For full details and usage see:
https://github.com/abacusai/Long-Context
The evaluation section has detailed code for how to load and patch the model for inference (or further fine-tuning).
Note in particular the max_position_embeddings
is not relevant since the patched module dynamically reallocates
the position buffers as required.
The tokenizer corresponding to this model is https://huggingface.co/abacusai/Giraffe-v1-Tokenizer.
Using the code in the repository you can load this model with the following code:
from models import load_model, load_tokenizer
tokenizer = load_tokenizer()
model = load_model('abacusai/Giraffe-v2-70b-32k', scale=8)
- Downloads last month
- 254