siddartha-abacus commited on
Commit
9014fc8
1 Parent(s): 47d6e6c

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -0
README.md ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - llama2
4
+ ---
5
+
6
+ ## Model Details
7
+
8
+ ### Model Description
9
+
10
+
11
+ We have followed up on our previous training runs related to extending the context length
12
+ of Llama models. The associated github repository
13
+
14
+ https://github.com/abacusai/long-context
15
+
16
+ has some basic details on our approach and metrics. We have also published a paper on arXiv
17
+ that covers our experiments and analysis a lot more comprehensively.
18
+
19
+ http://arxiv.org/abs/2308.10882
20
+
21
+ - **Developed by:** [Abacus.AI](https://abacus.ai)
22
+ - **Model type:** Transformer based autoregressive causal language model
23
+ - **License:** Llama 2 Community License: https://github.com/facebookresearch/llama/blob/main/LICENSE
24
+ - **Finetuned from model:** Llama V2 70B
25
+
26
+ ### Usage
27
+
28
+ To use this model at longer lengths the model needs to be patched to interpolate the longer context
29
+ lengths. It will not work if it is simply loaded with the `AutoModel` framework of `transformers`.
30
+ For full details and usage see:
31
+
32
+ https://github.com/abacusai/Long-Context
33
+
34
+ The evaluation section has detailed code for how to load and patch the model for inference (or further fine-tuning).
35
+ Note in particular the `max_position_embeddings` is not relevant since the patched module dynamically reallocates
36
+ the position buffers as required.
37
+
38
+ The tokenizer corresponding to this model is https://huggingface.co/abacusai/Giraffe-v1-Tokenizer.
39
+
40
+ Using the code in the repository you can load this model with the following code:
41
+ ```python
42
+ from models import load_model, load_tokenizer
43
+ tokenizer = load_tokenizer()
44
+ model = load_model('abacusai/Giraffe-v2-70b-32k', scale=8)
45
+ ```