File size: 3,033 Bytes
a64359e 3b7e0fd b0b4bea 56aec23 3b7e0fd 56aec23 3b7e0fd 56aec23 7b21e52 56aec23 7b21e52 56aec23 8d72701 56aec23 8d72701 56aec23 8d72701 3b7e0fd b0b4bea 3b7e0fd b0b4bea 3b7e0fd 56aec23 765d32a 56aec23 15e4261 56aec23 3b7e0fd 56aec23 3b7e0fd 56aec23 3b7e0fd 56aec23 3b7e0fd 9ac0016 56aec23 3b7e0fd b0b4bea |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
---
license: other
---
![Aquila_logo](./log.jpeg)
<h4 align="center">
<p>
<b>English</b> |
<a href="https://huggingface.co/BAAI/Aquila2-34B/blob/main/README_zh.md">简体中文</a> |
<p>
</h4>
We opensource our **Aquila2** series, now including **Aquila2**, the base language models, namely **Aquila2-7B** and **Aquila2-34B**, as well as **AquilaChat2**, the chat models, namely **AquilaChat2-7B** and **AquilaChat2-34B**, as well as the long-text chat models, namely **AquilaChat2-7B-16k** and **AquilaChat2-34B-16k**
The additional details of the Aquila model will be presented in the official technical report. Please stay tuned for updates on official channels.
## Updates 2024.6.6
We have updated the basic language model **Aquila2-34B**, which has the following advantages compared to the previous model:
* Replaced tokenizer with higher compression ratio:
| Tokenizer | Size | Zh | En | Code | Math | Average |
|-----------|-------|--------------------------|--------|-------|-------|---------|
| Aquila2-original | 100k | **4.70** | 4.42 | 3.20 | 3.77 | 4.02 |
| Qwen1.5 | 151k | 4.27 | 4.51 | 3.62 | 3.35 | 3.94 |
| Llama3 | 128k | 3.45 | **4.61** | 3.77 | **3.88** | 3.93 |
| Aquila2-new | 143k | 4.60 | **4.61** | **3.78** | **3.88** | **4.22** |
* The maximum processing length supported by the model has increased from 2048 to 8192
## Quick Start Aquila2-34B
### 1. Inference
Aquila2-34B is a base model that can be used for continuation.
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers import BitsAndBytesConfig
device= "cuda:0"
# Model Name
model_name = 'BAAI/Aquila2-34B'
# load model and tokenizer
quantization_config=BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, trust_remote_code=True,
# quantization_config=quantization_config # Uncomment this one for 4-bit quantization
)
tokenizer = AutoTokenizer.from_pretrained(path, trust_remote_code=True)
model.eval()
model.to(device)
# Example
text = "The meaning of life is"
tokens = tokenizer.encode_plus(text)['input_ids']
tokens = torch.tensor(tokens)[None,].to(device)
with torch.no_grad():
out = model.generate(tokens, do_sample=False, max_length=128, eos_token_id=tokenizer.eos_token_id)[0]
out = tokenizer.decode(out.cpu().numpy().tolist())
print(out)
```
## License
Aquila2 series open-source model is licensed under [ BAAI Aquila Model Licence Agreement](https://huggingface.co/BAAI/Aquila2-34B/blob/main/BAAI-Aquila-Model-License%20-Agreement.pdf)
|