File size: 2,580 Bytes
3eff70a 7cc8ad9 3eff70a 194cf5c 3eff70a 194cf5c 3eff70a b612f0c 3eff70a de1720f 3eff70a c369996 c4ff90a 3eff70a b623eca de1720f b623eca cc3990f 3eff70a 194cf5c ba364bb 0c06311 cc3990f 194cf5c a9ac3f6 be393fd a9ac3f6 3eff70a b623eca 3eff70a b623eca 3eff70a 3be3aca b623eca 3be3aca 3eff70a 3be3aca 3eff70a 3be3aca 3eff70a c369996 3eff70a a9ac3f6 194cf5c 3eff70a 194cf5c 3eff70a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 |
---
license: apache-2.0
language:
- en
tags:
- LLM
- BELLE
---
## Model Card for lyraBELLE
lyraBELLE is currently the **fastest BELLE model** available. To the best of our knowledge, it is the **first accelerated version of BELLE**.
The inference speed of lyraBELLE has achieved **3.3x+** acceleration upon the original version.
Among its main features are:
- weights: the original BELLE-7B-2M weights released by BelleGroup.
- device: Nvidia Ampere architechture or newer (e.g., A100)
Note that:
**Some interface/code were set for future uses(see demo below).**
- **int8 mode**: not supported yet, please always set it at 0
- **data type**: only `fp16` available.
## Speed
### test environment
- device: Nvidia A100 40G
- warmup: 10 rounds
- percision: fp16
- batch size: 64
- language: Chinese, keep the same in a batch.
- do_sample: True, the model will generate slightly different answsers to the same questions.
|version|speed|
|:-:|:-:|
|original|826.34 tokens/sec|
|lyraBELLE|2701.71 tokens/sec|
## Model Sources
- **Repository:** [https://huggingface.co/BelleGroup/BELLE-7B-2M?clone=true]
## Environment
- **docker image available** at [https://hub.docker.com/repository/docker/bigmoyan/lyrallm/general], pull image by:
```
docker pull bigmoyan/lyrallm:v0.1
```
## Uses
```python
from lyraBelle import LyraBelle
data_type = "fp16"
prompts = "今天天气大概 25度,有点小雨,吹着风,我想去户外散步,应该穿什么样的衣服裤子鞋子搭配。"
model_dir = "./model"
model_name = "1-gpu-fp16.h5"
max_output_length = 512
# int8 mode not supported, data_type only support fp16
model = LyraBelle(model_dir, model_name, data_type, 0)
output_texts = model.generate(prompts, output_length=max_output_length,top_k=30, top_p=0.85, temperature=0.35, repetition_penalty=1.2, do_sample=True)
print(output_texts)
```
## Demo output
### input
今天天气大概 25度,有点小雨,吹着风,我想去户外散步,应该穿什么样的衣服裤子鞋子搭配。
### output
建议穿着一件轻便的衬衫或T恤、一条牛仔裤和一双运动鞋或休闲鞋。如果下雨了可以带上一把伞。
## Citation
``` bibtex
@Misc{lyraBELLE2023,
author = {Kangjian Wu, Zhengtao Wang, Bin Wu},
title = {lyraBELLE: Accelerating BELLE by 3x+},
howpublished = {\url{https://huggingface.co/TMElyralab/lyraBELLE},
year = {2023}
}
```
## Report bug
- start a discussion to report any bugs!--> https://huggingface.co/TMElyralab/lyraBELLE/discussions
- report bug with a `[bug]` mark in the title. |