English
LLM
BELLE
lyraBELLE / README.md
benleader's picture
Update README.md
57951e0
|
raw
history blame
2.54 kB
---
license: creativeml-openrail-m
language:
- en
tags:
- LLM
- BELLE
---
## Model Card for lyraBELLE
lyraBELLE is currently the **fastest BELLE model** available. To the best of our knowledge, it is the **first accelerated version of BELLE**.
The inference speed of lyraBELLE has achieved **100x** acceleration upon the original version.
Among its main features are:
- weights: the original BELLE-7B-2M weights released by BelleGroup.
- device: Nvidia Ampere architechture or newer (e.g A100)
Note that:
**Some interface/code were set for future uses(see demo below).**
- **int8 mode**: not supported yet, please always set it to 0
- **data type**: only `fp16` available.
## Speed
### test environment
- device: Nvidia A100 40G
- warmup: 10 rounds
- percision:fp16
- batch size for our version: 64 (maximum under A100 40G)
- batch size for original: xx (maximum under A100 40G)
|version|batch size|speed|
|:-:|:-:|
|original|xxx|
|lyraBELLE|80|3030.36 tokens/sec|
## Model Sources
- **Repository:** [https://huggingface.co/BelleGroup/BELLE-7B-2M?clone=true]
## Environment
- **docker image available** at [https://hub.docker.com/repository/docker/bigmoyan/lyrallm/general], pull image by:
```
docker pull bigmoyan/lyrallm:v0.1
```
## Uses
```python
from lyraBelle import LyraBelle
data_type = "fp16"
prompts = "今天天气大概 25度,有点小雨,吹着风,我想去户外散步,应该穿什么样的衣服裤子鞋子搭配。"
model_dir = "./model"
model_name = "1-gpu-fp16.h5"
max_output_length = 512
# int8 mode not supported, data_type only support fp16
model = LyraBelle(model_dir, model_name, data_type, 0)
output_texts = model.generate(prompts, output_length=max_output_length,top_k=30, top_p=0.85, temperature=0.35, repetition_penalty=1.2, do_sample=True)
print(output_texts)
```
## Demo output
### input
今天天气大概 25度,有点小雨,吹着风,我想去户外散步,应该穿什么样的衣服裤子鞋子搭配。
### output
建议穿着一件轻便的衬衫或T恤、一条牛仔裤和一双运动鞋或休闲鞋。如果下雨了可以带上一把伞。
## Citation
``` bibtex
@Misc{lyraBELLE2023,
author = {Kangjian Wu, Zhengtao Wang, Bin Wu},
title = {lyraBELLE: Accelerating BELLE by 100x+},
howpublished = {\url{https://huggingface.co/TMElyralab/lyraBELLE},
year = {2023}
}
```
## Report bug
- start a discussion to report any bugs!--> https://huggingface.co/TMElyralab/lyraBELLE/discussions
- report bug with a `[bug]` mark in the title.