README.md · TMElyralab/lyraBELLE at acf20b276989359a68fb66f1531e65a37a713bc8

metadata

license: apache-2.0
language:
  - en
tags:
  - LLM
  - BELLE

Model Card for lyraBELLE

lyraBELLE is currently the fastest BELLE model available. To the best of our knowledge, it is the first accelerated version of BELLE.

The inference speed of lyraBELLE has achieved 3.3x+ acceleration upon the original version.

Among its main features are:

weights: the original BELLE-7B-2M weights released by BelleGroup.
device: Nvidia Ampere architechture or newer (e.g., A100)

Note that: Some interface/code were set for future uses(see demo below).

int8 mode: not supported yet, please always set it at 0
data type: only fp16 available.

Speed

test environment

device: Nvidia A100 40G
warmup: 10 rounds
percision: fp16
batch size: 64
language: Chinese, keep the same in a batch.
do_sample: True, the model will generate slightly different answsers to the same questions.

version	speed
original	826.34 tokens/sec
lyraBELLE	2701.71 tokens/sec

Model Sources

Repository: [https://huggingface.co/BelleGroup/BELLE-7B-2M?clone=true]

Environment

docker image available at [https://hub.docker.com/repository/docker/bigmoyan/lyrallm/general], pull image by:

docker pull bigmoyan/lyrallm:v0.1

Uses


from lyraBelle import LyraBelle

data_type = "fp16"
prompts = "今天天气大概 25度，有点小雨，吹着风，我想去户外散步，应该穿什么样的衣服裤子鞋子搭配。"
model_dir = "./model"
model_name = "1-gpu-fp16.h5"
max_output_length = 512

# int8 mode not supported, data_type only support fp16
model = LyraBelle(model_dir, model_name, data_type, 0)
output_texts = model.generate(prompts, output_length=max_output_length,top_k=30, top_p=0.85, temperature=0.35, repetition_penalty=1.2, do_sample=True)
print(output_texts)

Demo output

input

今天天气大概 25度，有点小雨，吹着风，我想去户外散步，应该穿什么样的衣服裤子鞋子搭配。

output

建议穿着一件轻便的衬衫或T恤、一条牛仔裤和一双运动鞋或休闲鞋。如果下雨了可以带上一把伞。

Citation

@Misc{lyraBELLE2023,
  author =       {Kangjian Wu, Zhengtao Wang, Bin Wu},
  title =        {lyraBELLE: Accelerating BELLE by 3x+},
  howpublished = {\url{https://huggingface.co/TMElyralab/lyraBELLE},
  year =         {2023}
}

Report bug

start a discussion to report any bugs!--> https://huggingface.co/TMElyralab/lyraBELLE/discussions
report bug with a [bug] mark in the title.