bloom-ctranslate2 / README.md
jordimas's picture
Update README
920004e
|
raw
history blame
1.9 kB
metadata
license: bigscience-bloom-rail-1.0

Bloom CTranslate2's model

This is a collection of some of the Bigscience Bloom exported to CTranslate2 model format. This allows to load and usage these models efficently on CPU or GPU.

Models

The models have been converted to float16 and can be load in with any other quantification method (e.g. int 8).

Model name Description
bloom-560m 560M parameter model pretrained on ROOTS
bloom-3b 3B parameter model pretrained on ROOTS
bloomz-7b1 7.1B parameter model finetuned on xP3
bloomz-7b1-mt 7.1B parameter model finetuned on xP3mt
mt0-xxl-mt 13B parameter model finetuned on xP3

Simple code to use them

Install dependencies:

pip install huggingface_hub ctranslate2 transformers torch

Usage:

model_name = "bloomz-7b1"
prompt = "Hello, I am Joan and I am from Barcelona and"

repo_id = "jordimas/bloom-ctranslate2"
output_dir = "output/"

kwargs = {
    "local_dir" : output_dir,
    "local_dir_use_symlinks" : False,
}
huggingface_hub.snapshot_download(repo_id = repo_id, allow_patterns=f"*{model_name}*", **kwargs)

model = f"{output_dir}{model_name}"
print(f"model: {model}")
generator = ctranslate2.Generator(model, compute_type="int8")
tokenizer = transformers.AutoTokenizer.from_pretrained(model)

start_tokens = tokenizer.convert_ids_to_tokens(tokenizer.encode(prompt))
results = generator.generate_batch([start_tokens], max_length=90)
result = tokenizer.decode(results[0].sequences_ids[0])
print(f"Result: {result}")