Bloom CTranslate2's model
This is a collection of some of the Bigscience Bloom exported to CTranslate2 model format. This allows to load and usage these models efficently on CPU or GPU.
Models
The models have been converted to float16 and can be load in with any other quantification method (e.g. int 8).
Model name | Description |
---|---|
bloom-560m | 560M parameter model pretrained on ROOTS |
bloom-3b | 3B parameter model pretrained on ROOTS |
bloomz-7b1 | 7.1B parameter model finetuned on xP3 |
bloomz-7b1-mt | 7.1B parameter model finetuned on xP3mt |
mt0-xxl-mt | 13B parameter model finetuned on xP3 |
See directories for the different models available.
Simple code to use them
Install dependencies:
pip install huggingface_hub ctranslate2 transformers torch
Usage:
import huggingface_hub
import ctranslate2
import transformers
model_name = "bloomz-7b1"
prompt = "Hello, I am Joan and I am from Barcelona and"
repo_id = "jordimas/bloom-ctranslate2"
snapshot_folder = huggingface_hub.snapshot_download(repo_id = repo_id, allow_patterns=f"*{model_name}*")
print(f"folder: {snapshot_folder}")
model = f"{snapshot_folder}/{model_name}"
generator = ctranslate2.Generator(model, compute_type="int8")
tokenizer = transformers.AutoTokenizer.from_pretrained(model)
start_tokens = tokenizer.convert_ids_to_tokens(tokenizer.encode(prompt))
results = generator.generate_batch([start_tokens], max_length=90)
result = tokenizer.decode(results[0].sequences_ids[0])
print(f"Result: {result}")