jordimas commited on
Commit
920004e
1 Parent(s): f2572d6

Update README

Browse files
Files changed (1) hide show
  1. README.md +54 -0
README.md CHANGED
@@ -1,3 +1,57 @@
1
  ---
2
  license: bigscience-bloom-rail-1.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: bigscience-bloom-rail-1.0
3
  ---
4
+
5
+ # Bloom CTranslate2's model
6
+
7
+ This is a collection of some of the [Bigscience Bloom](https://huggingface.co/bigscience/bloom) exported to
8
+ [CTranslate2](https://github.com/OpenNMT/CTranslate2) model format. This allows to load and usage these models
9
+ efficently on CPU or GPU.
10
+
11
+ ## Models
12
+
13
+ The models have been converted to *float16* and can be load in with any other quantification method (e.g. *int 8*).
14
+
15
+
16
+ | Model name | Description |
17
+ | --- | --- |
18
+ | [bloom-560m](https://huggingface.co/bigscience/bloom-560m) | 560M parameter model pretrained on ROOTS|
19
+ | [bloom-3b](https://huggingface.co/bigscience/bloom-3b) | 3B parameter model pretrained on ROOTS
20
+ | [bloomz-7b1](https://huggingface.co/bigscience/bloomz-7b1) | 7.1B parameter model finetuned on xP3|
21
+ | [bloomz-7b1-mt](https://huggingface.co/bigscience/bloomz-7b1-mt) | 7.1B parameter model finetuned on xP3mt |
22
+ | [mt0-xxl-mt](https://huggingface.co/bigscience/mt0-xxl-mt) | 13B parameter model finetuned on xP3|
23
+
24
+ ## Simple code to use them
25
+
26
+ Install dependencies:
27
+
28
+ ```shell
29
+ pip install huggingface_hub ctranslate2 transformers torch
30
+ ```
31
+
32
+ Usage:
33
+
34
+ ```python
35
+ model_name = "bloomz-7b1"
36
+ prompt = "Hello, I am Joan and I am from Barcelona and"
37
+
38
+ repo_id = "jordimas/bloom-ctranslate2"
39
+ output_dir = "output/"
40
+
41
+ kwargs = {
42
+ "local_dir" : output_dir,
43
+ "local_dir_use_symlinks" : False,
44
+ }
45
+ huggingface_hub.snapshot_download(repo_id = repo_id, allow_patterns=f"*{model_name}*", **kwargs)
46
+
47
+ model = f"{output_dir}{model_name}"
48
+ print(f"model: {model}")
49
+ generator = ctranslate2.Generator(model, compute_type="int8")
50
+ tokenizer = transformers.AutoTokenizer.from_pretrained(model)
51
+
52
+ start_tokens = tokenizer.convert_ids_to_tokens(tokenizer.encode(prompt))
53
+ results = generator.generate_batch([start_tokens], max_length=90)
54
+ result = tokenizer.decode(results[0].sequences_ids[0])
55
+ print(f"Result: {result}")
56
+ ```
57
+