bigscience
/

bloom-7b1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ybelkada commited on May 30, 2022

Commit

e1f323d

•

1 Parent(s): 0048a11

Update README.md (#1)

- Update README.md (c1e0422e2159df1723c73013a03d3f9df32c44c3)

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -122,11 +122,11 @@ Please see [the BLOOM training README](https://github.com/bigscience-workshop/bi
 * ALiBI positional encodings (see [paper](https://arxiv.org/pdf/2108.12409.pdf)), with GeLU activation functions
-* 176 billion parameters:
-    * 70 layers, 112 attention heads
-    * Hidden layers are 14336-dimensional
     * Sequence length of 2048 tokens used (see [BLOOM tokenizer](https://huggingface.co/bigscience/tokenizer), [tokenizer description](#tokenization))

 * ALiBI positional encodings (see [paper](https://arxiv.org/pdf/2108.12409.pdf)), with GeLU activation functions
+* 6.3B billion parameters:
+    * 30 layers, 32 attention heads
+    * Hidden layers are 4096-dimensional
     * Sequence length of 2048 tokens used (see [BLOOM tokenizer](https://huggingface.co/bigscience/tokenizer), [tokenizer description](#tokenization))