ybelkada
commited on
Commit
•
e1f323d
1
Parent(s):
0048a11
Update README.md (#1)
Browse files- Update README.md (c1e0422e2159df1723c73013a03d3f9df32c44c3)
README.md
CHANGED
@@ -122,11 +122,11 @@ Please see [the BLOOM training README](https://github.com/bigscience-workshop/bi
|
|
122 |
|
123 |
* ALiBI positional encodings (see [paper](https://arxiv.org/pdf/2108.12409.pdf)), with GeLU activation functions
|
124 |
|
125 |
-
*
|
126 |
|
127 |
-
*
|
128 |
|
129 |
-
* Hidden layers are
|
130 |
|
131 |
* Sequence length of 2048 tokens used (see [BLOOM tokenizer](https://huggingface.co/bigscience/tokenizer), [tokenizer description](#tokenization))
|
132 |
|
|
|
122 |
|
123 |
* ALiBI positional encodings (see [paper](https://arxiv.org/pdf/2108.12409.pdf)), with GeLU activation functions
|
124 |
|
125 |
+
* 6.3B billion parameters:
|
126 |
|
127 |
+
* 30 layers, 32 attention heads
|
128 |
|
129 |
+
* Hidden layers are 4096-dimensional
|
130 |
|
131 |
* Sequence length of 2048 tokens used (see [BLOOM tokenizer](https://huggingface.co/bigscience/tokenizer), [tokenizer description](#tokenization))
|
132 |
|