Add exact param counts
#106
by
Muennighoff
- opened
README.md
CHANGED
@@ -265,7 +265,9 @@ Please see [the BLOOM training README](https://github.com/bigscience-workshop/bi
|
|
265 |
|
266 |
* ALiBI positional encodings (see [paper](https://arxiv.org/pdf/2108.12409.pdf)), with GeLU activation functions
|
267 |
|
268 |
-
* 176
|
|
|
|
|
269 |
|
270 |
* 70 layers, 112 attention heads
|
271 |
|
|
|
265 |
|
266 |
* ALiBI positional encodings (see [paper](https://arxiv.org/pdf/2108.12409.pdf)), with GeLU activation functions
|
267 |
|
268 |
+
* 176,247,271,424 parameters:
|
269 |
+
|
270 |
+
* 3,596,615,680 embedding parameters
|
271 |
|
272 |
* 70 layers, 112 attention heads
|
273 |
|