nilabhra rcojocaru commited on
Commit
62400f1
1 Parent(s): 8814452

Fix typo (#4)

Browse files

- Fix typo (f4eb3ecc886972c7ae303f6fb4713647a96200cc)


Co-authored-by: Ruxandra Cojocaru <[email protected]>

Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -181,7 +181,7 @@ Falcon2-11B is a causal decoder-only model trained on a causal language modeling
181
 
182
  The architecture is broadly adapted from the GPT-3 paper ([Brown et al., 2020](https://arxiv.org/abs/2005.14165)), with the following differences:
183
 
184
- * **Positionnal embeddings:** rotary ([Su et al., 2021](https://arxiv.org/abs/2104.09864));
185
  * **Attention:** multiquery ([Shazeer et al., 2019](https://arxiv.org/abs/1911.02150)) and FlashAttention-2 ([Dao, 2023](https://arxiv.org/abs/2307.08691));
186
  * **Decoder-block:** parallel attention/MLP.
187
 
 
181
 
182
  The architecture is broadly adapted from the GPT-3 paper ([Brown et al., 2020](https://arxiv.org/abs/2005.14165)), with the following differences:
183
 
184
+ * **Positional embeddings:** rotary ([Su et al., 2021](https://arxiv.org/abs/2104.09864));
185
  * **Attention:** multiquery ([Shazeer et al., 2019](https://arxiv.org/abs/1911.02150)) and FlashAttention-2 ([Dao, 2023](https://arxiv.org/abs/2307.08691));
186
  * **Decoder-block:** parallel attention/MLP.
187