nilabhra commited on
Commit
252d556
β€’
1 Parent(s): 61637f6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -1,6 +1,6 @@
1
  # πŸš€ Falcon2-11B
2
 
3
- **Falcon2-11B is a 11B parameters causal decoder-only model built by [TII](https://www.tii.ae) and trained over 5,000B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) enhanced with curated corpora. The model is made available under the (TII Falcon License 2.0)[https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html], the permissive Apache 2.0-based software license which includes an (acceptable use policy)[https://falconllm-staging.tii.ae/falcon-2-acceptable-use-policy.html] that promotes the responsible use of AI.**
4
 
5
  *Paper coming soon 😊.*
6
 
@@ -49,7 +49,7 @@ For fast inference with Falcon, check-out [Text Generation Inference](https://gi
49
  - **Developed by:** [https://www.tii.ae](https://www.tii.ae)
50
  - **Model type:** Causal decoder-only
51
  - **Language(s) (NLP):** English, German, Spanish, French, Italian, Portuguese, Polish, Dutch, Romanian, Czech, Swedish
52
- - **License:** (TII Falcon License 2.0)[https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html]
53
 
54
  ### Model Source
55
 
@@ -190,7 +190,7 @@ Falcon2-11B was trained on AWS SageMaker, using on average 1024 A100 40GB GPUs i
190
 
191
  #### Software
192
 
193
- Falcon2-11B was trained a custom distributed training codebase, Gigatron. It uses a 3D parallelism approach combined with ZeRO, high-performance Triton kernels and FlashAttention-2.
194
 
195
  ## Citation
196
 
@@ -198,7 +198,7 @@ Falcon2-11B was trained a custom distributed training codebase, Gigatron. It use
198
 
199
  ## License
200
 
201
- Falcon2-11B is licenced under (TII Falcon License 2.0)[https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html], the permissive Apache 2.0-based software license which includes an (acceptable use policy)[https://falconllm-staging.tii.ae/falcon-2-acceptable-use-policy.html] that promotes the responsible use of AI.
202
 
203
  ## Contact
204
 
1
  # πŸš€ Falcon2-11B
2
 
3
+ **Falcon2-11B is a 11B parameters causal decoder-only model built by [TII](https://www.tii.ae) and trained over 5,000B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) enhanced with curated corpora. The model is made available under the [TII Falcon License 2.0](https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html), the permissive Apache 2.0-based software license which includes an [acceptable use policy](https://falconllm-staging.tii.ae/falcon-2-acceptable-use-policy.html) that promotes the responsible use of AI.**
4
 
5
  *Paper coming soon 😊.*
6
 
 
49
  - **Developed by:** [https://www.tii.ae](https://www.tii.ae)
50
  - **Model type:** Causal decoder-only
51
  - **Language(s) (NLP):** English, German, Spanish, French, Italian, Portuguese, Polish, Dutch, Romanian, Czech, Swedish
52
+ - **License:** [TII Falcon License 2.0](https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html)
53
 
54
  ### Model Source
55
 
 
190
 
191
  #### Software
192
 
193
+ Falcon2-11B was trained a custom distributed training codebase, Gigatron. It uses a 3D parallelism approach combined with ZeRO, high-performance Triton kernels and FlashAttention-2. More details about the distributed training strategy can be found in [Almazrouei et.al](https://arxiv.org/abs/2311.16867).
194
 
195
  ## Citation
196
 
 
198
 
199
  ## License
200
 
201
+ Falcon2-11B is licenced under [TII Falcon License 2.0](https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html), the permissive Apache 2.0-based software license which includes an [acceptable use policy](https://falconllm-staging.tii.ae/falcon-2-acceptable-use-policy.html) that promotes the responsible use of AI.
202
 
203
  ## Contact
204