Text Generation
Transformers
Safetensors
lola_v1
custom_code
neo-nlp-dev commited on
Commit
d7ea010
1 Parent(s): 8b31051

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -11,10 +11,12 @@ tags: []
11
  ## Model Description
12
 
13
  - **Developed by:** DICE Research Group (https://dice-research.org/) @ Paderborn University (https://www.uni-paderborn.de/)
14
- - **Model type:** GPT2 style (decoder-only) with Mixture-of-Experts layers
 
 
15
  - **Language(s) (NLP):** 160+
16
  - **License:** Coming soon
17
- - **Repository:** https://github.com/dice-group/LOLA-Megatron-DeepSpeed
18
 
19
  ## How to Get Started with the Model
20
 
 
11
  ## Model Description
12
 
13
  - **Developed by:** DICE Research Group (https://dice-research.org/) @ Paderborn University (https://www.uni-paderborn.de/)
14
+ - **Model type:** GPT2 style (decoder-only) with alternating Mixture-of-Experts layers
15
+ - **Number of Experts**: 16
16
+ - **Model Size**: 1.3 Billion Dense / 7.4 Billion Sparse
17
  - **Language(s) (NLP):** 160+
18
  - **License:** Coming soon
19
+ - **Repository:** https://github.com/dice-group/LOLA
20
 
21
  ## How to Get Started with the Model
22