File size: 238 Bytes
f625280 8857129 cfb8432 ded8724 bf541c0 a0d791a cfb8432 92cb5c4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 |
---
datasets:
- Locutusque/TM-DATA-V2
- LLM360/TxT360
- mlfoundations/dclm-baseline-1.0
- Skylion007/openwebtext
- JeanKaddour/minipile
language:
- en
license: apache-2.0
---
still in training. Trained on about ~17 billion tokens so far. |