mesolitica/mallam-1.1B-4096
Text Generation
•
Updated
•
460
•
5
Pretrain from scratch 4096 context length on 90B tokens Malaysian text, https://huggingface.co/papers/2401.14680