The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper โข 2406.17557 โข Published Jun 25 โข 82
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases โข 5 items โข Updated Aug 2 โข 672
Datasets for Pretrained Thai LLM Collection List Datasets for pretrained Thai LLM by PyThaiNLP โข 23 items โข Updated 6 days ago โข 7
BiPhone: Modeling Inter Language Phonetic Influences in Text Paper โข 2307.03322 โข Published Jul 6, 2023 โข 7