SaulLM-54B & SaulLM-141B: Scaling Up Domain Adaptation for the Legal Domain Paper • 2407.19584 • Published Jul 28 • 60
CroissantLLM: A Truly Bilingual French-English Language Model Paper • 2402.00786 • Published Feb 1 • 25
Common Corpus Collection The largest public domain dataset for training LLMs. • 27 items • Updated Jul 17 • 111