view article Article SeeMoE: Implementing a MoE Vision Language Model from Scratch By AviSoori1x • Jun 23 • 32
Common Corpus Collection The largest public domain dataset for training LLMs. • 27 items • Updated Jul 17 • 111
Text-to-Image Base Models Collection All text-to-image open source base models, with their respective license • 28 items • Updated May 10 • 19
OpenCLIP DataComp Collection OpenCLIP models trained on DataComp (https://huggingface.co/papers/2304.14108). • 6 items • Updated Oct 9, 2023 • 6
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents Paper • 2306.16527 • Published Jun 21, 2023 • 47