AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out Strategies Paper • 2408.06567 • Published 19 days ago • 2
view article Article Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models Mar 20 • 54