Upcycling Large Language Models into Mixture of Experts Paper • 2410.07524 • Published Oct 10 • 3 • 1