Power-LM Collection Dense & MoE LLMs trained with power learning rate scheduler. • 4 items • Updated Oct 17 • 15
view article Article Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging By akjindal53244 • Aug 19 • 73