Addition is All You Need for Energy-efficient Language Models Paper • 2410.00907 • Published Oct 1 • 142
Transformer Explainer: Interactive Learning of Text-Generative Models Paper • 2408.04619 • Published Aug 8 • 154
Retentive Network: A Successor to Transformer for Large Language Models Paper • 2307.08621 • Published Jul 17, 2023 • 170
PanGu-Coder2: Boosting Large Language Models for Code with Ranking Feedback Paper • 2307.14936 • Published Jul 27, 2023 • 41