Unified Normalization for Accelerating and Stabilizing Transformers Paper • 2208.01313 • Published Aug 2, 2022 • 1
Interpret Vision Transformers as ConvNets with Dynamic Convolutions Paper • 2309.10713 • Published Sep 19, 2023 • 1
Training BatchNorm and Only BatchNorm: On the Expressive Power of Random Features in CNNs Paper • 2003.00152 • Published Feb 29, 2020 • 1
SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization Paper • 2405.11582 • Published May 19 • 13