Random-LTD: Random and Layerwise Token Dropping Brings Efficient Training for Large-scale Transformers Paper • 2211.11586 • Published Nov 17, 2022 • 1