tim-lawson 's Collections

Multi-Layer Sparse Autoencoders

Single SAEs trained on the residual stream activation vectors from every transformer layer simultaneously: https://arxiv.org/abs/2409.04185