tim-lawson 's Collections

Multi-Layer Sparse Autoencoders with Transformers

Single SAEs trained on the residual stream activation vectors from every transformer layer simultaneously (including the transformers).