SAELens
Tom Lieberum
fold in scaling by sqrt(d_model) into params
9ff4e7b
raw
history blame
No virus
244 Bytes
average_l0_6/params.npz filter=lfs diff=lfs merge=lfs -text
average_l0_111/params.npz filter=lfs diff=lfs merge=lfs -text
average_l0_21/params.npz filter=lfs diff=lfs merge=lfs -text
average_l0_44/params.npz filter=lfs diff=lfs merge=lfs -text