--- library_name: saelens license: apache-2.0 datasets: - Juliushanhanhan/openwebtext-1b-llama3-tokenized-cxt-1024 --- # Llama-3-8B SAEs (layer 25, Post-MLP Residual Stream) ## Introduction We train a Gated SAE on the post-MLP residual stream of the 25th layer of [Llama-3-8b-instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) model. The width of SAE hidden dimensions is 65536 (x16). The SAE is trained with 500M tokens from the [OpenWebText corpus](https://huggingface.co/datasets/Juliushanhanhan/openwebtext-1b-llama3-tokenized-cxt-1024). Feature visualizations are hosted at https://www.neuronpedia.org/llama3-8b-it. The wandb run is recorded [here](https://wandb.ai/jiatongg/sae_semantic_entropy/runs/ruuu0izg?nw=nwuserjiatongg). ## Load the Model This repository contains the following SAEs: - blocks.25.hook_resid_post Load these SAEs using SAELens as below: ```python from sae_lens import SAE sae, cfg_dict, sparsity = SAE.from_pretrained("Juliushanhanhan/llama-3-8b-it-res", "") ``` ## Citation ``` @misc {jiatong_han_2024, author = { {Jiatong Han} }, title = { llama-3-8b-it-res (Revision 53425c3) }, year = 2024, url = { https://huggingface.co/Juliushanhanhan/llama-3-8b-it-res }, doi = { 10.57967/hf/2889 }, publisher = { Hugging Face } } ```