ArthurConmyGDM
commited on
Commit
•
1828c4a
1
Parent(s):
0c2c31e
Update README.md
Browse files
README.md
CHANGED
@@ -3,8 +3,6 @@ license: cc-by-4.0
|
|
3 |
library_name: saelens
|
4 |
---
|
5 |
|
6 |
-
⚠️ WARNING: we are still uploading these SAEs
|
7 |
-
|
8 |
# 1. Gemma Scope
|
9 |
|
10 |
Gemma Scope is a comprehensive, open suite of sparse autoencoders for Gemma 2 9B and 2B. Sparse Autoencoders are a "microscope" of sorts that can help us break down a model’s internal activations into the underlying concepts, just as biologists use microscopes to study the individual cells of plants and animals.
|
@@ -17,7 +15,11 @@ See our [landing page](https://huggingface.co/google/gemma-scope) for details on
|
|
17 |
- `9b-it-`: These SAEs were trained on Gemma v2 9B instruction-tuned model.
|
18 |
- `res`: These SAEs were trained on the model's residual stream.
|
19 |
|
20 |
-
# 3.
|
|
|
|
|
|
|
|
|
21 |
|
22 |
Point of contact: Arthur Conmy
|
23 |
|
|
|
3 |
library_name: saelens
|
4 |
---
|
5 |
|
|
|
|
|
6 |
# 1. Gemma Scope
|
7 |
|
8 |
Gemma Scope is a comprehensive, open suite of sparse autoencoders for Gemma 2 9B and 2B. Sparse Autoencoders are a "microscope" of sorts that can help us break down a model’s internal activations into the underlying concepts, just as biologists use microscopes to study the individual cells of plants and animals.
|
|
|
15 |
- `9b-it-`: These SAEs were trained on Gemma v2 9B instruction-tuned model.
|
16 |
- `res`: These SAEs were trained on the model's residual stream.
|
17 |
|
18 |
+
# 3. Why aren't there more IT SAEs?
|
19 |
+
|
20 |
+
To summarise our [technical report, Section 4.5](https://storage.googleapis.com/gemma-scope/gemma-scope-report.pdf), we find the same results as [Kissane et al., 2024](https://www.alignmentforum.org/posts/fmwk6qxrpW8d4jvbd/saes-usually-transfer-between-base-and-chat-models), that SAEs trained on Gemma 2 9B base transfer very well to the IT model, and these IT SAEs only work marginally better. Therefore in many cases we expect it is sufficient to use our PT SAEs for the equivalent IT model, e.g. using the [Gemma 2 9B PT SAEs](https://huggingface.co/google/gemma-scope-9b-pt-res) to interpret Gemma 2 9B IT.
|
21 |
+
|
22 |
+
# 4. Point of Contact
|
23 |
|
24 |
Point of contact: Arthur Conmy
|
25 |
|