license: cc-by-4.0
library_name: saelens
⚠️ WARNING: We have small labelling issues, and some SAEs appear twice in this repo.
1. Gemma Scope
Gemma Scope is a comprehensive, open suite of sparse autoencoders for Gemma 2 9B and 2B. Sparse Autoencoders are a "microscope" of sorts that can help us break down a model’s internal activations into the underlying concepts, just as biologists use microscopes to study the individual cells of plants and animals.
See our landing page for details on the whole suite. This is a specific set of SAEs:
2. What Is gemma-scope-2b-pt-res
?
gemma-scope-
: See 1.2b-pt-
: These SAEs were trained on Gemma v2 2B base model.res
: These SAEs were trained on the model's residual stream.- We include experimental SAEs trained on token embeddings in the ./embedding folder.
3. Which SAE is in the Neuronpedia demo?
https://huggingface.co/google/gemma-scope-2b-pt-res/tree/main/layer_20/width_16k/average_l0_71
See also 4.:
4. How can I use these SAEs straight away?
from sae_lens import SAE # pip install sae-lens
sae, cfg_dict, sparsity = SAE.from_pretrained(
release = "gemma-scope-2b-pt-res-canonical",
sae_id = "layer_0/width_16k/canonical",
)
See https://github.com/jbloomAus/SAELens for details on this library.
5. Point of Contact
Point of contact: Arthur Conmy
Contact by email:
''.join(list('moc.elgoog@ymnoc')[::-1])
HuggingFace account: https://huggingface.co/ArthurConmyGDM