transformer-lens-interpretability

Solid

Provides guidance for mechanistic interpretability research using TransformerLens to inspect and manipulate transformer internals via HookPoints and activation caching. Use when reverse-engineering model algorithms, studying attention patterns, or performing activation patching experiments.

AI & Automation 9,609 stars 724 forks Updated 1 months ago MIT

Install

View on GitHub

Quality Score: 94/100

Stars 20%

100

Recency 20%

Frontmatter 20%

Documentation 15%

100

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# TransformerLens: Mechanistic Interpretability for Transformers TransformerLens is the de facto standard library for mechanistic interpretability research on GPT-style language models. Created by Neel Nanda and maintained by Bryce Meyer, it provides clean interfaces to inspect and manipulate model internals via HookPoints on every activation. **GitHub**: [TransformerLensOrg/TransformerLens](https://github.com/TransformerLensOrg/TransformerLens) (2,900+ stars) ## When to Use TransformerLens **Use TransformerLens when you need to:** - Reverse-engineer algorithms learned during training - Perform activation patching / causal tracing experiments - Study attention patterns and information flow - Analyze circuits (e.g., induction heads, IOI circuit) - Cache and inspect intermediate activations - Apply direct logit attribution **Consider alternatives when:** - You need to work with non-transformer architectures → Use **nnsight** or **pyvene** - You want to train/analyze Sparse Autoencoders → Use **SAELens** - You need remote execution on massive models → Use **nnsight** with NDIF - You want higher-level causal intervention abstractions → Use **pyvene** ## Installation ```bash pip install transformer-lens ``` For development version: ```bash pip install git+https://github.com/TransformerLensOrg/TransformerLens ``` ## Core Concepts ### HookedTransformer The main class that wraps transformer models with HookPoints on every activation: ```python from transformer_lens import...

Details

Author: Orchestra-Research
Repository: Orchestra-Research/AI-Research-SKILLs
Created: 7 months ago
Last Updated: 1 months ago
Language: TeX
License: MIT

Integrates with

Hugging Face · AI

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Featured