Analyze with TransformerLens¶
Compute curvature on a TransformerLens HookedTransformer. Useful for mechanistic-interpretability work where you want to compute Hessian-related quantities while using TLens hooks.
Install¶
uv add "hessian-eigenthings[transformer-lens]"
# or
pip install "hessian-eigenthings[transformer-lens]"
Pattern¶
import torch
from transformer_lens import HookedTransformer
from hessian_eigenthings.algorithms import lanczos
from hessian_eigenthings.loss_fns import tlens_loss
from hessian_eigenthings.operators import HessianOperator
model = HookedTransformer.from_pretrained("solu-1l")
model.eval()
tokens = torch.randint(0, model.cfg.d_vocab, (2, 32))
dataloader = [tokens]
op = HessianOperator(model=model, dataloader=dataloader, loss_fn=tlens_loss())
result = lanczos(op, k=3, max_iter=20, tol=1e-3, seed=0)
print(result.eigenvalues)
tlens_loss() calls model(batch, return_type="loss") under the hood, the standard TLens shifted-cross-entropy LM loss.
Per-block analysis¶
TLens parameter names follow the blocks.{i}.attn.{...} and blocks.{i}.mlp.{...} pattern, which makes per-block filtering straightforward:
from hessian_eigenthings.param_utils import match_names
attn_op = HessianOperator(
model=model,
dataloader=dataloader,
loss_fn=tlens_loss(),
param_filter=match_names("blocks.*.attn.*"),
)
mlp_op = HessianOperator(
model=model,
dataloader=dataloader,
loss_fn=tlens_loss(),
param_filter=match_names("blocks.*.mlp.*"),
)
Per-block sharpness disparities — the attention vs MLP eigenvalue gap — are the setup Liu et al. (2025) studied.