news

Oct 02, 2025 New preprint: Eliciting Secret Knowledge from Language Models.
Jun 16, 2025 Starting MATS 8.0 mentored by Arthur Conmy and Sam Marks!
May 21, 2025 New preprint about extracting hidden knowledge from LLMs with mechanistic interpretability is out!
May 01, 2025 SAeUron got accepted to ICML 2025! :tada:
Feb 03, 2025 Check out our latest preprint about unlearning in diffusion models using SAEs! :sparkles:
Jan 22, 2025 Our work Precise Parameter Localization for Textual Generation in Diffusion Models has been accepted to ICLR 2025! :singapore:
Sep 01, 2024 I have officially started a PhD :tada: