Bartosz Cywiński

Hi there! I’m a first year PhD student at the Warsaw University of Technology advised by prof. Tomasz Trzciński, working closely with Kamil Deja.
My research focuses on the mechanistic interpretability of deep learning generative models. Currently, I am mostly excited about using interpretability to understand and fix weird model behaviors or phenomena.
For more details, see my publications.
news
Jun 16, 2025 | Starting MATS 8.0 mentored by Arthur Conmy and Sam Marks! |
---|---|
May 21, 2025 | New preprint about extracting hidden knowledge from LLMs with mechanistic interpretability is out! |
May 01, 2025 | SAeUron got accepted to ICML 2025! ![]() |
Feb 03, 2025 | Check out our latest preprint about unlearning in diffusion models using SAEs! ![]() |
Jan 22, 2025 | Our work Precise Parameter Localization for Textual Generation in Diffusion Models has been accepted to ICLR 2025! ![]() |