Bartosz Cywiński
Hey there!
I’m a PhD student at the Warsaw University of Technology. I am also a contractor at Google DeepMind in Neel Nanda’s team.
I work on interpretability of LLMs. Currently, I am mostly excited about using interpretability to understand and fix weird model behaviors or phenomena.
news
| Apr 09, 2026 | It’s London baby 💂 |
|---|---|
| Oct 02, 2025 | New preprint: Eliciting Secret Knowledge from Language Models. |
| Jun 16, 2025 | Starting MATS 8.0 mentored by Arthur Conmy and Sam Marks! |
| May 21, 2025 | New preprint about extracting hidden knowledge from LLMs with mechanistic interpretability is out! |
| May 01, 2025 | SAeUron got accepted to ICML 2025! |