Daniel Tan

AI Safety Researcher

Google Scholar LinkedIn Twitter LessWrong

About Me

I have a broad interest in AI alignment and AGI risk. My current focus is understanding and evaluating the legibility of models' chain-of-thought reasoning. I am also interested in steganography, prosaic interpretability, and alignment failure modes.

I am currently completing MATS 7.0 with Owain Evans. I am also a PhD student at University College London, supervised by Paige Brooks. I am supported by the Agency for Science, Technology and Research (A*STAR).

I post frequent updates on my LessWrong, as well as on Twitter. Please reach out if you would like to chat!

Selected Papers

Here are some papers I've made substantial contributions to. Please refer to my Google Scholar page for a full list of publications.

Daniel Tan

AI Safety Researcher

About Me

Selected Papers

Blog posts