another quiet morning in London. the kettle's on; the work continues.

I study how AI minds go wrong.

And, more importantly, how to keep them from going wrong as they get more capable.

I'm an AI safety researcher in London, doing my PhD at UCL and working with the Center on Long-Term Risk.

My research is about building a pragmatic science of how language models generalize — why an aligned model can quietly become misaligned, what models actually learn from their data, and how to catch it before it matters. I come at it from a mix of NLP, ML, and cognitive science.

Before this: undergrad at Stanford in ML and CS, a year building robots at a Singapore startup, and earlier obsessions with mechanistic interpretability and open-ended learning. Off the clock I'm dancing, on a parkrun, climbing, cooking, or deep in some anime / sci-fi. Autism is my superpower, and I'm unapologetically curious about almost everything.

Things I've helped figure out

A few papers I'm proud of. The full list lives on Scholar.

Models trained to write insecure code learn to admire Nazis. Emergent Misalignment: Narrow Finetuning can lead to Broad Misalignment Steer how a model generalizes by adding one line to the training data. Inoculation Prompting: Eliciting traits during training can suppress them at test-time Steering vectors don't work universally — they often fail on their own task. Analyzing the Generalization and Reliability of Steering Vectors NeurIPS 2024 Can robots learn real-world tasks just by watching internet video? Towards Generalist Robot Learning from Internet Video: A Survey In proceedings, JAIR $ ./alignment — rather play than read? Boot a little terminal and play as a model in training. Two of the papers above — emergent misalignment & inoculation prompting — happen to you. play →

Writing

Now

CYCLE · JUNE 2026

A new chapter — I've just started a role at Arcadia Alignment. Outside the work, health has become a real joy: two years with a trainer, and lately I'm into mudgar (heavy-club training) and my first steps on the swing-dance floor.

It's been a season of growth — clearer about what I want, happier, more myself. The fuller version →

What I'm into

The stuff I'll happily talk your ear off about:

Dancing Parkrun Bouldering Cooking 3Blue1Brown Karpathy Avatar: TLA Stardew Valley Hollow Knight ABBA TPOT

Also kicking around: books I loved · what therapy taught me · stanzas on growth · dating profile outtakes

Let's talk.

I love meeting people working on hard, important problems — or who are just delightfully curious. Say hi.

atmosphere

live tuning · dev-only panel