I'm a machine learning researcher with a background in Monte Carlo methods. I think the hardest remaining problems in building effective AI systems we can trust are fundamentally problems of sampling and inference: how to search, when to stop, and how to know what you don't know.
I have a PhD in statistics from the University of Bristol where I worked on particle MCMC for population genetics with Christophe Andrieu and Mark Beaumont. This time could be summarised as 3 years of finding tricks to sample from distributions that strongly preferred the idea of being left undisturbed.
From there I spent three years at Improbable building methods for calibrating complex simulators against real-world data. This worked beautifully right up until we asked what happens when the simulator is wrong which, for anything interesting, it invariably is. That question turned into papers at NeurIPS, UAI, and AISTATS and inspired a growing line of work on robust simulation-based inference. I then co-founded a computer vision startup pushing NeRFs and Gaussian splats to their limits, and most recently at Amazon AGI, I worked on large multimodal models for speech and audio.
Direction
In domains with verifiable rewards like mathematics, code, games, and formal reasoning, we're seeing clear progress. Here, sampling and inference have an obvious target. We can generate candidates, search over them, allocate more compute at test time, and use verification to decide what survives. I expect this recipe to matter deeply for scientific discovery, but only if the verification process is itself reliable.
I'm most interested in the cases where this recipe breaks. Many of the questions we ask intelligent systems don't have a straightforward verifier. They involve judgment, context, preference, uncertainty, and disagreement. Current approaches often average away such complications. While that can be useful, it leaves us with systems that are confident even when the target is underspecified — a central difficulty for alignment.
I want to build systems that can track that underspecification explicitly: systems that know when to search further, when to ask, when to defer, and when to preserve disagreement rather than resolve it prematurely. My current work addresses the mathematical foundations of systems that can act under uncertainty without collapsing it.
Publications