I'm a machine learning researcher with a background in Monte Carlo methods. I think the hardest remaining problems in building AI systems we can trust are fundamentally problems of sampling and inference: how to search, when to stop, and how to know what you don't know.
I have a PhD in statistics from the University of Bristol where I worked on particle MCMC for population genetics with Christophe Andrieu and Mark Beaumont. This time could be summarised as 3 years of finding tricks to sample from distributions that strongly preferred the idea of being left undisturbed.
From there I spent three years at Improbable building methods for calibrating complex simulators against real-world data. This worked beautifully right up until we asked what happens when the simulator is wrong which, for anything interesting, it invariably is. That question turned into papers at NeurIPS, UAI, and AISTATS and inspired a growing line of work on robust simulation-based inference — understanding when and why learned models fail, which has become a central question in AI safety. I then co-founded a computer vision startup pushing NeRFs and Gaussian splats to their limits, and most recently at Amazon AGI, I worked on post-training for large multimodal models.
Direction
In domains with verifiable rewards — mathematics, code, and formal reasoning — we are witnessing extraordinary progress. Here, sampling and inference have clear traction. The potential to accelerate scientific discovery is imminent, provided we continue to scale inference-time compute, search, and our ability to ensure the verification process itself is reliable.
But I'm also motivated by a longer term conviction. Current AI captures a kind of consensus, a melange, a blob - the natural consequence of the output that least offends. In some domains that's been fantastically useful, particularly where there are clear correctness criteria. Yet the richest aspects of human judgment are not well preserved under aggregation: taste, context-sensitivity, and irreducible perspective. These are not noise around a mean; they are a structured landscape of meaning. I believe this space is tractable and that calibrated uncertainty will be crucial to building it. That's where I'm focusing my research — developing the mathematical foundations for AI systems that know what they don't know.
Publications