Lorenzo Pacchiardi
Assistant Research Professor, University of Cambridge
I am an Assistant Research Professor at the Leverhulme Centre for the Future of Intelligence at the University of Cambridge. I lead a research project (funded by Open Philanthropy) on developing a benchmark for measuring the ability of LLMs to perform data science tasks. I am more broadly interested in AI evaluation, particularly in predictability and cognitive evaluation, and I closely collaborate with Prof José Hernández-Orallo and Prof Lucy Cheke. I contribute to the AI evaluation newsletter.
I am deeply familiar with EU AI policy (having been involved in several initiatives), and am one of the co-founders of the Italian AI policy think tank CePTE. I also collaborate with The Unjournal to make impactful research more rigorous, and I co-founded AcademicJobsItaly.com to make the Italian academic job market more accessible.
I previously worked on detecting lying in large language models with Dr Owain Evans (through the MATS programme) and on technical standards for AI for the EU AI Act at the Future of Life Institute. I have also shortly advised RAND on AI evaluation.
I obtained a PhD in Statistics and Machine Learning at Oxford, during which I worked on Bayesian simulation-based inference, generative models and probabilistic forecasting (with applications to meteorology). My supervisors were Prof. Ritabrata Dutta (Uni. Warwick) and Prof. Geoff Nicholls (Uni. Oxford).
Before my PhD studies, I obtained a Bachelor’s degree in Physical Engineering from Politecnico di Torino (Italy) and an MSc in Physics of Complex Systems from Politecnico di Torino and Université Paris-Sud, France. I did my MSc thesis at LightOn, a machine learning startup in Paris.
news
| Apr 01, 2026 | Our paper “General Scales Unlock AI Evaluation with Explanatory and Predictive Power” has been accepted and published in Nature! 🎉 |
|---|---|
| Mar 01, 2026 | Our paper “No Answer Needed: Predicting LLM Answer Accuracy from Question-Only Linear Probes” has been accepted to the ICLR 2026 Trustworthy AI workshop! 🎉 |
| May 16, 2025 | Our survey on AI evaluation was accepted at IJCAI 2025 survey track and our PredictaBoard was accepted at ACL 2025 Findings. |
| Mar 11, 2025 | Our new preprint shows how to extract the most predictive and explanatory power from AI benchmarks by automatically annotating the demands posed by each question. Check it out! |
| Feb 21, 2025 | Two new arXiv preprints: one surveying AI evaluation and identifying six main paradigms, the other one introducing a benchmark for jointly evaluating the performance of LLMs and its predictability on individual instances. |