Lorenzo Pacchiardi

Assistant Research Professor, University of Cambridge

I am an Assistant Research Professor at the Leverhulme Centre for the Future of Intelligence at the University of Cambridge. I lead a research project (funded by Open Philanthropy) on developing a benchmark for measuring the ability of LLMs to perform data science tasks. I am more broadly interested in AI evaluation, particularly in predictability and cognitive evaluation, and I closely collaborate with Prof José Hernández-Orallo and Prof Lucy Cheke. I contribute to the AI evaluation newsletter.

I am deeply familiar with EU AI policy (having been involved in several initiatives) and am currently part of the AI Act scientific panel, a 60-expert body advising the EU AI Office on general-purpose AI model risks, classification, and methodology. I am one of the co-founders of the Italian AI policy think tank CePTE.

I am also a board member at Meridian, which runs AI safety and biosecurity programmes and hosts researchers, students and professionals in high-impact careers in central Cambridge. I also collaborate with The Unjournal to make impactful research more rigorous, and I co-founded AcademicJobsItaly.com to make the Italian academic job market more accessible.

I previously worked on detecting lying in large language models with Dr Owain Evans (through the MATS programme) and on technical standards for AI for the EU AI Act at the Future of Life Institute. I have also shortly advised RAND on AI evaluation.

I obtained a PhD in Statistics and Machine Learning at Oxford, during which I worked on Bayesian simulation-based inference, generative models and probabilistic forecasting (with applications to meteorology). My supervisors were Prof. Ritabrata Dutta (Uni. Warwick) and Prof. Geoff Nicholls (Uni. Oxford).

Before my PhD studies, I obtained a Bachelor’s degree in Physical Engineering from Politecnico di Torino (Italy) and an MSc in Physics of Complex Systems from Politecnico di Torino and Université Paris-Sud, France. I did my MSc thesis at LightOn, a machine learning startup in Paris.

news

May 19, 2026	I co-authored a foundational preprint on how AI evaluation should adapt for Continual Learning AI.
Apr 01, 2026	Our paper “General Scales Unlock AI Evaluation with Explanatory and Predictive Power” has been accepted and published in Nature! 🎉
Mar 01, 2026	Our paper “No Answer Needed: Predicting LLM Answer Accuracy from Question-Only Linear Probes” has been accepted to the ICLR 2026 Trustworthy AI workshop! 🎉
May 16, 2025	Our survey on AI evaluation was accepted at IJCAI 2025 survey track and our PredictaBoard was accepted at ACL 2025 Findings.
Mar 11, 2025	Our new preprint shows how to extract the most predictive and explanatory power from AI benchmarks by automatically annotating the demands posed by each question. Check it out!

selected publications

Nature
General Scales Unlock AI Evaluation with Explanatory and Predictive Power

Lexin Zhou, Lorenzo Pacchiardi, Fernando Martínez-Plumed, Katherine M. Collins, Yael Moros-Daval, Seraphina Zhang, Qinlin Zhao, Yitian Huang, Luning Sun, Jonathan E. Prunty, Zongqian Li, Pablo Sánchez-García, Kexin Jiang Chen, Pablo A. M. Casares, Jiyun Zu, John Burden, Behzad Mehrbakhsh, David Stillwell, Manuel Cebrian, Jindong Wang, Peter Henderson, Sherry Tongshuang Wu, Patrick C. Kyllonen, Lucy Cheke, Xing Xie, and José Hernández-Orallo

Nature, 2026

Abs Bib HTML PDF Code

Ensuring safe and effective use of artificial intelligence (AI) requires understanding and anticipating its performance on new tasks, from advanced scientific challenges to transformed workplace activities. So far, benchmarking has guided progress in AI but has offered limited explanatory and predictive power for general-purpose AI systems, attributed to limited transferability across specific tasks. Here we introduce general scales for AI evaluation that elicit demand profiles explaining what capabilities common AI benchmarks truly measure, extract ability profiles quantifying the general strengths and limits of AI systems and robustly predict AI performance for new task instances. Our fully automated methodology builds on 18 rubrics, capturing a broad range of cognitive and intellectual demands, which place different task instances on the same general scales, illustrated on 15 large language models (LLMs) and 63 tasks. Both the demand and the ability profiles on these scales bring new insights such as construct validity through benchmark sensitivity and specificity and explain conflicting claims about whether AI has reasoning capabilities. Ultimately, high predictive power at the instance level becomes possible using the general scales, providing superior estimates over strong black-box baseline predictors, especially in out-of-distribution settings (new tasks and benchmarks). The scales, rubrics, battery, techniques and results presented here constitute a solid foundation for a science of AI evaluation, underpinning the reliable deployment of AI in the years ahead.
@article{zhou2025generalscalesunlockai, title = {{General Scales Unlock AI Evaluation with Explanatory and Predictive Power}}, author = {Zhou, Lexin and Pacchiardi, Lorenzo and Martínez-Plumed, Fernando and Collins, Katherine M. and Moros-Daval, Yael and Zhang, Seraphina and Zhao, Qinlin and Huang, Yitian and Sun, Luning and Prunty, Jonathan E. and Li, Zongqian and Sánchez-García, Pablo and Chen, Kexin Jiang and Casares, Pablo A. M. and Zu, Jiyun and Burden, John and Mehrbakhsh, Behzad and Stillwell, David and Cebrian, Manuel and Wang, Jindong and Henderson, Peter and Wu, Sherry Tongshuang and Kyllonen, Patrick C. and Cheke, Lucy and Xie, Xing and Hernández-Orallo, José}, year = {2026}, journal = {Nature}, volume = {652}, pages = {58-67}, url = {https://kinds-of-intelligence-cfi.github.io/ADELE/}, }
arXiv
Frontier AI Auditing: Toward Rigorous Third-Party Assessment of Safety and Security Practices at Leading AI Companies

Miles Brundage, Noemi Dreksler, Aidan Homewood, Sean McGregor, Patricia Paskov, Conrad Stosz, Girish Sastry, A. Feder Cooper, George Balston, Steven Adler, Stephen Casper, Markus Anderljung, Grace Werner, Soren Mindermann, Vasilios Mavroudis, Ben Bucknall, Charlotte Stix, Jonas Freund, Lorenzo Pacchiardi, Jose Hernandez-Orallo, Matteo Pistillo, Michael Chen, Chris Painter, Dean W. Ball, Cullen O’Keefe, Gabriel Weil, Ben Harack, Graeme Finley, Ryan Hassan, Scott Emmons, Charles Foster, Anka Reuel, Bri Treece, Yoshua Bengio, Daniel Reti, Rishi Bommasani, Cristian Trout, Ali Shahin Shamsabadi, Rajiv Dattani, Adrian Weller, Robert Trager, Jaime Sevilla, Lauren Wagner, Lisa Soder, Ketan Ramakrishnan, Henry Papadatos, Malcolm Murray, and Ryan Tovcimak

2026

Abs Bib HTML PDF

Frontier AI is becoming critical societal infrastructure, but outsiders lack reliable ways to judge whether leading developers’ safety and security claims are accurate and whether their practices meet relevant standards. Compared to other social and technological systems we rely on daily such as consumer products, corporate financial statements, and food supply chains, AI is subject to less rigorous third-party scrutiny along several dimensions. Ambiguity about whether AI systems are trustworthy can discourage deployment in some contexts where the technology could be beneficial, and make it more likely when it’s dangerous. Public transparency alone cannot close this gap: many safety- and security-relevant details are legitimately confidential and require expert interpretation. We define frontier AI auditing as rigorous third-party verification of frontier AI developers’ safety and security claims, and evaluation of their systems and practices against relevant standards, based on deep, secure access to non-public information. To make rigor legible and comparable, we introduce AI Assurance Levels (AAL-1 to AAL-4), ranging from time-bounded system audits to continuous, deception-resilient verification.
@misc{brundage2026frontieraiauditingrigorous, title = {Frontier AI Auditing: Toward Rigorous Third-Party Assessment of Safety and Security Practices at Leading AI Companies}, author = {Brundage, Miles and Dreksler, Noemi and Homewood, Aidan and McGregor, Sean and Paskov, Patricia and Stosz, Conrad and Sastry, Girish and Cooper, A. Feder and Balston, George and Adler, Steven and Casper, Stephen and Anderljung, Markus and Werner, Grace and Mindermann, Soren and Mavroudis, Vasilios and Bucknall, Ben and Stix, Charlotte and Freund, Jonas and Pacchiardi, Lorenzo and Hernandez-Orallo, Jose and Pistillo, Matteo and Chen, Michael and Painter, Chris and Ball, Dean W. and O'Keefe, Cullen and Weil, Gabriel and Harack, Ben and Finley, Graeme and Hassan, Ryan and Emmons, Scott and Foster, Charles and Reuel, Anka and Treece, Bri and Bengio, Yoshua and Reti, Daniel and Bommasani, Rishi and Trout, Cristian and Shamsabadi, Ali Shahin and Dattani, Rajiv and Weller, Adrian and Trager, Robert and Sevilla, Jaime and Wagner, Lauren and Soder, Lisa and Ramakrishnan, Ketan and Papadatos, Henry and Murray, Malcolm and Tovcimak, Ryan}, year = {2026}, eprint = {2601.11699}, archiveprefix = {arXiv}, primaryclass = {cs.CY}, url = {https://arxiv.org/abs/2601.11699}, publisher = {arXiv}, }
Zenodo
Continual Learning Requires Evaluating Trajectories

Lorenzo Pacchiardi, Patricia Paskov, Seán Ó hÉigeartaigh, Fernando Martı́nez-Plumed, Katherine M. Collins, Fazl Barez, Jonathan Prunty, Matteo Gabriel Mecattaf, Zafeirios Fountas, Risto Uuk, Sanmi Koyejo, Cozmin Ududec, and José Hernández-Orallo

2026

Abs Bib HTML PDF

AI systems increasingly incorporate continual learning mechanisms allowing their behaviour to adapt after deployment, from (1) in-context learning and (2) memory features already in wide use to (3) post-deployment weight modification under research. We argue that, by treating AI systems as frozen artefacts whose performance and safety are assessed at release, current evaluation practices structurally ignore the behavioural trajectory of a system that continues to learn from experience. Our position is that evaluation of continual learning systems should be centred on behavioural trajectories, with the complementary goals of characterising the landscape of possible behaviours and forecasting how behaviour will evolve from a given set of experiences. This can be operationalised through trajectory elicitation sandboxes and predictive monitors that forecast behavioural evolution, but may face fundamental obstacles analogous to those seen in dynamical systems. These are best addressed by (1) applying trajectory-centred evaluation to today’s continual learning systems and (2) relying on the resulting evidence to design systems amenable to it, yielding a virtuous cycle in which systems and their evaluations co-evolve.
@misc{pacchiardi2026continuallearningrequiresevaluating, title = {Continual Learning Requires Evaluating Trajectories}, author = {Pacchiardi, Lorenzo and Paskov, Patricia and h{\'E}igeartaigh, Se{\'a}n {\'O} and Mart{\'\i}nez-Plumed, Fernando and Collins, Katherine M. and Barez, Fazl and Prunty, Jonathan and Mecattaf, Matteo Gabriel and Fountas, Zafeirios and Uuk, Risto and Koyejo, Sanmi and Ududec, Cozmin and Hern{\'a}ndez-Orallo, Jos{\'e}}, year = {2026}, doi = {10.5281/zenodo.20344324}, url = {https://cl-eval.github.io/}, }
NeurIPS Workshop
A Framework for the Categorisation of General-Purpose AI Models under the EU AI Act

Lorenzo Pacchiardi, John Burden, Fernando Martı́nez-Plumed, Jose Hernandez-Orallo, Emilia Gomez, and David Fernández-Llorca

In NeurIPS 2025 Workshop on Regulatable ML , 2025

Abs Bib HTML PDF

We propose a framework for categorising AI models as General-Purpose AI (GPAI) models, based on their capabilities and generality, as defined in the European Union (EU) AI Act. Our framework breaks down the core components of the GPAI definition into measurable elements, focusing on four primary cognitive domains: Attention and Scan, Comprehension and Compositional Expression, Conceptualisation, Learning and Abstraction, and Quantitative and Logical Reasoning. We suggest using the Annotated Demand Levels (ADeLe) procedure to evaluate AI models’ capabilities in these domains, and provide a methodology for combining domain-level scores into a single measure of generality. The framework is illustrated with empirical results from existing models, and policy recommendations are made for selecting thresholds and metrics for GPAI categorisation.
@inproceedings{pacchiardi2025a, Workshop}, title = {A Framework for the Categorisation of General-Purpose {AI} Models under the {EU} {AI} Act}, author = {Pacchiardi, Lorenzo and Burden, John and Mart{\'\i}nez-Plumed, Fernando and Hernandez-Orallo, Jose and Gomez, Emilia and Fern{\'a}ndez-Llorca, David}, booktitle = {NeurIPS 2025 Workshop on Regulatable ML}, year = {2025}, url = {https://openreview.net/forum?id=uE33aEsyX1}, }
IJCAI
Paradigms of AI Evaluation: Mapping Goals, Methodologies and Culture

John Burden*, Marko Tešić*, Lorenzo Pacchiardi*, and José Hernández-Orallo

IJCAI 2025 Survey Track, 2025

Abs Bib HTML

Research in AI evaluation has grown increasingly complex and multidisciplinary, attracting researchers with diverse backgrounds and objectives. As a result, divergent evaluation paradigms have emerged, often developing in isolation, adopting conflicting terminologies, and overlooking each other’s contributions. This fragmentation has led to insular research trajectories and communication barriers both among different paradigms and with the general public, contributing to unmet expectations for deployed AI systems. To help bridge this insularity, in this paper we survey recent work in the AI evaluation landscape and identify six main paradigms. We characterise major recent contributions within each paradigm across key dimensions related to their goals, methodologies and research cultures. By clarifying the unique combination of questions and approaches associated with each paradigm, we aim to increase awareness of the breadth of current evaluation approaches and foster cross-pollination between different paradigms. We also identify potential gaps in the field to inspire future research directions.
@article{burden2025paradigmsaievaluationmapping, title = {Paradigms of {AI} Evaluation: Mapping Goals, Methodologies and Culture}, author = {Burden*, John and Tešić*, Marko and Pacchiardi*, Lorenzo and Hernández-Orallo, José}, year = {2025}, eprint = {2502.15620}, archiveprefix = {arXiv}, primaryclass = {cs.AI}, journal = {IJCAI 2025 Survey Track}, url = {https://arxiv.org/abs/2502.15620}, }
ICLR 2024
How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions

Lorenzo Pacchiardi*, Alex J Chan*, Sören Mindermann, Ilan Moscovitz, Alexa Y Pan, Yarin Gal, Owain Evans, and Jan Brauner

The Twelfth International Conference on Learning Representations, 2024

Abs Bib HTML Code

Large language models (LLMs) can "lie", which we define as outputting false statements despite "knowing" the truth in a demonstrable sense. LLMs might "lie", for example, when instructed to output misinformation. Here, we develop a simple lie detector that requires neither access to the LLM’s activations (black-box) nor ground-truth knowledge of the fact in question. The detector works by asking a predefined set of unrelated follow-up questions after a suspected lie, and feeding the LLM’s yes/no answers into a logistic regression classifier. Despite its simplicity, this lie detector is highly accurate and surprisingly general. When trained on examples from a single setting – prompting GPT-3.5 to lie about factual questions – the detector generalises out-of-distribution to (1) other LLM architectures, (2) LLMs fine-tuned to lie, (3) sycophantic lies, and (4) lies emerging in real-life scenarios such as sales. These results indicate that LLMs have distinctive lie-related behavioural patterns, consistent across architectures and contexts, which could enable general-purpose lie detection.
@article{pacchiardi2023catch, title = {How to Catch an {AI} Liar: Lie Detection in Black-Box {LLM}s by Asking Unrelated Questions}, author = {Pacchiardi*, Lorenzo and Chan*, Alex J and Mindermann, S{\"o}ren and Moscovitz, Ilan and Pan, Alexa Y and Gal, Yarin and Evans, Owain and Brauner, Jan}, journal = {The Twelfth International Conference on Learning Representations}, year = {2024}, url = {https://openreview.net/forum?id=567BjxgaTp}, }