- ICLR 2024How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated QuestionsLorenzo Pacchiardi , Alex James Chan , Sören Mindermann , Ilan Moscovitz , Alexa Yue Pan , Yarin Gal , Owain Evans , and Jan M. Brauner2024
Large language models (LLMs) can "lie", which we define as outputting false statements despite "knowing" the truth in a demonstrable sense. LLMs might "lie", for example, when instructed to output misinformation. Here, we develop a simple lie detector that requires neither access to the LLM’s activations (black-box) nor ground-truth knowledge of the fact in question. The detector works by asking a predefined set of unrelated follow-up questions after a suspected lie, and feeding the LLM’s yes/no answers into a logistic regression classifier. Despite its simplicity, this lie detector is highly accurate and surprisingly general. When trained on examples from a single setting – prompting GPT-3.5 to lie about factual questions – the detector generalises out-of-distribution to (1) other LLM architectures, (2) LLMs fine-tuned to lie, (3) sycophantic lies, and (4) lies emerging in real-life scenarios such as sales. These results indicate that LLMs have distinctive lie-related behavioural patterns, consistent across architectures and contexts, which could enable general-purpose lie detection.
- arXivLikelihood-Free Inference with Generative Neural Networks via Scoring Rule MinimizationLorenzo Pacchiardi , and Ritabrata DuttaarXiv preprint arXiv:2205.15784, 2022
Bayesian Likelihood-Free Inference methods yield posterior approximations for simulator models with intractable likelihood. Recently, many works trained neural networks to approximate either the intractable likelihood or the posterior directly. Most proposals use normalizing flows, namely neural networks parametrizing invertible maps used to transform samples from an underlying base measure; the probability density of the transformed samples is then accessible and the normalizing flow can be trained via maximum likelihood on simulated parameter-observation pairs. A recent work [Ramesh et al., 2022] approximated instead the posterior with generative networks, which drop the invertibility requirement and are thus a more flexible class of distributions scaling to high-dimensional and structured data. However, generative networks only allow sampling from the parametrized distribution; for this reason, Ramesh et al.  follows the common solution of adversarial training, where the generative network plays a min-max game against a "critic" network. This procedure is unstable and can lead to a learned distribution underestimating the uncertainty - in extreme cases collapsing to a single point. Here, we propose to approximate the posterior with generative networks trained by Scoring Rule minimization, an overlooked adversarial-free method enabling smooth training and better uncertainty quantification. In simulation studies, the Scoring Rule approach yields better performances with shorter training time with respect to the adversarial framework.
- arXivProbabilistic Forecasting with Generative Networks via Scoring Rule MinimizationLorenzo Pacchiardi , Rilwan Adewoyin , Peter Dueben , and Ritabrata DuttaarXiv preprint arXiv:2112.08217, 2022
Generative networks are often trained to minimize a statistical divergence between the reference distribution and the generative one in an adversarial setting. Some works trained instead generative networks to minimize Scoring Rules, functions assessing how well the generative distribution matches each training sample individually. We show how the Scoring Rule formulation easily extends to the so-called prequential (predictive-sequential) score, whose minimization allows performing probabilistic forecasting with generative networks. This objective leads to adversarial-free training, therefore easily avoiding uncertainty underestimation due to mode collapse, which is a common issue in the adversarial setting and undesirable for probabilistic forecasting. We provide consistency guarantees for the minimizer of the prequential score and employ that to perform probabilistic forecasting for two chaotic dynamical models and a benchmark dataset of global weather observations. For this last example, we define scoring rules for spatial data by drawing from the relevant literature, with which we obtain better uncertainty quantification with little hyperparameter tuning compared to adversarial training.
- JMLRScore Matched Neural Exponential Families for Likelihood-Free InferenceLorenzo Pacchiardi , and Ritabrata DuttaJournal of Machine Learning Research, 2022
Bayesian Likelihood-Free Inference (LFI) approaches allow to obtain posterior distributions for stochastic models with intractable likelihood, by relying on model simulations. In Approximate Bayesian Computation (ABC), a popular LFI method, summary statistics are used to reduce data dimensionality. ABC algorithms adaptively tailor simulations to the observation in order to sample from an approximate posterior, whose form depends on the chosen statistics. In this work, we introduce a new way to learn ABC statistics: we first generate parameter-simulation pairs from the model independently on the observation; then, we use Score Matching to train a neural conditional exponential family to approximate the likelihood. The exponential family is the largest class of distributions with fixed-size sufficient statistics; thus, we use them in ABC, which is intuitively appealing and has state-of-the-art performance. In parallel, we insert our likelihood approximation in an MCMC for doubly intractable distributions to draw posterior samples. We can repeat that for any number of observations with no additional model simulations, with performance comparable to related approaches. We validate our methods on toy models with known likelihood and a large-dimensional time-series model.
- arXivGeneralized Bayesian Likelihood-Free Inference Using Scoring Rules EstimatorsLorenzo Pacchiardi , and Ritabrata DuttaarXiv preprint arXiv:2104.03889, 2021
We propose a framework for Bayesian Likelihood-Free Inference (LFI) based on Generalized Bayesian Inference using scoring rules (SR). SR are used to evaluate probabilistic models given an observation; a proper SR is minimised in expectation when the model corresponds to the true data generating process for the observation. Using a strictly proper SR, for which the above minimum is unique, ensures posterior consistency of our method. As the likelihood function is intractable for LFI, we employ consistent estimators of SR using model simulations in a pseudo-marginal MCMC; we show the target of such chain converges to the exact SR posterior with increasing number of simulations. Furthermore, we note popular LFI techniques like Bayesian Synthetic Likelihood (BSL) and semiparametric BSL can be seen as special cases of our framework using only proper (but not strictly so) SR. We provide empirical results validating our consistency result and show how related approaches do not enjoy this property. Practically, we use the Energy and Kernel Scores, but our general framework sets the stage for extensions with other scoring rules.
- PLOS Comp. Biol.Using Mobility Data in the Design of Optimal Lockdown Strategies for the COVID-19 PandemicRitabrata Dutta , Susana Gomes , Dante Kalise , and Lorenzo PacchiardiPLOS Computational Biology, 2021
A mathematical model for the COVID-19 pandemic spread, which integrates age-structured Susceptible-Exposed-Infected-Recovered-Deceased dynamics with real mobile phone data accounting for the population mobility, is presented. The dynamical model adjustment is performed via Approximate Bayesian Computation. Optimal lockdown and exit strategies are determined based on nonlinear model predictive control, constrained to public-health and socio-economic factors. Through an extensive computational validation of the methodology, it is shown that it is possible to compute robust exit strategies with realistic reduced mobility values to inform public policy making, and we exemplify the applicability of the methodology using datasets from England and France.
- JSSABCpy: A High-Performance Computing Perspective to Approximate Bayesian ComputationRitabrata Dutta , Marcel Schoengens , Lorenzo Pacchiardi , Avinash Ummadisingu , Nicole Widmer , Pierre Künzli , Jukka-Pekka Onnela , and Antonietta MiraJournal of Statistical Software, 2021
ABCpy is a highly modular scientific library for approximate Bayesian computation (ABC) written in Python. The main contribution of this paper is to document a software engineering effort that enables domain scientists to easily apply ABC to their research without being ABC experts; using ABCpy they can easily run large parallel simulations without much knowledge about parallelization. Further, ABCpy enables ABC experts to easily develop new inference schemes and evaluate them in a standardized environment and to extend the library with new algorithms. These benefits come mainly from the modularity of ABCpy. We give an overview of the design of ABCpy and provide a performance evaluation concentrating on parallelization. This points us towards the inherent imbalance in some of the ABC algorithms. We develop a dynamic scheduling MPI implementation to mitigate this issue and evaluate the various ABC algorithms according to their adaptability towards high-performance computing.
- Sankhya BDistance-Learning for Approximate Bayesian Computation to Model a Volcanic EruptionLorenzo Pacchiardi , Pierre Künzli , Marcel Schoengens , Bastien Chopard , and Ritabrata DuttaSankhya B, 2020
Approximate Bayesian computation (ABC) provides us with a way to infer parameters of models, for which the likelihood function is not available, from an observation. Using ABC, which depends on many simulations from the considered model, we develop an inferential framework to learn parameters of a stochastic numerical simulator of volcanic eruption. Moreover, the model itself is parallelized using Message Passing Interface (MPI). Thus, we develop a nested-parallelized MPI communicator to handle the expensive numerical model with ABC algorithms. ABC usually relies on summary statistics of the data in order to measure the discrepancy model output and observation. However, informative summary statistics cannot be found for the considered model. We therefore develop a technique to learn a distance between model outputs based on deep metric-learning. We use this framework to learn the plume characteristics (eg. initial plume velocity) of the volcanic eruption from the tephra deposits collected by field-work associated with the 2450 BP Pululagua (Ecuador) volcanic eruption.