Statistical and machine-learning analyses often involve computationally expensive procedures: Markov chain Monte Carlo samplers that require millions of model evaluations, deep neural networks trained over many epochs, or simulation models in which a single run may take hours. In practice, any such analysis rests on choices about model structure, distributional assumptions, and parameter values that are difficult to justify with certainty. It is therefore important to ask how sensitive our conclusions are to those choices. If the analysis is expensive, however, a systematic sensitivity investigation becomes impractical: we cannot afford to rerun the full procedure for every plausible perturbation of every assumption.
One principled response to this tension is to construct a cheaper surrogate analysis, one that is fast enough to run repeatedly across a wide range of settings, and to use it to understand how the conclusions of the full analysis would vary if its inputs were changed. The surrogate is not intended to replace the full analysis but to serve as a proxy for exploring the input space efficiently. By studying the relationship between the surrogate output and the full output at a small number of carefully chosen input configurations, we can model the discrepancy between the two and obtain corrected predictions at any setting. This approach connects ideas from sensitivity analysis (Saltelli et al., 2000), Bayesian emulation (Kennedy and O’Hagan, 2001), and approximate inference, and has been applied to a range of problems including Bayesian robustness analysis (Vernon and Gosling, 2023).
In this project, the student will develop a working understanding of the theory and practice of surrogate-based robustness analysis, and will apply these ideas to a statistical or machine-learning problem of their choice. The figure below illustrates the core idea: the posterior mode (cheap, available everywhere) tracks the posterior mean (expensive, evaluated at only three settings of \(\beta\)) closely enough to reveal the structure of the full analysis at a fraction of the cost.
Figure 1: Posterior mode (blue line, computed cheaply across the full range) used as a surrogate for the posterior mean (red points, computed at three values of the parameter β). The cheap approximation captures the trend well, and the discrepancy between the two can itself be modelled to produce a corrected surrogate.
The following are useful entry points to the literature.
This project combines mathematical reading with hands-on implementation. The student will begin by working through the key ideas in the literature, building a precise understanding of the theoretical framework before moving on to computation. The primary language for implementation is R, though Python is acceptable where the student has a strong preference.
Evidence of learning takes the form of a written report. The report should present the theoretical background clearly and concisely, describe the surrogate methodology chosen and justify that choice, report computational experiments with well-chosen visualisations, and provide a critical assessment of the approach’s limitations and potential extensions. The student will meet the supervisor fortnightly throughout the year to discuss progress.
The following indicate directions in which the project could develop; the student is encouraged to choose one and pursue it in depth, or to propose an alternative direction in consultation with the supervisor.
Sensitivity analysis via a cheap surrogate. In many statistical models, certain summary statistics (modes, maximum likelihood estimates, or Laplace approximations) can be computed cheaply relative to full posterior inference. A natural project is to choose a specific model, implement both the full and cheap analyses, run the full analysis at a small number of input configurations to calibrate the discrepancy, and then use the calibrated surrogate to map how the full posterior summary varies across the input space. The Vernon and Gosling (2023) paper provides a worked example of this approach in the context of Bayesian robustness.
Gaussian process emulation. A Gaussian process (GP) can be fitted to the outputs of an expensive analysis at a set of design points, and then used as a smooth, probabilistic surrogate for the full analysis everywhere else. A project in this direction could implement GP emulation for a statistical or machine-learning model of the student’s choice, carry out a sensitivity analysis using the emulator, and assess both the accuracy of the emulator predictions and the computational savings achieved relative to direct evaluation.
Approximate Bayesian computation. Approximate Bayesian computation (ABC) replaces exact likelihood evaluation with a distance-based comparison of simulated and observed summary statistics, making inference tractable for models in which the likelihood is intractable. A project in this direction could implement an ABC algorithm, compare its output to exact posterior inference where feasible, and investigate the sensitivity of the approximation to the choice of summary statistics and acceptance threshold, following the framework of Csilléry et al. (2010).
Bayes linear and variational approximations. Full posterior inference is not the only principled approach to Bayesian computation, and two well-developed alternatives occupy an interesting middle ground between the full probabilistic analysis and a purely cheap surrogate. Bayes linear methods (Goldstein and Wooff, 2007) replace the full joint probability model with a second-order specification: prior means, variances, and covariances are updated by observed data using linear adjustments, without ever committing to a full distributional form. Variational inference (Blei et al., 2017) approximates the posterior by optimising over a tractable family of distributions, converting a sampling problem into an optimisation problem and achieving substantial computational savings relative to Markov chain Monte Carlo. A project in this direction could implement one or both of these approximations for a specific model class, characterise the discrepancy between the approximation and the full posterior as a function of the model’s structural features, and investigate the conditions under which the cheaper approach is reliable for practical inference.
Robustness of machine-learning pipelines. The surrogate idea applies beyond Bayesian statistics: in a neural network, for example, reducing the number of training epochs or hidden layers yields a cheap approximation to the fully trained model. A project in this direction could characterise the gap between reduced and full networks as a function of the architectural or training choices, and investigate whether a corrected reduced model can match full-model performance at substantially lower cost.
The following are useful but not required prior to starting:
Uncertainty quantification; Bayesian computation; sensitivity analysis; surrogate and emulator modelling.