# ProfessorMarkGirolami

Faculty of Natural SciencesDepartment of Mathematics

Visiting Professor

//

//

### Location

539Huxley BuildingSouth Kensington Campus

//

## Publications

Publication Type
Year
to

122 results found

Gregory A, Lau FD-H, Girolami M, Butler LJ, Elshafie MZEBet al., 2019, The synthesis of data from instrumented structures and physics-based models via Gaussian processes, Publisher: ACADEMIC PRESS INC ELSEVIER SCIENCE

WORKING PAPER

Gregory A, Lau D, Girolami M, Butler L, Elshafie Met al., 2019, The synthesis of data from instrumented structures and physics-based models via Gaussian processes, Journal of Computational Physics, Vol: 392, Pages: 248-265, ISSN: 0021-9991

At the heart of structural engineering research is the use of data obtained from physical structures such as bridges, viaducts and buildings. These data can represent how the structure responds to various stimuli over time when in operation. Many models have been proposed in literature to represent such data, such as linear statistical models. Based upon these models, the health of the structure is reasoned about, e.g. through damage indices, changes in likelihood and statistical parameter estimates. On the other hand, physics-based models are typically used when designing structures to predict how the structure will respond to operational stimuli. These models represent how the structure responds to stimuli under idealised conditions. What remains unclear in the literature is how to combine the observed data with information from the idealised physics-based model into a model that describes the responses of the operational structure. This paper introduces a new approach which fuses together observed data from a physical structure during operation and information from a mathematical model. The observed data are combined with data simulated from the physics-based model using a multi-output Gaussian process formulation. The novelty of this method is how the information from observed data and the physics-based model is balanced to obtain a representative model of the structures response to stimuli. We present our method using data obtained from a fibre-optic sensor network installed on experimental railway sleepers. The curvature of the sleeper at sensor and also non-sensor locations is modelled, guided by the mathematical representation. We discuss how this approach can be used to reason about changes in the structures behaviour over time using simulations and experimental data. The results show that the methodology can accurately detect such changes. They also indicate that the methodology can infer information about changes in the parameters within the physics-based

JOURNAL ARTICLE

Oates CJ, Cockayne J, Briol F-X, Girolami Met al., 2019, Convergence rates for a class of estimators based on Stein's method, BERNOULLI, Vol: 25, Pages: 1141-1159, ISSN: 1350-7265

JOURNAL ARTICLE

Oates CJ, Cockayne J, Aykroyd RG, Girolami Met al., 2019, Bayesian Probabilistic Numerical Methods in Time-Dependent State Estimation for Industrial Hydrocyclone Equipment, Publisher: AMER STATISTICAL ASSOC

WORKING PAPER

Roininen L, Girolami M, Lasanen S, Markkanen Met al., 2019, Hyperpriors for Matérn fields with applications in Bayesian inversion, Inverse Problems and Imaging, Vol: 13, Pages: 1-29, ISSN: 1930-8345

We introduce non-stationary Matérn field priors with stochastic partial differential equations, and construct correlation length-scaling with hyperpriors. We model both the hyperprior and the Matérn prior as continuous-parameter random fields. As hypermodels, we use Cauchy and Gaussian random fields, which we map suitably to a desired correlation length-scaling range. For computations, we discretise the models with finite difference methods. We consider the convergence of the discretised prior and posterior to the discretisation limit. We apply the developed methodology to certain interpolation, numerical differentiation and deconvolution problems, and show numerically that we can make Bayesian inversion which promotes competing constraints of smoothness and edge-preservation. For computing the conditional mean estimator of the posterior distribution, we use a combination of Gibbs and Metropolis-within-Gibbs sampling algorithms.

JOURNAL ARTICLE

Noonan J, Asiala SM, Grassia G, MacRitchie N, Gracie K, Carson J, Moores M, Girolami M, Bradshaw AC, Guzik TJ, Meehan GR, Scales HE, Brewer JM, McInnes IB, Sattar N, Faulds K, Garside P, Graham D, Maffia Pet al., 2018, In vivo multiplex molecular imaging of vascular inflammation using surface-enhanced Raman spectroscopy, Theranostics, Vol: 8, Pages: 6195-6209, ISSN: 1838-7640

Vascular immune-inflammatory responses play a crucial role in the progression and outcome of atherosclerosis. The ability to assess localized inflammation through detection of specific vascular inflammatory biomarkers would significantly improve cardiovascular risk assessment and management; however, no multi-parameter molecular imaging technologies have been established to date. Here, we report the targeted in vivo imaging of multiple vascular biomarkers using antibody-functionalized nanoparticles and surface-enhanced Raman scattering (SERS).Methods: A series of antibody-functionalized gold nanoprobes (BFNP) were designed containing unique Raman signals in order to detect intercellular adhesion molecule 1 (ICAM-1), vascular cell adhesion molecule 1 (VCAM-1) and P-selectin using SERS.Results: SERS and BFNP were utilized to detect, discriminate and quantify ICAM-1, VCAM-1 and P-selectin in vitro on human endothelial cells and ex vivo in human coronary arteries. Ultimately, non-invasive multiplex imaging of adhesion molecules in a humanized mouse model was demonstrated in vivo following intravenous injection of the nanoprobes.Conclusion: This study demonstrates that multiplexed SERS-based molecular imaging can indicate the status of vascular inflammation in vivo and gives promise for SERS as a clinical imaging technique for cardiovascular disease in the future.

JOURNAL ARTICLE

Briol F-X, Oates CJ, Girolami M, Osborne MA, Sejdinovic Det al., Rejoinder for "Probabilistic Integration: A Role in Statistical Computation?", Statistical Science, ISSN: 0883-4237

This article is the rejoinder for the paper "Probabilistic Integration: A Role in Statistical Computation?" to appear in Statistical Science with discussion. We would first like to thank the reviewers and many of our colleagues who helped shape this paper, the editor for selecting our paper for discussion, and of course all of the discussants for their thoughtful, insightful and constructive comments. In this rejoinder, we respond to some of the points raised by the discussants and comment further on the fundamental questions underlying the paper: (i) Should Bayesian ideas be used in numerical analysis?, and (ii) If so, what role should such approaches have in statistical computation?

JOURNAL ARTICLE

Dunlop MM, Girolami MA, Stuart AM, Teckentrup ALet al., 2018, How deep are deep Gaussian processes?, Journal of Machine Learning Research, Vol: 19, ISSN: 1532-4435

Recent research has shown the potential utility of deep Gaussian processes. These deep structures are probability distributions, designed through hierarchical construction, which are conditionally Gaussian. In this paper, the current published body of work is placed in a common framework and, through recursion, several classes of deep Gaussian processes are defined. The resulting samples generated from a deep Gaussian process have a Markovian structure with respect to the depth parameter, and the effective depth of the resulting process is interpreted in terms of the ergodicity, or non-ergodicity, of the resulting Markov chain. For the classes of deep Gaussian processes introduced, we provide results concerning their ergodicity and hence their effective depth. We also demonstrate how these processes may be used for inference; in particular we show how a Metropolis-within-Gibbs construction across the levels of the hierarchy can be used to derive sampling tools which are robust to the level of resolution used to represent the functions on a computer. For illustration, we consider the effect of ergodicity in some simple numerical examples.

JOURNAL ARTICLE

Xi X, Briol F-X, Girolami M, 2018, Bayesian quadrature for multiple related integrals, 35th International Conference on Machine Learning 2018, Publisher: PMLR, Pages: 5373-5382, ISSN: 2640-3498

Bayesian probabilistic numerical methods are a set of tools providingposterior distributions on the output of numerical methods. The use of thesemethods is usually motivated by the fact that they can represent ouruncertainty due to incomplete/finite information about the continuousmathematical problem being approximated. In this paper, we demonstrate thatthis paradigm can provide additional advantages, such as the possibility oftransferring information between several numerical methods. This allows usersto represent uncertainty in a more faithful manner and, as a by-product,provide increased numerical efficiency. We propose the first such numericalmethod by extending the well-known Bayesian quadrature algorithm to the casewhere we are interested in computing the integral of several related functions.We then prove convergence rates for the method in the well-specified andmisspecified cases, and demonstrate its efficiency in the context ofmulti-fidelity models for complex engineering systems and a problem of globalillumination in computer graphics.

CONFERENCE PAPER

Mendoza A, Roininen L, Girolami M, STATISTICAL METHODS TO ENABLE PRACTICAL ON-SITE TOMOGRAPHIC IMAGING OF WHOLE-CORE SAMPLES, SPWLA 59th Annual Symposium

CONFERENCE PAPER

Briol F-X, Oates CJ, Girolami M, Osborne MA, Sejdinovic Det al., Probabilistic Integration: A Role in Statistical Computation?, Statistical Science, ISSN: 0883-4237

A research frontier has emerged in scientific computation, wherein numericalerror is regarded as a source of epistemic uncertainty that can be modelled.This raises several statistical challenges, including the design of statisticalmethods that enable the coherent propagation of probabilities through a(possibly deterministic) computational work-flow. This paper examines the casefor probabilistic numerical methods in routine statistical computation. Ourfocus is on numerical integration, where a probabilistic integrator is equippedwith a full distribution over its output that reflects the presence of anunknown numerical error. Our main technical contribution is to establish, forthe first time, rates of posterior contraction for these methods. These showthat probabilistic integrators can in principle enjoy the "best of bothworlds", leveraging the sampling efficiency of Monte Carlo methods whilstproviding a principled route to assess the impact of numerical error onscientific conclusions. Several substantial applications are provided forillustration and critical evaluation, including examples from statisticalmodelling, computer graphics and a computer model for an oil reservoir.

JOURNAL ARTICLE

Ellam L, Girolami M, Pavliotis GA, Wilson Aet al., 2018, Stochastic modelling of urban structure, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, ISSN: 1364-5021

The building of mathematical and computer models of cities has a long history. The core elements are models of flows (spatial interaction) and the dynamics of structural evolution. In this article, we develop a stochastic model of urban structure to formally account for uncertainty arising from less predictable events. Standard practice has been to calibrate the spatial interaction models independently and to explore the dynamics through simulation. We present two significant results that will be transformative for both elements. First, we represent the structural variables through a single potential function and develop stochastic differential equations to model the evolution. Second, we show that the parameters of the spatial interaction model can be estimated from the structure alone, independently of flow data, using the Bayesian inferential framework. The posterior distribution is doubly intractable and poses significant computational challenges that we overcome using Markov chain Monte Carlo methods. We demonstrate our methodology with a case study on the London, UK, retail system.

JOURNAL ARTICLE

Lau D, Adams N, Girolami M, Butler L, Elshafie Met al., 2018, The role of statistics in data-centric engineering, Statistics and Probability Letters, Vol: 136, Pages: 58-62, ISSN: 0167-7152

We explore the role of statistics for Big Data analysis arising from the emerging eld of Data-Centric Engineering. Using examples related to sensor-instrumentedbridges, we highlight a number of issues and challenges. These are broadly cate-gorised as relating to uncertainty, latent-structure modelling, and the synthesisof statistical models and abstract physical models.Keywords: Big Data, Data-Centric Engineering, Digital Twin, Fibre-opticsensor, Instrumented infrastructure, Statistics

JOURNAL ARTICLE

Mac Aodha O, Gibb R, Barlow KE, Browning E, Firman M, Freeman R, Harder B, Kinsey L, Mead GR, Newson SE, Pandourski I, Parsons S, Russ J, Szodoray-Paradi A, Szodoray-Paradi F, Tilova E, Girolami M, Brostow G, Jones KEet al., 2018, Bat detective - Deep learning tools for bat acoustic signal detection, PLoS Computational Biology, Vol: 14, ISSN: 1553-734X

Passive acoustic sensing has emerged as a powerful tool for quantifying anthropogenic impacts on biodiversity, especially for echolocating bat species. To better assess bat population trends there is a critical need for accurate, reliable, and open source tools that allow the detection and classification of bat calls in large collections of audio recordings. The majority of existing tools are commercial or have focused on the species classification task, neglecting the important problem of first localizing echolocation calls in audio which is particularly problematic in noisy recordings. We developed a convolutional neural network based open-source pipeline for detecting ultrasonic, full-spectrum, search-phase calls produced by echolocating bats. Our deep learning algorithms were trained on full-spectrum ultrasonic audio collected along road-transects across Europe and labelled by citizen scientists from www.batdetective.org. When compared to other existing algorithms and commercial systems, we show significantly higher detection performance of search-phase echolocation calls with our test sets. As an example application, we ran our detection pipeline on bat monitoring data collected over five years from Jersey (UK), and compared results to a widely-used commercial system. Our detection pipeline can be used for the automatic detection and monitoring of bat populations, and further facilitates their use as indicator species on a large scale. Our proposed pipeline makes only a small number of bat specific design decisions, and with appropriate training data it could be applied to detecting other species in audio. A crucial novelty of our work is showing that with careful, non-trivial, design and implementation considerations, state-of-the-art deep learning methods can be used for accurate and efficient monitoring in audio.

JOURNAL ARTICLE

Barp A, Briol F-X, Kennedy AD, Girolami Met al., 2018, Geometry and dynamics for Markov chain Monte Carlo, Annual Review of Statistics and Its Application, Vol: 5, Pages: 451-471, ISSN: 2326-8298

Markov Chain Monte Carlo methods have revolutionised mathematical computationand enabled statistical inference within many previously intractable models. Inthis context, Hamiltonian dynamics have been proposed as an efficient way ofbuilding chains which can explore probability densities efficiently. The methodemerges from physics and geometry and these links have been extensively studiedby a series of authors through the last thirty years. However, there iscurrently a gap between the intuitions and knowledge of users of themethodology and our deep understanding of these theoretical foundations. Theaim of this review is to provide a comprehensive introduction to the geometrictools used in Hamiltonian Monte Carlo at a level accessible to statisticians,machine learners and other users of the methodology with only a basicunderstanding of Monte Carlo methods. This will be complemented with somediscussion of the most recent advances in the field which we believe willbecome increasingly relevant to applied scientists.

JOURNAL ARTICLE

Lau D, Butler L, Adams N, Elshafie M, Girolami Met al., Real-time Statistical Modelling of Data Generated from Self-Sensing Bridges, Proceedings of the Institution of Civil Engineers - Civil Engineering, ISSN: 0965-089X

JOURNAL ARTICLE

Stathopoulos V, Zamora-Gutierrez V, Jones KE, Girolami Met al., 2018, Bat echolocation call identification for biodiversity monitoring: a probabilistic approach, Journal of the Royal Statistical Society Series C: Applied Statistics, Vol: 67, Pages: 165-183, ISSN: 0035-9254

Bat echolocation call identification methods are important in developing efficient cost‐effective methods for large‐scale bioacoustic surveys for global biodiversity monitoring and conservation planning. Such methods need to provide interpretable probabilistic predictions of species since they will be applied across many different taxa in a diverse set of applications and environments. We develop such a method using a multinomial probit likelihood with independent Gaussian process priors and study its feasibility on a data set from an on‐going study of 21 species, five families and 1800 bat echolocation calls collected from Mexico, a hotspot of bat biodiversity. We propose an efficient approximate inference scheme based on the expectation propagation algorithm and observe that the overall methodology significantly improves on currently adopted approaches to bat call classification by providing an approach which can be easily generalized across different species and call types and is fully probabilistic. Implementation of this method has the potential to provide robust species identification tools for biodiversity acoustic bat monitoring programmes across a range of taxa and spatial scales.

JOURNAL ARTICLE

Briol F-X, Girolami M, 2018, Bayesian Numerical Methods as a Case Study for Statistical Data Science, Conference on Statistical Data Science, Publisher: WORLD SCIENTIFIC PUBL CO PTE LTD, Pages: 99-110

CONFERENCE PAPER

Meagher JP, Damoulas T, Jones KE, Girolami Met al., 2018, Phylogenetic Gaussian Processes for Bat Echolocation, Conference on Statistical Data Science, Publisher: WORLD SCIENTIFIC PUBL CO PTE LTD, Pages: 111-124

CONFERENCE PAPER

Oates CJ, Niederer S, Lee A, Briol F-X, Girolami Met al., 2017, Probabilistic models for integration error in the assessment of functional cardiac models, Neural Information Processing Systems, Publisher: NIPS Proceedings, Pages: 110-118, ISSN: 1049-5258

This paper studies the numerical computation of integrals, representing estimates or predictions, over the output f(x) of a computational model with respect to a distribution p(dx) over uncertain inputs x to the model. For the functional cardiac models that motivate this work, neither f nor p possess a closed-form expression and evaluation of either requires ≈ 100 CPU hours, precluding standard numerical integration methods. Our proposal is to treat integration as an estimation problem, with a joint model for both the a priori unknown function f and the a priori unknown distribution p. The result is a posterior distribution over the integral that explicitly accounts for dual sources of numerical approximation error due to a severely limited computational budget. This construction is applied to account, in a statistically principled manner, for the impact of numerical errors that (at present) are confounding factors in functional cardiac model assessment.

CONFERENCE PAPER

Betancourt M, Byrne S, Livingstone S, Girolami Met al., 2017, The geometric foundations of Hamiltonian Monte Carlo, Bernoulli, Vol: 23, Pages: 2257-2298, ISSN: 1350-7265

Although Hamiltonian Monte Carlo has proven an empirical success, the lack of a rigorous theoretical understanding of the algorithm has in many ways impeded both principled developments of the method and use of the algorithm in practice. In this paper, we develop the formal foundations of the algorithm through the construction of measures on smooth manifolds, and demonstrate how the theory naturally identifies efficient implementations and motivates promising generalizations.

JOURNAL ARTICLE

Oates CJ, Niederer S, Lee A, Briol F-X, Girolami Met al., Probabilistic Models for Integration Error in the Assessment of Functional Cardiac Models, Advances in Neural Information Processing Systems (NIPS), Pages: 109-117

This paper studies the numerical computation of integrals, representingestimates or predictions, over the output $f(x)$ of a computational model withrespect to a distribution $p(\mathrm{d}x)$ over uncertain inputs $x$ to themodel. For the functional cardiac models that motivate this work, neither $f$nor $p$ possess a closed-form expression and evaluation of either requires$\approx$ 100 CPU hours, precluding standard numerical integration methods. Ourproposal is to treat integration as an estimation problem, with a joint modelfor both the a priori unknown function $f$ and the a priori unknowndistribution $p$. The result is a posterior distribution over the integral thatexplicitly accounts for dual sources of numerical approximation error due to aseverely limited computational budget. This construction is applied to account,in a statistically principled manner, for the impact of numerical errors that(at present) are confounding factors in functional cardiac model assessment.

CONFERENCE PAPER

Ellam L, Murray I, Girolami M, Strathmann Het al., 2017, A determinant-free method to simulate theparameters of large Gaussian ﬁelds, Stat

JOURNAL ARTICLE

Conrad PR, Girolami M, Sarkka S, Stuart A, Zygalakis Ket al., 2017, Statistical analysis of differential equations: introducing probability measures on numerical solutions, Statistics and Computing, Vol: 27, Pages: 1065-1082, ISSN: 0960-3174

In this paper, we present a formal quantification of uncertainty induced by numerical solutions of ordinary and partial differential equation models. Numerical solutions of differential equations contain inherent uncertainties due to the finite-dimensional approximation of an unknown and implicitly defined function. When statistically analysing models based on differential equations describing physical, or other naturally occurring, phenomena, it can be important to explicitly account for the uncertainty introduced by the numerical method. Doing so enables objective determination of this source of uncertainty, relative to other uncertainties, such as those caused by data contaminated with noise or model error induced by missing physical or inadequate descriptors. As ever larger scale mathematical models are being used in the sciences, often sacrificing complete resolution of the differential equation on the grids used, formally accounting for the uncertainty in the numerical method is becoming increasingly more important. This paper provides the formal means to incorporate this uncertainty in a statistical model and its subsequent analysis. We show that a wide variety of existing solvers can be randomised, inducing a probability measure over the solutions of such differential equations. These measures exhibit contraction to a Dirac measure around the true unknown solution, where the rates of convergence are consistent with the underlying deterministic numerical method. Furthermore, we employ the method of modified equations to demonstrate enhanced rates of convergence to stochastic perturbations of the original deterministic problem. Ordinary differential equations and elliptic partial differential equations are used to illustrate the approach to quantify uncertainty in both the statistical analysis of the forward and inverse problems.

JOURNAL ARTICLE

Cockayne J, Oates C, Sullivan T, Girolami Met al., 2017, Probabilistic numerical methods for PDE-constrained Bayesian inverse problems, Bayesian Inference and Maximum Entropy Methods in Science and Engineering, Publisher: AIP Publishing, ISSN: 1551-7616

This paper develops meshless methods for probabilistically describing discretisation error in the numerical solution of partial differential equations. This construction enables the solution of Bayesian inverse problems while accounting for the impact of the discretisation of the forward problem. In particular, this drives statistical inferences to be more conservative in the presence of significant solver error. Theoretical results are presented describing rates of convergence for the posteriors in both the forward and inverse problems. This method is tested on a challenging inverse problem with a nonlinear forward model.

CONFERENCE PAPER

Oates CJ, Girolami M, Chopin N, 2017, Control functionals for Monte Carlo integration, Journal of the Royal Statistical Society Series B: Statistical Methodology, Vol: 79, Pages: 695-718, ISSN: 1369-7412

A non‐parametric extension of control variates is presented. These leverage gradient information on the sampling density to achieve substantial variance reduction. It is not required that the sampling density be normalized. The novel contribution of this work is based on two important insights: a trade‐off between random sampling and deterministic approximation and a new gradient‐based function space derived from Stein's identity. Unlike classical control variates, our estimators improve rates of convergence, often requiring orders of magnitude fewer simulations to achieve a fixed level of precision. Theoretical and empirical results are presented, the latter focusing on integration problems arising in hierarchical models and models based on non‐linear ordinary differential equations.

JOURNAL ARTICLE

Beskos A, Girolami M, Lan S, Farrell PE, Stuart AMet al., 2017, Geometric MCMC for infinite-dimensional inverse problems, Journal of Computational Physics, Vol: 335, Pages: 327-351, ISSN: 0021-9991

Bayesian inverse problems often involve sampling posterior distributions on infinite-dimensional function spaces. Traditional Markov chain Monte Carlo (MCMC) algorithms are characterized by deteriorating mixing times upon mesh-refinement, when the finite-dimensional approximations become more accurate. Such methods are typically forced to reduce step-sizes as the discretization gets finer, and thus are expensive as a function of dimension. Recently, a new class of MCMC methods with mesh-independent convergence times has emerged. However, few of them take into account the geometry of the posterior informed by the data. At the same time, recently developed geometric MCMC algorithms have been found to be powerful in exploring complicated distributions that deviate significantly from elliptic Gaussian laws, but are in general computationally intractable for models defined in infinite dimensions. In this work, we combine geometric methods on a finite-dimensional subspace with mesh-independent infinite-dimensional approaches. Our objective is to speed up MCMC mixing times, without significantly increasing the computational cost per step (for instance, in comparison with the vanilla preconditioned Crank–Nicolson (pCN) method). This is achieved by using ideas from geometric MCMC to probe the complex structure of an intrinsic finite-dimensional subspace where most data information concentrates, while retaining robust mixing times as the dimension grows by using pCN-like methods in the complementary subspace. The resulting algorithms are demonstrated in the context of three challenging inverse problems arising in subsurface flow, heat conduction and incompressible flow control. The algorithms exhibit up to two orders of magnitude improvement in sampling efficiency when compared with the pCN method.

JOURNAL ARTICLE

Briol F-X, Oates CJ, Cockayne J, Chen WY, Girolami Met al., On the Sampling Problem for Kernel Quadrature, International Conference on Machine Learning (ICML), Publisher: PMLR, Pages: 586-595

The standard Kernel Quadrature method for numerical integration with random point sets (also called Bayesian Monte Carlo) is known to converge in root mean square error at a rate determined by the ratio $s/d$, where $s$ and $d$ encode the smoothness and dimension of the integrand. However, an empirical investigation reveals that the rate constant $C$ is highly sensitive to the distribution of the random points. In contrast to standard Monte Carlo integration, for which optimal importance sampling is well-understood, the sampling distribution that minimises $C$ for Kernel Quadrature does not admit a closed form. This paper argues that the practical choice of sampling distribution is an important open problem. One solution is considered; a novel automatic approach based on adaptive tempering and sequential Monte Carlo.Empirical results demonstrate a dramatic reduction in integration error of up to 4 orders of magnitude can be achieved with the proposed method.

CONFERENCE PAPER

Jensen K, Soguero-Ruiz C, Mikalsen KO, Lindsetmo R-O, Kouskoumvekaki I, Girolami M, Skrovseth SO, Augestad KMet al., 2017, Analysis of free text in electronic health records for identification of cancer patient trajectories, Scientific Reports, Vol: 7, ISSN: 2045-2322

With an aging patient population and increasing complexity in patient disease trajectories, physicians are often met with complex patient histories from which clinical decisions must be made. Due to the increasing rate of adverse events and hospitals facing financial penalties for readmission, there has never been a greater need to enforce evidence-led medical decision-making using available health care data. In the present work, we studied a cohort of 7,741 patients, of whom 4,080 were diagnosed with cancer, surgically treated at a University Hospital in the years 2004–2012. We have developed a methodology that allows disease trajectories of the cancer patients to be estimated from free text in electronic health records (EHRs). By using these disease trajectories, we predict 80% of patient events ahead in time. By control of confounders from 8326 quantified events, we identified 557 events that constitute high subsequent risks (risk > 20%), including six events for cancer and seven events for metastasis. We believe that the presented methodology and findings could be used to improve clinical decision support and personalize trajectories, thereby decreasing adverse events and optimizing cancer treatment.

JOURNAL ARTICLE

Ellam L, Zabaras N, Girolami M, 2016, A Bayesian approach to multiscale inverse problems with on-the-fly scale determination, Journal of Computational Physics, Vol: 326, Pages: 115-140, ISSN: 0021-9991

© 2016 Elsevier Inc. A Bayesian computational approach is presented to provide a multi-resolution estimate of an unknown spatially varying parameter from indirect measurement data. In particular, we are interested in spatially varying parameters with multiscale characteristics. In our work, we consider the challenge of not knowing the characteristic length scale(s) of the unknown a priori, and present an algorithm for on-the-fly scale determination. Our approach is based on representing the spatial field with a wavelet expansion. Wavelet basis functions are hierarchically structured, localized in both spatial and frequency domains and tend to provide sparse representations in that a large number of wavelet coefficients are approximately zero. For these reasons, wavelet bases are suitable for representing permeability fields with non-trivial correlation structures. Moreover, the intra-scale correlations between wavelet coefficients form a quadtree, and this structure is exploited to identify additional basis functions to refine the model. Bayesian inference is performed using a sequential Monte Carlo (SMC) sampler with a Markov Chain Monte Carlo (MCMC) transition kernel. The SMC sampler is used to move between posterior densities defined on different scales, thereby providing a computationally efficient method for adaptive refinement of the wavelet representation. We gain insight from the marginal likelihoods, by computing Bayes factors, for model comparison and model selection. The marginal likelihoods provide a termination criterion for our scale determination algorithm. The Bayesian computational approach is rather general and applicable to several inverse problems concerning the estimation of a spatially varying parameter. The approach is demonstrated with permeability estimation for groundwater flow using pressure sensor measurements.

JOURNAL ARTICLE

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00519654&limit=30&person=true