# Dr Nikolas Kantas

Faculty of Natural SciencesDepartment of Mathematics

Senior Lecturer

//

### Contact

+44 (0)20 7594 2772n.kantas

//

### Location

538Huxley BuildingSouth Kensington Campus

//

## Publications

Publication Type
Year
to

38 results found

Ruzayqat H, Beskos A, Crisan D, Jasra A, Kantas Net al., 2023, Unbiased estimation using a class of diffusion processes, Journal of Computational Physics, Vol: 472, ISSN: 0021-9991

We study the problem of unbiased estimation of expectations with respect to(w.r.t.) $\pi$ a given, general probability measure on$(\mathbb{R}^d,\mathcal{B}(\mathbb{R}^d))$ that is absolutely continuous withrespect to a standard Gaussian measure. We focus on simulation associated to aparticular class of diffusion processes, sometimes termed theSchr\"odinger-F\"ollmer Sampler, which is a simulation technique thatapproximates the law of a particular diffusion bridge process $\{X_t\}_{t\in[0,1]}$ on $\mathbb{R}^d$, $d\in \mathbb{N}_0$. This latter process isconstructed such that, starting at $X_0=0$, one has $X_1\sim \pi$. Typically,the drift of the diffusion is intractable and, even if it were not, exactsampling of the associated diffusion is not possible. As a result,\cite{sf_orig,jiao} consider a stochastic Euler-Maruyama scheme that allows thedevelopment of biased estimators for expectations w.r.t.~$\pi$. We show thatfor this methodology to achieve a mean square error of$\mathcal{O}(\epsilon^2)$, for arbitrary $\epsilon>0$, the associated cost is$\mathcal{O}(\epsilon^{-5})$. We then introduce an alternative approach thatprovides unbiased estimates of expectations w.r.t.~$\pi$, that is, it does notsuffer from the time discretization bias or the bias related with theapproximation of the drift function. We prove that to achieve a mean squareerror of $\mathcal{O}(\epsilon^2)$, the associated cost is, with highprobability, $\mathcal{O}(\epsilon^{-2}|\log(\epsilon)|^{2+\delta})$, for any$\delta>0$. We implement our method on several examples including Bayesianinverse problems.

Journal article

Tu B, Gandy A, Kantas N, Shafei Bet al., 2022, Joint entropy search for multi-objective bayesian optimization, Advances in neural information processing systems, ISSN: 1049-5258

Many real-world problems can be phrased as a multi-objective optimizationproblem, where the goal is to identify the best set of compromises between thecompeting objectives. Multi-objective Bayesian optimization (BO) is a sampleefficient strategy that can be deployed to solve these vector-valuedoptimization problems where access is limited to a number of noisy objectivefunction evaluations. In this paper, we propose a novel information-theoreticacquisition function for BO called Joint Entropy Search (JES), which considersthe joint information gain for the optimal set of inputs and outputs. Wepresent several analytical approximations to the JES acquisition function andalso introduce an extension to the batch setting. We showcase the effectivenessof this new approach on a range of synthetic and real-world problems in termsof the hypervolume and its weighted variants.

Journal article

Borovykh A, Kantas N, Parpas P, Pavliotis GAet al., 2022, Stochastic Mirror Descent for Convex Optimization with Consensus Constraints

The mirror descent algorithm is known to be effective in applications whereit is beneficial to adapt the mirror map to the underlying geometry of theoptimization model. However, the effect of mirror maps on the geometry ofdistributed optimization problems has not been previously addressed. In thispaper we propose and study exact distributed mirror descent algorithms incontinuous-time under additive noise and present the settings that enablelinear convergence rates. Our analysis draws motivation from the augmentedLagrangian and its relation to gradient tracking. To further explore thebenefits of mirror maps in a distributed setting we present a preconditionedvariant of our algorithm with an additional mirror map over the Lagrangian dualvariables. This allows our method to adapt to the geometry of the consensusmanifold and leads to faster convergence. We illustrate the performance of thealgorithms in convex settings both with and without constraints. We alsoexplore their performance numerically in a non-convex application with neuralnetworks.

Journal article

Chak M, Kantas N, Lelièvre T, Pavliotis GAet al., 2021, Optimal friction matrix for underdamped Langevin sampling, Publisher: ArXiv

A systematic procedure for optimising the friction coefficient in underdampedLangevin dynamics as a sampling tool is given by taking the gradient of theassociated asymptotic variance with respect to friction. We give an expressionfor this gradient in terms of the solution to an appropriate Poisson equationand show that it can be approximated by short simulations of the associatedfirst variation/tangent process under concavity assumptions on the log density.Our algorithm is applied to the estimation of posterior means in Bayesianinference problems and reduced variance is demonstrated when compared to theoriginal underdamped and overdamped Langevin dynamics in both full andstochastic gradient cases.

Working paper

Ruzayqat H, Er-Raiy A, Beskos A, Crisan D, Jasra A, Kantas Net al., 2021, A lagged particle filter for stable filtering of certain high-dimensional state-space models, Publisher: ArXiv

We consider the problem of high-dimensional filtering of state-space models(SSMs) at discrete times. This problem is particularly challenging asanalytical solutions are typically not available and many numericalapproximation methods can have a cost that scales exponentially with thedimension of the hidden state. Inspired by lag-approximation methods for thesmoothing problem, we introduce a lagged approximation of the smoothingdistribution that is necessarily biased. For certain classes of SSMs,particularly those that forget the initial condition exponentially fast intime, the bias of our approximation is shown to be uniformly controlled in thedimension and exponentially small in time. We develop a sequential Monte Carlo(SMC) method to recursively estimate expectations with respect to our biasedfiltering distributions. Moreover, we prove for a class of class of SSMs thatcan contain dependencies amongst coordinates that as the dimension$d\rightarrow\infty$ the cost to achieve a stable mean square error inestimation, for classes of expectations, is of $\mathcal{O}(Nd^2)$ per-unittime, where $N$ is the number of simulated samples in the SMC algorithm. Ourmethodology is implemented on several challenging high-dimensional examplesincluding the conservative shallow-water model.

Working paper

Sharrock L, Kantas N, Parpas P, Pavliotis GAet al., 2021, Paramater estimation for the McKean-Vlasov stochastic differential equation

We consider the problem of parameter estimation for a stochasticMcKean-Vlasov equation, and the associated system of weakly interactingparticles. We first establish consistency and asymptotic normality of theoffline maximum likelihood estimator for the interacting particle system in thelimit as the number of particles $N\rightarrow\infty$. We then propose anonline estimator for the parameters of the McKean-Vlasov SDE, which evolvesaccording to a continuous-time stochastic gradient descent algorithm on theasymptotic log-likelihood of the interacting particle system. We prove thatthis estimator converges in $\mathbb{L}^1$ to the stationary points of theasymptotic log-likelihood of the McKean-Vlasov SDE in the joint limit as$N\rightarrow\infty$ and $t\rightarrow\infty$, under suitable assumptions whichguarantee ergodicity and uniform-in-time propagation of chaos. We thendemonstrate, under the additional assumption of global strong concavity, thatour estimator converges in $\mathbb{L}^2$ to the unique maximiser of thisasymptotic log-likelihood function, and establish an $\mathbb{L}^2$ convergencerate. We also obtain analogous results under the assumption that, rather thanobserving multiple trajectories of the interacting particle system, we insteadobserve multiple independent replicates of the McKean-Vlasov SDE itself or,less realistically, a single sample path of the McKean-Vlasov SDE and its law.Our theoretical results are demonstrated via two numerical examples, a linearmean field model and a stochastic opinion dynamics model.

Working paper

Beskos A, Crisan D, Jasra A, Kantas N, Ruzayqat Het al., 2021, Score-based parameter estimation for a class of continuous-time state space models, SIAM Journal on Scientific Computing, ISSN: 1064-8275

We consider the problem of parameter estimation for a class ofcontinuous-time state space models. In particular, we explore the case of apartially observed diffusion, with data also arriving according to a diffusionprocess. Based upon a standard identity of the score function, we consider twoparticle filter based methodologies to estimate the score function. Bothmethods rely on an online estimation algorithm for the score function of$\mathcal{O}(N^2)$ cost, with $N\in\mathbb{N}$ the number of particles. Thefirst approach employs a simple Euler discretization and standard particlesmoothers and is of cost $\mathcal{O}(N^2 + N\Delta_l^{-1})$ per unit time,where $\Delta_l=2^{-l}$, $l\in\mathbb{N}_0$, is the time-discretization step.The second approach is new and based upon a novel diffusion bridgeconstruction. It yields a new backward type Feynman-Kac formula incontinuous-time for the score function and is presented along with a particlemethod for its approximation. Considering a time-discretization, the cost is$\mathcal{O}(N^2\Delta_l^{-1})$ per unit time. To improve computational costs,we then consider multilevel methodologies for the score function. We illustrateour parameter estimation method via stochastic gradient approaches in severalnumerical examples.

Journal article

Knock ES, Whittles LK, Lees JA, Perez-Guzman PN, Verity R, FitzJohn RG, Gaythorpe KAM, Imai N, Hinsley W, Okell LC, Rosello A, Kantas N, Walters CE, Bhatia S, Watson OJ, Whittaker C, Cattarino L, Boonyasiri A, Djaafara BA, Fraser K, Fu H, Wang H, Xi X, Donnelly CA, Jauneikaite E, Laydon DJ, White PJ, Ghani AC, Ferguson NM, Cori A, Baguelin Met al., 2021, Key epidemiological drivers and impact of interventions in the 2020 SARS-CoV-2 epidemic in England, Science Translational Medicine, Vol: 13, Pages: 1-12, ISSN: 1946-6234

We fitted a model of SARS-CoV-2 transmission in care homes and the community to regional surveillance data for England. Compared with other approaches, our model provides a synthesis of multiple surveillance data streams into a single coherent modelling framework allowing transmission and severity to be disentangled from features of the surveillance system. Of the control measures implemented, only national lockdown brought the reproduction number (Rteff ) below 1 consistently; if introduced one week earlier it could have reduced deaths in the first wave from an estimated 48,600 to 25,600 (95% credible interval [95%CrI]: 15,900-38,400). The infection fatality ratio decreased from 1.00% (95%CrI: 0.85%-1.21%) to 0.79% (95%CrI: 0.63%-0.99%), suggesting improved clinical care. The infection fatality ratio was higher in the elderly residing in care homes (23.3%, 95%CrI: 14.7%-35.2%) than those residing in the community (7.9%, 95%CrI: 5.9%-10.3%). On 2nd December 2020 England was still far from herd immunity, with regional cumulative infection incidence between 7.6% (95%CrI: 5.4%-10.2%) and 22.3% (95%CrI: 19.4%-25.4%) of the population. Therefore, any vaccination campaign will need to achieve high coverage and a high degree of protection in vaccinated individuals to allow non-pharmaceutical interventions to be lifted without a resurgence of transmission.

Journal article

Borovykh A, Kantas N, Parpas P, Pavliotis GAet al., 2021, On stochastic mirror descent with interacting particles: Convergence properties and variance reduction, Physica D: Nonlinear Phenomena, Vol: 418, Pages: 1-21, ISSN: 0167-2789

An open problem in optimization with noisy information is the computation of an exact minimizer that is independent of the amount of noise. A standard practice in stochastic approximation algorithms is to use a decreasing step-size. This however leads to a slower convergence. A second alternative is to use a fixed step-size and run independent replicas of the algorithm and average these. A third option is to run replicas of the algorithm and allow them to interact. It is unclear which of these options works best. To address this question, we reduce the problem of the computation of an exact minimizer with noisy gradient information to the study of stochastic mirror descent with interacting particles. We study the convergence of stochastic mirror descent and make explicit the tradeoffs between communication and variance reduction. We provide theoretical and numerical evidence to suggest that interaction helps to improve convergence and reduce the variance of the estimate.

Journal article

Knock E, Whittles L, Lees J, Perez Guzman P, Verity R, Fitzjohn R, Gaythorpe K, Imai N, Hinsley W, Okell L, Rosello A, Kantas N, Walters C, Bhatia S, Watson O, Whittaker C, Cattarino L, Boonyasiri A, Djaafara A, Fraser K, Fu H, Wang H, Xi X, Donnelly C, Jauneikaite E, Laydon D, White P, Ghani A, Ferguson N, Cori A, Baguelin Met al., 2020, Report 41: The 2020 SARS-CoV-2 epidemic in England: key epidemiological drivers and impact of interventions

England has been severely affected by COVID-19. We fitted a model of SARS-CoV-2 transmission in care homes and the community to regional 2020 surveillance data. Only national lockdown brought the reproduction number below 1 consistently; introduced one week earlier in the first wave it could have reduced mortality by 23,300 deaths on average. The mean infection fatality ratio was initially ~1.3% across all regions except London and halved following clinical care improvements. The infection fatality ratio was two-fold lower throughout in London, even when adjusting for demographics. The infection fatality ratio in care homes was 2.5-times that in the elderly in the community. Population-level infection-induced immunity in England is still far from herd immunity, with regional mean cumulative attack rates ranging between 4.4% and 15.8%.

Report

Kantas N, 2020, Data-driven computational methods: parameter and operator estimations., SIAM Review, Vol: 62, Pages: 992-993, ISSN: 0036-1445

Journal article

Sharrock L, Kantas N, 2020, Two-timescale stochastic gradient descent in continuous time with applications to joint online parameter estimation and optimal sensorplacement, Publisher: arXiv

In this paper, we establish the almost sure convergence of two-timescalestochastic gradient descent algorithms in continuous time under general noiseand stability conditions, extending well known results in discrete time. Weanalyse algorithms with additive noise and those with non-additive noise. Inthe non-additive case, our analysis is carried out under the assumption thatthe noise is a continuous-time Markov process, controlled by the algorithmstates. The algorithms we consider can be applied to a broad class of bileveloptimisation problems. We study one such problem in detail, namely, the problemof joint online parameter estimation and optimal sensor placement for apartially observed diffusion process. We demonstrate how this can be formulatedas a bilevel optimisation problem, and propose a solution in the form of acontinuous-time, two-timescale, stochastic gradient descent algorithm.Furthermore, under suitable conditions on the latent signal, the filter, andthe filter derivatives, we establish almost sure convergence of the onlineparameter estimates and optimal sensor placements to the stationary points ofthe asymptotic log-likelihood and asymptotic filter covariance, respectively.We also provide numerical examples, illustrating the application of theproposed methodology to a partially observed Bene\v{s} equation, and apartially observed stochastic advection-diffusion equation.

Working paper

Sharrock L, Kantas N, 2020, Joint online parameter estimation and optimal sensor placement for the partially observed stochastic advection-diffusion equation, Publisher: arXiv

In this paper, we consider the problem of jointly performing online parameterestimation and optimal sensor placement for a partially observed infinitedimensional linear diffusion process. We present a novel solution to thisproblem in the form of a continuous-time, two-timescale stochastic gradientdescent algorithm, which recursively seeks to maximise the log-likelihood withrespect to the unknown model parameters, and to minimise the expected meansquared error of the hidden state estimate with respect to the sensorlocations. We also provide extensive numerical results illustrating theperformance of the proposed approach in the case that the hidden signal isgoverned by the two-dimensional stochastic advection-diffusion equation.

Working paper

Marowka M, Peters GW, Kantas N, Guillaume Bet al., 2020, Factor augmented bayesian cointegration model: a case study on the soybean crush spread, Journal of the Royal Statistical Society Series C: Applied Statistics, Vol: 69, Pages: 483-500, ISSN: 0035-9254

In this paper we investigate how vector autoregressive (VAR) models canbe used to study the soybean crush spread. By crush spread we mean a time se-ries marking the difference between a weighted combination of the value of soymealand soyoil to the value of the original soybeans. Commodity industry practitionersoften use fixed prescribed values for these weights, which do not take into accountany time varying effects or any financial market based dynamic features that can bediscerned from futures price data. In this work we address this issue by proposing anappropriate time series model with cointegration. Our model consists of an extensionof a particular VAR model used widely in econometrics. Our extensions are inspiredby the problem at hand and allow for a time varying covariance structure and a timevarying intercept to account for seasonality. To perform Bayesian inference we designan efficient Markov Chain Monte Carlo algorithm, which is based on the approach ofKoop et al. [2009]. Our investigations on prices obtained from futures contracts dataconfirmed that the added features in our model are useful in reliable statistical de-termination of the crush spread. Although the interest here is on the soybean crushspread, our approach is applicable also to other tradable spreads such as oil andenergy based crack or spark.

Journal article

Chak M, Kantas N, Pavliotis GA, 2020, On the generalised Langevin equation for simulated annealing, Publisher: arXiv

In this paper, we consider the generalised (higher order) Langevin equationfor the purpose of simulated annealing and optimisation of nonconvex functions.Our approach modifies the underdamped Langevin equation by replacing theBrownian noise with an appropriate Ornstein-Uhlenbeck process to account formemory in the system. Under reasonable conditions on the loss function and theannealing schedule, we establish convergence of the continuous time dynamics toa global minimum. In addition, we investigate the performance numerically andshow better performance and higher exploration of the state space compared tothe underdamped and overdamped Langevin dynamics with the same annealingschedule.

Working paper

Lu X, Adams N, Kantas N, 2019, On adaptive estimation for dynamic Bernoulli bandits, Foundations of Data Science, Vol: 1, Pages: 197-225, ISSN: 2639-8001

The multi-armed bandit (MAB) problem is a classic example of theexploration-exploitation dilemma. It is concerned with maximising the totalrewards for a gambler by sequentially pulling an arm from a multi-armed slotmachine where each arm is associated with a reward distribution. In staticMABs, the reward distributions do not change over time, while in dynamic MABs,each arm's reward distribution can change, and the optimal arm can switch overtime. Motivated by many real applications where rewards are binary, we focus ondynamic Bernoulli bandits. Standard methods like $\epsilon$-Greedy and UpperConfidence Bound (UCB), which rely on the sample mean estimator, often fail totrack changes in the underlying reward for dynamic problems. In this paper, weovercome the shortcoming of slow response to change by deploying adaptiveestimation in the standard methods and propose a new family of algorithms,which are adaptive versions of $\epsilon$-Greedy, UCB, and Thompson sampling.These new methods are simple and easy to implement. Moreover, they do notrequire any prior knowledge about the dynamic reward process, which isimportant for real applications. We examine the new algorithms numerically indifferent scenarios and the results show solid improvements of our algorithmsin dynamic environments.

Journal article

Kantas N, Parpas P, Pavliotis GA, 2019, The sharp, the flat and the shallow: Can weakly interacting agents learn to escape bad minima?

An open problem in machine learning is whether flat minima generalize betterand how to compute such minima efficiently. This is a very challenging problem.As a first step towards understanding this question we formalize it as anoptimization problem with weakly interacting agents. We review appropriatebackground material from the theory of stochastic processes and provideinsights that are relevant to practitioners. We propose an algorithmicframework for an extended stochastic gradient Langevin dynamics and illustrateits potential. The paper is written as a tutorial, and presents an alternativeuse of multi-agent learning. Our primary focus is on the design of algorithmsfor machine learning applications; however the underlying mathematicalframework is suitable for the understanding of large scale systems of agentbased models that are popular in the social sciences, economics and finance.

Conference paper

Llopis FP, Kantas N, Beskos A, Jasra Aet al., 2018, Particle filtering for stochastic Navier-Stokes signal observed with linear additive noise, SIAM Journal on Scientific Computing, Vol: 40, Pages: A1544-A1565, ISSN: 1064-8275

We consider a non-linear filtering problem, whereby the signal obeys thestochastic Navier-Stokes equations and is observed through a linear mappingwith additive noise. The setup is relevant to data assimilation for numericalweather prediction and climate modelling, where similar models are used forunknown ocean or wind velocities. We present a particle filtering methodologythat uses likelihood informed importance proposals, adaptive tempering, and asmall number of appropriate Markov Chain Monte Carlo steps. We provide adetailed design for each of these steps and show in our numerical examples thatthey are all crucial in terms of achieving good performance and efficiency.

Journal article

Marowka M, Peters GW, Kantas N, Bagnarosa Get al., 2017, Some Recent Developments in Markov Chain Monte Carlo for Cointegrated Time Series, ESAIM : Proceedings, Vol: 59, Pages: 76-103, ISSN: 1270-900X

We consider multivariate time series that exhibit reduced rank cointegration, which means a lower dimensional linear projection of the process becomes stationary. We will review recent suitable Markov Chain Monte Carlo approaches for Bayesian inference such as the Gibbs sampler of [41] and the Geodesic Hamiltonian Monte Carlo method of [3]. Then we will propose extensions that can allow the ideas in both methods to be applied for cointegrated time series with non-Gaussian noise. We illustrate the efficiency and accuracy of these extensions using appropriate numerical experiments.

Journal article

Whiteley N, Kantas N, 2017, Calculating principal eigen-functions of non-negative integral kernels: particle approximations and applications, Mathematics of Operations Research, Vol: 42, Pages: 1007-1034, ISSN: 1526-5471

Often in applications such as rare events estimation or optimal control it isrequired that one calculates the principal eigen-function and eigen-value of anon-negative integral kernel. Except in the finite-dimensional case, usuallyneither the principal eigen-function nor the eigen-value can be computedexactly. In this paper, we develop numerical approximations for thesequantities. We show how a generic interacting particle algorithm can be used todeliver numerical approximations of the eigen-quantities and the associatedso-called "twisted" Markov kernel as well as how these approximations arerelevant to the aforementioned applications. In addition, we study a collectionof random integral operators underlying the algorithm, address some of theirmean and path-wise properties, and obtain $L_{r}$ error estimates. Finally,numerical examples are provided in the context of importance sampling forcomputing tail probabilities of Markov chains and computing value functions fora class of stochastic optimal control problems.

Journal article

Beskos A, Jasra A, Kantas N, Thiery Aet al., 2016, On the convergence of adaptive sequential Monte Carlo methods, Annals of Applied Probability, Vol: 26, Pages: 1111-1146, ISSN: 1050-5164

In several implementations of Sequential Monte Carlo (SMC) methods it is natural and important, in terms of algorithmic efficiency, to exploit the information of the history of the samples to optimally tune their subsequent propagations. In this article we provide a carefully formulated asymptotic theory for a class of such adaptive SMC methods. The theoretical framework developed here will cover, under assumptions, several commonly used SMC algorithms [Chopin, Biometrika 89 (2002) 539–551; Jasra et al., Scand. J. Stat. 38 (2011) 1–22; Schäfer and Chopin, Stat. Comput. 23 (2013) 163–184]. There are only limited results about the theoretical underpinning of such adaptive methods: we will bridge this gap by providing a weak law of large numbers (WLLN) and a central limit theorem (CLT) for some of these algorithms. The latter seems to be the first result of its kind in the literature and provides a formal justification of algorithms used in many real data contexts [Jasra et al. (2011); Schäfer and Chopin (2013)]. We establish that for a general class of adaptive SMC algorithms [Chopin (2002)], the asymptotic variance of the estimators from the adaptive SMC method is identical to a “limiting” SMC algorithm which uses ideal proposal kernels. Our results are supported by application on a complex high-dimensional posterior distribution associated with the Navier–Stokes model, where adapting high-dimensional parameters of the proposal kernels is critical for the efficiency of the algorithm.

Journal article

Kantas N, Doucet A, Singh SS, Maciejowski J, Chopin Net al., 2015, On Particle Methods for Parameter Estimation in State-Space Models, Statistical Science, Vol: 30, Pages: 328-351, ISSN: 0883-4237

Nonlinear non-Gaussian state-space models are ubiquitousin statistics, econometrics, information engineering and signal processing.Particle methods, also known as Sequential Monte Carlo (SMC)methods, provide reliable numerical approximations to the associatedstate inference problems. However, in most applications, the state-spacemodel of interest also depends on unknown static parameters that needto be estimated from the data. In this context, standard particle methodsfail and it is necessary to rely on more sophisticated algorithms.The aim of this paper is to present a comprehensive review of particlemethods that have been proposed to perform static parameter estimationin state-space models. We discuss the advantages and limitationsof these methods and illustrate their performance on simple models.

Journal article

Kantas N, Beskos A, Jasra A, 2014, Sequential Monte Carlo methods for high-dimensional inverse problems: a case study for the Navier-Stokes equations, SIAM/ASA Journal on Uncertainty Quantification, Vol: 2, Pages: 464-489, ISSN: 2166-2525

We consider the inverse problem of estimating the initial condition of apartial differential equation, which is only observed through noisymeasurements at discrete time intervals. In particular, we focus on the casewhere Eulerian measurements are obtained from the time and space evolvingvector field, whose evolution obeys the two-dimensional Navier-Stokes equationsdefined on a torus. This context is particularly relevant to the area ofnumerical weather forecasting and data assimilation. We will adopt a Bayesianformulation resulting from a particular regularization that ensures the problemis well posed. In the context of Monte Carlo based inference, it is achallenging task to obtain samples from the resulting high dimensionalposterior on the initial condition. In real data assimilation applications itis common for computational methods to invoke the use of heuristics andGaussian approximations. The resulting inferences are biased and notwell-justified in the presence of non-linear dynamics and observations. On theother hand, Monte Carlo methods can be used to assimilate data in a principledmanner, but are often perceived as inefficient in this context due to thehigh-dimensionality of the problem. In this work we will propose a genericSequential Monte Carlo (SMC) sampling approach for high dimensional inverseproblems that overcomes these difficulties. The method builds upon Markov chainMonte Carlo (MCMC) techniques, which are currently considered as benchmarks forevaluating data assimilation algorithms used in practice. In our numericalexamples, the proposed SMC approach achieves the same accuracy as MCMC but in amuch more efficient manner.

Journal article

Jasra A, Kantas N, Ehrlich E, 2014, Approximate inference for observation-driven time series models with intractable likelihoods, ACM Transactions on Modeling and Computer Simulation, Vol: 24, ISSN: 1558-1195

In this article, we consider approximate Bayesian parameter inference for observation-driven time series models. Such statistical models appear in a wide variety of applications, including econometrics and applied mathematics. This article considers the scenario where the likelihood function cannot be evaluated pointwise; in such cases, one cannot perform exact statistical inference, including parameter estimation, which often requires advanced computational algorithms, such as Markov Chain Monte Carlo (MCMC). We introduce a new approximation based upon Approximate Bayesian Computation (ABC). Under some conditions, we show that as n → ∞, with n the length of the time series, the ABC posterior has, almost surely, a Maximum A Posteriori (MAP) estimator of the parameters that is often different from the true parameter. However, a noisy ABC MAP, which perturbs the original data, asymptotically converges to the true parameter, almost surely. In order to draw statistical inference, for the ABC approximation adopted, standard MCMC algorithms can have acceptance probabilities that fall at an exponential rate in n and slightly more advanced algorithms can mix poorly. We develop a new and improved MCMC kernel, which is based upon an exact approximation of a marginal algorithm, whose cost per iteration is random, but the expected cost, for good performance, is shown to be O(n2) per iteration. We implement our new MCMC kernel for parameter inference from models in econometrics.

Journal article

Ehrlich E, Jasra A, Kantas N, 2013, Gradient free parameter estimation for hidden Markov models with intractable likelihoods, Methodology and Computing in Applied Probability, Vol: 17, Pages: 315-349, ISSN: 1573-7713

In this article we focus on Maximum Likelihood estimation (MLE) for the static model parameters of hidden Markov models (HMMs). We will consider the case where one cannot or does not want to compute the conditional likelihooddensity of the observation given the hidden state because of increased computationalcomplexity or analytical intractability. Instead we will assume that one may obtainsamples from this conditional likelihood and hence use approximate Bayesiancomputation (ABC) approximations of the original HMM. Although these ABCapproximations will induce a bias, this can be controlled to arbitrary precision viaa positive parameter , so that the bias decreases with decreasing . We first establishthat when using an ABC approximation of the HMM for a fixed batch of data,then the bias of the resulting log- marginal likelihood and its gradient is no worsethan O(n), where n is the total number of data-points. Therefore, when usinggradient methods to perform MLE for the ABC approximation of the HMM, onemay expect parameter estimates of reasonable accuracy. To compute an estimate ofthe unknown and fixed model parameters, we propose a gradient approach based onsimultaneous perturbation stochastic approximation (SPSA) and Sequential MonteCarlo (SMC) for the ABC approximation of the HMM. The performance of thismethod is illustrated using two numerical examples.

Journal article

Kantas N, Singh SS, Doucet A, 2012, Distributed Maximum Likelihood for Simultaneous Self-Localization and Tracking in Sensor Networks, IEEE TRANSACTIONS ON SIGNAL PROCESSING, Vol: 60, Pages: 5038-5047, ISSN: 1053-587X

Journal article

Jasra A, Kantas N, Persing A, 2012, Bayesian Parameter Inference for Partially Observed Stopped Processes, Statistics and Computing, Vol: n/a, Pages: n/a-n/a, ISSN: 0960-3174

In this article we consider Bayesian parameter inference associated topartially-observed stochastic processes that start from a set B0 and arestopped or killed at the first hitting time of a known set A. Such processesoccur naturally within the context of a wide variety of applications. Theassociated posterior distributions are highly complex and posterior parameterinference requires the use of advanced Markov chain Monte Carlo (MCMC)techniques. Our approach uses a recently introduced simulation methodology,particle Markov chain Monte Carlo (PMCMC) (Andrieu et. al. 2010 [1]), wheresequential Monte Carlo (SMC) approximations (see Doucet et. al. 2001 [18] andLiu 2001 [27]) are embedded within MCMC. However, when the parameter ofinterest is fixed, standard SMC algorithms are not always appropriate for manystopped processes. In Chen et. al. [11] and Del Moral 2004 [15] the authorsintroduce SMC approximations of multi-level Feynman-Kac formulae, which canlead to more efficient algorithms. This is achieved by devising a sequence ofnested sets from B0 to A and then perform the resampling step only when thesamples of the process reach intermediate level sets in the sequence.Naturally, the choice of the intermediate level sets is critical to theperformance of such a scheme. In this paper, we demonstrate that multi-levelSMC algorithms can be used as a proposal in PMCMC. In addition, we propose aflexible strategy that adapts the level sets for different parameter proposals.Our methodology is illustrated on the coalescent model with migration.

Journal article

Whiteley N, Kantas N, Jasra A, 2012, Linear Variance Bounds for Particle Approximations of Time-Homogeneous Feynman-Kac Formulae, Stochastic Processes and their Applications

This article establishes sufficient conditions for a linear-in-time bound onthe non-asymptotic variance of particle approximations of time-homogeneousFeynman-Kac formulae. These formulae appear in a wide variety of applicationsincluding option pricing in finance and risk sensitive control in engineering.In direct Monte Carlo approximation of these formulae, the non-asymptoticvariance typically increases at an exponential rate in the time parameter. Itis shown that a linear bound holds when a non-negative kernel, defined by thelogarithmic potential function and Markov kernel which specify the Feynman-Kacmodel, satisfies a type of multiplicative drift condition and other regularityassumptions. Examples illustrate that these conditions are general and flexibleenough to accommodate two rather extreme cases, which can occur in the contextof a non-compact state space: 1) when the potential function is bounded above,not bounded below and the Markov kernel is not ergodic; and 2) when thepotential function is not bounded above, but the Markov kernel itself satisfiesa multiplicative drift condition.

Journal article

Kantas N, Lecchini-Visintini A, Maciejowski JM, 2010, Simulation-based Bayesian optimal design of aircraft trajectories for air traffic management, International Journal of Adaptive Control and Signal Processing, Vol: 24, Pages: 882-899, ISSN: 1099-1115

In this paper, we consider a specific Air Traffic Management problem, where multiple aircraft in a specific region are required to reach a different target zone in the minimum expected time, while maintaining safe separation. The problem is complicated by the presence of random wind disturbances. We propose a realistic policy to automatically generate optimal and safe maneuvres for each aircraft. The parameters of the optimal policy are computed using a sequential Monte Carlo approach.

Journal article

Yang Z, Kantas N, Lecchini-Visintini A, Maciejowksi JMet al., 2010, Stable Markov decision processes using predictive control, International Symposium on Mathematical Theory of Networks and Systems (MTNS 2010)

Conference paper

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00669248&limit=30&person=true