The goal of the workshop is to gather experts from different areas in this inter-disciplinary field to investigate and discuss how to harness the power of machine learning techniques to solve high-dimensional, non-linear partial differential equations (PDEs), as well as how to leverage the theory of PDEs to construct better machine learning models and study their theoretical properties.
PDEs are a dominant modelling paradigm ubiquitous throughout science, from fluid dynamics to quantum mechanics, to calculus of variations and quantitative finance. When the PDEs at hand are low dimensional (dim=1,2,3,4) they can generally be solved numerically leveraging a large arsenal of techniques developed over the last 150 years, including finite difference and finite elements methods.
Nonetheless, many PDEs arising from complex, real world financial-engineering-physical problems are often so high-dimensional (sometimes even infinite dimensional) that classical numerical techniques are either not directly applicable or do not scale to high-resolution computations. Examples of such intractable equations include pricing and hedging with rough volatility price dynamics, non-Markovian, path-dependent stochastic control problems, and turbulent fluid flow dynamics to be solved on very fine scales.
Recent advances in Machine Learning (ML) have enabled for the development of novel computational techniques for tackling PDE-based problems considered unresolvable with classical methods. Physics-informed neural networks, neural differential equations, and neural operators are among the most popular models used to tackle PDE-related problems with deep learning.
The goal of this workshop is to develop a classification of ML techniques depending on the type of PDE and to set clear new directions in the design of optimal numerical schemes, both numerically and theoretically (with convergence results). The list of participants is designed to maximise inter-disciplinarity and encourage diversity with experts in different fields, such as stochastic analysis, numerical analysis, mathematical finance and machine learning.
Dr A. Jacquier (Imperial College), Prof. J. Ruf (LSE) and Dr C. Salvi (Imperial College).
Please contact the organisers if you are interested in attending the workshop.
EPSRC, LSE, Imperial.
Confirmed speakers and schedule
|Tuesday 6th||13:30-13:45||Welcome Speech|
|Wednesday 7th||10:15-11:15||Martin Larsson|
|11:15-11:45||Marc Sabate Vidales|
|19:00-22:00||Workshop Dinner (by invitation only)|
|Thursday 8th||10:00-10:30||Athena Picarelli|
|11:45-12:45||Camilo Garcia Trillos|
Titles and abstracts
Title: On the choice of loss functions and initializations for deep learning-based solvers for PDEs
Abstract: In this talk we will discuss several challenges that arise when solving PDEs with deep learning-based solvers. We will begin with defining the loss function of a general PDE and discuss how this choice of loss function, and specifically the weighting of the different loss terms, can impact the accuracy of the solution. We will show how to choose an optimal weighting that corresponds to accurate solutions. Next, we will focus on the approximation of the Hamilton-Jacobi-Bellman partial differential equation associated to optimal stabilization of the NonLinear Quadratic Regular Problem. It is not obvious that the neural network will converge to the correct solution with just any type of initialisation; this is particularly relevant when the solution to the HJB-PDE is non-unique. We will discuss a two-step learning approach where the model is pre-trained on a dataset obtained from solving a state-dependent Riccati equation and we show that in this way efficient and accurate convergence can still be obtained.
Title: Neural Q-learning solutions to elliptic PDEs
Abstract: Solving high-dimensional partial differential equations (PDEs) is a major challenge in scientific computing. We develop a new numerical method for solving elliptic-type PDEs by adapting the Q-learning algorithm in reinforcement learning. Using a neural tangent kernel (NTK) approach, we prove that the neural network approximator for the PDE solution, trained with the Q-PDE algorithm, converges to the trajectory of an infinite-dimensional ordinary differential equation (ODE) as the number of hidden units becomes infinite. For monotone PDE (i.e. those given by monotone operators), despite the lack of a spectral gap in the NTK, we then prove that the limit neural network, which satisfies the infinite-dimensional ODE, converges in $L^2$ to the PDE solution as the training time increases. The numerical performance of the Q-PDE algorithm is studied for several elliptic PDEs. Based on joint work with Deqing Jiang and Justin Sirignano.
Title: Deep learning-based reduced order models for scientific applications
Abstract: The solution of differential equations by means of full order models (FOMs), such as, e.g., the finite element method, entails prohibitive computational costs when it comes to real-time simulations and multi-query routines. The purpose of reduced order modeling is to replace FOMs with suitable surrogates, so-called reduced order models (ROMs), characterized by much lower complexity but still able to express
the physical features of the system under investigation. Conventional ROMs anchored to the assumption of modal linear superimposition, such as proper orthogonal decomposition (POD), may reveal inefficient when dealing with nonlinear time-dependent parametrized PDEs, especially for problems featuring coherent structures propagating over time. To enhance ROM efficiency, we propose a nonlinear approach to set ROMs by exploiting deep learning (DL) algorithms, such as convolutional neural networks (CNNs). In the resulting DL-ROM, both the nonlinear trial manifold and the nonlinear reduced dynamics are learned in a non-intrusive way by relying on DL algorithms trained on a set of FOM snapshots, obtained for different parameter values. Furthermore, in case of large-scale FOMs, a former dimensionality reduction on FOM snapshots through POD enables to speed-up training times and to substantially decrease the network complexity. Accuracy and efficiency of the DL-ROM technique are assessed in different scientific applications aiming at solving parametrized PDE problems, e.g., in cardiac electrophysiology, computational mechanics and fluid dynamics, possibly accounting for fluid-structure interaction effects, where new queries to the DL-ROM can be computed in real-time. Finally, with the aim of moving towards a rigorous justification on DL-ROMs mathematical foundations, error bounds are derived for the approximation of nonlinear operators by means of CNNs. The resulting error estimates provide a clear interpretation of the hyperparameters defining the neural network architecture.
Camilo Garcia Trillos
Title: Neural network approximation of some second order semilinear PDEs
Abstract: Since its inception in the early 90s, the well-known connection between second-order semi-linear PDEs and Markovian BSDEs has been useful in creating numerical probabilistic methods to solve the former. Our approach to the solution of these PDEs belongs to a recent stream in the literature that uses neural networks together with the BSDE connection to define numerical methods that are robust and efficient in large dimensions. In contrast with existing works, our analysis focuses on the case where derivatives enter ‘quadratically’ in the semilinear term, covering some interesting cases in control theory. In this setting, we study both forward and backward-types of neural network based methods. Joint work with Daniel Bussell.
Title: Data-driven schemes for Hamilton-Jacobi-Bellman equations
Abstract: In this talk I will discuss different computational aspects arising in the construction of data-driven schemes for HJB PDEs. First, I will discuss synthetic data generation through representation formulas including Pontryagin’s Maximum Principle and State-dependent Riccati Equations. This data can be used in a regression framework for which we consider different approximation architectures: polynomial approximation, tensor train decompositions, and deep neural networks. Finally, I will address the role of synthetic data in the framework of physics-informed neural networks.
Title: Minimum curvature flow and martingale exit times
Abstract: We study the following question: What is the largest deterministic amount of time T* that a suitably normalized martingale X can be kept inside a convex body K in d-dimensional Euclidean space? We show, in a viscosity framework, that T* equals the time it takes for the relative boundary of K to reach X(0) as it undergoes a geometric flow that we call (positive) minimum curvature flow. This result has close links to the literature on stochastic and game representations of geometric flows. Moreover, the minimum curvature flow can be viewed as an arrival time version of the Ambrosio-Soner codimension-(d-1) mean curvature flow of the 1-skeleton of K. We present preliminary sampling-based numerical approximations to the solution of the corresponding PDE. The numerical part is work in progress. This work is based on a collaboration with Camilo Garcia Trillos, Johannes Ruf, and Yufei Zhang.
Title: A deep solver for BSDEs with jumps
Abstract: The aim of this work is to propose an extension of the Deep BSDE solver by Han, E, Jentzen (2017) to the case of FBSDEs with jumps. As in the aforementioned solver, starting from a discretized version of the BSDE and parametrizing the (high dimensional) control processes by means of a family of ANNs, the BSDE is viewed as model-based reinforcement learning problem and the ANN parameters are fitted so as to minimize a prescribed loss function. We take into account both finite and infinite jump activity, introducing in the latest case, an approximation with finitely many jumps of the forward process. (joint work with A. Gnoatto and M. Patacca)
Title: Complexity of neural network approximations to parametric and non-parametric parabolic PDEs
Abstract: In the first part of the talk, we discuss theoretical results which ascertain that deep neural networks can approximate the solutions of parabolic PDEs without the curse of dimensionality. That is to say, under certain technical assumptions, there exists a family of feedforward neural networks such that the complexity, measured by the number of hyper parameters, grows only polynomially in the dimension of the PDE and the reciprocal of the required accuracy. This part is based on joint work with Yufei Zhang. In the second part, we study multilevel neural networks for approximating the parametric dependence of PDE solutions. This essentially requires learning a function from computationally expensive samples. To reduce the complexity, we construct multilevel estimators using a hierarchy of finite difference approximations to the PDE on refined meshes. We provide a general framework that demonstrates that the complexity can be reduced by orders of magnitude under reasonable assumptions, and give a numerical illustration with Black-Scholes-type PDEs and random feature neural networks. This part is based on joint work with Filippo de Angelis and Mike Giles.
Title: Gradient Boosting for Solving PDEs
Abstract: Several Deep Learning methods have been successfully applied to solve several PDEs with many interesting complexities (high-dimensional, non-linear, system of PDEs, etc). However, DL usually lacks proper statistical guarantees and convergence is usually just verified in practice. In this talk, we propose a Gradient Boosting method to solve a class of PDEs. Although still preliminary, there is some hope to derive proper statistical analysis of the method. Numerical implementations and examples will be discussed.
Title: Signature kernel methods for path-dependent PDEs
Abstract: In this talk we will present a kernel framework for solving linear and non-linear path-dependent PDEs (PPDEs) leveraging a recently introduced class of kernels indexed on pathspace, called signature kernels. The proposed method solves an optimal recovery problem by approximating the solution of a PPDE with an element of minimal norm in the (signature) reproducing kernel Hilbert space (RKHS) constrained to satisfy the PPDE at a finite collection of collocation points. In the linear case, by the representer theorem, it can be shown that the optimisation has a unique, analytic solution expressed entirely in terms of simple linear algebra operations. Furthermore, this method can be extended to a probabilistic Bayesian framework allowing to represents epistemic uncertainty over the approximated solution; this is done by positing a Gaussian process prior for the solution and constructing a posterior Gaussian measure by conditioning on the PPDE being satisfied at a finite collection of points, with mean function agreeing with the solution of the optimal recovery problem. In the non-linear case, the optimal recovery problem can be reformulated as a two-level optimisation that can be solved by minimising a quadratic objective subject to nonlinear constraints. Although the theoretical analysis is still ongoing, the proposed method comes with convergence guarantees and is amenable to rigorous error analysis. Finally we will discuss some motivating examples from rough volatility and present some preliminary numerical results on path-dependent heat-type equations. This is joint work with Alexandre Pannier.
Marc Sabate Vidales
Title: Solving path dependent PDEs with LSTMs and paths signatures
Abstract: Using a combination of recurrent neural networks and signature methods from the rough paths theory we design efficient algorithms for solving parametric families of path dependent partial differential equations (PPDEs) that arise in pricing and hedging of path-dependent derivatives or from use of non-Markovian model, such as rough volatility models in Jacquier and Oumgari, 2019. The solutions of PPDEs are functions of time, a continuous path (the asset price history) and model parameters. As the domain of the solution is infinite dimensional many recently developed deep learning techniques for solving PDEs do not apply. Similarly as in Vidales et al. 2018, we identify the objective function used to learn the PPDE by using martingale representation theorem. As a result we can de-bias and provide confidence intervals for then neural network-based algorithm. We validate our algorithm using classical models for pricing lookback and auto-callable options and report errors for approximating both prices and hedging strategies.
Title: Provably convergent policy gradient methods for continuous-time stochastic control
Abstract: Recently, policy gradient methods for stochastic control have attracted substantial research interests. Much of the attention and success, however, has been for the discrete-time setting. A provably convergent policy gradient method for general continuous space-time stochastic control problems has been elusive. This talk proposes a proximal gradient algorithm for finite-time horizon stochastic control problems with controlled drift, and nonsmooth nonconvex objective functions. We prove under suitable conditions that the algorithm converges linearly to a stationary point of the control problem. We then discuss a PDE-based, momentum accelerated implementation of the proposed algorithm. Numerical experiments for high-dimensional mean-field control problems are presented, which reveal that our algorithm captures important structures of the optimal policy and achieves a robust performance with respect to parameter perturbation. This is joint work with Christoph Reisinger and Wolfgang Stockinger.