Understanding complex\, irregularly sampled \, high-dimensional streams of data is a central challenge in modern data science. Key to this understanding is the ability to extract actionable in formation from a stream and use it to make consequential decisions. Exampl es include summarising patients’ health records to evaluate the efficacy of treatments\, or extracting information from the trajectories of stocks in order to design successful trading strategies.

\nDeveloped in th e 90s by Terry Lyons as a robust solution theory for non-linear control sy stems driven by irregular signals\, rough path theory (or more generally r ough analysis\, MSC2020 code 60Lxx) offers a deterministic toolbox from wh ich it is possible to recover many classical results in stochastic analysi s\, without the need to use arguments specific to probability theory. Its theoretical footprint has been substantial in the study of random phenomen a\, notably through its presence in Martin Hairer’s Fields medal-winning work on regularity structures\, which develops a rigorous framework to so lve certain ill-posed stochastic PDEs.

\nGrounded in mathematical an
alysis but inspired from equations arising in probability theory\, rough a
nalysis has deep connections to branches of pure mathematics as diverse as
differential geometry and algebraic combinatorics. More recently\, rough
analysis has become popular in Machine Learning (ML) to extract actionable
information from high dimensional\, irregular time series. It can be a ga
me changer for learning tasks with time series data\, as shown by the rece
nt and ongoing explosion of ML papers using it. Successful applications of
rough

\nanalysis have been obtained in various areas of data science
including healthcare\, neuroscience\, computer vision and quantitative fi
nance.

One of the catalyst factors was the recent development of h
igh-performance and scalable software libraries such as *esig*\, *i
isignature* and *signatory*. This two-days workshop will consist o
f 45 minute talks from invited speakers addressing 4 main topics (see the
list below). Talks will be followed by questions and a discussion about fu
ture research directions.

In this workshop we wish to bring togeth er experts and young researchers in diverse fields of mathematics and mach ine learning\, who share the interest and expertise of applying techniques from rough analysis to challenges in data science. The main themes of the workshop will be:

\n- \n
- Interplay between rough paths and machin e learning. \n
- Applications of rough paths to mathematical finance. \n
- Theoretical aspects of rough paths. \n
- Regularity struct ures\, stochastic PDEs and machine learning. \n

We hope that an in-person workshop at a world-renowned institution such as Imperial Col lege London will result in new international and multidisciplinary collabo rations\, and that it will generate original research that uses advanced m athematics to tackle real-world problems.

\n\nCris Salvi (Imperial)\, Thomas Cass (Imperial)\, Blanka Horvarth ( TUM)\, Emilio Ferrucci (Oxford).

\n\n

\n

\n\n\n

\n\n\n\n \n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
\n\n\n\n\n\
n\n\n\n\n\n\n\n\n

\n

\nTuesday 26 July | |

08:50 – 09:0 0 | Welcome |

09:00 – 09:45 | Patric Bonnier |

09:45 – 10:30 | Emilio Ferrucci |

10:30 – 11:00 | Coffee break |

11:00 – 1 1:45 | Josef Te ichman (online) |

11:45 – 12:30 | Darrick Lee (online) |

12:30 – 14:00 | Lunch break |

14:00 – 15:00 | Marti n Hairer |

15:00 – 15:45 | Maud Lemercier |

15:45 – 16:15 | Coffee break |

16:15 – 17:00 | Harald Oberhauser |

17:00 – 17:45 | James Foster |

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\
n\n\n\n\n\n\n\n\n\n\n\n\n\n\
n\n\n\n\n\n\n\n\n\n\n

\n

\nWednesday 27 July | |

09:00 – 10:00 | Terry Lyons |

10:00 – 10:45\n | Joel Dyer< /a> |

10:45 – 11:15 | Coffee break |

11:15 – 12:00 | Christian Bayer |

12:00 – 12:45 | Horatio Boedihardjo |

12:45 – 14:00 | Lunch break |

14:00 – 14:45 | Nikol as Tapia (online) |

14:45 – 15:30 | Carlo Bellingeri (o nline) |

15:30 – 16:00 | Coffee break |

16:00 – 16:45 | Yue Wu |

16:45 – 17:30 | Andrew Alden |

\n

\n### Titles and abstracts

\n

\
n\n

**Patrick Bonnier: Proper Scoring Rules
\, Divergences\, and Entropies for Paths and Time Series\n**Many forecasts consist not of point predictions but concern the evolution
of quantities. For example\, a central bank might predict the interest ra
tes during the next quarter\, an epidemiologist might predict trajectories
of infection rates\, a clinician might predict the behaviour of medical m
arkers over the next day\, etc. The situation is further complicated since
these forecasts sometimes only concern the approximate “shape of the fu
ture evolution” or “order of events”. Formally\, such forecasts can
be seen as probability measures on spaces of equivalence classes of paths
modulo time-parametrization. We leverage the statistical framework of prop
er scoring rules with classical mathematical results to derive a principle
d approach to decision making with such forecasts. In particular\, we intr
oduce notions of gradients\, entropy\, and divergence that are tailor-made
to respect the underlying non-Euclidean structure.

\n

**Emilio Ferrucci: On the Wiener Chaos Expansion of the Signature
of a Gaussian Process\n**This talk is based on joint work with Tho
mas Cass. We compute the Wiener chaos decomposition of the signature for a
class of Gaussian processes\, which contains fractional Brownian motion (
fBm) with Hurst parameter H in (1/4\, 1). At level 0\, our result yields a
n expression for the expected signature of such processes\, which determin
e their law [CL16]. In particular\, this formula simultaneously extends bo
th the one for 1/2 < H-fBm [BC07] and the one for Brownian motion (H = 1/2
) [Faw03]\, to the general case H > 1/4. Other processes studied include c
ontinuous and centred Gaussian semimartingales.

\n

**Josef Teichman: A representation theoretic vie
wpoint on signatures with a view towards regularization\n**
By means of representation theory we construct families of kernels\, which
can be approximated by random feature selection. This sheds new light on
randomized signature and regularization of learning procedures for signatu
re approximations.

**Darrick Lee:** ~~Mapping Space Signatures~~

\n~~We introduce the mapping sp
ace signature\, a generalization of the path signature for maps from highe
r dimensional cubical domains\, which is motivated by the topological pers
pective of iterated integrals by K. T. Chen. We show that the mapping spac
e signature shares many of the analytic and algebraic properties of the pa
th signature\; in particular it is universal and characteristic with Jacob
ian equivalence classes of cubical maps. This is joint work with Chad Gius
ti\, Vidit Nanda\, and Harald Oberhauser.~~

**Martin H
airer:** **A concise proof of the BPHZ theorem for regularity st
ructures**

**Maud Lemercier:** **Neural Stochas
tic PDEs**

\nStochastic partial differential equations (SPDEs
) are the mathematical tool of choice for modelling spatiotemporal PDE-dyn
amics under the influence of randomness. In this talk\, I will present a n
ovel neural architecture to learn solution operators of PDEs with (possibl
y stochastic) forcing from partially observed data. The propo
sed Neural SPDE model is capable of processing incoming sequential informa
tion arriving irregularly in time and observed at arbitrary spatial resolu
tions. By performing operations in the spectral domain\, I will show how a
Neural SPDE can be evaluated by solving a fixed point problem.

**Harald Obe
rhauser:** **Capturing graphs with hypoelliptic diffusions
\n**A common way to capture graph structures is through random walks. Th
e distribution of these random walks evolves according to a diffusion equa
tion defined using the graph Laplacian. We extend this approach by leverag
ing classic mathematical results about hypo-elliptic diffusions. This resu
lts in a novel tensor-valued graph operator\, which we call the hypo-ellip
tic graph Laplacian. We provide theoretical guarantees and efficient low-r
ank approximation algorithms.

\n

**James Foster
: Cubature vs Markov Chain Monte Carlo for Bayesian Inference
**

\nMarkov Chain Monte Carlo (MCMC) is widely regarde
d as the “go-to” approach for computing integrals with respect to post
erior distributions in Bayesian inference. Whilst there are a large variet
y of MCMC methods\, many prominent algorithms can be viewed as approximati
ons of stochastic differential equations (SDEs). For example\, the unadjus
ted Langevin algorithm (ULA) is obtained as an Euler discretization of the
Langevin diffusion and has seen particular interest due to its scalabilit
y and connections to the optimization literature. On the other hand\, “C
ubature on Wiener Space” (Lyons and Victoir\, 2004) provides a powerful
alternative to Monte Carlo for simulating SDEs. In the cubature paradigm\,
SDE solutions are represented as a cloud of particles and propagated via
deterministic cubature formulae. However\, such formulae can dramatically
increase the number of particles\, and thus SDE cubature requires efficien
t “distribution compression” to be practical. Fortunately\, there are
now a range of kernel-based compression algorithms available in the machin
e learning literature – such as kernel herding\, thinning and recombinat
ion. In this talk\, we will show that by applying cubature to ULA and empl
oying kernel herding\, we can obtain a gradient-based particle method for
Bayesian inference. We shall discuss the theory underpinning this algorith
m and the key properties of the Langevin diffusion that enable numerical e
rrors to be controlled over long time horizons. Finally\, we compare the p
roposed Langevin cubature algorithm to ULA on a simple mixture model and o
bserve significant computational benefits.

**Terry L
yons: ****From the mathematics of rough paths to more scalable d
ata science\n**The mathematics of rough path theory creates
a framework for understanding the interactions of complex and highly oscil
latory systems and generalises the Newtonian framework of controlled diffe
rential equations to include rough multimodal evolving systems. A key feat
ure of this theory is the development of a strong analytic theory capturin
g in concrete terms\, the space of (polynomial) functions on these spaces
of paths. This work was built on the ideas of KT Chen who studied these sp
aces of functions to develop a co-homology theory on loop space. The core
analysis came from LC Young. The generating function for these polynomial
functions on path space is known as the signature\, and it was established
by Hambly and Lyons (Annals of Math 2010) that the signature of the path
is a complete invariant of a path of finite length modulo the appropriate
notion of re-parametrisation. Boedihardjo and … (Advances in Maths 2016)
extended this result to rough paths.

\n These resu lts provide a new perspective and a graded feature set for describing comp lex streamed data. The first few terms in this series expansion allow high quality local descriptions of streams. These features are expensive to co mpute. But crucially for machine learning\, they only need to be computed once and can be used in every training cycle. In this way they can the for m the basis for much more scalable machine learning algorithms. (Morrill\, James\, Cristopher Salvi\, Patrick Kidger\, and James Foster. Neural roug h differential equations for long time series. In International Conference on Machine Learning\, pp. 7829-7838. PMLR\, 2021).

\n We will survey this space.

**Joel Dyer:
Simulation-based inference with path signatures**

\n<
span>Computer simulations are used widely across scientific disciplines\,
often taking the form of stochastic black-box models consuming fixed param
eters and generating a random output. In general for such models\, no like
lihood function is available\, often due to the complexity of the simulato
rs. Consequently\, it is often convenient to adopt so-called likelihood-fr
ee or simulation-based inference methods that mimic conventional likelihoo
d-based procedures using data simulated at different parameter values. Whi
le many such approaches exist for iid data\, adapting these techniques to
simulators that generate sequential data can be challenging. In this talk\
, we will discuss our recent work on simulation-based parameter inference
for dynamic\, stochastic simulators with the use of path signatures. We wi
ll argue that signatures and their recent kernelisation naturally and flex
ibly enable both approximate Bayesian and frequentist inference with time-
series simulators of different kinds\, with competitive empirical performa
nce in a variety of benchmark experiments.

**Christi
an Bayer:** **Optimal stopping with signatures**<
br />\nWe propose a new method for solving optimal stopping problems
(such as American option pricing in finance) under minimal assumptions on
the underlying stochastic process X. <
span>We consider classic and randomized stopping times represented by line
ar and non-linear functionals of the rough path signature 𝕏<∞ associa
ted to X\, and prove that maximizing over these classes of signature stopp
ing times\, in fact\, solves the original optimal stopping problem. Using
the algebraic properties of the signature\, we can then recast the problem
as a (deterministic) optimization problem depending only on the (truncate
d) expected signature. By applying a deep neural network approach to appro
ximate the non-linear signature functionals\, we can efficiently solve the
optimal stopping problem numerically. The only assumption o
n the process X is that it is a continuous (geometric) random rough path.
Hence\, the theory encompasses processes such as fractional Brownian motio
n\, which fail to be either semi-martingales or Markov processes\, and can
be used\, in particular\, for American-type option pricing in fractional
models\, e.g. of financial or electricity mark
ets. (Based on joint work with Paul Hager\, Sebastian Riede
l\, and John Schoenmakers)

**Horatio Boedihardjo:** **A non-vanishing property for the signature of a bounde
d variation path**

\nGiven a bounded variation path\,
what can we say about it’s signature? It is classical that the signature
is a group-like element and that the n-th term of the signature decay at
the speed of n!. In this talk\, we will show a third property\, that the s
equence of signature cannot contain infinitely many zeros. Together with t
he result of Chang\, Lyons and Ni\, this means the signature of reduced bo
unded variations paths have an exact decay rate n!. This work gives rise t
o many interesting questions\, including what would be the complex version
of the uniqueness theorem for signature and the analogous non-vanishing r
esults for general geometric rough path (even though the nonvanishing prop
erty itself is not true for general rough paths).

**
Nikola Tapia:** **Stability of Deep Neural Networks via discrete
rough paths**

\nUsing rough path techniques\, we provide a p
riori estimates for the output of Deep Residual Neural Networks in terms o
f both the input data and the (trained) network weights. As trained networ
k weights are typically very rough when seen as functions of the layer\, w
e propose to derive stability bounds in terms of the total p-variation of
trained weights for any p∈[1\,3]. Unlike the C1-theory underlying the ne
ural ODE literature\, our estimates remain bounded even in the limiting ca
se of weights behaving like Brownian motions\, as suggested in [arXiv:2105
.12245]. Mathematically\, we interpret residual neural network as solution
s to (rough) difference equations\, and analyse them based on recent resul
ts of discrete time signatures and rough path theory.

**Carlo Bellingeri:**** A Young-type Euler-Maclaurin Fo
rmula**

\nConsidered one of the key identities in clas
sical analysis\, the Euler-McLaurin formula is one of the standard tool to
relate sums and integrals\, with remarkable applications in many areas of
mathematics\, though with little use in stochastic analysis. In this talk
\, we will show how the notion of signature can generalize this identity i
n the context of Young’s integration and discuss some possible applicati
ons.

**Yue Wu:** **A NRDE-ba
sed model for solving path-dependent PDEs**

\nThe path-depen
dent partial differential equation (PPDE) was firstly introduced for path-
dependent derivatives in the financial market such as Asian\, barrier\, an
d lookback options\; its semilinear type was later identified as a non-Mar
kovian BSDE. The solution of PPDE contains an infinite-dimensional spatial
variable\, which makes the solution approximation extremely challenging\,
if it is not impossible. In this talk\, we propose a neural rough differe
ntial equation (NRDE) based model to learn (high dimensional) path-depende
nt parabolic PDEs. This resulting continuous-time model for the PDE soluti
on has the advantage of memory efficiency and coping with variable time-fr
equency. Several numerical experiments are provided to validate the perfor
mance of the proposed model in compassion to strong baselines in the liter
ature. This is joint work with Bowen Fang (University of Warwick\, UK) and
Hao Ni (UCL\, UK).

\n

**Andrew Alden:** **Model-Agnostic Pricing of Exotic Derivatives Using Signature
s**

\nDerivative pricing can be formulated as a higher
-order distribution regression problem on stochastic processes. Pricing us
ing this model-agnostic path-wise approach involves the use of the second-
order maximum mean discrepancy (MMD)\, a notion of distances between stoch
astic processes based on path signatures. Computing this distance is resou
rce-expensive and time-consuming. Motivated by the recent successes of usi
ng neural networks to price derivatives\, we speed up the computation of t
he MMD to facilitate the use of neural networks to address the distributio
n regression problem. In this talk I will discuss how we reduce the run-ti
me for computing the second-order MMD. I will also present the results whi
ch were obtained by combining distribution regression and neural networks
to price three exotic derivatives. Finally\, I will discuss the robustness
of our path-wise pricing framework to stochastic model parameter misspeci
fications. This talk is based on joint work with Carmine Ventre\, Blanka H
orvath\, and Gordon Lee.