Title:

Policy Evaluation and Temporal-Difference Learning in Continuous Time and Space: A Martingale Lens 

Abstract:

We propose a unified framework to study policy evaluation (PE) and the associated temporal difference (TD) methods for reinforcement learning in continuous time and space. Mathematically, PE is to devise a data-driven Feynman–Kac formula without knowing any coefficients of a PDE. We show that this problem is equivalent to maintaining the martingale condition of a process. From this perspective, we present two methods for designing PE algorithms. The first one, using a “martingale loss function”, interprets the classical gradient Monte-Carlo algorithm. The second method is based on a system of equations called the “martingale orthogonality conditions”. Solving these equations in different ways recovers various classical TD algorithms, such as TD, LSTD, and GTD. We apply these  results to option pricing and portfolio selection. This is a joint work with Yanwei Jia.  

Biography:

Xunyu Zhou is the Liu Family Professor of Financial Engineering and the Director of Nie Center for Intelligent Asset Management at Columbia University. He was the Nomura Professor of Mathematical Finance at University of Oxford before joining Columbia in 2016.

His research covers stochastic control, dynamic portfolio selection, asset pricing, behavioral finance, and time inconsistency. Currently his research focuses on continuous-time reinforcement learning and applications to optimization broadly and to wealth management specifically. He is a recipient of the Wolfson Research Award from The Royal Society, the Outstanding Paper Prize from SIAM, the Alexander von Humboldt Research Fellowship, and the Croucher Senior Research Fellowship. He was an invited speaker at the 2010 International Congress of Mathematicians, a Humboldt Distinguished Lecturer at Humboldt University and an Archimedes Lecturer at Columbia. He is both an IEEE Fellow and a SIAM Fellow.

Xunyu Zhou received his PhD in Operations Research and Control Theory from Fudan University in 1989.

 


Zoom Meeting Details