News: as of 3/9/18, the course will begin on 12/10/18 and will continue for 8 sessions until 7/12/18. No lecture on 30/11/18. Please read below the instructions concerning the final project.
Mathematical Foundations of Reinforcement Learning (TCC Course 10/18-12/18)
Course Description: this course concerns multi-stage decision processes in the framework of dynamic programming and the Bellman equation, where optimal policies are synthesized based on both immediate and long-term rewards. However, the computational requirements of dynamic programming techniques can be prohibitive as the policy/state space is overwhelmingly large, the so-called Bellman's curse of dimensionality". In this course we will overcome this difficulty by means of different techniques for the computation of suboptimal solutions to dynamic programming equations. The lectures will address theoretical, algorithmic, and computational aspects of such techniques.
Prerequisites: dome general knowledge on Iterative Methods, Optimisation and Markov Chains can be useful, but not essential.
1. Introduction to Dynamic Programming I: Optimal feedback control and the Bellman equation, Value and Policy Iteration.
2. Introduction to Dynamic Programming II: Finite and Infinite Horizon Control, Value and Policy Iteration.
3. Neural Networks: basic architectures, training/optimisation. Stochastic Iterative Algorithms: Stochastic Gradient Method, convergence results.
4. Optimisation (continuation) and Simulation Methods: Monte Carlo policy evaluation.
5. Approximate Dynamic Programming I: introduction and Approximate Policy Iteration.
6. Approximate Dynamic Programming II: Approximate Value Iteration.
7. Bellman Equation Methods, The Hamilton-Jacobi-Bellman PDE, Dynamic Games.
8. An Overview of Deep Reinforcement Learning. A Case Study: playing Pac-man and Tetris with Reinforcement Learning.
[NDP] Neuro-Dynamic Programming, Dimitri P. Bertsekas and John Tsitsiklis, Athena Scientific, 1996.
[RL] Reinforcement Learning: An Introduction, R. Sutton and A. Barto, MIT Press, 2014.
[DRL] Deep Reinforcement Learning: A Brief Survey, K. Arulkumaran, M. P. Deisenroth, M. Brundage, A. A. Bharath, IEEE Signal Processing Magazine 34(6), 2017.
Assessment: individual projects on different theoretical aspects and applications of reinforcement learning. Please let me know by November 30 which topic are you choosing for your report. Submission deadline: December 20, 2018.
1. Hamilton-Jacobi Bellman equations in optimal control (2 projects: deterministic/stochastic).
2. Implementing a RL framework for a 2d/3d minimum time problem with obstacles (2 projects).
3. Training a deep neural network with J* values (1 project).
4. A benchmark of different gradient methods (1 project).
5. Temporal difference methods (2 projects).
6. Approximation theory for neural networks (1 project).
7. Applications of your own interest ( \infty projects).