Imperial College London

Emeritus ProfessorBercRustem

Faculty of EngineeringDepartment of Computing

Emeritus Professor
 
 
 
//

Contact

 

+44 (0)20 7594 8345b.rustem Website

 
 
//

Assistant

 

Dr Amani El-Kholy +44 (0)20 7594 8220

 
//

Location

 

361Huxley BuildingSouth Kensington Campus

//

Summary

 

Publications

Citation

BibTex format

@article{Wiesemann:2012:10.1287/moor.1120.0566,
author = {Wiesemann, W and Kuhn, D and Rustem, B},
doi = {10.1287/moor.1120.0566},
journal = {Mathematics of Operations Research},
title = {Robust Markov Decision Processes},
url = {http://dx.doi.org/10.1287/moor.1120.0566},
year = {2012}
}

RIS format (EndNote, RefMan)

TY  - JOUR
AB - Markov decision processes (MDPs) are powerful tools for decision making in uncertain dynamic environments. However, the solutions of MDPs are of limited practical use because of their sensitivity to distributional model parameters, which are typically unknown and have to be estimated by the decision maker. To counter the detrimental effects of estimation errors, we consider robust MDPs that offer probabilistic guarantees in view of the unknown parameters. To this end, we assume that an observation history of the MDP is available. Based on this history, we derive a confidence region that contains the unknown parameters with a prespecified probability 1-β. Afterward, we determine a policy that attains the highest worst-case performance over this confidence region. By construction, this policy achieves or exceeds its worst-case performance with a confidence of at least 1-β. Our method involves the solution of tractable conic programs of moderate size.
AU - Wiesemann,W
AU - Kuhn,D
AU - Rustem,B
DO - 10.1287/moor.1120.0566
PY - 2012///
SN - 1526-5471
TI - Robust Markov Decision Processes
T2 - Mathematics of Operations Research
UR - http://dx.doi.org/10.1287/moor.1120.0566
UR - http://hdl.handle.net/10044/1/14216
ER -