Imperial College London

Professor Yiannis Demiris

Faculty of EngineeringDepartment of Electrical and Electronic Engineering

Professor of Human-Centred Robotics, Head of ISN



+44 (0)20 7594 6300y.demiris Website




1011Electrical EngineeringSouth Kensington Campus






BibTex format

author = {Wang, R and Ciliberto, C and Amadori, P and Demiris, Y},
publisher = {Proceedings of International Conference on Machine Learning (ICML-2019)},
title = {Random Expert Distillation: Imitation Learning via Expert Policy Support Estimation},
url = {},
year = {2019}

RIS format (EndNote, RefMan)

AB - We consider the problem of imitation learning from a finite set of experttrajectories, without access to reinforcement signals. The classical approachof extracting the expert's reward function via inverse reinforcement learning,followed by reinforcement learning is indirect and may be computationallyexpensive. Recent generative adversarial methods based on matching the policydistribution between the expert and the agent could be unstable duringtraining. We propose a new framework for imitation learning by estimating thesupport of the expert policy to compute a fixed reward function, which allowsus to re-frame imitation learning within the standard reinforcement learningsetting. We demonstrate the efficacy of our reward function on both discreteand continuous domains, achieving comparable or better performance than thestate of the art under different reinforcement learning algorithms.
AU - Wang,R
AU - Ciliberto,C
AU - Amadori,P
AU - Demiris,Y
PB - Proceedings of International Conference on Machine Learning (ICML-2019)
PY - 2019///
TI - Random Expert Distillation: Imitation Learning via Expert Policy Support Estimation
UR -
UR -
UR -
ER -