Imperial College London

Professor Claudia Clopath

Faculty of EngineeringDepartment of Bioengineering

Professor of Computational Neuroscience
 
 
 
//

Contact

 

+44 (0)20 7594 1435c.clopath Website

 
 
//

Location

 

Royal School of Mines 4.09Royal School of MinesSouth Kensington Campus

//

Summary

 

Publications

Citation

BibTex format

@inproceedings{Gogianu:2021,
author = {Gogianu, F and Berariu, T and Rosca, M and Clopath, C and Busoniu, L and Pascanu, R},
pages = {1--11},
publisher = {JMLR-JOURNAL MACHINE LEARNING RESEARCH},
title = {Spectral normalisation for deep reinforcement learning: an optimisation perspective},
url = {http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000683104603068&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=1ba7043ffcc86c417c072aa74d649202},
year = {2021}
}

RIS format (EndNote, RefMan)

TY  - CPAPER
AB - Most of the recent deep reinforcement learningadvances take an RL-centric perspective and focus on refinements of the training objective. Wediverge from this view and show we can recoverthe performance of these developments not bychanging the objective, but by regularising thevalue-function estimator. Constraining the Lipschitz constant of a single layer using spectralnormalisation is sufficient to elevate the performance of a Categorical-DQN agent to that of amore elaborated RAINBOW agent on the challenging Atari domain. We conduct ablation studiesto disentangle the various effects normalisationhas on the learning dynamics and show that issufficient to modulate the parameter updates torecover most of the performance of spectral normalisation. These findings hint towards the needto also focus on the neural component and itslearning dynamics to tackle the peculiarities ofDeep Reinforcement Learning.
AU - Gogianu,F
AU - Berariu,T
AU - Rosca,M
AU - Clopath,C
AU - Busoniu,L
AU - Pascanu,R
EP - 11
PB - JMLR-JOURNAL MACHINE LEARNING RESEARCH
PY - 2021///
SN - 2640-3498
SP - 1
TI - Spectral normalisation for deep reinforcement learning: an optimisation perspective
UR - http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000683104603068&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=1ba7043ffcc86c417c072aa74d649202
UR - https://proceedings.mlr.press/v139/gogianu21a.html
UR - http://hdl.handle.net/10044/1/92996
ER -