author = {Jamali, N and Kormushev, P and Ahmadzadeh, SR and Caldwell, DG},
title = {Covariance Analysis as a Measure of Policy Robustness in Reinforcement Learning},
year = {2014}

AB - —In this paper we propose covariance analysis as ametric for reinforcement learning to improve the robustness ofa learned policy. The local optima found during the explorationare analyzed in terms of the total cumulative reward and thelocal behavior of the system in the neighborhood of the optima.The analysis is performed in the solution space to select a policythat exhibits robustness in uncertain and noisy environments.We demonstrate the utility of the method using our previouslydeveloped system where an autonomous underwater vehicle(AUV) has to recover from a thruster failure. When a failure isdetected the recovery system is invoked, which uses simulationsto learn a new controller that utilizes the remaining functioningthrusters to achieve the goal of the AUV, that is, to reach a targetposition. In this paper, we use covariance analysis to examinethe performance of the top, n, policies output by the previousalgorithm. We propose a scoring metric that uses the output ofthe covariance analysis, the time it takes the AUV to reach thetarget position and the distance between the target position andthe AUV’s final position. The top polices are simulated in a noisyenvironment and evaluated using the proposed scoring metric toanalyze the effect of noise on their performance. The policy thatexhibits more tolerance to noise is selected. We show experimentalresults where covariance analysis successfully selects a morerobust policy that was ranked lower by the original algorithm.
