UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Beyond learning curves : understanding stochasiticity and learned solution modes in reinforcement learning Wilson, Matthew


While deep reinforcement learning (Deep RL) algorithms have been used to successfully solve challenging decision making and control tasks, their behavior often remains poorly understood. Studies and comparisons between algorithms are often done through impoverished and partial signals such as learning curves and individual rollout videos. In this work, we follow along a tradition of work which dives deeper into why exactly algorithms produce different rewards from run to run on different tasks. We aim to go beyond learning curves and develop a more holistic view of both the optimization landscape of particular environments and the multimodal behaviors that algorithms produce for given environments. To this end, we develop a set of tools for comparing many runs of deep reinforcement learning algorithms and rollouts from a single policy. We use these to answer a broad range of questions about RL.

Item Citations and Data


Attribution 4.0 International