Beyond learning curves : understanding stochasiticity and learned solution modes in reinforcement learning

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Beyond learning curves : understanding stochasiticity and learned solution modes in reinforcement learning Wilson, Matthew

Abstract

While deep reinforcement learning (Deep RL) algorithms have been used to successfully solve challenging decision making and control tasks, their behavior often remains poorly understood. Studies and comparisons between algorithms are often done through impoverished and partial signals such as learning curves and individual rollout videos. In this work, we follow along a tradition of work which dives deeper into why exactly algorithms produce different rewards from run to run on different tasks. We aim to go beyond learning curves and develop a more holistic view of both the optimization landscape of particular environments and the multimodal behaviors that algorithms produce for given environments. To this end, we develop a set of tools for comparing many runs of deep reinforcement learning algorithms and rollouts from a single policy. We use these to answer a broad range of questions about RL.

Item Metadata

Title	Beyond learning curves : understanding stochasiticity and learned solution modes in reinforcement learning
Creator	Wilson, Matthew
Supervisor	Van de Panne, M. (Michiel), 1965-
Publisher	University of British Columbia
Date Issued	2022
Description	While deep reinforcement learning (Deep RL) algorithms have been used to successfully solve challenging decision making and control tasks, their behavior often remains poorly understood. Studies and comparisons between algorithms are often done through impoverished and partial signals such as learning curves and individual rollout videos. In this work, we follow along a tradition of work which dives deeper into why exactly algorithms produce different rewards from run to run on different tasks. We aim to go beyond learning curves and develop a more holistic view of both the optimization landscape of particular environments and the multimodal behaviors that algorithms produce for given environments. To this end, we develop a set of tools for comparing many runs of deep reinforcement learning algorithms and rollouts from a single policy. We use these to answer a broad range of questions about RL.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2022-09-27
Provider	Vancouver : University of British Columbia Library
Rights	Attribution 4.0 International
DOI	10.14288/1.0420754
URI	http://hdl.handle.net/2429/82765
Degree	Master of Science - MSc
Program	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Graduation Date	2022-11
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Beyond learning curves : understanding stochasiticity and learned solution modes in reinforcement learning Wilson, Matthew

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights