- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Beyond learning curves : understanding stochasiticity...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Beyond learning curves : understanding stochasiticity and learned solution modes in reinforcement learning Wilson, Matthew
Abstract
While deep reinforcement learning (Deep RL) algorithms have been used to successfully solve challenging decision making and control tasks, their behavior often remains poorly understood. Studies and comparisons between algorithms are often done through impoverished and partial signals such as learning curves and individual rollout videos. In this work, we follow along a tradition of work which dives deeper into why exactly algorithms produce different rewards from run to run on different tasks. We aim to go beyond learning curves and develop a more holistic view of both the optimization landscape of particular environments and the multimodal behaviors that algorithms produce for given environments. To this end, we develop a set of tools for comparing many runs of deep reinforcement learning algorithms and rollouts from a single policy. We use these to answer a broad range of questions about RL.
Item Metadata
Title |
Beyond learning curves : understanding stochasiticity and learned solution modes in reinforcement learning
|
Creator | |
Supervisor | |
Publisher |
University of British Columbia
|
Date Issued |
2022
|
Description |
While deep reinforcement learning (Deep RL) algorithms have been used to successfully solve challenging decision making and control tasks, their behavior often remains poorly understood. Studies and comparisons between algorithms are often done through impoverished and partial signals such as learning curves and individual rollout videos. In this work, we follow along a tradition of work which dives deeper into why exactly algorithms produce different rewards from run to run on different tasks. We aim to go beyond learning curves and develop a more holistic view of both the optimization landscape of particular environments and the multimodal behaviors that algorithms produce for given environments. To this end, we develop a set of tools for comparing many runs of deep reinforcement learning algorithms and rollouts from a single policy. We use these to answer a broad range of questions about RL.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2022-09-27
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution 4.0 International
|
DOI |
10.14288/1.0420754
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2022-11
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution 4.0 International