- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Undergraduate Research /
- Applying Network Resets in Model-Based Reinforcement...
Open Collections
UBC Undergraduate Research
Applying Network Resets in Model-Based Reinforcement Learning with Deep Neural Networks in Constrained Data Settings Akins, Seth Lincoln
Abstract
Artificial Intelligence and reinforcement learning are only becoming more prevalent in our society. Specifically, applications and research with deep neural networks have drastically increased. Often, these methods require large amounts of data to train to the desired level of effectiveness. Collecting this data can sometimes be highly expensive, impractical, or outright impossible. As such, advancing algorithms which can achieve high levels of performance on smaller amounts of data is essential; however, algorithms which achieve this often lose the ability to generalize well to unseen data, or eventually lose their ability to learn from new data. Maintaining network plasticity while training on small data samples is a difficult problem. We propose the ShrinkZero algorithm, an adaption of the EfficientZero algorithm. EfficientZero achieved state-of-the-art performance in Atari while consuming only 100k frames of gameplay. ShrinkZero added frequent network resets, increased the number of layers in the network, and modified the rollout lengths based on the time since the last reset. These features were shown to work well in Bigger Better Faster (BBF), a model-free algorithm which also achieved state-of-the-art performance in Atari with under 100k gameplay frames. ShrinkZero consistently underperforms EfficientZero and fails to leverage the resets which worked well in BBF. It only achieves higher performance than both algorithms in one Atari game and is frequently outperformed by humans by a large margin. Further work is needed to try to leverage these resets in a model-based setting.
Item Metadata
Title |
Applying Network Resets in Model-Based Reinforcement Learning with Deep Neural Networks in Constrained Data Settings
|
Creator | |
Date Issued |
2024-04
|
Description |
Artificial Intelligence and reinforcement learning are only becoming more
prevalent in our society. Specifically, applications and research with deep
neural networks have drastically increased. Often, these methods require
large amounts of data to train to the desired level of effectiveness. Collecting this data can sometimes be highly expensive, impractical, or outright
impossible. As such, advancing algorithms which can achieve high levels of
performance on smaller amounts of data is essential; however, algorithms
which achieve this often lose the ability to generalize well to unseen data,
or eventually lose their ability to learn from new data. Maintaining network plasticity while training on small data samples is a difficult problem.
We propose the ShrinkZero algorithm, an adaption of the EfficientZero algorithm. EfficientZero achieved state-of-the-art performance in Atari while
consuming only 100k frames of gameplay. ShrinkZero added frequent network resets, increased the number of layers in the network, and modified the
rollout lengths based on the time since the last reset. These features were
shown to work well in Bigger Better Faster (BBF), a model-free algorithm
which also achieved state-of-the-art performance in Atari with under 100k
gameplay frames. ShrinkZero consistently underperforms EfficientZero and
fails to leverage the resets which worked well in BBF. It only achieves higher
performance than both algorithms in one Atari game and is frequently outperformed by humans by a large margin. Further work is needed to try to
leverage these resets in a model-based setting.
|
Genre | |
Type | |
Language |
eng
|
Series | |
Date Available |
2024-07-15
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0444155
|
URI | |
Affiliation | |
Peer Review Status |
Unreviewed
|
Scholarly Level |
Undergraduate
|
Copyright Holder |
Seth Lincoln Akins
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International