Applying Network Resets in Model-Based Reinforcement Learning with Deep Neural Networks in Constrained Data Settings

UBC Undergraduate Research

Applying Network Resets in Model-Based Reinforcement Learning with Deep Neural Networks in Constrained Data Settings Akins, Seth Lincoln

Abstract

Artificial Intelligence and reinforcement learning are only becoming more prevalent in our society. Specifically, applications and research with deep neural networks have drastically increased. Often, these methods require large amounts of data to train to the desired level of effectiveness. Collecting this data can sometimes be highly expensive, impractical, or outright impossible. As such, advancing algorithms which can achieve high levels of performance on smaller amounts of data is essential; however, algorithms which achieve this often lose the ability to generalize well to unseen data, or eventually lose their ability to learn from new data. Maintaining network plasticity while training on small data samples is a difficult problem. We propose the ShrinkZero algorithm, an adaption of the EfficientZero algorithm. EfficientZero achieved state-of-the-art performance in Atari while consuming only 100k frames of gameplay. ShrinkZero added frequent network resets, increased the number of layers in the network, and modified the rollout lengths based on the time since the last reset. These features were shown to work well in Bigger Better Faster (BBF), a model-free algorithm which also achieved state-of-the-art performance in Atari with under 100k gameplay frames. ShrinkZero consistently underperforms EfficientZero and fails to leverage the resets which worked well in BBF. It only achieves higher performance than both algorithms in one Atari game and is frequently outperformed by humans by a large margin. Further work is needed to try to leverage these resets in a model-based setting.

Item Metadata

Title	Applying Network Resets in Model-Based Reinforcement Learning with Deep Neural Networks in Constrained Data Settings
Creator	Akins, Seth Lincoln
Date Issued	2024-04
Description	Artificial Intelligence and reinforcement learning are only becoming more prevalent in our society. Specifically, applications and research with deep neural networks have drastically increased. Often, these methods require large amounts of data to train to the desired level of effectiveness. Collecting this data can sometimes be highly expensive, impractical, or outright impossible. As such, advancing algorithms which can achieve high levels of performance on smaller amounts of data is essential; however, algorithms which achieve this often lose the ability to generalize well to unseen data, or eventually lose their ability to learn from new data. Maintaining network plasticity while training on small data samples is a difficult problem. We propose the ShrinkZero algorithm, an adaption of the EfficientZero algorithm. EfficientZero achieved state-of-the-art performance in Atari while consuming only 100k frames of gameplay. ShrinkZero added frequent network resets, increased the number of layers in the network, and modified the rollout lengths based on the time since the last reset. These features were shown to work well in Bigger Better Faster (BBF), a model-free algorithm which also achieved state-of-the-art performance in Atari with under 100k gameplay frames. ShrinkZero consistently underperforms EfficientZero and fails to leverage the resets which worked well in BBF. It only achieves higher performance than both algorithms in one Atari game and is frequently outperformed by humans by a large margin. Further work is needed to try to leverage these resets in a model-based setting.
Genre	Graduating Project
Type	Text
Language	eng
Series	University of British Columbia. COSC 449
Date Available	2024-07-15
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0444155
URI	http://hdl.handle.net/2429/88638
Affiliation	Science, Irving K. Barber Faculty of (Okanagan); Computer Science, Mathematics, Physics and Statistics, Department of (Okanagan)
Peer Review Status	Unreviewed
Scholarly Level	Undergraduate
Copyright Holder	Seth Lincoln Akins
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Undergraduate Research