UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Reinforcement learning in complex environments with locally trained naïve agents Gupta, Kashish


Reinforcement learning has long been advertised as the one with the capability to intelligently mimic and understand human learning and behavior. While the upshot of the field's advances is not underrated, its applicability and extension to large, complex and highly dynamic environments remain inefficient, inaccurate or unsolved. As the complexity of objectives increase, trend in reinforcement learning research is to tackle it with more computational power and training samples. The inspiration for the proposed methodology is derived from human learning, where increasingly complex objectives are learned with limited computational power by re-purposing previously learned skills. An intuitive and elegant concept is presented which exploits abstract symmetries present in training environments to learn a naïve agent and bypass some of the core challenges of reinforcement learning training. The naïve agent, trained in a local environment, can then be used in numerous ways to improve training in high-dimensional state-space environments. The proposed solution is incorporated with heuristic-based planning, learning from demonstration and state-space abstraction methods to present the efficacy and ease of adaptability of the proposed concept on a range of domains. The proposed method provides a structured approach to the training process and improves the trained agent's generalization capabilities and training sample efficiency. While the presented concept benefits from additional information about the state-space structure, it is notably different from traditional reinforcement learning approaches which augments the training process with expensive domain-specific knowledge to improve sample efficiency. Experiments and analyses are presented for challenging navigation and control environments, solved with augmented off-policy family of reinforcement learning methods.

Item Citations and Data


Attribution-NonCommercial-NoDerivatives 4.0 International