Scalable deep reinforcement learning for physics-based motion control

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Scalable deep reinforcement learning for physics-based motion control Berseth, Glen

Abstract

This thesis studies the broad problem of learning robust control policies for difficult physics-based motion control tasks such as locomotion and navigation. A number of avenues are explored to assist in learning such control. In particular, are there underlying structures in the motor-learning system that enable learning solutions to complex tasks? How are animals able to learn new skills so efficiently? Animals may be learning and using implicit models of their environment to assist in planning and exploration. These potential structures motivate the design of learning systems and in this thesis, we study their effectiveness on physically simulated and robotic motor-control tasks. Five contributions that build on motion control using deep reinforcement learning are presented. First, a case study on the motion control problem of brachiation, the movement of gibbons through trees is presented. This work compares parametric and non-parametric models for reinforcement learning. The difficulty of this motion control problem motivates separating the control problem into multiple levels. Second, a hierarchical decomposition is presented that enables efficient learning by operating across multiple time scales for a complex locomotion and navigation task. First, reinforcement learning is used to acquire a low-level, high-frequency policy for joint actuation, used for bipedal footstep-directed walking. Subsequently, an additional policy is learned that provides directed footstep plans to the first level of control in order to navigate through the environment. Third, improved action exploration methods are investigated. An explicit action valued function is constructed using the learned model. Using this action-valued function we can compute actions that increase the value of future states. Fourth, a new algorithm is designed to progressively learn and integrate new skills producing a robust and multi-skilled physics-based controller. This algorithm combines the skills of experts and then applies transfer learning methods to initialize and accelerate the learning of new skills. In the last chapter, the importance of good benchmarks for improving reinforcement learning research is discussed. The computer vision community has benefited from large carefully processed collections of data, and, similarly, reinforcement learning needs well constructed and interesting environments to drive progress.

Item Metadata

Title	Scalable deep reinforcement learning for physics-based motion control
Creator	Berseth, Glen
Publisher	University of British Columbia
Date Issued	2019
Description	This thesis studies the broad problem of learning robust control policies for difficult physics-based motion control tasks such as locomotion and navigation. A number of avenues are explored to assist in learning such control. In particular, are there underlying structures in the motor-learning system that enable learning solutions to complex tasks? How are animals able to learn new skills so efficiently? Animals may be learning and using implicit models of their environment to assist in planning and exploration. These potential structures motivate the design of learning systems and in this thesis, we study their effectiveness on physically simulated and robotic motor-control tasks. Five contributions that build on motion control using deep reinforcement learning are presented. First, a case study on the motion control problem of brachiation, the movement of gibbons through trees is presented. This work compares parametric and non-parametric models for reinforcement learning. The difficulty of this motion control problem motivates separating the control problem into multiple levels. Second, a hierarchical decomposition is presented that enables efficient learning by operating across multiple time scales for a complex locomotion and navigation task. First, reinforcement learning is used to acquire a low-level, high-frequency policy for joint actuation, used for bipedal footstep-directed walking. Subsequently, an additional policy is learned that provides directed footstep plans to the first level of control in order to navigate through the environment. Third, improved action exploration methods are investigated. An explicit action valued function is constructed using the learned model. Using this action-valued function we can compute actions that increase the value of future states. Fourth, a new algorithm is designed to progressively learn and integrate new skills producing a robust and multi-skilled physics-based controller. This algorithm combines the skills of experts and then applies transfer learning methods to initialize and accelerate the learning of new skills. In the last chapter, the importance of good benchmarks for improving reinforcement learning research is discussed. The computer vision community has benefited from large carefully processed collections of data, and, similarly, reinforcement learning needs well constructed and interesting environments to drive progress.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2019-04-10
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0378079
URI	http://hdl.handle.net/2429/69572
Degree	Doctor of Philosophy - PhD
Program	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Graduation Date	2019-05
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Scalable deep reinforcement learning for physics-based motion control Berseth, Glen

Abstract

Item Metadata

Item Media

Item Citations and Data

Permanent URL (DOI):

Rights