Reinforcement learning using sensorimotor traces

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Reinforcement learning using sensorimotor traces Li, Jingxian

Abstract

The skilled motions of humans and animals are the result of learning good solutions to difficult sensorimotor control problems. This thesis explores new models for using reinforcement learning to acquire motion skills, with potential applications to computer animation and robotics. Reinforcement learning offers a principled methodology for tackling control problems. However, it is difficult to apply in high-dimensional settings, such as the ones that we wish to explore, where the body can have many degrees of freedom, the environment can have significant complexity, and there can be further redundancies that exist in the sensory representations that are available to perceive the state of the body and the environment. In this context, challenges to overcome include: a state space that cannot be fully explored; the need to model how the state of the body and the perceived state of the environment evolve together over time; and solutions that can work with only a small number of sensorimotor experiences. Our contribution is a reinforcement learning method that implicitly represents the current state of the body and the environment using sensorimotor traces. A distance metric is defined between the ongoing sensorimotor trace and previously experienced sensorimotor traces and this is used to model the current state as a weighted mixture of past experiences. Sensorimotor traces play multiple roles in our method: they provide an embodied representation of the state (and therefore also the value function and the optimal actions), and they provide an embodied model of the system dynamics. In our implementation, we focus specifically on learning steering behaviors for a vehicle driving along straight roads, winding roads, and through intersections. The vehicle is equipped with a set of distance sensors. We apply value-iteration using off-policy experiences in order to produce control policies capable of steering the vehicle in a wide range of circumstances. An experimental analysis is provided of the effect of various design choices. In the future we expect that similar ideas can be applied to other high-dimensional systems, such as bipedal systems that are capable of walking over variable terrain, also driven by control policies based on sensorimotor traces.

Item Metadata

Title	Reinforcement learning using sensorimotor traces
Creator	Li, Jingxian
Publisher	University of British Columbia
Date Issued	2013
Description	The skilled motions of humans and animals are the result of learning good solutions to difficult sensorimotor control problems. This thesis explores new models for using reinforcement learning to acquire motion skills, with potential applications to computer animation and robotics. Reinforcement learning offers a principled methodology for tackling control problems. However, it is difficult to apply in high-dimensional settings, such as the ones that we wish to explore, where the body can have many degrees of freedom, the environment can have significant complexity, and there can be further redundancies that exist in the sensory representations that are available to perceive the state of the body and the environment. In this context, challenges to overcome include: a state space that cannot be fully explored; the need to model how the state of the body and the perceived state of the environment evolve together over time; and solutions that can work with only a small number of sensorimotor experiences. Our contribution is a reinforcement learning method that implicitly represents the current state of the body and the environment using sensorimotor traces. A distance metric is defined between the ongoing sensorimotor trace and previously experienced sensorimotor traces and this is used to model the current state as a weighted mixture of past experiences. Sensorimotor traces play multiple roles in our method: they provide an embodied representation of the state (and therefore also the value function and the optimal actions), and they provide an embodied model of the system dynamics. In our implementation, we focus specifically on learning steering behaviors for a vehicle driving along straight roads, winding roads, and through intersections. The vehicle is equipped with a set of distance sensors. We apply value-iteration using off-policy experiences in order to produce control policies capable of steering the vehicle in a wide range of circumstances. An experimental analysis is provided of the effect of various design choices. In the future we expect that similar ideas can be applied to other high-dimensional systems, such as bipedal systems that are capable of walking over variable terrain, also driven by control policies based on sensorimotor traces.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2013-12-05
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0103383
URI	http://hdl.handle.net/2429/45590
Degree	Master of Science - MSc
Program	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Graduation Date	2014-05
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Reinforcement learning using sensorimotor traces Li, Jingxian

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights