- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Reinforcement learning using sensorimotor traces
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Reinforcement learning using sensorimotor traces Li, Jingxian
Abstract
The skilled motions of humans and animals are the result of learning good solutions to difficult sensorimotor control problems. This thesis explores new models for using reinforcement learning to acquire motion skills, with potential applications to computer animation and robotics. Reinforcement learning offers a principled methodology for tackling control problems. However, it is difficult to apply in high-dimensional settings, such as the ones that we wish to explore, where the body can have many degrees of freedom, the environment can have significant complexity, and there can be further redundancies that exist in the sensory representations that are available to perceive the state of the body and the environment. In this context, challenges to overcome include: a state space that cannot be fully explored; the need to model how the state of the body and the perceived state of the environment evolve together over time; and solutions that can work with only a small number of sensorimotor experiences. Our contribution is a reinforcement learning method that implicitly represents the current state of the body and the environment using sensorimotor traces. A distance metric is defined between the ongoing sensorimotor trace and previously experienced sensorimotor traces and this is used to model the current state as a weighted mixture of past experiences. Sensorimotor traces play multiple roles in our method: they provide an embodied representation of the state (and therefore also the value function and the optimal actions), and they provide an embodied model of the system dynamics. In our implementation, we focus specifically on learning steering behaviors for a vehicle driving along straight roads, winding roads, and through intersections. The vehicle is equipped with a set of distance sensors. We apply value-iteration using off-policy experiences in order to produce control policies capable of steering the vehicle in a wide range of circumstances. An experimental analysis is provided of the effect of various design choices. In the future we expect that similar ideas can be applied to other high-dimensional systems, such as bipedal systems that are capable of walking over variable terrain, also driven by control policies based on sensorimotor traces.
Item Metadata
Title |
Reinforcement learning using sensorimotor traces
|
Creator | |
Publisher |
University of British Columbia
|
Date Issued |
2013
|
Description |
The skilled motions of humans and animals are the result of learning good solutions
to difficult sensorimotor control problems. This thesis explores new models
for using reinforcement learning to acquire motion skills, with potential applications
to computer animation and robotics. Reinforcement learning offers a principled
methodology for tackling control problems. However, it is difficult to apply
in high-dimensional settings, such as the ones that we wish to explore, where the
body can have many degrees of freedom, the environment can have significant
complexity, and there can be further redundancies that exist in the sensory representations
that are available to perceive the state of the body and the environment.
In this context, challenges to overcome include: a state space that cannot be fully
explored; the need to model how the state of the body and the perceived state of
the environment evolve together over time; and solutions that can work with only
a small number of sensorimotor experiences.
Our contribution is a reinforcement learning method that implicitly represents
the current state of the body and the environment using sensorimotor traces. A
distance metric is defined between the ongoing sensorimotor trace and previously
experienced sensorimotor traces and this is used to model the current state as a
weighted mixture of past experiences. Sensorimotor traces play multiple roles in
our method: they provide an embodied representation of the state (and therefore
also the value function and the optimal actions), and they provide an embodied
model of the system dynamics.
In our implementation, we focus specifically on learning steering behaviors for
a vehicle driving along straight roads, winding roads, and through intersections.
The vehicle is equipped with a set of distance sensors. We apply value-iteration using off-policy experiences in order to produce control policies capable of steering
the vehicle in a wide range of circumstances. An experimental analysis is provided
of the effect of various design choices.
In the future we expect that similar ideas can be applied to other high-dimensional
systems, such as bipedal systems that are capable of walking over variable terrain,
also driven by control policies based on sensorimotor traces.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2013-12-05
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0103383
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2014-05
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International