- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Nonparametric learning from examples in very high dimensional...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Nonparametric learning from examples in very high dimensional spaces Grudic, Gregory Zlatko
Abstract
Constructing predictive models or mappings from sample data is an important goal in
many areas of science. Most of the research in this area has been directed towards relatively low
dimensional models; however, many real world problems can be large and very high dimensional.
Such problems require general learning methods which are fundamentally different from those
used to construct low dimensional models. In this thesis a new nonparametric regression methodology
(SPORE) is proposed for very large and very high dimensional learning problems:
problems with more than 10,000 learning examples of regression data having 100 or more inputs.
The SPORE regression model is constructed incrementally by adding small, low dimensional
parametric building blocks one at a time, using the outputs of previously added blocks as inputs to
the new ones. This process forms stable regression functions from relatively few training
examples, even when there are a large number of input variables. Furthermore, SPORE demands
little computational effort to choose between candidate building blocks or inputs, making it
computationaly feasible in very high dimensional spaces. SPORE challenges two basic
mainstream notions found in contemporary learning algorithms. First, it questions the need to
simultaneously fitting large high dimensional structures to model complex high dimensional
interactions. Second, it rejects the need for "greedy", computationaly expensive searches used to
finding the next "best" building block to add to a regression function. SPORE also allows for the
subdivision of the domain of the input space to make incremental construction both computationaly
and theoretically feasible. Conditions under which the rate of convergence of the method is
independent of the dimension of the data are established. It is also shown that the computational
complexity of constructing a SPORE-type regression model is linear with respect to dimension
within each domain subdivision. In addition, conditions are given under which no domain
subdivision is necessary.
The most basic version of this regression methodology (SPORE-1) is implemented and
empirically evaluated on four types of data sets. The SPORE-1 learning algorithm is completely
automatic and requires no manual intervention. First, SPORE-1 is applied to 10 regression
problems found in the literature and is shown to produce regression functions which are as good
or better, with respect to mean squared approximation error, than published results on 9 of these
data sets. Second, SPORE-1 is applied to 15 new, synthetic, large, very high dimensional data sets
(40,000 learning examples of 100, 200, 400, 800, and 1600 inputs) and is shown to construct
effective regression functions in the presence of both input and output noise. Third, SPORE-1 is
used to build mappings from input/output learning examples generated by a human using tele-operation
to execute an 'object locate and approach' task sequence. SPORE-1 effectively builds this
mapping and directs a robot to autonomously execute the task demonstrated by the human
operator, even in a visually cluttered environment with an unpredictable background. Finally,
SPORE-1 is successfully applied to the 10-bit parity problem to demonstrate its efficacy on
problems which have flat low dimensional projections, thus showing that it is not subject to the
same limitations as other algorithms that build regression functions using low dimensional
parametric building blocks, added one at a time.
Item Metadata
| Title |
Nonparametric learning from examples in very high dimensional spaces
|
| Creator | |
| Publisher |
University of British Columbia
|
| Date Issued |
1997
|
| Description |
Constructing predictive models or mappings from sample data is an important goal in
many areas of science. Most of the research in this area has been directed towards relatively low
dimensional models; however, many real world problems can be large and very high dimensional.
Such problems require general learning methods which are fundamentally different from those
used to construct low dimensional models. In this thesis a new nonparametric regression methodology
(SPORE) is proposed for very large and very high dimensional learning problems:
problems with more than 10,000 learning examples of regression data having 100 or more inputs.
The SPORE regression model is constructed incrementally by adding small, low dimensional
parametric building blocks one at a time, using the outputs of previously added blocks as inputs to
the new ones. This process forms stable regression functions from relatively few training
examples, even when there are a large number of input variables. Furthermore, SPORE demands
little computational effort to choose between candidate building blocks or inputs, making it
computationaly feasible in very high dimensional spaces. SPORE challenges two basic
mainstream notions found in contemporary learning algorithms. First, it questions the need to
simultaneously fitting large high dimensional structures to model complex high dimensional
interactions. Second, it rejects the need for "greedy", computationaly expensive searches used to
finding the next "best" building block to add to a regression function. SPORE also allows for the
subdivision of the domain of the input space to make incremental construction both computationaly
and theoretically feasible. Conditions under which the rate of convergence of the method is
independent of the dimension of the data are established. It is also shown that the computational
complexity of constructing a SPORE-type regression model is linear with respect to dimension
within each domain subdivision. In addition, conditions are given under which no domain
subdivision is necessary.
The most basic version of this regression methodology (SPORE-1) is implemented and
empirically evaluated on four types of data sets. The SPORE-1 learning algorithm is completely
automatic and requires no manual intervention. First, SPORE-1 is applied to 10 regression
problems found in the literature and is shown to produce regression functions which are as good
or better, with respect to mean squared approximation error, than published results on 9 of these
data sets. Second, SPORE-1 is applied to 15 new, synthetic, large, very high dimensional data sets
(40,000 learning examples of 100, 200, 400, 800, and 1600 inputs) and is shown to construct
effective regression functions in the presence of both input and output noise. Third, SPORE-1 is
used to build mappings from input/output learning examples generated by a human using tele-operation
to execute an 'object locate and approach' task sequence. SPORE-1 effectively builds this
mapping and directs a robot to autonomously execute the task demonstrated by the human
operator, even in a visually cluttered environment with an unpredictable background. Finally,
SPORE-1 is successfully applied to the 10-bit parity problem to demonstrate its efficacy on
problems which have flat low dimensional projections, thus showing that it is not subject to the
same limitations as other algorithms that build regression functions using low dimensional
parametric building blocks, added one at a time.
|
| Extent |
9489394 bytes
|
| Genre | |
| Type | |
| File Format |
application/pdf
|
| Language |
eng
|
| Date Available |
2009-03-31
|
| Provider |
Vancouver : University of British Columbia Library
|
| Rights |
For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
|
| DOI |
10.14288/1.0065207
|
| URI | |
| Degree (Theses) | |
| Program (Theses) | |
| Affiliation | |
| Degree Grantor |
University of British Columbia
|
| Graduation Date |
1997-11
|
| Campus | |
| Scholarly Level |
Graduate
|
| Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.