Open Collections

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Nonparametric learning from examples in very high dimensional spaces Grudic, Gregory Zlatko

Abstract

Constructing predictive models or mappings from sample data is an important goal in many areas of science. Most of the research in this area has been directed towards relatively low dimensional models; however, many real world problems can be large and very high dimensional. Such problems require general learning methods which are fundamentally different from those used to construct low dimensional models. In this thesis a new nonparametric regression methodology (SPORE) is proposed for very large and very high dimensional learning problems: problems with more than 10,000 learning examples of regression data having 100 or more inputs. The SPORE regression model is constructed incrementally by adding small, low dimensional parametric building blocks one at a time, using the outputs of previously added blocks as inputs to the new ones. This process forms stable regression functions from relatively few training examples, even when there are a large number of input variables. Furthermore, SPORE demands little computational effort to choose between candidate building blocks or inputs, making it computationaly feasible in very high dimensional spaces. SPORE challenges two basic mainstream notions found in contemporary learning algorithms. First, it questions the need to simultaneously fitting large high dimensional structures to model complex high dimensional interactions. Second, it rejects the need for "greedy", computationaly expensive searches used to finding the next "best" building block to add to a regression function. SPORE also allows for the subdivision of the domain of the input space to make incremental construction both computationaly and theoretically feasible. Conditions under which the rate of convergence of the method is independent of the dimension of the data are established. It is also shown that the computational complexity of constructing a SPORE-type regression model is linear with respect to dimension within each domain subdivision. In addition, conditions are given under which no domain subdivision is necessary. The most basic version of this regression methodology (SPORE-1) is implemented and empirically evaluated on four types of data sets. The SPORE-1 learning algorithm is completely automatic and requires no manual intervention. First, SPORE-1 is applied to 10 regression problems found in the literature and is shown to produce regression functions which are as good or better, with respect to mean squared approximation error, than published results on 9 of these data sets. Second, SPORE-1 is applied to 15 new, synthetic, large, very high dimensional data sets (40,000 learning examples of 100, 200, 400, 800, and 1600 inputs) and is shown to construct effective regression functions in the presence of both input and output noise. Third, SPORE-1 is used to build mappings from input/output learning examples generated by a human using tele-operation to execute an 'object locate and approach' task sequence. SPORE-1 effectively builds this mapping and directs a robot to autonomously execute the task demonstrated by the human operator, even in a visually cluttered environment with an unpredictable background. Finally, SPORE-1 is successfully applied to the 10-bit parity problem to demonstrate its efficacy on problems which have flat low dimensional projections, thus showing that it is not subject to the same limitations as other algorithms that build regression functions using low dimensional parametric building blocks, added one at a time.

Item Metadata

Title	Nonparametric learning from examples in very high dimensional spaces
Creator	Grudic, Gregory Zlatko
Publisher	University of British Columbia
Date Issued	1997
Description	Constructing predictive models or mappings from sample data is an important goal in many areas of science. Most of the research in this area has been directed towards relatively low dimensional models; however, many real world problems can be large and very high dimensional. Such problems require general learning methods which are fundamentally different from those used to construct low dimensional models. In this thesis a new nonparametric regression methodology (SPORE) is proposed for very large and very high dimensional learning problems: problems with more than 10,000 learning examples of regression data having 100 or more inputs. The SPORE regression model is constructed incrementally by adding small, low dimensional parametric building blocks one at a time, using the outputs of previously added blocks as inputs to the new ones. This process forms stable regression functions from relatively few training examples, even when there are a large number of input variables. Furthermore, SPORE demands little computational effort to choose between candidate building blocks or inputs, making it computationaly feasible in very high dimensional spaces. SPORE challenges two basic mainstream notions found in contemporary learning algorithms. First, it questions the need to simultaneously fitting large high dimensional structures to model complex high dimensional interactions. Second, it rejects the need for "greedy", computationaly expensive searches used to finding the next "best" building block to add to a regression function. SPORE also allows for the subdivision of the domain of the input space to make incremental construction both computationaly and theoretically feasible. Conditions under which the rate of convergence of the method is independent of the dimension of the data are established. It is also shown that the computational complexity of constructing a SPORE-type regression model is linear with respect to dimension within each domain subdivision. In addition, conditions are given under which no domain subdivision is necessary. The most basic version of this regression methodology (SPORE-1) is implemented and empirically evaluated on four types of data sets. The SPORE-1 learning algorithm is completely automatic and requires no manual intervention. First, SPORE-1 is applied to 10 regression problems found in the literature and is shown to produce regression functions which are as good or better, with respect to mean squared approximation error, than published results on 9 of these data sets. Second, SPORE-1 is applied to 15 new, synthetic, large, very high dimensional data sets (40,000 learning examples of 100, 200, 400, 800, and 1600 inputs) and is shown to construct effective regression functions in the presence of both input and output noise. Third, SPORE-1 is used to build mappings from input/output learning examples generated by a human using tele-operation to execute an 'object locate and approach' task sequence. SPORE-1 effectively builds this mapping and directs a robot to autonomously execute the task demonstrated by the human operator, even in a visually cluttered environment with an unpredictable background. Finally, SPORE-1 is successfully applied to the 10-bit parity problem to demonstrate its efficacy on problems which have flat low dimensional projections, thus showing that it is not subject to the same limitations as other algorithms that build regression functions using low dimensional parametric building blocks, added one at a time.
Extent	9489394 bytes
Genre	Thesis/Dissertation
Type	Text
File Format	application/pdf
Language	eng
Date Available	2009-03-31
Provider	Vancouver : University of British Columbia Library
Rights	For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
DOI	10.14288/1.0065207
URI	http://hdl.handle.net/2429/6679
Degree	Doctor of Philosophy - PhD
Program	Electrical and Computer Engineering
Affiliation	Applied Science, Faculty of; Electrical and Computer Engineering, Department of
Degree Grantor	University of British Columbia
Graduation Date	1997-11
Campus	UBCV
Scholarly Level	Graduate
Aggregated Source Repository	DSpace

Item Media

ubc_1997-250636.pdf -- 9.05MB

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.