- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Genetic algorithm for feature selection and weighting...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Genetic algorithm for feature selection and weighting for off-line character recognition Hussein, Faten T.
Abstract
Computer-based pattern recognition is a process that involves several sub-processes, including pre-processing, feature extraction, classification, and post-processing. This thesis is involved with feature selection and feature weighting processes. Feature extraction is the measurement of certain attributes of the target pattern. Classification utilizes the values of these attributes to assign a class to the input pattern. In our view, the selection and weighting of the right set of features is the hardest part of building a pattern recognition system. The ultimate aim of our research work is the automation of the process of feature selection and weighting, within the context of character/symbol recognition systems. Our chosen optimization method for feature selection and weighting is the genetic algorithm approach. Feature weighting is the general case of feature selection, and hence it is expected to perform better than or at least the same as feature selection. The initial purpose of this study was to test the validity of this hypothesis within the context of character recognition systems and using genetic algorithms. However, our study shows that this is not true. We carried two sets of experimental studies. The first set compares the performance of Genetic Algorithm (GA)-based feature selection to GA-based feature weighting, under various circumstances. The second set of studies evaluates the performance of the better method (which turned out to be feature selection) in terms of optimal performance and time. The results of these studies also show that (a) in the presence of redundant or irrelevant features, feature set selection prior to classification is important for k-nearest neighbor classifiers; and (b) that GA is an effective method for feature selection and the performance obtained using genetic algorithms was comparable to that of exhaustive search. However, the scalability of GA to highly dimensional problems, although far superior to that of exhaustive search, is still an open problem.
Item Metadata
Title |
Genetic algorithm for feature selection and weighting for off-line character recognition
|
Creator | |
Publisher |
University of British Columbia
|
Date Issued |
2002
|
Description |
Computer-based pattern recognition is a process that involves several sub-processes,
including pre-processing, feature extraction, classification, and post-processing. This thesis is
involved with feature selection and feature weighting processes. Feature extraction is the
measurement of certain attributes of the target pattern. Classification utilizes the values of
these attributes to assign a class to the input pattern. In our view, the selection and weighting
of the right set of features is the hardest part of building a pattern recognition system. The
ultimate aim of our research work is the automation of the process of feature selection and
weighting, within the context of character/symbol recognition systems. Our chosen
optimization method for feature selection and weighting is the genetic algorithm approach.
Feature weighting is the general case of feature selection, and hence it is expected to
perform better than or at least the same as feature selection. The initial purpose of this study
was to test the validity of this hypothesis within the context of character recognition systems
and using genetic algorithms. However, our study shows that this is not true. We carried two
sets of experimental studies. The first set compares the performance of Genetic Algorithm
(GA)-based feature selection to GA-based feature weighting, under various circumstances.
The second set of studies evaluates the performance of the better method (which turned out
to be feature selection) in terms of optimal performance and time. The results of these studies
also show that (a) in the presence of redundant or irrelevant features, feature set selection
prior to classification is important for k-nearest neighbor classifiers; and (b) that GA is an
effective method for feature selection and the performance obtained using genetic algorithms
was comparable to that of exhaustive search. However, the scalability of GA to highly
dimensional problems, although far superior to that of exhaustive search, is still an open
problem.
|
Extent |
5722150 bytes
|
Genre | |
Type | |
File Format |
application/pdf
|
Language |
eng
|
Date Available |
2009-08-12
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
|
DOI |
10.14288/1.0065127
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2002-05
|
Campus | |
Scholarly Level |
Graduate
|
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.