- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- An expert system for the recognition of general symbols
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
An expert system for the recognition of general symbols Ahmed, Maher M.
Abstract
This thesis addresses the problem of automatic recognition of any hand written symbol. The number of different styles of handwritten symbols demonstrates the difficulties that an automatic recognizer must cope with. For example, Some handwritten styles of capital A letters are: [picture] The main technical approaches for solving the problems in recognizing patterns are statistical pattern recognition and structural pattern recognition. Statistical pattern recognition systems use pixel-features for recognition. Some of these features are moments, histograms, Fourier transforms, and the percentage of ink pixels in different zones. Although statistical pattern recognition techniques, including Artificial Neural Networks (ANNs), carry a high recognition rate, there are some disadvantages that are in these systems. Disadvantages include the requirements of a very large number of training data, and the inability to justify its answer. In addition, the output is only a classification and not a description of the actual pattern. As opposed to statistical pattern recognition techniques, structural pattern recognition techniques extract commonly used descriptions of the patterns (structural-features). These features include loops, end points, and arcs. After extracting these similarities, the system then finds the common relationships among these structural-features (descriptions). In this research, the structural pattern recognition approach was used for developing an expert system that extracts structural-features (descriptions) from the symbol at each stage of recognition. The developed system enabled us to automatically recognize handwritten symbols, assuming that the symbols are in their isolated forms. This system is unique in that it is not limited for a specific application, but it can be used to recognize any general symbol of any language. To obtain a representation of a symbol the system performs four basic steps. First, the system adjusts the symbol by rotating it around its central point until its principal axis aligns with the vertical axis or having a multiple of 20° to the vertical axis. Second, the system scales the symbol to a predefined size. The third step is to thin the symbol. A novel rule-based system for thinning is developed in this research. The resultant thinned image is composed of the central lines of the image. Finally, the last step involves extracting and describing the thinned symbol in terms of strokes. These strokes will be approximated by a set of line segments. The resulting representation of the symbol is compared with different stored models of the different symbols in the system knowledge base. For each symbol many models are stored. The results of our system depend on a certain threshold. Using a low threshold will decrease the space for this symbol, increase the rejection rate and increase the recognition rate. The system was tested with 5726 handwritten English characters. When the system learned an average of 97 models per symbol and used a low threshold, the recognition rate was 95% and the rejection rate was 16.1%. The tested data were all test data (binary data) taken from the Center of Excellence for Document Analysis and Recognition (CEDAR) database. When the threshold is 100, the recognition rate was 87.6% and the rejection rate was 0%. The recognition rate of our system can be increased by storing more models for each symbol or by increasing the rejection rate. The system is capable of learning new symbols by simply adding models for these symbols to the system knowledge base. The system is implemented using C++ running on a 120 MHz Pentium PC.
Item Metadata
Title |
An expert system for the recognition of general symbols
|
Creator | |
Publisher |
University of British Columbia
|
Date Issued |
1999
|
Description |
This thesis addresses the problem of automatic recognition of any hand written symbol.
The number of different styles of handwritten symbols demonstrates the difficulties that
an automatic recognizer must cope with. For example, Some handwritten styles of capital
A letters are: [picture]
The main technical approaches for solving the problems in recognizing patterns are
statistical pattern recognition and structural pattern recognition.
Statistical pattern recognition systems use pixel-features for recognition. Some of these
features are moments, histograms, Fourier transforms, and the percentage of ink pixels in
different zones. Although statistical pattern recognition techniques, including Artificial
Neural Networks (ANNs), carry a high recognition rate, there are some disadvantages
that are in these systems. Disadvantages include the requirements of a very large number
of training data, and the inability to justify its answer. In addition, the output is only a
classification and not a description of the actual pattern.
As opposed to statistical pattern recognition techniques, structural pattern recognition
techniques extract commonly used descriptions of the patterns (structural-features). These
features include loops, end points, and arcs. After extracting these similarities, the
system then finds the common relationships among these structural-features
(descriptions).
In this research, the structural pattern recognition approach was used for developing an
expert system that extracts structural-features (descriptions) from the symbol at each
stage of recognition. The developed system enabled us to automatically recognize
handwritten symbols, assuming that the symbols are in their isolated forms. This system
is unique in that it is not limited for a specific application, but it can be used to recognize
any general symbol of any language.
To obtain a representation of a symbol the system performs four basic steps. First, the
system adjusts the symbol by rotating it around its central point until its principal axis
aligns with the vertical axis or having a multiple of 20° to the vertical axis. Second, the
system scales the symbol to a predefined size.
The third step is to thin the symbol. A novel rule-based system for thinning is developed
in this research. The resultant thinned image is composed of the central lines of the
image. Finally, the last step involves extracting and describing the thinned symbol in
terms of strokes. These strokes will be approximated by a set of line segments.
The resulting representation of the symbol is compared with different stored models of
the different symbols in the system knowledge base. For each symbol many models are
stored. The results of our system depend on a certain threshold. Using a low threshold
will decrease the space for this symbol, increase the rejection rate and increase the
recognition rate.
The system was tested with 5726 handwritten English characters. When the system
learned an average of 97 models per symbol and used a low threshold, the recognition
rate was 95% and the rejection rate was 16.1%. The tested data were all test data (binary
data) taken from the Center of Excellence for Document Analysis and Recognition
(CEDAR) database. When the threshold is 100, the recognition rate was 87.6% and the
rejection rate was 0%.
The recognition rate of our system can be increased by storing more models for each
symbol or by increasing the rejection rate. The system is capable of learning new
symbols by simply adding models for these symbols to the system knowledge base. The
system is implemented using C++ running on a 120 MHz Pentium PC.
|
Extent |
7618527 bytes
|
Genre | |
Type | |
File Format |
application/pdf
|
Language |
eng
|
Date Available |
2009-07-02
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
|
DOI |
10.14288/1.0065333
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
1999-11
|
Campus | |
Scholarly Level |
Graduate
|
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.