Schema labelling applied to hand-printed Chinese character recognition

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Schema labelling applied to hand-printed Chinese character recognition Bult, Timothy Paul

Abstract

Hand-printed Chinese character recognition presents an interesting problem for Artificial Intelligence research. Input data in the form of arrays of pixel values cannot be directly mapped to unique character identifications because of the complexity of the characters. Thus, intermediate data structures are necessary, which in turn lead to a need to represent knowledge of the characters' composition. Building the intermediate constructs for these hand-printed characters necessarily involves choices among ambiguities, the set of which is so large that an efficient search algorithm becomes central to the recognition process. Schema labelling is a theory of how knowledge should be organized for recognition tasks in which composition structure is inherent in the domain, the composition entails ambiguity, and the ambiguity generates large search spaces. This thesis describes an implementation of an enhanced version of schema labelling for Chinese characters. The specific problems addressed by the enhancements, with some success, are (i) the segmentation of real images into objects usable by the schema system, (ii) the definition of schemas which adequately describe the generic composition of hand-printed Chinese characters, as well as common variations or vagaries, and (iii) the inclusion of sufficient "control knowledge" to prevent combinatorial explosion of the backtracking recognition process. Test characters for recognition systems can be classified along several dimensions. On the spectrum from type-set, through hand-printed, to hand-written forms, our system was tested on restricted hand-print, at a level somewhat more difficult than is normally attempted. On the spectrum of input types, from grey-scale pixel input through on-line stroke representations, our system was fully tested only at the high end, with complete synthetic strokes. We obtained a success rate of 57%, 12 out of the 21 characters tested. The principal success of the work is that characters of the complexity tested could be recognized at all, and in the impact schema labelling techniques had on that recognition.

Item Metadata

Title	Schema labelling applied to hand-printed Chinese character recognition
Creator	Bult, Timothy Paul
Publisher	University of British Columbia
Date Issued	1987
Description	Hand-printed Chinese character recognition presents an interesting problem for Artificial Intelligence research. Input data in the form of arrays of pixel values cannot be directly mapped to unique character identifications because of the complexity of the characters. Thus, intermediate data structures are necessary, which in turn lead to a need to represent knowledge of the characters' composition. Building the intermediate constructs for these hand-printed characters necessarily involves choices among ambiguities, the set of which is so large that an efficient search algorithm becomes central to the recognition process. Schema labelling is a theory of how knowledge should be organized for recognition tasks in which composition structure is inherent in the domain, the composition entails ambiguity, and the ambiguity generates large search spaces. This thesis describes an implementation of an enhanced version of schema labelling for Chinese characters. The specific problems addressed by the enhancements, with some success, are (i) the segmentation of real images into objects usable by the schema system, (ii) the definition of schemas which adequately describe the generic composition of hand-printed Chinese characters, as well as common variations or vagaries, and (iii) the inclusion of sufficient "control knowledge" to prevent combinatorial explosion of the backtracking recognition process. Test characters for recognition systems can be classified along several dimensions. On the spectrum from type-set, through hand-printed, to hand-written forms, our system was tested on restricted hand-print, at a level somewhat more difficult than is normally attempted. On the spectrum of input types, from grey-scale pixel input through on-line stroke representations, our system was fully tested only at the high end, with complete synthetic strokes. We obtained a success rate of 57%, 12 out of the 21 characters tested. The principal success of the work is that characters of the complexity tested could be recognized at all, and in the impact schema labelling techniques had on that recognition.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2010-07-07
Provider	Vancouver : University of British Columbia Library
Rights	For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
DOI	10.14288/1.0051883
URI	http://hdl.handle.net/2429/26175
Degree	Master of Science - MSc
Program	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Campus	UBCV
Scholarly Level	Graduate
Aggregated Source Repository	DSpace

Item Media

UBC_1987_A6_7 B84.pdf -- 4.04MB

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.

Open Collections

UBC Theses and Dissertations

Schema labelling applied to hand-printed Chinese character recognition Bult, Timothy Paul

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights