Machine recognition of typewritten characters based on shape descriptors

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Machine recognition of typewritten characters based on shape descriptors Kanciar, Eugene J.A.

Abstract

An optical character recognition technique for typewritten letters was developed with application to a personal reading machine for the blind. The feature extraction process defined a character in terms of lines and shapes which are closely related to a person's description of form. The system was developed to identify all upper and lower-case typewritten characters in the alphabet. A letter was described by any combination of seven basic features, usually in a 3 x 3 feature matrix. The extraction of topological (or structural) properties had several advantages; a very small feature dictionary with about 100 code-word entries; quick and simple training procedure for a new font; and, a strong capability to handle character deformities. A separate technique, based on edge examination, was developed to identify characters with prominent diagonal features. Sequential classification was employed throughout the entire system so that recognition was made once a sufficiently unique measure was satisfied. Tests on both repeated characters and typewritten passages produced approximately 97% accuracy when the system was applied to three fonts which varied from a stylized to a serifless print. For a scanning rate of 60 wpm, a recognition speed of two characters per second was achieved. The system was developed on a PDP-12 computer and is fully compatible for realization on a PDP-8 computer with 8K of memory.

Item Metadata

Title	Machine recognition of typewritten characters based on shape descriptors
Creator	Kanciar, Eugene J.A.
Publisher	University of British Columbia
Date Issued	1974
Description	An optical character recognition technique for typewritten letters was developed with application to a personal reading machine for the blind. The feature extraction process defined a character in terms of lines and shapes which are closely related to a person's description of form. The system was developed to identify all upper and lower-case typewritten characters in the alphabet. A letter was described by any combination of seven basic features, usually in a 3 x 3 feature matrix. The extraction of topological (or structural) properties had several advantages; a very small feature dictionary with about 100 code-word entries; quick and simple training procedure for a new font; and, a strong capability to handle character deformities. A separate technique, based on edge examination, was developed to identify characters with prominent diagonal features. Sequential classification was employed throughout the entire system so that recognition was made once a sufficiently unique measure was satisfied. Tests on both repeated characters and typewritten passages produced approximately 97% accuracy when the system was applied to three fonts which varied from a stylized to a serifless print. For a scanning rate of 60 wpm, a recognition speed of two characters per second was achieved. The system was developed on a PDP-12 computer and is fully compatible for realization on a PDP-8 computer with 8K of memory.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2010-01-20
Provider	Vancouver : University of British Columbia Library
Rights	For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
DOI	10.14288/1.0065610
URI	http://hdl.handle.net/2429/18757
Degree (Theses)	Master of Applied Science - MASc
Program (Theses)	Electrical and Computer Engineering
Affiliation	Applied Science, Faculty of; Electrical Engineering, Department of
Degree Grantor	University of British Columbia
Campus	UBCV
Scholarly Level	Graduate
Aggregated Source Repository	DSpace

Item Media

UBC_1974_A7 K35.pdf -- 5.65MB

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.

Open Collections

UBC Theses and Dissertations

Machine recognition of typewritten characters based on shape descriptors Kanciar, Eugene J.A.

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights