Investigation of time-domain measurements for analysis and machine recognition of speech

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Investigation of time-domain measurements for analysis and machine recognition of speech Ito, Mabo Robert

Abstract

At present in speech analysis and mechanical speech recognition work, spectral measurements are the conventional form of signal representation and acoustical descriptions of speech sounds are usually given in terms of this form of representation. In this thesis, certain time-domain measurements are investigated as an alternative form of signal representation and as a basis for acoustical characterization of speech sounds. The primary measurements studied are the short-time averages of the zero-crossing rate of the acoustic waveform and the distribution patterns of the time intervals between zero-crossings. These measurements are found to be easy to implement with digital techniques and are implemented through digital computer simulation. Other advantages of these measurements include effectiveness in handling the large intensity range of speech sounds and ability to track rapid transient phenomena such as the release of unvoiced stops. Computer software for an interactive graphics facility was developed for acquisition, presentation, manipulation and analysis of the acoustic speech data. One of the pattern analysis programs, for the display of time-interval distribution data, yielded a visual presentation which could be compared to frequency spectrograms. Theoretical expressions are developed to relate the time-domain and spectral representation for some phone types and these relationships are compared with experimental results. The above theoretical expressions show that important spectral characterization features are accounted for. These findings, combined with empirical observation of the utility of the time-domain signal representation in phonetic characterization, indicates that this form of representation is a useful alternative to the spectral representation. The speech materials employed were selected to study temporal structures and contextual variations of acoustic properties and to provide quantitative data useful for word recognition applications. The vowels, fricatives and stops were the main phoneme classes studied. Quantitative data on the acoustic properties of the selected phonemes is presented and discussed in terms of i) our own spectral data, ii) other data reported in the literature and iii) simple production models. The time-domain signal representation was found to provide an effective means of analyzing and characterizing the acoustically complex stops and voiced fricatives. For the vowels and unvoiced fricatives, which are well suited to spectral analysis, the time domain measurements were found to yield very simple and direct characterization features. Some limited phonemic decomposition and machine recognition work is described which demonstrates the design of useful characterization features and provides a basis for further work.

Item Metadata

Title	Investigation of time-domain measurements for analysis and machine recognition of speech
Creator	Ito, Mabo Robert
Publisher	University of British Columbia
Date Issued	1971
Description	At present in speech analysis and mechanical speech recognition work, spectral measurements are the conventional form of signal representation and acoustical descriptions of speech sounds are usually given in terms of this form of representation. In this thesis, certain time-domain measurements are investigated as an alternative form of signal representation and as a basis for acoustical characterization of speech sounds. The primary measurements studied are the short-time averages of the zero-crossing rate of the acoustic waveform and the distribution patterns of the time intervals between zero-crossings. These measurements are found to be easy to implement with digital techniques and are implemented through digital computer simulation. Other advantages of these measurements include effectiveness in handling the large intensity range of speech sounds and ability to track rapid transient phenomena such as the release of unvoiced stops. Computer software for an interactive graphics facility was developed for acquisition, presentation, manipulation and analysis of the acoustic speech data. One of the pattern analysis programs, for the display of time-interval distribution data, yielded a visual presentation which could be compared to frequency spectrograms. Theoretical expressions are developed to relate the time-domain and spectral representation for some phone types and these relationships are compared with experimental results. The above theoretical expressions show that important spectral characterization features are accounted for. These findings, combined with empirical observation of the utility of the time-domain signal representation in phonetic characterization, indicates that this form of representation is a useful alternative to the spectral representation. The speech materials employed were selected to study temporal structures and contextual variations of acoustic properties and to provide quantitative data useful for word recognition applications. The vowels, fricatives and stops were the main phoneme classes studied. Quantitative data on the acoustic properties of the selected phonemes is presented and discussed in terms of i) our own spectral data, ii) other data reported in the literature and iii) simple production models. The time-domain signal representation was found to provide an effective means of analyzing and characterizing the acoustically complex stops and voiced fricatives. For the vowels and unvoiced fricatives, which are well suited to spectral analysis, the time domain measurements were found to yield very simple and direct characterization features. Some limited phonemic decomposition and machine recognition work is described which demonstrates the design of useful characterization features and provides a basis for further work.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2011-04-21
Provider	Vancouver : University of British Columbia Library
Rights	For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
DOI	10.14288/1.0101825
URI	http://hdl.handle.net/2429/33942
Degree	Doctor of Philosophy - PhD
Program	Electrical and Computer Engineering
Affiliation	Applied Science, Faculty of; Electrical and Computer Engineering, Department of
Degree Grantor	University of British Columbia
Campus	UBCV
Scholarly Level	Graduate
Aggregated Source Repository	DSpace

Item Media

UBC_1971_A1 I86.pdf -- 15.9MB

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.

Open Collections

UBC Theses and Dissertations

Investigation of time-domain measurements for analysis and machine recognition of speech Ito, Mabo Robert

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights