Open Collections

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Identification of invariant acoustic cues in stop consonants using the Wigner distribution Garudadri, Harinath

Abstract

It is a common belief that there are invariant acoustic patterns in speech signals, which can be related to their phonetic description. These patterns are expected to remain invariant, independent of the language, speaker, phonetic context, etc. Although many investigations based on short-time spectral analysis have established the feasibility of extracting invariant cues in certain contexts, they could not provide a set of invariant cues in any given phonetic context. In this thesis, the Wigner distribution (WD) was used to analyze speech signals for the first time, to investigate acoustic invariance. The WD, like the spectrogram, provides a time-frequency description of the signal. Unlike the spectrogram, it provides correct marginals in the time and frequency domains, but it is not a positive distribution. It is demonstrated here that the partially smoothed WD, in which both the properties of positivity and correct marginals are sacrificed to some extent, provides a better time-frequency resolution than short-time spectral analyses methods. An implementation and an interpretation of the partially smoothed WD are presented. The choice of smoothing parameters and the nature of cross-term suppression in a partially smoothed WD are discussed in detail. It is shown that the cross-terms in a partially smoothed WD do not mask the underlying nature of a signal in the time-frequency plane. A partially smoothed WD was used to investigate acoustic invariance in voiceless, unaspirated stop consonants spoken by native speakers of English, Telugu and French. Contrary to reports in the literature, it was shown that the features "diffuse-rising" and "compact" spectral shapes were not unique to alveolar and velar places of articulation, respectively, but depended on the vowel context. The resulting ambiguities when specifying the place of articulation were resolved using Formant Onset Duration (time taken for the steady state formants to occur in the vocal tract after the consonantal release) and F₂ of the following vowel. The place of articulation was specified correctly for 86% of the tokens. Unlike in other investigations, the errors in specifying the place of articulation were uniformly distributed over all vowel contexts.

Item Metadata

Title	Identification of invariant acoustic cues in stop consonants using the Wigner distribution
Creator	Garudadri, Harinath
Publisher	University of British Columbia
Date Issued	1987
Description	It is a common belief that there are invariant acoustic patterns in speech signals, which can be related to their phonetic description. These patterns are expected to remain invariant, independent of the language, speaker, phonetic context, etc. Although many investigations based on short-time spectral analysis have established the feasibility of extracting invariant cues in certain contexts, they could not provide a set of invariant cues in any given phonetic context. In this thesis, the Wigner distribution (WD) was used to analyze speech signals for the first time, to investigate acoustic invariance. The WD, like the spectrogram, provides a time-frequency description of the signal. Unlike the spectrogram, it provides correct marginals in the time and frequency domains, but it is not a positive distribution. It is demonstrated here that the partially smoothed WD, in which both the properties of positivity and correct marginals are sacrificed to some extent, provides a better time-frequency resolution than short-time spectral analyses methods. An implementation and an interpretation of the partially smoothed WD are presented. The choice of smoothing parameters and the nature of cross-term suppression in a partially smoothed WD are discussed in detail. It is shown that the cross-terms in a partially smoothed WD do not mask the underlying nature of a signal in the time-frequency plane. A partially smoothed WD was used to investigate acoustic invariance in voiceless, unaspirated stop consonants spoken by native speakers of English, Telugu and French. Contrary to reports in the literature, it was shown that the features "diffuse-rising" and "compact" spectral shapes were not unique to alveolar and velar places of articulation, respectively, but depended on the vowel context. The resulting ambiguities when specifying the place of articulation were resolved using Formant Onset Duration (time taken for the steady state formants to occur in the vocal tract after the consonantal release) and F₂ of the following vowel. The place of articulation was specified correctly for 86% of the tokens. Unlike in other investigations, the errors in specifying the place of articulation were uniformly distributed over all vowel contexts.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2010-09-28
Provider	Vancouver : University of British Columbia Library
Rights	For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
DOI	10.14288/1.0098033
URI	http://hdl.handle.net/2429/28786
Degree (Theses)	Doctor of Philosophy - PhD
Program (Theses)	Electrical and Computer Engineering
Affiliation	Applied Science, Faculty of; Electrical Engineering, Department of
Degree Grantor	University of British Columbia
Campus	UBCV
Scholarly Level	Graduate
Aggregated Source Repository	DSpace

Item Media

UBC_1988_A1 G37.pdf -- 7.11MB

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.