Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

The importance of temporal synchrony in word recognition Pass, Hollis Elizabeth 1998

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_1998-0573.pdf [ 15.06MB ]
Metadata
JSON: 831-1.0088554.json
JSON-LD: 831-1.0088554-ld.json
RDF/XML (Pretty): 831-1.0088554-rdf.xml
RDF/JSON: 831-1.0088554-rdf.json
Turtle: 831-1.0088554-turtle.txt
N-Triples: 831-1.0088554-rdf-ntriples.txt
Original Record: 831-1.0088554-source.json
Full Text
831-1.0088554-fulltext.txt
Citation
831-1.0088554.ris

Full Text

THE IMPORTANCE OF TEMPORAL SYNCHRONY IN WORD RECOGNITION HOLLIS ELIZABETH PASS B.Sc, Dalhousie University, 1988 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in FACULTY OF GRADUATE STUDIES THE FACULTY OF MEDICINE (School of Audiology and Speech Sciences) We accept this thesis as conforming to the required standard UNIVERSITY OF BRITISH COLUMBIA October 1998 © Hollis Elizabeth Pass, 1998 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, 1 agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of U,-^ ^ ^ f o e ^ r ^ . C", The University of British Columbia Vancouver, Canada DE-6 (2/88) II ABSTRACT The primary purpose of this study was to examine the effect of temporal asynchrony on recognition of words presented in background noise. This was done by determining if participants' ability to recognize and repeat words from monaural, highl-and low-context sentences worsened when: (1) the signal-to-noise ratio decreased (where the noise was multi-talker "babble" presented simultaneously with the signal); and (2) the degree of temporal asynchrony ("jitter") increased. The other purpose of this study was to determine if monaural temporal processing ability (as measured by performance on the "jitter" task) correlated with variables such as gap threshold, age, education, or vocabulary size. In this experiment, 16 young, normal-hearing participants (ages 18-35) repeated the last word of SPIN-R sentences altered by differing degrees of temporal jitter. In addition to one condition where the SPIN-R sentences were unchanged, three jitter conditions were tested: two were moderate degrees of jitter that differed in sound quality and method of creation, and one was a highly jittered condition. Jittered and unjittered stimuli were presented in signal-to-noise ratios of +4 and +8 dB, to examine the interaction effects of temporal asynchrony and background noise on word recognition. Participants also performed a gap detection task (Schneider, 1994), the 20-item Mill Hill Vocabulary test (Raven, 1938), and pure-tone air conduction and SRT tasks. It was found that degree of temporal jitter, signal-to-noise ratio and context all had significant effects on participants' abilities to recognize sentence-final words. One of the moderate jitter conditions affected participants' performance on low-context sentences. Specifically, with a moderate amount of jitter and without context, participants could not accurately repeat all sounds in the target word, and tended to replace one or more sounds with others containing similar phonetic features. The highly jittered condition adversely affected performance on both the high- and low-context sentences; that is, with this degree of jitter, context was no longer enough to help listeners anticipate the sentence-final words, and perceptual errors increased dramatically - particularly in the lower signal-to-noise ratio. Participants' performance on the jittered SPIN task resembled that of elderly listeners on unjittered sentences (Pichora-Fuller, Schneider, & Daneman, 1995). This suggests that the external jitter tested in this study may be similar to that hypothesized to characterize the aging auditory system, and which is thought to interfere with the ability of the elderly to understand language spoken in noisy environments. It was also found that performance on the SPIN task did not correlate with gap threshold, age, years of education, or vocabulary-test score. However, the participants in this study were of similar ages and years of education, and performed similarly on the gap and vocabulary tests. (In addition, a learning effect was found for the gap detection task.) Thus this sample was not representative of all possible participants, and conclusions about correlations must be made cautiously. iv TABLE OF CONTENTS ABSTRACT ii TABLE OF CONTENTS iv LIST OF TABLES ix LIST OF FIGURES x ACKNOWLEDGEMENTS xi 1. LITERATURE REVIEW 1 1.1 Introduction 1 1.2 Evidence of the Importance of Temporal Information in Understanding Spoken Language 6 1.3 The Anatomy and Physiology of Temporal Processing 10 1.4 Potential Sources of Temporal Asynchrony in the Young and Elderly 14 1.5 Evidence of Temporal Asynchrony in the Elderly and Language Impaired 16 1.5.1 Monaural Temporal Processing Tasks 17 1.5.1.1 Gap Detection 17 1.5.1.2 Frequency Discrimination 20 1.5.1.3 Temporal Ordering 21 1.5.1.4 Temporally Modulated Transfer Functions (TMTFs) .22 1.5.1.5 Compressed and Expanded Speech Rates 23 1.5.2 Binaural Temporal Processing Tasks 29 V 1.5.2.1 Localization 29 1.5.2.2 Auditory Scene Analysis and Binaural Unmasking.... 31 1.5.2.3 Binaural Gap Detection 32 1.6 Models of Temporal Processing 33 1.6.1 Monaural Models 33 1.6.2 Binaural Models 36 1.7 Summary 38 1.8 Hypotheses 39 2. METHODS 43 2.1 Objectives 43 2.2 Experimental Design 43 2.3 The Pilot Study 44 2.3.1 Participants in the Pilot Study 44 2.3.2 Materials for the Pilot Study 44 2.3.2.1 The SPIN-R Sentences: An Overview 46 2.3.2.2 Preparation of Jittered Stimuli 48 2.3.2.3 Calibrating the Sound Level of the Stimuli 52 2.3.3 Apparatus and Physical Setting for the Pilot Study 53 2.3.4 Procedure for the Pilot Study 53 2.4 The Main Experiment 54 2.4.1 Participants in the Main Experiment 55 2.4.2 Materials for the Main Experiment 55 2.4.3 Apparatus and Physical Setting for the Main Experiment 55 vi 2.4.4 Procedure for the Main Experiment 55 2.4.4.1 The Gap Detection Test 56 2.4.4.2 The SPIN Task 57 3. RESULTS 59 3.1 Introduction 59 3.2 Scoring Procedure 59 3.3 Results of the Pilot Study 60 3.3.1 Types of Errors Made by Pilot Participants 61 3.4 Results of the Main Experiment 62 3.4.1 Effect of Jitter Condition on Sentence-Final Word-Recognition Scores 62 3.4.2 Effect of Signal-to-Noise Ratio on Sentence-Final Word-Recognition Scores 64 3.4.3 Effect of Context on Sentence-Final Word-Recognition Scores 65 3.4.4 Interaction of Jitter and Signal-to-Noise-Ratio Conditions ....65 3.4.5 Interaction of Jitter and Context Conditions 68 3.4.6 Interaction of Signal-to-Noise-Ratio and Context Conditions 71 3.4.7 Interaction of Jitter, Signal-to-Noise-Ratio, and Context Conditions 72 3.4.8 Correlation between Performance, Gap Threshold, and Other Variables 75 Vll 3.4.9 The Effect of Practice on Gap Detection Thresholds 76 3.4.10 The Difference Between Word-Recognition Scores in High- and Low-Context Conditions 77 3.4.10.1 Effect of Jitter Condition on the Difference Between Word-Recognition Scores in High- and Low-Context Conditions 77 3.4.10.2 Interaction Between Jitter Condition and Signal-to-Noise Ratio, and its Effect on the Difference Between Word-Recognition Scores in High- and Low-Context Conditions 79 3.4.11 Types of Errors Made by Participants in the Main Experiment 82 4. DISCUSSION 84 4.1 Review of Hypotheses 84 4.2 Summary of Results 85 4.2.1 Null Hypothesis 1: Performance on Temporally Jittered Sentences 85 4.2.2 Null Hypothesis 2: Correlation of Performance with Gap Threshold 86 4.2.3 Null Hypothesis 3: Correlation of Performance with Age, Years of Education, and Vocabulary-test Scores 88 4.2.4 Null Hypothesis 4: Perceptual Equivalence of the Moderate Jitter Conditions 88 4.3 Conclusions 89 4.3.1 The Effect of Temporal Asynchrony on Word Recognition... 89 4.3.2 The Effect of Context on Word Recognition 90 4.3.3 The Effect of S:N Ratio on Word Recognition 90 4.4 Future Research Directions 91 REFERENCES 92 APPENDIX A: Participants' Pure-Tone Thresholds (dB HL) for Right (R) and Left(L) Ears 98 APPENDIX B: Participants' Characteristics 99 APPENDIX C: Forms of the Revised SPIN Test 100 APPENDIX D: Time-Amplitude Waveforms of Jittered and Unjittered Tones and Speech Stimuli 122 APPENDIX E: Spectrograms of Jittered and Unjittered Tones and Speech Stimuli 147 APPENDIX F: Connections on the Tucker Davis Technologies Modules 171 APPENDIX G: Instructions to Participants for the SPIN and GAP Tasks 175 APPENDIX H: Percent-Correct Scores of Participants for High-Context, Low-Context, and All Sentences 177 APPENDIX I: Order of Jitter Conditions and SPIN Forms for each Participant 179 APPENDIX J: Models of Temporal Resolution 180 ix LIST OF TABLES Table 1. Mean Percent-Correct Word-Recognition Score for Each Jitter Condition. 62 Table 2. Mean Percent-Correct Word-Recognition Score for Each Jitter Condition in Each S:N-Ratio Condition 66 Table 3. Mean Percent-Correct Word-Recognition Score for Each Jitter Condition in Each Context Condition 68 Table 4. Mean Percent-Correct Word-Recognition Score for Each S.N-Ratio Condition in Each Context Condition 71 Table 5. Mean Percent-Correct Word-Recognition Score for Each Jitter Condition in Each S:N-Ratio and Context Condition 72 Table 6. Mean Difference Between High- and Low-Context Word-Recognition Scores for Each Jitter Condition 78 Table 7. Mean Difference Between High- and Low-Context Word-Recognition Scores for Each Jitter Condition in Each S:N-Ratio Condition 80 X LIST OF FIGURES Figure 1. Mean Percent-Correct Word-Recognition Score in Each Jitter Condition .63 Figure 2. Mean Percent-Correct Word-Recognition Score in Each Jitter and S:N Condition 66 Figure 3. Mean Percent-Correct Word-Recognition Score in Each Jitter and Context Condition 69 Figure 4. Mean Percent-Correct Word-Recognition Score in Each S.N and Context Condition 71 Figure 5. Mean Percent-Correct Word-Recognition Score in Each Jitter, S:N (dB) and Context Condition 73 Figure 6. Gap Threshold Changes Across Test Trials for 3 Participants 76 Figure 7. Mean Difference Between High- and Low-Context Word-Recognition Scores for Each Jitter Condition 78 Figure 8. Mean Difference Between High- and Low-Context Word-Recognition Scores for Each Jitter and S:N Condition 80 ACKNOWLEDGEMENTS I would like to thank Kathy Pichora-Fuller, Jeff Small, Rushen Shi, and Bruce Schneider for their input and guidance during this project. I am also grateful for the assistance and emotional support of my friends at UBC (Michelle, Kristin, Val, and Rhea), as well as my family and friends scattered throughout Canada, the United States, and Israel. I could not have completed this task without them. This research was supported by a grant from the Natural Sciences and Engineering Research Council of Canada. 1 1. LITERATURE REVIEW 1.1 Introduction Spoken language, our primary mode of communication, consists of utterances containing sequences of words of one or more syllables; each syllable, in turn, contains sequences of phonemes. At the prosodic level of speech (i.e., on a time scale large enough for linguistic elements to be identified), temporal characteristics - that is, how words and syllables change over time - frequently convey meaning to the listener (Ladefoged, 1993). For example, when one particular word in an utterance is stressed, its duration and amplitude may increase slightly; the listener can use this temporal information to determine what the speaker wished to emphasize in that utterance. Within a word, individual syllables may also be stressed; the listener can use this temporal information to determine whether the word is a noun or verb. Each syllable is comprised of phonemes, whose pronunciation is strongly affected by the preceding and following phonemes (Moore, 1989). Thus, temporal ordering plays a crucial role in the way speech sounds are made, and can also affect the way we perceive those sounds (Moore, 1989). Up to 30 phonemes per second can occur in running speech (Liberman et al., 1967, as cited in Moore, 1989). Clearly, then, our ability to understand spoken language depends greatly on the capacity to interpret many temporal aspects of very rapidly changing acoustic information. Thus temporal processing, from the segmental to the prosodic level of speech, is crucial for understanding spoken language. 2 Phoneme pronunciation, syllables, and words are not the only elements of running speech that change over time. On a much smaller time scale (e.g., over milliseconds), the fine structure of speech also changes. To understand spoken language, the listener must process temporal information at this level as well (Shannon, Zeng, & Wygonski, 1992; Shannon, Zeng, Kamath, Wygonski, & Ekelid, 1995; Drullman, 1995; Freyman, Nerbonne, & Cote, 1991; Van Tassel, Greenfield, Logemann, & Nielson, 1992; Van Tasell, Soli, Kirby, & Widin, 1987). This rudimentary, sub-prosodic, and sub-phonemic temporal processing is the central topic of this chapter. In the discussion that follows, the terms temporal processing and temporal resolution refer to the listener's ability to detect temporal information and temporal changes at a rudimentary, sub-prosodic, and sub-phonemic level (i.e., over a period of milliseconds). The phrases temporal asynchrony and temporal jitterwill refer to corruption in the processing of sub-prosodic, sub-phonemic temporal information. For years, it was thought that spectral information (i.e., the formants and formant transitions of individual sounds) was the most important information for the interpretation of speech (Erber, 1972, as cited in Shannon et al., 1992). Temporal information was considered important only at the prosodic level. Theories attempting to explain how the auditory system encoded language therefore focussed on the processing of spectral information. Place theories, for example, posited that the different frequencies of speech sounds excited different, tonotopically-organized areas of the cochlea and auditory neurons; thus the location of the excitation indicated which frequencies were being detected (e.g., Whitfield, 1970, as cited in Greenberg, 1996). Further, nerve firing rate increased as the intensity level of the incoming stimulus 3 increased, thereby coding which frequencies had the most energy (i.e., where the formants were). Place theories have failed, however, because this type of physiological coding is insufficient to explain certain behavioral phenomena, such as why humans can still understand language spoken at high (conversational) decibel levels, at which firing rates saturate. At such levels, the nerves reach their maximum firing (i.e., saturation) rate, thus the rates no longer indicate where formants occur (Greenberg, 1996). Without such crucial spectral information, we should not be able to decipher speech sounds at high intensities. The fact that we can suggests that spectral cues are not the only cues we use to understand language (Greenberg, 1996). It also suggests that we do not depend solely upon place of neural excitation to identify spectral information. This is supported by recent studies, which have demonstrated, for example, that temporal ordering can affect how we perceive pitch (Darwin, Hukin, & Al-Khatib, 1995; Darwin & Ciocca, 1992). Thus both spectral cues and place theories are insufficient to explain how humans perceive speech. Speech-recognition devices that are modelled upon place theories fail to simulate language processing in different environments. For example, they cannot encode language spoken in the presence of loud background noise because both speech and background noise are detected equally (Greenberg, 1996; Greenberg, 1997). In physiological terms, a listener relying solely on spectral (or place) cues in this situation would be unable to separate important spectral information, such as the formants of speech sounds, from intense background noise, because both noise and speech would cause neurons to fire. Nevertheless, most spoken language occurs in the presence of some kind of background noise, and most normal-hearing listeners are able to 4 understand it (CHABA, 1988; Moore, 1989). Thus, speech-recognition models must account for the fact that spectral information alone is not enough to encode spoken language, particularly that spoken in the presence of noise. Listeners must be using other types of information to recognize words spoken in noise. Several studies suggest that it is temporal information1 - received both monaurally and binaurally - that allows humans to do this (e.g., Pichora-Fuller & Schneider, 1992; Pichora-Fuller, Schneider, & Daneman, 1995; Shannon et al., 1995). This is supported by the success of speech-recognition devices that take temporal processes, such as neural phase-locking and temporal integration, into account (e.g., Patterson & Allerhand, 1995; Ghitza, 1993); such devices more accurately simulate language processing in the presence of background noise. If, as these studies suggest, temporal information is important for understanding language spoken in noise, a disruption in temporal processing ability (i.e., the presence of temporal asynchrony) should interfere with the ability to perceive speech in noise. Many studies have noted the difficulties elderly people have in perceiving speech in noise, even in the absence of clinically-significant hearing threshold elevation (e.g., Pichora-Fuller et al., 1995; CHABA, 1988). One explanation is that these difficulties are caused by an age-related "slowing" in perceptual and cognitive abilities (Salthouse, 1991, 1994, as cited in Wingfield, Tun, Koh, & Rosen, submitted). An alternative explanation exists, however. If temporal processing abilities are necessary to perceive speech in noise, and if these abilities deteriorate with age (as will be shown in the 1 Note that we use more than temporal information to perceive speech in noise. For example, we tend to perceptually group sounds together that have similar fundamental frequencies (Culling & Summerfield, 1995; Culling & Darwin, 1993, 1994). This is a process that relies more on spectral information than on 5 following sections), perhaps it is the presence of internal temporal asynchrony - not simply "slowing" - that causes elderly listeners to have difficulty understanding language spoken in noise. It is also possible that "timing" (temporal synchronization) and "time" (slowing) present coexisting problems. The current study was designed to evaluate the effect of temporal asynchrony on the recognition of language spoken in noise. The purpose of this study was to determine if the presence of temporal asynchrony affects the listener's ability to recognize and repeat words from high- and low-context sentences presented in multi-talker, background "babble." (In this experiment, the temporal asynchrony was externally added to the signal, to simulate internal temporal asynchrony. For a detailed description of the experimental stimuli, refer to Chapter 2.) The results of this study will be used to support the notion that elderly listeners have difficulty understanding language spoken in noisy environments because of the presence of age-related, internal temporal asynchrony. In this chapter, the theory that temporal information and temporal processing are crucial for understanding spoken language will be supported, using the results from many experimental studies. The physiological processes (as well as the associated anatomical structures) by which temporal information is conveyed to the auditory cortex will be reviewed. Evidence of how these processes and anatomical structures deteriorate with age will be described, to support the idea that elderly listeners have difficulty understanding language in noisy environments because of poor temporal processing abilities. This notion will be further supported by evidence obtained from temporal information. However, both types of information are necessary to perceive speech in noise. 6 psychoacoustic studies, which reveal that aging listeners do, in fact, have poorer monaural and binaural temporal resolution. Theoretical models that attempt to explain monaural and binaural language processing will be discussed, with special emphasis on those that take temporal processing into account. The theory upon which the present experiment was based will be justified using one of these models. Finally, null and experimental hypotheses for this experiment will be stated. 1.2 Evidence of the Importance of Temporal Information in Understanding Spoken Language We rely on temporal processing abilities to interpret two particular characteristics of sounds: (1) the outer, temporal "envelope" of the speech signal, which changes slowly over time, and (2) the inner, "fine structure" of the signal, which changes rapidly overtime (Viemeister & Plack, 1993; Shannon etal., 1995; Shannon et al., 1992; Drullman, 1995). The amplitude of the temporal envelope varies with a frequency called the modulation frequency; the fine-structure amplitude varies with a frequency called the carrier frequency (Viemeister & Plack, 1993). The importance of these temporal characteristics for speech perception has been demonstrated by many investigators. In particular, the envelope of the signal has been shown to play a crucial role in speech intelligibility. Erber (1979, as cited in Van Tasell et al., 1987) hypothesized that this was so because the envelope reveals information such as number of syllables, syllabic stress, duration of consonants, and even emotional state. In fact, several studies have established that, in the absence of 7 spectral and fine-structure temporal information, phonemes, syllables, words, and even running speech can still be interpreted using the temporal envelope. Shannon et al. (1992), for instance, used the envelopes of 16 word-medial consonants to modulate a high-frequency train of biphasic pulses (which therefore replaced the temporal fine-structure and frequency cues). Participants with auditory brainstem implants were asked to identify each consonant from a list of possible alternatives. The investigators found that participants could still recognize the consonants with a high degree of accuracy. Specifically, voicing and manner distinctions were easiest to identify using the temporal envelope. Place distinctions, however, required more spectral information to be identified; performance in this area was significantly poorer. Thus, the temporal envelope was crucial for interpreting voicing and manner distinctions in consonants, and useful (although less so) for determining place distinctions. Shannon et al. (1995) performed a similar study, in which they maintained amplitude and temporal cues of medial consonants, vowels, and simple sentences while systematically changing the degree of spectral information. In this experiment, amplitude envelopes of speech sounds (obtained using low-pass filters) were used to modulate white noise of various bandwidths. Participants were asked to identify 16 consonants and 8 vowels from a list of all possible alternatives; sentences were heard once, whereupon subjects were asked to repeat as many of the words as possible in each sentence. The investigators found that voicing and manner distinctions were made with over 90% accuracy with minimal spectral information, while place information was identified with less than 70% accuracy, even with the maximum amount of spectral 8 information. In addition, 10-30% of words in sentences were identified using minimal spectral information. Thus, temporal cues - and hence the ability to process them -play an important role in the deciphering of voice, manner, and place distinctions of consonants and vowels, as well as (to a lesser extent) words in running speech. Van Tasell et al. (1987) also found consonant identification was possible using the temporal envelopes of medial consonants and a fine structure of noise. In addition, they noted that participants' types of responses could be grouped according to three dimensions of the speech envelope. One dimension concerned the voicing distinction, which was noticeable because voiced medial consonants were longer than unvoiced, and had greater amplitudes. Another envelope dimension used by participants distinguished the sonorant consonants from all others; the investigators determined that the envelopes of such consonants have much greater amplitudes than others, with a flatter shape in the peaks. The third dimension used by participants separated the voiceless stops /p, t, k/ from all others; the investigators hypothesized that this was because such consonants have an initial, low-amplitude phase (corresponding to articulator closure) followed by an abrupt, high-amplitude phase (for the release burst). Thus, in this study, all consonants had particular temporal characteristics which, if retained in the envelope of the waveform, made identification possible. Clearly, an inability to process such temporal information would lead to difficulties in speech perception. In a later, similar study, Van Tasell et al. (1992) found that recognition of speech sounds using only temporal information improved with practice. They also discovered that, even with practice, identification of the place feature of consonants still remained 9 poor. Thus, as noted by Shannon et al. (1992, 1995), place information for speech sounds is determined more by spectral than temporal information. Freyman et al. (1991) revealed the importance of waveform shape by examining medial-consonant identification when the consonantal segment of the waveform was amplified by 10 dB. These investigators found that performance was enhanced for some consonants (particularly voiced stops), and hindered for others (glides and voiced fricatives). Thus, when the temporal relationship between sounds was altered, identification of those sounds was significantly affected. Few studies have examined the way fine-structure temporal cues are used to enhance speech recognition, or the relative importance of the peaks and troughs in the temporal envelope. Drullman (1995) compared fine-structure and envelope cues to determine which was most important in speech perception. He also hypothesized that the peaks and troughs of the envelope were processed differently and so contributed different amounts to the intelligibility of the signal. Drullman assessed the recognition of sentences whose fine structure remained unchanged while their temporal envelopes were formed by: (1) combining speech and noise; (2) removing the troughs; (3) removing the peaks; and (4) creating a block-pulse version of the envelope (refer to Drullman for details). Subjects heard each sentence once, and were asked to repeat as many of the words as possible. Performance in recognizing these stimuli was compared to performance on sentences with intact speech envelopes and noise fine structures. Drullman found that, when the speech envelope is intact, the fine structure is of little importance for almost perfect recognition; however, when the envelope is altered and the fine structure is intact, accurate interpretation is minimal. He also found 10 (as did Freyman et al., 1991) that listeners are extremely sensitive to changes in the temporal envelope. Finally, Drullman discovered that listeners rely more upon envelope peaks than troughs to interpret speech signals. Thus the intelligibility of a speech signal depends greatly on its temporal characteristics, which include the envelope amplitude, shape, and peaks, as well as relative amplitudes across segments. A complex system of anatomical structures and physiological responses is necessary to convey this subtle, rapidly-changing information to the auditory cortex. The following section reviews what is known about the anatomy and physiology of auditory temporal processing. 1.3 The Anatomy and Physiology of Temporal Processing Speech sounds consist of travelling waves of air pressure. To be detected by the peripheral auditory system, these waves must pass through the external auditory meatus and vibrate the tympanic membrane. The vibrations are transmitted via the three middle-ear ossicles (malleus, incus, and stapes) to the oval window of the cochlea, which lies in the inner ear. The vibrations push the oval window in and out. These movements create a pressure wave that travels along the length of the basilar membrane, from base to apex. (The mechanical properties of the basilar membrane cause the basal portion to vibrate maximally in response to high frequencies, and the apical portion to respond maximally to low frequencies.) The Organ of Corti lies between the basilar membrane and the tectorial membrane; it contains inner and outer hair cells. The inner hair cells are in contact with the overlying tectorial membrane, as well as with underlying auditory nerve fibers. When the basilar membrane is displaced 11 upward, the Organ of Corti moves upward as well, causing the inner hair cells to be displaced by a shearing action between the basilar and tectorial membranes. This displacement leads to the release of chemical neurotransmitters, which, when released in great enough quantities, will cause the auditory nerve fibers to fire. In this way, the mechanical vibrations of sound are converted into electrical impulses (Moore, 1989). Neurotransmitter substance is released in greatest amounts during the maximum displacement of the inner hair cells; this occurs during the point of maximum positive displacement of the basilar membrane. Since this neurotransmitter release causes auditory nerve cells to fire, the cells tend to fire at the same, positive phase of the wave travelling across the basilar membrane. Thus the temporal characteristics of the travelling wave - and hence the sound wave - are preserved in the firing pattern of auditory nerve fibers. This phenomenon is called phase-locking; it occurs for frequencies below 4 or 5 kHz, and is one of the most important ways that fine-structure temporal information is conveyed to the auditory cortex (Greenberg, 1996; Moore, 1989). Electrical impulses are conducted along the fibers of the cochlear nerve to the cochlear nucleus on either side of the brainstem. The cochlear nucleus processes this information by extracting relevant acoustic features (Greenberg, 1996). The anteroventral, posteroventral, and dorsal cochlear nuclei, of which the cochlear nucleus is comprised, all respond differently to the incoming impulses. The anteroventral cochlear neurons are thought to be necessary for binaural analyses of sounds (Greenberg, 1996). Such analyses allow the listener to separate 12 relevant signals, such as speech, from background noise, and to localize sound sources. (For more information, refer to section 1.5.2.1, "Localization".) The posteroventral cochlear neurons are believed to assist with the coding of spectral information, and with the extraction of relevant information from background noise (Greenberg, 1996). Two types of neurons, called choppers and onset units, can be found in the posteroventral cochlear nucleus. Choppers discharge regularly, independently of the frequency of the incoming stimulus, and phase-lock to amplitude-modulation frequencies below 1 kHz (Greenberg, 1996). Onset units respond with great precision at the onset of a stimulus, then gradually stop responding over several milliseconds. Their onset response occurs most often with amplitude-modulation frequencies above 1.5 kHz. Like choppers, onset units can phase-lock to modulation frequencies below 1 kHz with great precision (i.e., on almost every modulation cycle of the waveform). Both onset units and choppers appear to respond best to low-frequency modulations, such as the fundamental frequencies of the human voice (Greenberg, 1996). The dorsal cochlear neurons may minimize background noise, thus allowing relevant information (e.g., speech) to be extracted from the noise (Rhode & Greenberg, 1991, as cited in Greenberg, 1996). This function would be crucial in the understanding of language spoken in noisy environments. However, since most information about the dorsal cochlear nucleus has been obtained through studies of animals, this conclusion cannot be confirmed. The entire auditory system - particularly the cochlear nucleus and auditory cortex - responds most intensely to a stimulus' onset; if a sound continues, unchanged, for 13 over 50 msec, the system stops responding to it (Greenberg, 1996). The system therefore responds primarily to novel sounds, and tends to extract information about changes in the incoming stimulus over time. The glottal pulses of vocal folds are regularly-occurring stimulus onsets; each pulse causes nerves to fire in the inner ear and cochlear nucleus. Thus glottal pulses keep the auditory system constantly "alert" to the stimulus of the human voice (Greenberg, 1996). This, in turn, may help us to perceive speech in noisy environments. As suggested by the activity of neurons in the cochlear nuclei, much of the auditory system is able to phase-lock to incoming impulses. Thus temporal precision is maintained (and likely necessary) at all levels of the auditory pathway (Greenberg, 1996). Neural phase-locking ensures that the incoming stimulus is encoded accurately, and causes the auditory system to synchronize with low frequencies, such as the fundamental frequencies of the human voice. Thus phase-locking likely helps listeners perceive relevant signals, such as speech, in the presence of background noise (Greenberg, 1996). A deterioration in temporal precision at any point of this pathway would interfere with phase-locking; this, in turn, would decrease the accuracy of the encoding process, and so affect the listener's ability to perceive speech in noise. Because temporal precision is necessary at so many points in the auditory system, there are many places where temporal asynchrony may occur. The following section considers the anatomical structures and physiological processes that may introduce asynchrony in the processing of temporal information, both in young and aging auditory systems. 14 1.4 Potential Sources of Temporal Asynchrony in the Young and Elderly The anatomy and physiology of the auditory system have inherent potential for adding temporal asynchrony to the signal (Viemeister & Plack, 1993). First, because the basilar membrane is often described as a series of bandpass filters (Fletcher, 1940, as cited in Moore, 1989) that transmit slow, but not rapid, temporal fluctuations, reproduction of the temporal characteristics of the sound wave is inexact, even before the vibration is transduced to electrical impulses (Viemeister & Plack, 1993). Second, the maximum neuronal firing rates are limited by inherent properties of the hair cells, synapses, and neurons. This limits the modulation frequencies that can be transmitted for the temporal envelopes of incoming stimuli. Once again, temporal precision is compromised. Third, the fact that nerve fibers fire fully or not at all means that movements of the basilar membrane representing subtle temporal changes in the sound wave will not always be transduced into neural activity. Fourth, the auditory system is thought to integrate temporal information by "averaging" nerve firing rates over time and across many fibers; if this is true, the act of averaging will compromise temporal precision, and thus temporal acuity. Finally, the bandpass filters of the cochlea continue to respond to a stimulus after it has stopped (a phenomenon called ringing) (Viemeister & Plack, 1993). The low-frequency filters respond for longer periods of time than the high-frequency filters; thus, at lower frequencies, temporal information may be distorted as these fibers continue to respond to a stimulus, even in its absence (Viemeister & Plack, 1993). In addition to these inherent imperfections, the structures of the auditory system degenerate with age (Schneider, 1997). Schneider hypothesized that any degeneration 15 of the mechanical structures of the inner ear must somehow affect hearing, although not necessarily in ways that are detectable via audiometric threshold tests. Perhaps, then, these age-related changes in structure are responsible for the degeneration in temporal resolution that has been found in elderly people (Schneider, 1997). (For evidence of temporal asynchrony in the aged and language-impaired, refer to the following section.) Many structures in the cochlea change with age. Both inner hair cells, especially in the basal region, and outer hair cells are lost (Willott, 1991, as cited in Schneider, 1997). As mentioned in the previous section, inner hair cells detect vibrations and convert sound pressure into neural activity; outer hair cells affect the mechanical properties of the Organ of Corti, such that gain is added to the incoming signal (Schneider, 1997). Spiral ganglion cells also disappear, particularly in the basal region of the cochlea. The spiral ligament shows degeneration (e.g., Wright & Schuknecht, 1972, as cited in Schneider, 1997); this structure attaches the basilar membrane and Reissner's membrane to the bony labyrinth. The basilar membrane thickens and calcifies (e.g., Nadol, 1979, as cited in Schneider, 1997). The stria vascularis, which is responsible for maintaining potentials in the fluids of the cochlea, exhibits loss of capillaries and degeneration. Finally, the highest possible compound action potential (a measure of combined neural responses) decreases with age; the firing thresholds of individual nerves increase (so they fire less often in response to auditory stimuli); nerves are not as well "tuned" to their characteristic frequencies at these firing thresholds; and the compound action potential grows more slowly as stimulus intensity increases (Hellstrom & Schmeidt, 1990, and Schmeidt, Mills, & Adams, 1990, as cited in Schneider, 1997). 16 These anatomical structures and physiological processes may not be the only ones to deteriorate with age. All structures of the auditory system that are capable of phase-locking to the stimulus may also deteriorate, causing a decrease in temporal precision and encoding accuracy. Such structures include the cochlear nuclei, superior olivary nuclei, inferior colliculi, and medial geniculate nuclei. At this time, data is lacking on how these structures change with age; however, given the substantial evidence of the degeneration of other auditory structures in the elderly, these, too, are likely affected by age. Clearly, there are several important auditory structures whose damage or degeneration could significantly affect many aspects of hearing and, in turn, speech perception. It is entirely possible that these physiological changes affect temporal resolution in particular, but this has yet to be determined. In spite of the lack of knowledge about the real causes of temporal asynchrony, however, many experimental tasks have been devised to reveal degeneration in temporal processes that may be attributable to loss of neural synchronization. These tasks are described in the following section. 15 Evidence of Temporal Asynchrony in the Elderly and Language Impaired Various monaural and binaural temporal processing tasks have been used to assess listeners' temporal resolution. In these experiments, it has been found that elderly listeners have poorer temporal resolution than young listeners, even when they do not have significantly elevated pure-tone thresholds in the speech range. The 17 following sections describe how these tasks are performed and what aspects of temporal resolution they measure. 1.5.1 Monaural Temporal Processing Tasks Many tasks have been devised to assess monaural temporal resolution. Each of the following sections describes one such task, as well as how various types of listeners (young, old, and language-impaired) perform on that task. 1.5.1.1 Gap Detection The gap detection test determines the smallest silent interval between two sound stimuli that a listener can detect. In this task, the listener is presented with two sounds: one is interrupted by the gap and the other is continuous. The order of presentation for gap and non-gap stimuli varies randomly across trials. The gap and non-gap stimuli must have equal magnitudes, otherwise the listener can use intensity differences to discriminate them. The sounds bracketing the gap are called markers. The markers can be broadband noise, narrowband noise, sinusoidal pure tones, tone pips (very short pure tones), or clicks. The durations, types, and phases of the markers may vary relative to each other. Young, normal-hearing listeners' gap thresholds for broadband noise or click markers are 2-3 msec (e.g., Plomp, 1964, Ronken, 1970, and Penner, 1977, as cited in Moore, 1989). Studies have revealed, however, that listeners depend most on the high-frequency portions of broadband markers (e.g., Fitzgibbons, 1983, Shailer & Moore, 1983, Buus & Florentine, 1985, and Formby & Muir, 1988, as cited in Schneider, 18 Pichora-Fuller, Kowalchuk, & Lamb, 1994). Thus, for elderly people who have high-frequency hearing losses, gap thresholds for broadband markers may be larger simply because the high-frequency components are not audible. Young listeners' gap thresholds for narrowband noise markers have been found to increase as the center frequency of the markers decreases (e.g., Fitzgibbons & Wightman, 1982, Shailer & Moore, 1983, Buus & Florentine, 1985, as cited in Moore, 1989). This is due to the fact that the amplitude fluctuation in noise bands increases in rate as the center frequency increases. Thus low-frequency noise bands have slow amplitude fluctuations, which could easily be confused with the gap stimuli. In addition, ringing (the ongoing response of the cochlea after a stimulus has stopped) lasts longer at lower frequencies; this means that, for the gap, the dip perceived in the amplitude envelope of the stimulus will not be very deep. As a result of these characteristics of the cochlea and noise markers, gaps between low-frequency narrowband markers must be relatively large to be detected. For example, Shailer and Moore (1983, as cited in Moore, 1989) found that gap thresholds for narrowband noise markers centered at 200 Hz could be as large as 22 msec. Note that, for higher-frequency narrowband markers, the gap threshold for young listeners has been found to be similar to (but slightly larger than) those for broadband or click markers (e.g., 3 msec rather than 2 msec). Another disadvantage in using narrowband noise in gap detection tasks is the occurrence of spectral splatter (energy outside the desired bandwidth), which occurs when the stimulus is turned on or off quickly. Spectral splatter can be detected by the listener. Sinusoidal pure-tone and tone pip markers also have spectral splatter if they are turned on or off quickly (i.e., with less than 10 msec rise or fall time). To mask this 19 splatter, the markers can be presented in background noise that has a notch of the same bandwidth as the markers' bandwidth. Alternatively, the spectral splatter of pure tones can be minimized if the tone onsets and offsets are shaped using Gaussian envelopes (Schneider, Pichora-Fuller, Kowalchuk, & Lamb, 1994). There is another potential confound when pure-tone markers are used in gap detection experiments. Shailer & Moore (1987, as cited in Moore, 1989) found that the gap threshold is strongly affected by the relative phases of the markers, and by the size of the gap relative to the pure-tone period. (For details, refer to Moore, 1989.) Once they had taken these potential confounds into account, Shailer and Moore determined that, for pure-tone markers of 400, 1000, and 2000 Hz, the gap threshold was approximately 5 msec for young listeners, and the average gap threshold over all frequencies was 4.5 msec. Tone pips have been hypothesized to cause less ringing than longer pure-tone markers, simply because they are shorter in duration (Schneider et al., 1994). These investigators used Gaussian-enveloped tone pips to compare gap thresholds of young and elderly listeners. They found that the gap thresholds of older listeners were approximately twice that of young listeners, and more variable. Further, this difference was not related to audiometric measures of hearing thresholds in the elderly listeners. Other studies have confirmed that gap detection is not necessarily highly correlated with degree of hearing impairment (e.g., Strouse, Ashmead, Ohde, & Grantham, 1998; Schneider, Speranza, & Pichora-Fuller, 1994; Fitzgibbons & Gordon-Salant, 1994; Moore, Peters, & Glasberg, 1992, as cited in Schneider, 1997). Overall, there is agreement that gap-detection thresholds increase with age. Further, some studies have 20 correlated gap detection with the ability to identify speech in noise (e.g., Tyler, Summerfield, Wood, & Fernandes, 1982). Other studies have not found this correlation (Strouse et al., 1998). Thus the poorer gap thresholds of elderly listeners may be partially responsible for (or at least correlated with) the difficulties they experience when trying to comprehend spoken language in noisy environments. Binaural gap detection has recently been investigated, where the markers are presented from right and left headphones or speakers in a free field (Phillips, Taylor, Hall, Carr, & Mossop, 1997; Phillips, Hall, Harrington, & Taylor, 1998). For more information, refer to the section entitled "Binaural Gap Detection" in section 1.5.2. 1.5.1.2 Frequency Discrimination Frequency discrimination tasks assess the smallest frequency difference (the difference limen) or frequency modulation rate the listener can detect. The frequency difference limen is found by presenting two tones (one after the other) with slightly different frequencies, and asking the participant to identify the tone with the higher pitch. Frequency difference limens are very small for young listeners; for example, at a sound level of 60-70 dB SPL, the frequency difference limen for young listeners for 1000 Hz is approximately 2 Hz (Moore, 1989). This difference limen increases in a linear fashion with age (Konig, 1957, as cited in Schneider, 1997) and independently of hearing loss (Abel, Krever, & Alberti, 1990, as cited in Schneider, 1997). Difference limens in elderly listeners are noticeably larger for low frequencies than for high frequencies, and larger than those of young listeners at all frequencies (Moore & Peters, 1992). This is thought to be related to loss of neural synchrony (Schneider, 1997). 21 Specifically, at low frequencies, neural responses phase-lock to the stimulus (Greenberg, 1996). Asynchrony in the neural responses, however, would diminish phase-locking, and thus reduce precision when the listener had to discriminate between similar frequencies. (This lack of precision would cause difficulty if, for example, the listener was trying to hear the fundamental frequency of a voice in similar, low-frequency background noise.) At high frequencies (greater than 4 or 5 kHz), phase-locking does not occur; therefore, the presence of neural asynchrony would not have such a deleterious effect on frequency discrimination for high-frequency stimuli. Thus, for frequencies less than 4 or 5 kHz, frequency discrimination tasks can reveal the presence of increased temporal jitter in neural impulses. 1.5.1.3 Temporal Ordering The ability to determine the temporal order of sounds is affected by the types of sounds used in the task. If the sounds form a coherent class (i.e., with similar temporal and spectral characteristics), such as a series of musical notes, temporal order can be determined for items 10-50 msec in duration (Hirsh, 1959, and Winckel, 1967, as cited in Moore, 1989). If different classes of sounds are used, temporal ordering becomes much more difficult (e.g., Ladefoged & Broadbent, 1960, as cited in Moore, 1989). This is also true of patterns consisting of high and low tones; specifically, listeners cannot order these tones relative to each other when they hear a mixed sequence (Bregman & 22 Campbell, 1971, as cited in Moore, 1989).2 If these tones are connected by frequency glides, however, temporal ordering becomes easier (Bregman & Dannenbring, 1973, as cited in Moore, 1989). Note that, with frequency glides, the high/low tone sequence more closely resembles speech, in which most sounds are connected by frequency glides (Moore, 1989). Thus these glides may be crucial for perceiving the temporal order of speech sounds. Trainor and Trehub (1989, as cited in Schneider, 1997) found that older adults made more mistakes than young adults in discriminating temporal sequences and identifying temporal orders. Other studies have confirmed this (Humes & Christopherson, 1991, and Neils, Newman, Hill, & Weiler, 1991, as cited in Schneider, 1997). These temporal ordering deficits were not significantly correlated with hearing thresholds (Trainer & Trehub).3 Thus loss of temporal-ordering ability seems to be age-related. Given the significance of temporal ordering when listening to the rapidly changing sounds of running speech, this deficit may very well affect speech perception in the elderly. 1.5.1.4 Temporally Modulated Transfer Functions (TMTFs) The TMTF is a measure of modulation-depth detection threshold, plotted as a function of modulation rate. To find the TMTF, white noise is amplitude modulated at 2 The listener will also have difficulty determining the order of rapidly presented tones if auditory streaming occurs; this is a psychoacoustic phenomenon in which the listener groups the tones into two separate "streams," or auditory "objects," based on their acoustic attributes (Bregman, 1978, as cited in Moore, 1989). Auditory streaming occurs if the tones differ greatly in intensity, frequency, or location (Moore, 1989). It is unknown, however, how performance on these tasks correlated with cognitive measures such as attention and working memory. various frequencies, and the minimum modulation depth (degree of amplitude modulation) that must be present for detection at each frequency is determined. For young listeners, at modulation frequencies below 16 Hz, the modulation-depth threshold is independent of modulation frequency. Between 16 Hz and approximately 1000 Hz, threshold increases as frequency increases. Beyond 1000 Hz, the modulation is no longer detectable (Bacon & Viemeister, 1985, as cited in Moore, 1989). TMTFs were originally devised to account for temporal resolution in linear models of temporal processing (Moore, 1989; Viemeister & Plack, 1993). (For information on linear and nonlinear models, refer to section 1.6, "Models of Temporal Processing.") Nevertheless, such functions can reveal the ability (or loss of ability) to resolve the temporal envelope of sounds. Thus TMTFs could be used to compare the temporal resolution of elderly listeners to young listeners. As yet, this has not been done. 1.5.1.5 Compressed and Expanded Speech Rates The temporal characteristics of a speech signal, which include envelope amplitude, shape and peaks, as well as relative amplitudes across segments, change very rapidly. Our ability to understand spoken language is thus partially determined by our ability to process this temporal information quickly. The importance of fast temporal resolution can be seen in studies which assess the effect of speech rate on language comprehension and word recognition. For more than 20 years, the prevailing method of altering speech rate has involved a periodic sampling procedure, whereby electronic devices divide the speech signal into many samples, and insert silent periods every nth sample (for speech rate 24 expansion) or remove every nth sample and compress the remainder (for speech compression); n, in this method, is determined by the degree of compression or expansion (Letowski & Poch, 1996; deHaan, 1977). Prior to using this method, investigators primarily used "speeded speech" (deHaan, 1977), whereby the speed of a playback device was increased beyond the normal playing speed. It is believed by some investigators (e.g., deHaan, 1977; Wingfield et al., submitted) that speeded speech increases both rate and pitch of the stimuli, while compressed (or expanded) speech changes only the rate.4 For this reason, the sampling method of compression and expansion is believed to be more intelligible and is now the preferred method in psychoacoustic research (deHaan, 1977). All studies reviewed in this chapter used the periodic sampling method for speech compression or expansion (unless otherwise noted). Most studies concerning speech rate have focussed on the effect of speech compression on comprehension or word recognition, because this measurement is assumed to reflect the effectiveness of the listener's temporal resolution (Letowski & Poch, 1996; Fitzgibbons & Gordon-Salant, 1994). In general, these studies have found that, as speech rate increases, comprehension, word recognition, and word recall decrease, more so in elderly listeners than in young listeners. Letowski and Poch (1996), for example, examined the effect of compression rate and discard interval length (the size of the intervals discarded from the compressed speech) on the comprehension of middle-aged and older listeners. They found that 4 This conclusion is not necessarily true of all rate-altering methods, nor is it agreed upon by all researchers. Some investigators believe, for example, that compression of a waveform reduces the wavelengths of the sounds, and that this must, in turn, increase the frequency. 25 performance of both groups deteriorated with increasing compression rate and discard interval length, with the older listeners performing significantly worse than the middle-aged listeners. Further, performance of the middle-aged participants closely resembled that of young participants from an earlier, similar study (Poch, 1992, as cited in Letowski & Poch, 1996). Wingfield et al. (submitted) found that increasing degrees of compression led to poorer recall of spoken paragraphs in young as well as old listeners (with the older listeners performing more poorly than the young listeners). They hypothesized, however, that lack of processing time was partially responsible for this performance. Thus they restored processing time by inserting silent intervals at linguistically-relevant places (clause boundaries). They found that performance of young listeners returned to normal, whereas that of old listeners improved but did not return to normal. Thus, both the ability to process the temporally-altered speech and processing time were necessary for the elderly to recall compressed language. Many other studies have examined the effect of compression and expansion on the comprehension or word recognition of young, middle-aged, and elderly listeners, with varying results. Schmitt and McCroskey (1981), for example, found that elderly listeners' comprehension improved at compressed (60%) and expanded (140%) rates. However, since the task consisted of pointing to a picture best representing the utterance, the investigators concluded that the improved performance may have been the result of the listeners having supportive visual cues. Schmitt (1983) compared the comprehension of compressed and expanded speech by two age groups (65-74 and 75-84). He found that both groups performed more poorly with 60% compression than with normal speech rates; the younger group performed increasingly better at expansion 26 rates of 140% and 180%; the older group performed better at the 140% rate and worse at the 180% rate; and the older group performed more poorly than the younger group at all altered rates. Schmitt and Carroll (1985) assessed the effect of "naturally" compressed and expanded speech on the comprehension of elderly listeners (ages 65-74). This rate alteration was accomplished by recording a speaker talking quickly at a rate equivalent to 60% compression, and talking slowly to simulate expansion rates of 140% and 180%. The investigators found that comprehension was poorer at the compressed rate, but unaffected by the expanded rates. Schmitt and Moore (1989) then examined the effect of this naturally compressed and expanded speech on the comprehension of an older age group (ages 75-84). Again, they found that comprehension was worse for compressed speech and unaffected by expansion. They concluded that their results supported the "slowing" theory of aging, whereby the central nervous system slows with age, causing elderly listeners to require more processing time and to have more difficulty comprehending rapidly changing acoustic stimuli. King and Behnke (1989) hypothesized that different kinds of listening were affected in different ways by compressed speech. Using young participants, they examined the effect of speech compression on three kinds of listening: comprehensive listening, which is done when the listener attempts to understand and, later, recall a message; short-term listening, which refers to the initial, brief (40-second) period when spoken information is processed and retained; and interpretive listening, where the listener attempts to make inferences about the spoken information (King & Behnke, 1989). They found that, as speech rate increased, comprehensive listening deteriorated but interpretive and short-term listening were not affected until the listeners were 27 presented with a very high degree of compression (60%). Thus, even in young listeners, comprehension of language is adversely affected by temporal distortion. The use of context in the recognition of compressed speech was examined by Newman (1982), who presented the Speech Perception in Noise (SPIN) test at a highly compressed rate (60%) to young listeners. (The SPIN test contains high-context sentences, which allow the listener to predict the last word, as well as low-context sentences, which do not. (For more information, refer to Chapter 2.) Newman found that supportive context allowed listeners to recognize sentence-final words with 88% accuracy at the compressed rate, but that lack of context caused a significant decrease in word recognition at this compression rate. (In other words, the listeners could not use their linguistic and world knowledge to compensate for the lack of appropriate temporal cues in the compressed speech.) Newman concluded that when speech is temporally distorted, linguistic and semantic information are crucial for accurate word recognition. Some researchers have found that, as an increased speech rate interferes with comprehension, so does a decreased speech rate seem to improve comprehension. (Recall, for example, the younger listeners in Schmitt's 1983 study). Griffiths (1992) examined the effects of reduced and increased speech rate on the listening comprehension of young, non-native English speakers. He found that an expanded speech rate resulted in better comprehension than did normal or compressed speech rates. This confirmed an earlier study Griffiths had done (1990), in which a slower speech rate was found to have a facilitative effect on learning by non-native English speakers. (Note, however, that these participants were limited in their language, rather 28 than their temporal processing, abilities. Note, too, that these findings did not coincide with those of Schmitt and Carroll (1985) or Schmitt and Moore (1989).) Temporal distortions of speech have also been found to affect the comprehension of spoken language by Alzheimer's Dementia (AD) and aphasic patients. Some researchers have found that slowed speech (where vowels are lengthened and silences inserted between words and phrases) facilitates syntactic comprehension of listeners with Wernicke's aphasia (Blumstein, Katz, Goodglass, Shrier, & Dworetsky, 1985). Other investigators, however, have not achieved the same results, instead finding great variability in preferred listening rates (Riensche, Wohlert, & Porch, 1983). Slow speech rates have also been found to facilitate word recognition and comprehension in AD participants, provided working memory is mostly intact; however, slow rates had no effect when working memory was compromised (Small, Andersen, & Kempler, 1997; Small, Kemper, & Lyons, 1997). Tomoeda, Bayles, Boone, Kaszniak, and Slauson (1990) also did not find slowed speech facilitative for AD participants' comprehension, but they did not determine if working memory was a factor in their results. Timing affects the comprehension of children with language impairments as well. Weismer and Hesketh (1996) found that slowed speech rates facilitated the acquisition of new words by children with specific language impairments. At faster speech rates, however, participants' learning decreased. Blosser, Weidner, and Dinero (1976) found the same results using speech that had been slowed via tape recordings. These studies reveal that speech rate affects word recognition, recall, and comprehension in both young and elderly listeners, and that the temporal processing 29 abilities of elderly, aphasic, AD, and language-impaired listeners are poorer than those of young, normal listeners. The difficulties that elderly and impaired listeners experience in understanding spoken language provide support for the notion that monaural temporal processing abilities are crucial for word recognition and comprehension. 1.5.2 Binaural Temporal Processing Tasks Many experimental tasks have been designed to assess the listener's ability to compare binaural temporal cues. Three such tasks are discussed here. Two of them (localization and binaural unmasking) have been used more extensively than the third (binaural gap detection). The following sections describe each of these tasks, as well as how various types of listeners (young, old, and language-impaired) perform on them. 1.5.2.1 Localization Localization is the ability to locate an object in space relative to oneself. This is done by comparing the arrival times, phases, and intensities of sounds arriving at both ears. For example, if the sound is louder or arrives first at the right ear, the listener will think the sound image is to the right. Clearly, accuracy in localization tasks depends on accurate binaural temporal resolution, (as well as the ability to attend selectively). Studies of localization have used a variety of stimuli, including pure tones (e.g., Mills, 1960, as cited in Moore, 1989). Pure-tone tasks have revealed that we use intensity differences to locate high-frequency sound sources and phase or time differences to locate low-frequency sources (i.e., below 1500 Hz). 30 The ability to use localization to track "moving" auditory images (clicks presented through speakers, with varying delays between the clicks) develops with age (Cranford, Morgan, Scudder, & Moore, 1993). Little is known, however, about how this ability changes in later years. One study (Noble, Byrne, & LePage, 1994, as cited in Schneider, 1997) revealed that localization is poorer for elderly adults with hearing impairments; the investigators suggested that age may have been a factor, as well. More work must be done in this area before any relationship can be established between age and poor localization abilities. Interestingly, children with specific language impairments (SLI) have been found to perform poorly in localization tasks (Visto, Cranford, & Scudder, 1996). These children (along with normal, language- and chronologically-age-matched children) were required to track moving auditory images. The SLI children performed as well as the language-matched children (who were younger), but poorer than the age-matched children. The investigators concluded that SLI children are not able to "dynamically" process binaural temporal information as well as normal children, and that perhaps this limitation was affecting their ability to dynamically process the temporal information in speech sounds. To conclude, although few experiments have been conducted to study how localization abilities change with age, and although we cannot conclude that poor localization affects the elderly's ability to perceive speech in noise, it seems reasonable to expect that there might be such effects. 31 1.5.2.2 Auditory Scene Analysis and Binaural Unmasking Auditory scene analysis is the analysis the brain performs on acoustic information to reconstruct the external, auditory environment. This analysis could not be accomplished without the process of binaural unmasking, which is the ability to discriminate a signal from background noise ("unmask" it) using interaural cues. Clearly, this ability is also advantageous if one is to understand speech in noisy environments. Binaural unmasking ability is measured using masking level differences (MLDs). When a pure tone is presented to one ear simultaneously with a white noise masker, the tone must be at a certain intensity to be just masked by the noise (i.e., if the tone were slightly less intense, it would cease to be detectable). If the tone is set to this threshold level (with the noise still present), and the same noise is presented to the other ear, the listener will suddenly be able to hear the tone again; in other words, the threshold level for the masked tone drops in the presence of identical, binaural (diotic) noise. The difference between the original threshold and the new one is called the masking-level difference. In addition to pure tones, MLDs have been observed for complex tones, clicks, and speech sounds (Moore, 1989). The larger the MLD, the better the system is at resolving binaural temporal information. MLDs for young listeners range between 3 and 15 dB, with the largest MLDs occurring at low frequencies (less than 500 Hz) and the smaller MLDs occurring at frequencies above 1500 Hz (Moore, 1989). Elderly listeners have much smaller MLDs than do young listeners (e.g., Strouse et al., 1998). The size of this difference in MLDs depends on the particular interaural differences in the signal and noise (Pichora-Fuller & 32 Schneider, 1991, as cited in Pichora-Fuller & Schneider, 1992). The reduced MLDs of the elderly listeners suggest that these participants are poorer at resolving binaural temporal information, possibly due to temporal asynchrony in the firing of the neurons (Schneider, 1997). Thus the presence of temporal jitter may affect binaural unmasking ability, which in turn affects the listener's ability to perceive speech in noise. This, again, supports the notion that temporal asynchrony is one of the reasons that elderly listeners find comprehending language spoken in noise so effortful. 1.5.2.3 Binaural Gap Detection Gap thresholds of young listeners have been evaluated for binaural as well as monaural marker presentation. Leading and trailing markers were presented via right and left headphones (Phillips et al., 1997) and right and left speakers (Phillips et al., 1998). For both types of presentation, binaural gap thresholds were considerably larger than monaural thresholds. Phillips et al. (1997) hypothesized that monaural and binaural gap detection are therefore performed by two different temporal processes. Specifically, when the signals remain within one perceptual channel (monaural), a "discontinuity detection" process is used, and when the signals activate two perceptual channels (binaural), a "relative timing" process is used. This binaural, relative-timing process was found to be sensitive to first-marker duration: for short marker durations (5-10 msec), gap thresholds were approximately 30 msec. This threshold value is similar to the typical voice-onset time that differentiates voiced from unvoiced stops in the English language (Phillips et al., 1997). Thus the 33 ability to discriminate gaps in binaural stimuli may be closely related to the ability to perceive and discriminate some classes of speech sounds. These studies of localization, binaural unmasking, and gap detection reveal that the binaural temporal processing abilities of elderly and language-impaired listeners are poorer than those of young, normal listeners. Since the ability to resolve binaural information has been linked to the ability to perceive speech in noise, the poorer resolving power of elderly listeners (i.e., the presence of temporal asynchrony in their auditory systems) may explain why these listeners have difficulty understanding language spoken in noisy environments. 1.6 Models of Temporal Processing Many theoretical models have been devised to account for monaural and binaural psychoacoustic phenomena such as those discussed above. The prevailing monaural and binaural models will be described in the sections which follow, with special emphasis on those that take temporal processes into account. The theory upon which the present experiment is based will also be described. 1.6.1 Monaural Models Models of monaural temporal resolution attempt to describe how temporal characteristics of sounds are processed by each ear, both peripherally and centrally. Most models assume that, in the internal representation of the sound, temporal information is "smoothed" such that rapid changes are removed and slower changes remain (Moore, 1989). 34 Two monaural temporal processing models are depicted in Appendix J. In Model 1 (Rodenburg, 1977, and Viemeister, 1979, as cited in Moore, 1989), the stimulus first encounters a bandpass filter; this stage simulates the function of the peripheral auditory filters. Although only one filter is shown in the model, in fact, there would be many adjacent filters - each one corresponding to a critical bandwidth. The output of the bandpass filter then goes through a device that alters it in a nonlinear fashion; in Model 1, this is a half-wave rectifier, which allows positive portions of the waveform through but not negative portions. The nonlinear device is meant to simulate the (nonlinear) response of auditory neurons, which tend to fire at the peaks of the stimulus waveform (e.g., as they do when they are phase-locked to the stimulus). The output from the nonlinear device represents the amplitude of the original signal, and therefore contains information about the signal's temporal envelope (Viemeister & Plack, 1993). This output passes through a low-pass filter, which smoothes the signal by removing rapid changes in envelope amplitude and allowing slower changes to remain. The low-pass filter output is thus very similar to the temporal envelope of the initial, band-pass filter output. The last part of the model is a decision device, which performs different functions depending on the temporal task required of the listener. If, for example, the listener must discern the presence of a gap in a sinusoidal stimulus (as in the gap detection test), the decision device must detect a particular dip in the amplitude envelope emerging from the low-pass filter. A slightly different version of this model can be seen in Model 2 in Appendix J. In this model, the stimulus still encounters a bandpass filter and a nonlinear device, but the nonlinearity creates an output meant to simulate the power, rather than amplitude, of 35 the bandpass output. It does this by squaring the output from the bandpass filter; note, as with the half-wave rectifier in Model 1, that the output of the nonlinear device is still positive. The third stage is a "temporal integrator," which smoothes (removes rapid changes from) the power-like output of the nonlinear device, just as the low-pass filter does for the amplitude-like output in Model 1. The final stage in this model is the same decision device as that in Model 1. Other monaural models exist (e.g., Shannon, 1986, as cited in Moore, 1989) which also try to take into account the conversion of sound pressure into neural activity. Neural activity does not respond in a linear fashion to an increase in stimulus intensity. As intensity increases, neural activity increases, but more slowly. Thus the conversion of sound pressure to neural spikes is represented in these models as a nonlinear compressive function, which acts on the stimulus before it reaches the smoothing mechanism (the temporal integrator or low-pass filter). At present, each model has its strengths, but none can fully explain all the empirical perceptual data that exists. For example, while all three models account for perceptual behavior that occurs within one frequency bandwidth, none can explain the temporal comparisons that have been found to occur across bandwidths (e.g., Green, 1973, and Pisoni, 1977, as cited in Viemeister & Plack, 1993; Zera & Green, 1993).5 The models do reveal, however, the complexity of the processing performed on the sounds arriving at each ear. 5 More recent models, upon which successful speech-recognition devices are based, account for these temporal comparisons (e.g., Patterson & Allerhand, 1995). 36 1.6.2 Binaural Models Binaural interaction models of temporal processing were created to account for phenomena that occur when particular combinations of sounds reach both ears. Specifically, these models have tried to account for binaural masking-level differences (MLDs). Durlach's (1972) equalization and cancellation model tries to account for MLDs. Durlach's model is depicted in Appendix J. In this model, the stimulus (assumed to be a tone masked by noise, for the purpose of explaining MLDs) arrives at both ears and is filtered by the familiar bandpass filter. Then each signal is slightly corrupted by temporal asynchrony, or "jitter," in both amplitude and time. Each signal is multiplied by a random amplitude factor of (1 - e) and delayed by a random delay factor 5. (The jitter factors differ for each ear.) The values for e and 5 for each ear are independent of each other and of the values in the other ear. They are assumed to be generated randomly from Gaussian distributions of possible values (one for s and one for 8) with means of zero (Colburn & Durlach, 1978). After being temporally corrupted, the jittered output from each ear is transformed by the equalization mechanism such that the masker noise from one ear (or channel) is made the same as that in the other ear. The equalizer is able to perform certain operations, such as shifting phases, adjusting amplitudes, or delaying one channel relative to the other, to accomplish this transformation. The amount by which one channel can be delayed relative to the other (the internal interaural delay) is thought to be that which maximizes the final signal-to-noise ratio in the output of the cancellation mechanism; this delay value has been hypothesized to be quite limited in magnitude 37 (i.e., less than 1 msec). Recent studies have revealed this value to be closer to 2.5 msec (Pichora-Fuller & Schneider, 1992). In the cancellation mechanism, the output that arrived via one channel is subtracted from that which arrived via the other channel. In this way, the masker noise is eliminated while the signal remains. (Note that, because of temporal corruption, the cancellation will not be perfect.) The cancellation-mechanism output then reaches a decision device that also receives input from the monaural channels. The MLD is predicted to be the signal-to-noise ratio of the cancellation-mechanism output divided by the signal-to-noise ratio from one of the monaural channels (Moore, 1989). Durlach's model has one great strength: it allows quantitative predictions to be made about MLDs and other psychoacoustic phenomena. However, the underlying physiology of the auditory pathways are not fully taken into account (Colburn & Durlach, 1978). Specifically, Durlach identifies none of the physiological processes by which equalization and cancellation are achieved, nor does he describe the anatomical and physiological sources of temporal jitter. (Note, however, that his model preceded relevant physiological studies.) For a detailed description of this model, refer to Durlach (1972). In his model, Durlach hypothesized that the amount of temporal jitter introduced into each channel was independent of the internal interaural delay (added by the equalizer). This has been found to be true for older listeners, but not young ones (Pichora-Fuller & Schneider, 1992). For older listeners, it was found that MLDs (which are large if temporal jitter is small) did not change with (external) interaural delay; in young listeners, however, MLDs changed as interaural delays changed. Pichora-Fuller 38 and Schneider concluded that this empirical data could be explained by Durlach's model if temporal jitter was independent of internal interaural delay in the elderly, but increased with internal interaural delay in young listeners. If this modification to Durlach's model is true, then introducing a constant level of external jitter to a speech signal should cause young listeners to perform like elderly listeners (assuming that the temporal jitter is created using Durlach's random amplitude and delay factors from Gaussian distributions6). It is this hypothesis that forms the basis for the experiment described in the following chapter. For detailed predictions concerning the performance of young listeners with jittered speech signals, refer to section 1.8, "Hypotheses." 1.7 Summary Monaural and binaural temporal resolution (as measured by the aforementioned tasks) have been linked with the ability to recognize, recall, and understand spoken language, both in noise and quiet. The presence of temporal asynchrony can interfere with accurate temporal resolution. There are many inherent anatomical and physiological sources of temporal asynchrony in the auditory system. Ample evidence shows that these structures and processes deteriorate as we age, thus the temporal asynchrony in our perceptual system is likely to increase with age. This deterioration clearly affects temporal resolution; therefore it undoubtedly affects word recognition and 6 Note that this type of jitter would cause random fluctuations in amplitude, but would not affect the overall amplitude of the signal. A variation in overall amplitude would change the magnitude of the signal, which would introduce spectral as well as temporal distortions (Viemeister & Plack, 1993). Thus the type of jitter created using Durlach's method would introduce only temporal distortions. 39 comprehension abilities as well. Thus, the difficulties older people have in understanding language spoken in noisy environments may not be caused by audiometrically-measured hearing loss or cognitive factors alone, but also by increased jitter in their auditory systems. It follows, then, that if jitter is introduced artificially to a speech signal and presented to young normal-hearing listeners, they should perform like older listeners do without jitter (e.g., Pichora-Fuller et al., 1995). If this jitter is created using a method reminiscent of Durlach's (1972) random amplitude values, it will change the local properties of the temporal envelope in a random fashion, but will not alter the overall magnitude of the signal. Further, such jittering will randomly affect the timing of the signal, but will not reduce the processing time available to the listeners as compression does (e.g., Wingfield et al., submitted). Finally, this jittering method will not alter the perceived pitch of the stimuli, because the temporal changes will occur at random intervals, in random amounts. Thus, in this experiment, we applied temporal jitter to the SPIN-R sentences, and presented these sentences in varying amounts of background noise to young, normal-hearing listeners. This allowed us to simulate internal temporal jitter, and to observe the effect of this jitter on speech perception and the ability to use context to recognize words spoken in noise. 1.8 Hypotheses In the present study, the following null and research hypotheses were tested. 40 Null Hypothesis 1. In signal-to-noise ratios that permit near-ceiling performance of young, normal-hearing listeners on monaural auditory processing tasks, performance will remain essentially unchanged when the signal is temporally jittered. Accompanying Research Hypotheses. In signal-to-noise ratios that permit near-ceiling performance of young, normal-hearing listeners on monaural auditory processing tasks, performance will worsen when the signal is temporally jittered. Specifically, when the signal contains a moderate amount of temporal jitter, performance will resemble that of old, normal-hearing listeners from previous studies (i.e., listeners will have poorer speech perception, but will be able to use contextual cues to recognize words). When the signal contains a high amount of temporal jitter, performance will resemble that of old, presbycusic listeners from previous studies (i.e., the greater degree of jitter will interfere, as presbycusis does, with both speech perception and the ability to use contextual cues to recognize words.) Null Hypothesis 2. Performance on monaural auditory processing tasks involving temporal jitter will not be correlated with gap detection thresholds. Accompanying Research Hypothesis. Performance on monaural auditory processing tasks involving temporal jitter will be inversely correlated with gap detection thresholds; that is, the smaller the gap detection threshold, the better the performance on the tasks. Small gap detection thresholds are believed, by some researchers, to reflect superior temporal resolution. Therefore listeners with smaller gap detection thresholds should perform better than listeners with larger gap detection thresholds on auditory processing tasks such as those in the present study. 41 Null Hypothesis 3. Performance on monaural auditory processing tasks involving temporal jitter will not be correlated with age, years of education, or vocabulary-test scores within a young, normal-hearing group. Accompanying Research Hypothesis. Performance on monaural auditory processing tasks involving temporal jitter will be correlated with age, education and vocabulary; specifically, when the signal contains high amounts of temporal jitter, performance on high-context sentences will improve as age, vocabulary scores, and years of education increase. This hypothesis is based on the expectation that older listeners will have greater amounts of world knowledge than younger listeners, and such knowledge will help them use context more effectively to recognize words. Likewise, listeners with larger vocabularies or more years of education may also use context more effectively than listeners with smaller vocabularies or fewer years of education. Null Hypothesis 4. The two moderate jitter conditions tested in this experiment will be equally easy to perceive, in spite of their qualitative differences. Accompanying Research Hypothesis. Performance in the two moderate jitter conditions will be different; specifically, the high-SD moderate jitter condition (described in the Methods chapter) will prove to be more difficult to perceive than the high-BW condition, and so result in poorer performances, particularly on low-context sentences. This hypothesis was based on the experimenter's subjective observation that high-SD jitter, which sounded as if static were interfering with the signal, seemed more difficult to perceive than high-BW jitter, which had a "warbled" sound quality. (Refer to the Methods chapter for details.) 43 2. METHODS 2.1 Objectives The first objective of this study was to determine if participants' ability to recognize and repeat words from monaural, high- and low-context sentences worsens when: (1) the signal-to-noise ratio decreases (where the noise is multi-talker "babble" presented simultaneously with the signal); and (2) the degree of temporal asynchrony (hereafter called "jitter") increases. The second objective was to determine if participants' performance correlated with variables such as pure-tone threshold, speech recognition threshold, age, education, vocabulary test scores, and gap detection threshold. To answer these questions, a pilot study and main experiment were conducted. The following sections describe these studies. 2.2 Experimental Design Before the objectives stated above could be met, a pilot study was conducted whose purpose was to confirm that the chosen degrees of jitter, when interacting with selected signal-to-noise ratios (S:N), were likely to mimic performance of old participants with good or impaired hearing. Once the appropriate S:N ratios were found, the main experiment was conducted using these S:N ratios for the presentation of the stimuli. The following sections describe the pilot and main experiments in detail. 44 Z3 The Pilot Study The participants, materials, setting, and procedures of the pilot study are described in the sections which follow. The stimuli that were jittered for this experiment are described in section 2.3.2.1, "The SPIN-R Sentences: An Overview." The process by which the stimuli were jittered, and the selected degrees of jitter, are described in section 2.3.2.2, "Preparation of Jittered Stimuli." 2.3.1 Participants in the Pilot Study Five people participated in the pilot study for this experiment. All were female, native English speakers between the ages of 24 and 31, with 15 to 20 years of education. All had pure-tone air-conduction thresholds within normal limits bilaterally (i.e., thresholds were below 25 dB HL for frequencies between 250 and 4000 Hz). 2.3.2 Materials for the Pilot Study Eight forms of the Revised test of Speech Perception in Noise (SPIN-R) (Bilger, Nuetzel, Rabinowitz, & Rzeczkowski, 1984) were used; see Appendix C for the contents of each form. Each form consists of 50 sentences, 25 of which are high-context and 25 of which are low-context. (For an overview of the development and content of these sentences, refer to the following section.) The sentences were presented monaurally in the presence of background, multi-talker "babble." An in-house computer program was used to present the sentences and babble to the listener at an experimenter-specified signal-to-noise ratio. 45 The SPIN-R sentences were presented in four conditions. In one condition, the sentences were unaltered. In the other three conditions, the sentences were altered using differing degrees of temporal jitter. For a description of these three jittered conditions, refer to section 2.3.2.2, "Preparation of Jittered Stimuli." Other aural or written materials used with participants included the Spondaic Word Lists (Hirsh, Davis, Silverman, Reynolds, Eldert, & Benson, 1952), which were used to determine participants' bilateral speech recognition thresholds (SRTs); the gap detection test (Schneider et al., 1994); the Mill Hill vocabulary test (Raven, 1938); and a Hearing and Language History questionnaire that was devised in-house. Pure-tone air-conduction thresholds were also found for each participant. The pure-tone thresholds and SRT tests were done to make sure that participants had normal hearing and speech recognition thresholds; elevated thresholds on these tests are correlated with reduced ability to perceive speech in noise (e.g., Tyler et al., 1982). The vocabulary test was done to ensure that all participants had approximately the same vocabulary; vocabulary size is correlated with verbal IQ, which affects the listener's familiarity with words on the SPIN-R Test. Word familiarity, in turn, affects the listener's ability to use context to identify the sentence-final words (Kalikow, Stevens, & Elliott, 1977; Bilger et al., 1984). The gap detection test was done to ensure that listeners' monaural temporal processing abilities were within normal range. This test was also done to determine if gap thresholds correlated with the ability to perceive speech in noise, as has been found in other studies (e.g., Tyler et al., 1982). For information on how these materials were used, refer to section 2.3.4, "Procedure for the Pilot Study," and section 2.4.4, "Procedure for the Main Experiment." 46 2.3.2.1 The SPIN-R Sentences: An Overview The ability to understand language in competing background noise has been assessed in young listeners, old normal-hearing listeners, and/or hearing-impaired listeners of various ages (e.g., Kalikow et al., 1977; Newman, 1982; Bilger et al., 1984; Gordon-Salant, Bell, Humes, Schum, & Bilger, 1989, as cited in Pichora-Fuller et al., 1995). In most studies, the materials used to assess this ability were either the Speech Perception in Noise (SPIN) Test (Kalikow et al., 1977) or the Revised SPIN (SPIN-R) Test (Bilger et al., 1984). The original SPIN Test consists of 10 sets of 50 tape-recorded sentences. In 25 of the sentences of each set, the sentence-final word is predictable (i.e., primed) from the context; in the other 25, the final word is not predictable. Thus, in the high-context sentences, the listener can use the context to anticipate the final word, and may respond correctly, even if the target word was not heard completely. In the low-context sentences, the listener must use primarily phonetic cues to determine the final word. The SPIN Test reveals how well the listener uses context to understand the message, and how well the listener recognizes words in running speech when there is background noise and no supportive context. The SPIN Test yields two scores: a percent correct for high-context sentences, and a percent correct for low-context sentences. The difference between these two scores reveals the listener's ability to use context to anticipate the last word (Kalikow et al., 1977). The original SPIN sentences are presented with background "babble" noise, which consists of 12 people reading passages aloud simultaneously. (This babble was 47 used to simulate realistic listening conditions: most listeners, in everyday situations, must perceive speech in the presence of many competing signals that tend to mask the speech (Kalikow et al., 1977).) The S:N ratio can be varied, to determine the effect of different amounts of background noise on speech perception and language comprehension. Bilger et al. (1984) found that, although the original SPIN sentences were reliable, the sentence sets were not equivalent for hearing-impaired listeners. They discarded two sets of sentences that deviated greatly from the others (as did Kalikow et al., in the original experiment). Then they rearranged the content of the remaining sets until they had eight sets that were equivalent at a S:N ratio of +8 dB. These eight sets are called the Revised SPIN (SPIN-R) Test; they are presented with background babble noise that consists of eight people talking, rather than twelve. Because the babble spectrum changes over time (as does spoken language), its synchronization with the sentences must be fixed; otherwise, lulls in the babble would vary in how they coincided with particular portions of the sentence, thereby introducing unwanted variability in recognition (Bilger et al., 1984). Using the SPIN-R Test, Pichora-Fuller et al. (1995) demonstrated that age affects the ability to recognize sentence-final words. They found that older, normal-hearing subjects performed more poorly than young subjects at the same signal-to-noise ratio, both for high- and low-context sentences. Further, they found that older, presbycusic listeners performed more poorly than old, normal-hearing or young listeners in these conditions. In addition, the older subjects used context more effectively in those signal-to-noise conditions where young and old did equally poorly on low-context 48 sentences. Thus age affected participants' ability to recognize words, and perhaps led them to be better at using context in conditions where words would be ambiguous without supportive context. Newman (1982) temporally compressed the SPIN sentences to 60% of the normal rate (using a speech time compressor/expander), to determine the effect on young listeners' word recognition. He found that, in the presence of this temporal stress, listeners relied greatly on context to interpret the distorted speech. (Note, however, that Newman did not specify the S:N ratio he used.) Thus the young participants' performance resembled that of the older participants in Pichora-Fuller et al.'s (1995) study, suggesting that external temporal distortion can simulate in young, normal-hearing listeners the behavior that older listeners exhibit in listening conditions without external temporal distortion of the signal. If this is so, perhaps internal temporal distortion is one of the underlying causes for the troubles elderly people experience when trying to comprehend language spoken in noise. To support this notion, the present study used SPIN-R sentences that were distorted with temporal jitter, to observe how this jitter affected young, normal-hearing listeners' ability to recognize words presented in noise. The following section describes this external temporal jitter, as well as the preparation of the three degrees of jitter used in this study. 2.3.2.2 Preparation of Jittered Stimuli A signal, such as a SPIN sentence, can be represented temporally on a graph of amplitude vs. time (refer to Appendix D for such graphs of jittered and unjittered pure tones and speech sounds). This graph consists of many data points (in the case of the 49 SPIN-R sentences, 20,000 per second, as determined by the sampling rate at the time of digital recording). Each point on the graph has an amplitude and time coordinate. When digital signals are played, these many points are "smoothed" (connected). Thus the temporal envelope of the digital speech signal is based on the amplitude and time coordinates of every point on the graph. A temporally-asynchronous (i.e., jittered) signal is one in which the amplitude values have been moved from their original, temporal positions. The overall amplitude of the speech signal, as well as the perceived pitch of the words, remains unchanged. The temporal structure of a jittered signal differs from that of the original, thus distorting the original sound quality. The SPIN-R sentences were jittered using a program written by Bruce Schneider (1997). As discussed above, this program distorted the temporal structure of the original signal by variably delaying the time coordinates of various amplitude values in the signal file. The delay affecting each coordinate could be positive (such that the point moved forward on the time axis) or negative (such that the point moved backward). In naturally-occurring, internal jitter, such as that hypothesized to exist in the elderly, the size of each delay value and the number of times this value changes are thought to vary randomly over time (Durlach, 1972; Pichora-Fuller & Schneider, 1992). If one assumes that the size of each delay value varies randomly in the same way that the amplitude of a band-limited noise varies randomly, one could generate random delay values using the Gaussian distribution of band-limited white-noise. In this case, the possible range of delay values would be determined by the standard deviation of the 50 distribution. For example, in a distribution with a large standard deviation (e.g., 57), the possible variation from the mean would be large. Thus amplitudes that deviated greatly from the mean - and therefore large positive and negative delay values - would be possible. If delay values were generated from such a distribution and applied to points in a signal, the resulting degree of jitter (and therefore distortion) would be high. Thus a high standard deviation creates a distortion that may mimic an internal, highly asynchronous temporal processor. Similarly, delay values generated from a distribution with a small standard deviation (e.g., 1) would be small, and the resulting degree of jitter would be small.8 To carry this notion still further, the (random) rate at which the delay values change could be generated using the frequency bandwidth of the noise distribution. Delay values generated from a distribution that has a bandwidth with a high-frequency upper cut-off (e.g., 500 Hz) would vary quickly over time; the resulting degree of jitter (and therefore distortion) would be high. Thus, as with a high standard deviation, a high-frequency bandwidth creates a distortion that may mimic an internal, highly asynchronous temporal processor. Similarly, delay values generated from a distribution that has a bandwidth with a low-frequency upper cut-off (e.g., 100 Hz) would vary more slowly over time. The degree of jitter resulting from such values would be smaller. Thus, using this model, one could employ both standard deviation and frequency bandwidth cut-off values to generate delay sizes and rates of delay variation; these 7 A standard deviation of 1 is the size of one data point in the digital file. Since the file contains 20,000 data points per second, a standard deviation of 1 is equivalent to a 0.05 msec standard deviation, and a standard deviation of 5 is equivalent to a 0.25 msec standard deviation. 8 Note that a standard deviation of 0 would imply no temporal deviation at all; i.e., the original signal would be unchanged. 51 values could then be applied to a speech signal to jitter it. The jitter program designed by Schneider (1997) creates a Gaussian distribution of white-noise amplitude values using standard deviation and frequency bandwidth values (entered by the experimenter). Then it generates, from this distribution, jitter delay values that vary randomly in size and change at a specified rate, and applies these delay values to a digital speech signal. By selecting the standard deviation and frequency bandwidth, the experimenter controls the degree of jitter added to the speech signal. Three jitter conditions were used with the SPIN-R sentences; two conditions had moderate amounts of jitter, and one had a high degree of jitter. The first moderate jitter condition resulted from assigning the white-noise distribution a standard deviation (SD) of 5 and a frequency bandwidth (BW) of 100 Hz. The second moderate jitter condition resulted from assigning the white-noise distribution a SD of 1 and a BW of 500 Hz. The third, most jittered condition came from a distribution whose SD was 5 and BW was 500 Hz. Refer to Appendices D and E for graphs and spectrograms of tones and speech sounds jittered using either high standard deviation, high bandwidth, or both high standard deviation and high bandwidth values.9 Note that in the first moderate condition, the standard deviation was high and the bandwidth low, while in the other moderate condition the situation was reversed. This was done to ascertain, separately, the effects of high bandwidth and high standard deviation on listener performance. (The first moderate jitter condition will hereafter be referred to as the high-SD condition. The second moderate condition will hereafter be 9 Note, in the spectrograms in Appendix E, that the external jitter appears as a narrow band of noise surrounding the pure tone or speech signal. Further, the band of noise increases in width as the degree of jitter increases. (This band can be clearly seen, for example, in the jittered 2000-Hz pure tones.) 52 referred to as the high-BW condition.) These particular pairs of values (SD=5, BW=100; SD=1, BW=500) were chosen because they both resulted in a moderate amount of jitter that, in the pilot study, made the performance of young, normal-hearing listeners resemble that of old, normal-hearing listeners. (Refer to section 2.3.4 for details about the pilot study procedure.) Note, however, that the high-SD sentences sounded qualitatively different from the high-BW sentences. To the experimenter, the sentences with high-SD jitter had a muffled, warbled sound quality, while the sentences with high-BW jitter sounded as if static were interfering with the signal. In addition, the high-SD sentences sounded more distorted than the high-BW sentences. The values for the third, most-jittered condition (hereafter referred to as the high-SD/BW condition) were chosen because they generated the greatest degree of temporal distortion in which the pilot participants could still perform at the lowest signal-to-noise ratio (+4 dB). For more information, refer to section 2.3.4, in which the pilot study procedure is described in greater detail. 2.3.2.3 Calibrating the Sound Level of the Stimuli The RMS values of the SPIN-R sentences had already been established (Kalikow et al., 1977). These values were incorporated into the in-house computer program that played SPIN-R sentences for the listener. Thus calibration occurred automatically, as the sentences were played for participants. 53 2.3.3 Apparatus and Physical Setting for the Pilot Study All pilot and main experiment sessions took place with the participant seated in a double-walled, sound-attenuating IAC booth. Participants listened to SPIN-R sentences through TDH 39P 10W headphones and responded verbally. The experimenter controlled the presentation of SPIN-R sentences (form number, jitter condition, and signal-to-noise ratio) from outside the booth. The SPIN-R sentences were played for the participant using an in-house computer program on an IBM-compatible personal computer. These digital signals passed through the Tucker Davis Technologies DD1, FT5, PA4, SM3, and HB5 modules before reaching the participant's headphones. (Both Gain dials on the SM3 module were set to -20.) Refer to Appendix F for the purpose of each module and how these modules were connected. 2.3.4 Procedure for the Pilot Study As mentioned earlier, the pilot study was performed to determine which S:N ratios caused the performance of young listeners with jittered stimuli to resemble that of old listeners with unjittered stimuli. The S:N ratios to be tested were 0, +4, +6, and +8 dB. First, audiograms and speech recognition thresholds were obtained for all participants. Then, for the SPIN task, participants were asked to repeat the last word of each SPIN-R sentence; refer to Appendix G for the specific instructions given to participants. The experimenter read aloud 10 practice sentences from the SPIN-R Practice Form, without jitter and without noise, to help participants practice the task. 54 The first pilot participant heard a different SPIN-R form in each of the three jitter conditions (high-SD, high-BW, and high-SD/BW). The two moderate jitter conditions were presented in a S:N ratio of +8 dB; the most jittered condition was presented in a S:N of 0 dB. This participant's performance revealed that: (1) the two moderate conditions in a S:N of +8 dB could, indeed, make a young, normal-hearing listener perform like an old, normal-hearing listener, as was desired; and (2) the most jittered condition (high-SD/BW) was too difficult to do in a S:N of 0 dB, thus a new, "difficult" S:N ratio had to be determined. The four remaining pilot participants heard the two moderate jitter conditions as the first participant did (in a S:N of +8 dB), and the most jittered condition in the S:N ratios of +4 and +6 dB. Since participants could still perform reasonably well (i.e., with scores of at least 20% in the high-context condition) in the most difficult listening condition, +4 dB was selected as the "difficult" S:N ratio for the main experiment. In addition, although participants' high-context scores were near 100% in the easiest listening condition, low-context scores were not; with unjittered stimuli, such scores would be near ceiling for young - but not older - listeners (Pichora-Fuller et al., 1995)). Thus a S:N of +8 dB was chosen to be the "easy" S:N ratio in the main experiment. 2.4 The Main Experiment The following sections describe the participants, materials, setting, and procedures of the main experiment. 55 2.4.1 Participants in the Main Experiment Sixteen people participated in the main experiment. Six were male and ten were female. All were native English speakers between the ages of 19 and 35, with 12 to 21 years of education. All had pure-tone air-conduction thresholds within normal limits bilaterally (see Appendix A for details). 2.4.2 Materials for the Main Experiment As with the pilot study, the main experiment used jittered and unjittered SPIN-R sentences with multi-talker background babble. Refer to section 2.3.2, "Materials for the Pilot Study," for a description of all materials used in the main experiment. 2.4.3 Apparatus and Physical Setting for the Main Experiment The set-up for participants in the main experiment was identical to that used in the pilot study. Refer to section 2.3.3, "Apparatus and Physical Setting for the Pilot Study," for more information. 2.4.4 Procedure for the Main Experiment In the main experiment, audiograms and speech recognition thresholds were obtained for all participants. All participants also completed an in-house, one-page form concerning their hearing and language history, followed by the 20-item Mill Hill vocabulary test (Raven, 1938). Refer to Appendix B for vocabulary scores and other participant information. 56 After doing the pure-tone and SRT tests, each participant then performed the Gap Detection Test. This test is described in detail in the following section. 2.4.4.1 The Gap Detection Test The purpose of the Gap Detection test was to find the shortest duration of silence that the participant could detect within a pure-tone presented monaurally at normal conversational volume. To administer this task, the experimenter (from outside the booth) started the program (Schneider et al., 1994) that would present the pure-tone pips. The participant listened to the pips through the TDH 39P 10W headphones while holding a three-button box. Two of the buttons on this box could be pushed as possible responses for each trial; the third button could be used to start the gap detection test or the next trial in the test. In each trial of the test, the participant heard two very short tones. One tone had a silent interval in the middle; the other did not. The participant had to indicate which tone had the silent interval by pushing a corresponding button. (Refer to Appendix G for the specific instructions given to each participant.) After the participant pushed a button, a light above the correct button would light up. The participant started the test and went on to another trial by pushing a button when he or she was ready. This Gap Detection program (Schneider, 1994) used an adaptive procedure over several trials to calculate a gap detection threshold (in msec) for the participant. An adaptive procedure is one in which the stimulus level for each trial is determined by how the participant responded to the previous level in the preceding trial. Several strategies can be used (singly or combined) to determine when, and by what increment, to 57 increase or decrease the stimulus level. Adaptive procedures have many advantages over conventional procedures (which decrease the gap size over a fixed number of trials); for example, they are more efficient, and can lead to reliable measures after a small number of trials. For a discussion on different types of adaptive procedures, refer to Levitt (1971). During the gap threshold testing, the experimenter discovered that some participants had difficulty learning the task required of them. These participants also had larger gap thresholds than other subjects (e.g., one participant's threshold was 15 msec, where many others were below 6 msec). When these participants were allowed to do the test a second time, they became more proficient at the task and their gap thresholds decreased. The experimenter therefore allowed all participants to repeat the gap test. Subjects whose gap thresholds dropped by more than 1 msec were allowed to do the test a third time, if they so desired. Thus all participants performed at least two gap detection tests, and some performed three. One participant was asked to do two extra tests, to observe the changes in gap threshold over many trials. Refer to Appendix B for the final gap detection threshold of each participant; refer to the Results section for more discussion on gap test results. 2.4.4.2 The SPIN Task After doing the Gap Detection Test, participants performed the SPIN Task. In this task, participants were asked to repeat the last word of each SPIN-R sentence; refer to Appendix G for the specific instructions given to participants. The experimenter 58 then read aloud 10 practice sentences from the SPIN-R Practice Form, without jitter and without noise, to help the participant practice the task. The SPIN-R sentences were presented monaurally to the participant's right ear at 70 dB SPL; this value is equivalent to 50 dB HL (Wilber, 1994), which is a typical conversational level. All jitter conditions (high-BW, high-SD, and high-SD/BW) were presented in both S:N ratios, in an order that went from easiest to most difficult. (This was done so that, if learning affected performance, participants would benefit most at the most difficult jitter and S:N conditions. If participants still did poorly at the most difficult combination of conditions, the experimenter could then conclude that, despite any learning, jitter affected participants' performance.) The order of SPIN-R forms was balanced across participants, jitter conditions, and S:N conditions. (This was done to control for any effects of inequivalence among forms. The SPIN-R forms have been shown to be equivalent at a S:N of +8 dB (Bilger et al., 1984), but this has not been shown for a S:N of +4 dB.) Participants' percent correct scores in each jitter and S:N condition are listed in Appendix H. Refer to Appendix I for the order of conditions and SPIN-R forms used for each participant. Refer to the Results section for a detailed discussion of the participants' performance in all conditions. 59 3. RESULTS 3.1 Introduction In this chapter, the results of the pilot and main experiments will be described. In the section which follows, the procedure used to score the SPIN-R forms in the pilot and main experiments will be described. The results of the pilot study will then be briefly discussed. The remaining sections will review the results of the main experiment; specifically, the effects of experimental variables (degree of jitter, S:N ratio, and sentence context) on participants' scores will be described in detail. 3.2 Scoring Procedure In this experiment, participants heard each SPIN form in one of eight jitter by S:N combinations. The four jitter conditions were: (1) no jitter; (2) the high-BW condition, which was obtained using a high bandwidth (BW=500 Hz) and low standard deviation (SD=110); (3) the high-SD condition, which was obtained using a low bandwidth (BW=100 Hz) and high standard deviation (SD=5); and (4) the high-SD/BW condition, which was obtained using a high bandwidth (BW=500 Hz) and high standard deviation (SD=5). The two S:N conditions were +4 dB and +8 dB. Participants had to repeat the last word of each sentence; their response was scored correct if it was exactly the same as the word presented. For example, if their response was a plural form of the word, it was scored as incorrect. Participants received two percent-correct word-recognition Recall that a standard deviation of 1 is equivalent to 0.05 msec, and a standard deviation of 5 is equivalent to 0.25 msec. 60 scores for each SPIN form: one for high-context sentences and one for low-context sentences. 3.3 Results of the Pilot Study Five people participated in the pilot study, which was conducted to determine the range of conditions to be used in the main experiment. (The goal of the main experiment was to test a span of performance from near-ceiling in the easiest condition to just below 20% correct in the most difficult condition.) The first participant heard SPIN-R forms in both moderate (high-SD and high-BW) jitter conditions in a S:N ratio of +8 dB; this person also heard forms in the most jittered condition (high-SD/BW) in S:N ratios of 0 dB and +4 dB. In the +8 dB S:N-ratio condition, this participant obtained near-ceiling scores on the high-context sentences in both high-SD and high-BW jitter conditions, and scores of 80% on the low-context sentences. In the +4 dB S:N-ratio condition, all scores in these two jitter conditions plummeted, with a large gap appearing between high- and low-context scores. In the 0 dB S:N-ratio condition, the participant could respond to only 4 of the 50 sentences. Therefore the 0 dB S:N-ratio condition was discarded from the pilot and main experiments because it was too difficult. The four remaining participants also heard SPIN-R forms in both moderate jitter conditions in a S:N ratio of +8 dB, and the high-SD/BW jittered forms in S:N ratios of +4 dB and +6 dB. As with the first participant, their high-context scores were near ceiling in the +8 dB S:N-ratio condition; however, their low-context scores in this S:N ratio were considerably lower (i.e., 52-72% for the high-SD jitter, and 68-84% for the high-BW 61 jitter). Note that, for low-context sentences, the scores in high-SD jitter were generally lower than those in high-BW jitter. For the high-SD/BW jitter condition, both high- and low-context scores in the +6 dB S:N-ratio condition were better than those in the +4 dB S:N-ratio condition. In each S:N condition, high-context scores were considerably better than low-context scores. The lowest scores occurred (not surprisingly) with the low-context sentences in the +4 dB S:N-ratio condition, in which scores ranged from 4% to 32% correct. Because the participants' performance with jittered stimuli in the +4 dB S:N-ratio condition resembled that of elderly listeners with unjittered stimuli (Pichora-Fuller et al., 1995), this S:N-ratio condition was selected for inclusion in the main experiment as the "difficult" S:N condition, along with the "easier" +8 dB S:N condition. 3.3.1 Types of Errors Made by Pilot Participants As mentioned in the previous section, participants' low-context scores in the "easier" conditions were quite low, relative to the high-context scores. Errors participants made on low-context sentences in the +8 dB S:N-ratio condition included: (1) substituting one voiceless stop for another, such as IM for /k/; (2) substituting one back vowel for another, or one front vowel for another; (3) replacing word-final nasals with stops, such as /t/for/nd/; (4) omitting word-final consonants; (5) adding word-final consonants (e.g., adding plural suffixes or other consonants to the root word); and (6) replacing the entire word with one that occurs more frequently in English (e.g., replacing "dart" with "dark"). Note that the incorrect sounds almost always had one or more features in common with the target sounds, in voicing, place, and/or manner. Further, in each incorrect response in the moderate jitter conditions, most participants replaced only one sound in the word. With the high-SD/BW jitter condition, many of the errors in the high- and low-context sentences contained more than one sound substitution. For example, in one low-context sentence, the word "prize" was replaced with "brats" by one participant. Thus the responses resembled the target words less as the S:N ratio increased. In addition, participants were less able to use context to anticipate the last word, thus the high-context responses depended more on which sounds had been perceived. The participants' responses therefore did not always make sense within the supportive context of the sentences, but had some sounds in common with the target word. 3.4 Results of the Main Experiment The following sections review participants' scores in each experimental condition, and discuss effects of the experimental variables on these scores. 3.4.1 Effect of Jitter Condition on Sentence-Final Word-Recognition Scores Table 1 lists the mean percent-correct score and corresponding standard deviation for each jitter condition. Each mean includes scores for both high- and low-context sentences, as well as the +4 dB and +8 dB S:N-ratio conditions. Table 1. Jitter Condition Mean Percent Correct Standard Deviation No Jitter 87.8 12.54 High-BW 86.3 14.21 High-SD 78.7 19.15 High-SD/BW 45.2 25.25 Mean Percent-Correct Word-Recognition Score for Each Jitter Condition. 63 Figure 1 shows the mean percent-correct word-recognition score in each jitter condition. Figure 1 Mean Percent-Correct Word-Recognit ion Score in Each Jitter Condit ion o 2» o o c <D U l_ 0) D_ c (0 0 100 90 80 70 60 50 40 30 20 10 No Jitter High-BW High-SD High-SD/BW Jitter Condition Overall, participants performed similarly in the no-jitter and high-BW conditions. Recall that the high-BW condition had a low standard deviation (SD=1), which caused delay values to deviate little from the original, and a high bandwidth (BW=500 Hz), which caused the delay values to change faster over time. Scores in the no-jitter and high-BW conditions were superior to those in the high-SD and high-SD/BW conditions. Recall that the high-SD condition had a high standard deviation (SD=5), which caused delay values to deviate greatly from the original, and a low bandwidth (BW=100 Hz), 64 which caused delay values to change slowly over time. Participants made the most errors in the high-SD/BW condition, which had both high standard deviation and high bandwidth (SD=5; BW=500 Hz). These observations were supported by an analysis of variance, which demonstrated a significant effect of jitter condition on percent-correct scores [F(3, 45)=309.4, p_<0.01]. The results of a Student-Newman-Keuls test of multiple comparisons indicated that there were the following significant differences among the jitter conditions (p_<0.01): 1. There was no significant difference between performance without jitter and performance in the high-BW condition. 2. Scores in the no-jitter and high-BW conditions were significantly better than those in the high-SD and high-SD/BW conditions. 3. Performance in the high-SD condition was significantly better than that in the high-SD/BW condition. 3.4.2 Effect of Signal-to-Noise Ratio on Sentence-Final Word-Recognition Scores Overall, performance in the easier S:N-ratio condition (+8 dB) was better than that in the more difficult S:N-ratio condition (+4 dB). The mean percent-correct score in the +8 dB S:N-ratio condition (including both high- and low-context conditions) was 78.3; the standard deviation was 23.4. The mean percent-correct score in the +4 dB S:N-ratio condition was 70.7; the standard deviation was 26.5. An analysis of variance confirmed that performance in the +8 dB S:N-ratio condition was significantly better than 65 that in the +4 dB S:N-ratio condition [F(1, 15)=74.5, p<0.01]. The effect of S:N condition on performance varied depending upon the jitter condition. Refer to section 3.4.4 for details on the interaction between jitter and S:N conditions. 3.4.3 Effect of Context on Sentence-Final Word-Recognition Scores Performance in the high-context sentences was better than that in the low-context sentences. The mean percent-correct score for high-context sentences (including both +4 dB and +8 dB S:N-ratio conditions) was 89.7; the standard deviation was 16.2. The mean percent-correct score for low-context sentences was 59.3; the standard deviation was 23.5. An analysis of variance confirmed that there was a significant main effect of context on performance [F(1, 15)=956.0, p_<0.01]. This effect varied depending upon the jitter condition, but did not vary significantly across the two S:N-ratio conditions. Refer to section 3.4.5 for details on the interaction between jitter and context conditions, and to section 3.4.6 for details on the interaction between S:N-ratio and context conditions. 3.4.4 Interaction of Jitter and Signal-to-Noise-Ratio Conditions Table 2 lists the mean percent-correct score and corresponding standard deviation for each jitter condition in each S:N-ratio condition. Each mean includes scores for both high- and low-context sentences. 66 S:N Ratio (dB) Jitter Condition Mean Percent Correct Standard Deviation 8 No Jitter 89.6 11.03 8 High-BW 88.4 12.95 8 High-SD 82.6 16.59 8 High-SD/BW 52.5 27.12 4 No Jitter 86.0 13.82 4 High-BW 84.3 15.31 4 High-SD 74.7 20.93 4 High-SD/BW 37.9 21.21 Table 2. Mean Percent-Correct Word-Recognition Score for Each Jitter Condition in Each S:N-Ratio Condition. Figure 2 shows the mean percent-correct word-recognition score in each jitter and S:N-ratio condition. u o 1— o o u o CL c a o Figure 2 Mean Percent-Correct Word-Recognition Score in Each Jitter and S:N Condition • No Jitter • High-BW • High-SD • High-SD/BW Signal-to-Noise Ratio (dB) 67 Within each S:N condition, performance was similar in the no-jitter and high-BW conditions. In addition, participants scored higher in these two jitter conditions than in the high-SD and high-SD/BW conditions. Performance in the high-SD condition was better than that in the high-SD/BW condition within each S:N-ratio condition. If one compares scores across S:N ratios, one can see that, for all jitter conditions, performance in the +4 dB S:N-ratio condition was worse than that in the +8 dB S:N-ratio condition. Scores in the no-jitter and high-BW conditions in the +4 dB S:N-ratio condition were slightly lower than those in the +8 dB S:N-ratio condition, although all these scores were quite similar. Performance in the high-SD condition was worse in the +4 dB S:N-ratio condition than in the +8 dB S:N-ratio condition. The greatest difference in performance can be seen in the high-SD/BW condition, where the mean score in the +4 dB S:N-ratio condition was considerably lower than that in the +8 dB S:N-ratio condition. These observations were supported by an analysis of variance, which demonstrated a significant interaction effect between jitter and S:N-ratio conditions [F(3, 45)=10.1, p_<0.01 ]. The results of a Student-Newman-Keuls test of multiple comparisons indicated that the following significant interactions occurred between jitter and S:N-ratio conditions (p_<0.01): 1. Scores in the no-jitter and high-BW conditions did not differ significantly within each S:N-ratio condition. 2. There was no significant S:N-ratio effect on performance in the no-jitter and high-BW conditions. 68 3. Within each S:N-ratio condition, performance in the no-jitter and high-BW conditions was significantly better than that in the high-SD and high-SD/BW conditions. 4. There was a significant S:N-ratio effect on performance in the high-SD and high-SD/BW conditions. 5. Within each S:N-ratio condition, scores in the high-SD condition were significantly better than those in the high-SD/BW condition. 3.4.5 Interaction of Jitter and Context Conditions Table 3 lists the mean percent-correct word-recognition score and corresponding standard deviation for each jitter condition in each context condition. Each mean includes scores for both S:N-ratio conditions. Context Jitter Condition Mean Percent Correct Standard Deviation High No Jitter 98.6 2.61 High High-BW 98.4 2.72 High High-SD 95.4 5.09 High High-SD/BW 66.4 16.78 Low No Jitter 77.0 8.44 Low High-BW 74.2 10.05 Low High-SD 62.0 12.02 Low High-SD/BW 24.0 9.37 Table 3. Mean Percent-Correct Word-Recognition Score for Each Jitter Condition in Each Context Condition. Figure 3 shows the mean percent-correct word-recognition score in each jitter and context condition. 69 Figure 3 Mean Percent -Cor rec t W o r d - R e c o g n i t i o n S c o r e in Each J i t te r a n d C o n t e x t C o n d i t i o n • No Jitter H High-BW • High-SD • High-SD/BW High Low Context It can be seen in Figure 3 that, for high-context sentences, performance was near ceiling for the no-jitter, high-BW and high-SD conditions. Performance in the high-SD/BW jitter condition in high context was noticeably worse than that in the other jitter conditions. In the low-context sentences, scores were highest (and quite similar) in the no-jitter and high-BW conditions. Performance in the high-SD condition in low context was worse than that in the no-jitter and high-BW conditions; scores dropped considerably more in the high-SD/BW condition. If one compares performance across context conditions, one can see that, for all jitter conditions, performance in low context was worse than that in high context. The low-context mean scores in the no-jitter and high-BW conditions were approximately 20% less than the high-context means. The low-context mean score in the high-SD 70 condition was almost 30% less than the high-context mean. The greatest difference in performance can be seen in the high-SD/BW condition, where the mean score in low context was almost one-third that in high context. These observations were supported by an analysis of variance, which demonstrated a significant interaction effect between jitter and context conditions [F(3, 45)=20.8, p_<0.01]. The results of a Student-Newman-Keuls test of multiple comparisons indicated that the following significant interactions occurred between jitter and context conditions (p_<0.01): 1. Scores in the no-jitter, high-BW, and high-SD conditions did not differ significantly within the high-context condition; however, they were significantly better than the high-SD/BW scores in this condition. 2. No-jitter and high-BW scores did not differ significantly within the low-context condition; however, they were significantly better than the high-SD and high-SD/BW scores in this condition. 3. In the low-context condition, high-SD performance was significantly better than high-SD/BW performance. 4. There was a significant context effect on performance in all jitter conditions; that is, performance in each jitter condition was significantly better in high context than in low context. 71 3.4.6 Interaction of S i q n a l - t o - N o i s e - R a t i o a n d C o n t e x t C o n d i t i o n s T a b l e 4 lists t h e m e a n p e r c e n t - c o r r e c t w o r d - r e c o g n i t i o n s c o r e a n d c o r r e s p o n d i n g s t a n d a r d d e v i a t i o n for e a c h S : N - r a t i o c o n d i t i o n in e a c h c o n t e x t c o n d i t i o n . E a c h m e a n i n c l u d e s s c o r e s for all f o u r jitter c o n d i t i o n s . S:N Ratio (dB) Context Mean Percent Correct Standard Deviation 8 High 93.4 11.18 8 Low 63.2 22.72 4 High 86.0 19.44 4 Low 55.4 23.70 a b l e 4. M e a n P e r c e n t - C o r r e c t W o r d - R e c o g n i t i o n S c o r e for E a c h S : N - R a t i o C o n d i t i o n in E a c h C o n t e x t C o n d i t i o n . F i g u r e 4 s h o w s t h e m e a n p e r c e n t - c o r r e c t w o r d - r e c o g n i t i o n s c o r e in e a c h S : N -ratio a n d c o n t e x t c o n d i t i o n . Figure 4 Mean Percent-Correct Word-Recognition Score in Each S:N and Context Condition • High) • Low 8 4 S:N Condition (dB) 72 It can be seen in Figure 4 that, within each S:N-ratio condition, high-context performance was better than low-context performance. (Recall, from section 3.4.3, that context did indeed have a significant main effect on performance.) However, for each context condition, performance in the +8 dB S:N-ratio condition was only slightly better than that in the +4 dB S:N-ratio condition. An analysis of variance confirmed that there was no significant two-way interaction between S:N-ratio and context conditions. 3.4.7 Interaction of Jitter, Signal-to-Noise-Ratio, and Context Conditions Table 5 lists the mean percent-correct score and corresponding standard deviation for each jitter condition in each S:N-ratio and context condition. S:N Ratio (dB) Context Jitter Condition Mean % Correct Standard Deviation 8 H No Jitter 99.5 2.00 8 H High-BW 99.5 1.37 8 H High-SD 97.2 3.49 8 H High-SD/BW 77.2 11.57 8 L No Jitter 79.7 6.28 8 L High-BW 77.2 8.97 8 L High-SD 68.0 10.01 8 L High-SD/BW 27.7 8.94 4 H No Jitter 97.7 2.91 4 H High-BW 97.4 3.32 4 H High-SD 93.5 5.82 4 H High-SD/BW 55.5 14.00 4 L No Jitter 74.2 9.57 4 L High-BW 71.2 10.45 4 L High-SD 56.0 11.03 4 L High-SD/BW 20.2 8.45 Table 5. Mean Percent-Correct Word-Recognition Score for Each Jitter Condition in Each S:N-Ratio and Context Condition. 73 Figure 5 shows the mean percent-correct word-recognition score in each jitter, S:N-ratio, and context condition. Figure 5 Mean Percent-Correct Word-Recognition Score in Each Jitter, S:N (dB), and Context Condition • No Jitter • High-BW E3 High-SD • High-SD/BW High, +8 High, +4 Low, +8 Low, +4 Context Condition and S:N Ratio It can be seen in Figure 5 that, within each context by S:N-ratio condition, performance in the no-jitter and high-BW conditions was similar, and performance in both was superior to performance in the high-SD and high-SD/BW conditions. Also, within each high-context condition, performance in high-SD jitter was similar to that in the no-jitter and high-BW conditions. Within each low-context condition, however, performance in the high-SD condition was noticeably worse than in the no-jitter and high-BW conditions. The scores in the high-SD/BW condition were the lowest of all within each context by S:N-ratio condition. The greatest differences between scores in 74 the high-SD/BW condition and scores in the other conditions were observed when context was low. If one compares performance across conditions, one can see that scores in the no-jitter, high-BW, and high-SD conditions were approximately the same when context was high, regardless of the S:N ratio. However, performance in these conditions worsened noticeably when context was low, with the lowest scores occurring in the lower S:N-ratio condition. The high-SD condition was more affected than the no-jitter and high-BW conditions by both context and S:N ratio. Of all jitter conditions, the high-SD/BW condition was most strongly affected by context and S:N ratio: scores in this condition worsened considerably as context dropped from high to low and the S:N ratio worsened. These observations were supported by an analysis of variance, which demonstrated a significant three-way interaction effect between jitter, S:N-ratio, and context conditions [F(3, 45)=7.7, p_<0.01]. The results of a Student-Newman-Keuls test of multiple comparisons indicated that the following significant interactions occurred between jitter, S:N-ratio, and context conditions (p_<0.01): 1. Scores in the no-jitter, high-BW, and high-SD conditions did not differ significantly within or across the high-context conditions in both the +8 dB and +4 dB S:N-ratio conditions; however, they were significantly better than the high-SD/BW scores in these conditions. 2. No-jitter and high-BW scores did not differ significantly within or across the low-context conditions in both the +8 dB and +4 dB S:N-ratio conditions. Further, they did not differ significantly from the high-SD condition in low context in the +8 75 dB S:N-ratio condition. All these scores, however, were significantly better than the high-SD performance in low context in the +4 dB S:N-ratio condition. 3. In the low-context sentences, performance in the high-SD/BW condition was significantly worse than in the no-jitter, high-BW, and high-SD conditions, in both +4 dB and +8 dB S:N-ratio conditions. 4. In the low-context condition, performance in high-SD/BW jitter in the +4 dB S:N-ratio condition was significantly worse than that in the +8 dB S:N-ratio condition. 5. Performance in each jitter condition was significantly better in high context than in low context, regardless of S:N ratio. 3.4.8 Correlation between Performance, Gap Threshold, and Other Variables The gap thresholds ranged from 1.7 msec to 8.2 msec. The mean gap threshold was 4.4 msec; the standard deviation was 2.2. Analysis of the relationship between gap threshold and performance in each jitter by S:N-ratio by context condition yielded no correlation above r = .462 (p_<0.05). Likewise, analysis of the relationships between performance and other variables, such as pure-tone threshold, speech recognition threshold, age, education, and vocabulary-test scores, yielded no correlation above r = -.688 (p_<0.05)11. Therefore no significant correlation was found between performance and gap threshold or between performance and the other variables measured. 76 3.4.9 The Effect of Practice on Gap Detection Thresholds As discussed in the Methods chapter, some participants had difficulty learning the task in the gap detection threshold test. The gap thresholds of these participants were initially quite large; these thresholds decreased and participants' skill at the task improved over several trials. For the remainder of the participants, performance remained steady (i.e., it did not change by more than 1 msec across test trials). Figure 6 shows how gap thresholds changed with practice for three participants who had difficulty with the task. One participant (S10) was asked to do two extra trials, so that the change in gap threshold could be observed over many trials, and to confirm that three trials was enough to arrive at an accurate gap threshold. Figure 6 Gap Threshold Changes Across Test Trials for 3 Participants 12 a 10 (A £ 8 £ 6 (0 « 2 O -0-S6 -B-S10 -A-S11 #1 #2 #3 #4 Gap Test Trial Number #5 1 1 Only one correlation (r = -.688) was significant (p.<0.05). This correlation was between participants' pure-tone air-conduction thresholds at 8000 Hz and performance on unjittered, high-context sentences in a S:N ratio of +8 dB. The significance of this correlation may be spurious, however, given the small 77 It can be seen in Figure 6 that participants needed at least two trials in the gap detection test to reach their true gap threshold. All three participants' gap thresholds decreased by approximately 50% from trial 1 to trial 2; for two participants, the gap threshold decreased again from trial 2 to trial 3. The performance of S10 reveals that gap threshold remains moderately steady (within a 2 msec range) after trial 3. The gap threshold used in the correlations was the final threshold that each participant obtained over 2-3 trials. (That is, if the participant did the test twice, the second value was used; if the participant did the test three times, the third value was used.) 3.4.10 The Difference Between Word-Recognition Scores in High- and Low-Context Conditions Participants' use of sentence context, as indicated by the difference between high- and low-context scores, differed depending on the jitter and S:N-ratio conditions. The following sections discuss the effects of jitter and S:N ratio on these differences. 3.4.10.1 Effect of Jitter Condition on the Difference Between Word-Recognition Scores in High- and Low-Context Conditions Table 6 lists the mean difference between high- and low-context scores for each jitter condition, as well as the corresponding standard deviation for each mean. Each value includes scores for both +4 dB and +8 dB S:N-ratio conditions. number of values used to calculate the correlation. 78 Jitter Condition Mean Difference Between Highl-and Low-Context Scores Standard Deviation No Jitter 21.6 7.99 High-BW 24.2 10.37 High-SD 33.4 11.61 High-SD/BW 42.4 14.65 Table 6. Mean Difference Between High- and Low-Context Word-Recognition Scores for Each Jitter Condition. Figure 7 shows the mean difference between high- and low-context word-recognition scores for each jitter condition. Figure 7 Mean Difference Between High- and Low-Context Word-Recognition Scores for each Jitter Condition 45 40 — 35 — 30 — 25 _ _ — — 20 1 I — 15 — 10 — 5 _ 0 -I 1 1 , 1 ' , 1 1 , 1 No Jitter High-BW High-SD High-SD/BW Jitter Condition It can be seen in Figure 7 that the greatest difference between high- and low-context performance occurred in the high-SD/BW condition. In other words, participants relied on sentence context more in this jitter condition than in any other. They used context less in the high-SD condition than in the high-SD/BW condition; however, they c s 2 5 C V o 00 o 0) > w = ? § Q) TJ « ) ~ re 79 used context more in high-SD than in the high-BW and no-jitter conditions. Participants relied on context the least in the no-jitter condition. These observations were supported by an analysis of variance, which demonstrated a significant effect of jitter condition on use of context [F(3, 45)=20.8, p_<0.01]. The results of a Student-Newman-Keuls test of multiple comparisons indicated that there were the following significant differences among the jitter conditions (p_<0.01): 1. Use of context in the high-SD/BW condition was significantly greater than that in all other jitter conditions. 2. In the high-SD condition, use of context was significantly less than that in the high-SD/BW condition, and significantly greater than that in the no-jitter and high-BW conditions. 3. Use of context did not differ significantly between the no-jitter and high-BW conditions. 3.4.10.2 Interaction Between Jitter Condition and Signal-to-Noise Ratio, and its Effect on the Difference Between Word-Recognition Scores in High- and Low-Context Conditions Table 7 lists the mean difference between high- and low-context word-recognition scores, as well as the corresponding standard deviation, for each jitter condition in each S:N-ratio condition. 80 S:N Ratio (dB) Jitter Condition Mean Difference between High-and Low-Context Scores Standard Deviation 8 No Jitter 19.7 6.28 8 High-BW 22.2 9.79 8 High-SD 29.2 9.98 8 High-SD/BW 49.5 10.42 4 No Jitter 23.5 9.22 4 High-BW 26.1 10.87 4 High-SD 37.5 11.94 4 High-SD/BW 35.2 15.05 Table 7. Mean Difference Between High- and Low-Context Word-Recognition Scores for Each Jitter Condition in Each S:N-Ratio Condition. Figure 8 shows the mean difference between high- and low-context word-recognition scores in each jitter and S:N-ratio condition. Figure 8 Mean Di f ference Between H igh - and L o w -Con tex t W o r d - R e c o g n i t i o n Scores fo r Each J i t te r and S:N Cond i t i on [No Jitter I High-BW • High-SD • High-SD/BW S:N Ratio (dB) 81 It can be seen in Figure 8 that, within each S:N-ratio condition, participants used context almost equally in the no-jitter and high-BW conditions; also, participants used context less in these conditions than in the high-SD and high-SD/BW conditions. Within the +4 dB S:N-ratio condition, context was used to almost the same degree in the high-SD and high-SD/BW conditions. Within the +8 dB S:N-ratio condition, participants relied most heavily on context in the high-SD/BW condition. If one compares performance across S:N-ratio conditions, one can see that, for the no-jitter and high-BW conditions, the degree to which participants used context remained largely unchanged as the S:N ratio dropped. In the high-SD condition, use of context increased slightly as the S:N ratio dropped. In the high-SD/BW condition, however, use of context decreased greatly as the S:N ratio dropped from +8 dB to +4 dB. These observations were supported by an analysis of variance, which demonstrated a significant interaction effect between jitter and S:N-ratio conditions [F(3, 45)=7.7, p_<0.01]. The results of a Student-Newman-Keuls test of multiple comparisons indicated that the following significant interactions occurred between jitter and S:N-ratio conditions (p_<0.01): 1. Use of context in the no-jitter and high-BW conditions did not differ significantly within or across S:N ratios. 2. In the +4 dB S:N-ratio condition, participants used context significantly more in the high-SD and high-SD/BW conditions than in the no-jitter condition; however, 82 there was no significant difference in use of context among the high-SD, high-BW, or high-SD/BW conditions. 3. In the +8 dB S:N-ratio condition, participants relied on context significantly more in the high-SD/BW condition than in any other. 4. There was no S:N-ratio effect on context use in the no-jitter, high-BW, and high-SD conditions. 5. In the high-SD/BW condition, participants used context significantly more in the +8 dB S:N-ratio condition than in the +4 dB S:N-ratio condition. 3.4.11 Types of Errors Made by Participants in the Main Experiment Various types of errors were made by participants in each jitter by S:N-ratio condition. First, although young listeners typically perform above 90% on unjittered sentences in S:N ratios of +4 and +8 dB (Pichora-Fuller et al., 1995), most participants in this experiment scored 80% or below in these conditions. Their errors primarily consisted of changing one sound in the target word; as in the pilot study, the incorrect sound usually had one or more features in common with the correct sound. In addition, there was great variation across participants: not all performed poorly. Thus the difference between performance on unjittered sentences in this study and Pichora-Fuller et al.'s (1995) study can be seen to be minimal, when one examines the pattern of errors. In the moderate jitter conditions, participants' errors resembled those made by the pilot participants (refer to section 3.3.1 for a list of these types of errors). In the high-BW condition, errors consisted primarily of one-sound substitutions within a word. 83 In the high-SD condition, the type of error was the same, but more of these errors were made. In the high-SD/BW condition, errors were again similar to those made by pilot participants. Specifically, participants tended to change more than one sound in each word (e.g., one participant replaced "dust" with "best"). In addition, as in the pilot study, participants were less able to use context to predict the last word. Thus their incorrect responses did not always make sense within the provided context, but frequently resembled the target word phonetically. 84 4. DISCUSSION 4.1 Review of Hypotheses This study was designed to determine if the presence of temporal asynchrony affects the listener's ability to recognize and repeat words from high- and low-context sentences presented in multi-talker, background babble. Participants listened to jittered and unjittered high- and low-context sentences presented in varying S:N ratios, and repeated each sentence-final word. By varying the degree and nature of the jitter, it was possible to evaluate the effect of this jitter on both speech perception and word recognition. The following null hypotheses were tested in this experiment: 1) In signal-to-noise ratios that permit near-ceiling performance of young, normal-hearing listeners on monaural auditory processing tasks, performance will remain essentially unchanged when the signal is temporally jittered. 2) Performance on monaural auditory processing tasks involving temporal jitter will not be correlated with gap detection thresholds. 3) Performance on monaural auditory processing tasks involving temporal jitter will not be correlated with age, years of education, or vocabulary-test scores within a young, normal-hearing group. 4) The two moderate jitter conditions tested in this experiment will be equally easy to perceive, in spite of their qualitative differences. 85 4.2 Summary of Results The following sections discuss whether this experiment supported or refuted the four null hypotheses. 4.2.1 Null Hypothesis 1: Performance on Temporally Jittered Sentences It was hypothesized that participants' performance on the SPIN-R task would be unaffected by the presence of external temporal jitter. This experiment revealed that the external jitter affected both speech perception and word recognition. Further, the greater the degree of temporal jitter, the more adversely performance was affected. The jitter used in this experiment distorted only temporal cues; that is, spectral cues, such as the perceived loudness or pitch of the speech signal, were not altered. Thus the participants' poor performance in moderate- and high-jitter conditions suggests that temporal cues are crucial for the recognition of words spoken in background noise. The necessity of temporal cues (and hence the ability to process these cues) has been demonstrated in many studies (e.g., Shannon et al., 1995; Shannon et al., 1992; Drullman, 1995; Van Tasell et al., 1992; Van Tasell et al., 1987). It has also been supported by the success of speech-recognition models that use temporal information to process language (e.g., Patterson & Allerhand, 1995; Ghitza, 1993). Thus it should not be surprising that distortion of temporal information in speech affects word recognition, particularly in the presence of competing speech signals. The performance of young, normal-hearing listeners with jittered stimuli was similar to that of old listeners with unjittered stimuli in other studies (e.g., Pichora-Fuller et al., 1995). Specifically, in moderate levels of jitter, the young listeners relied greatly on contextual cues as the S:N ratio decreased. Also, in the highest degree of jitter, although context helped greatly, it was not enough to help listeners achieve the high scores they achieved in the moderate-jitter conditions. This behavior is reminiscent of that of older listeners attempting to perceive normal, unjittered speech in noise (Pichora-Fuller et al., 1995). The similarity of behavior supports the notion that older listeners may be experiencing difficulty perceiving speech in noise because their auditory systems distort the temporal information in the speech signals. Such temporal distortion is likely, given that many anatomical structures and physiological processes deteriorate with age (Schneider, 1997). Further, since many neural structures in the auditory system seem designed to phase-lock to stimuli (Greenberg, 1996), deterioration of these structures would affect phase-locking, which would compromise temporal synchrony during the transmission of speech signals to the auditory cortex. Thus internal temporal jitter in the elderly is highly probable and, if present, would likely interfere with their ability to understand language spoken in noise - as the experimental jitter did with young listeners. This external jitter, then, may have accurately simulated the internal temporal jitter hypothesized to exist in the elderly. 4.2.2 Null Hypothesis 2: Correlation of Performance with Gap Threshold It was hypothesized that the ability of young listeners to understand jittered sentences would not be correlated with gap threshold - a measure that is thought, by some, to be correlated with the ability to identify speech in noise (e.g., Tyler, Summerfield, Wood, & Fernandes, 1982). The gap detection task in this case was one that employed very brief, Gaussian-shaped pure-tone markers (Schneider, 1994). The 87 duration and shape of these markers helped reduce spectral splatter, which would have helped the listeners discriminate the marker containing the gap from the continuous marker (Schneider et al., 1994). In addition, this test employed an adaptive procedure, which allowed a gap threshold to be obtained in very few trials (40 or less). Performance with jittered stimuli was not correlated with the gap thresholds obtained in this study, thus the null hypothesis was supported. However, a learning effect was found for the gap detection task used in this study. This suggests that the gap thresholds obtained may not have been entirely accurate; participants may have needed several more trials than they were given to properly perform this task. In addition, some researchers (e.g., Phillips, 1995) believe that the temporal process used in the gap detection task differs from processes relying on synchronization (phase-locking). Phillips suggested that the temporal process used in gap detection involves neurons sensitive primarily to stimulus onsets (i.e., choppers and onset units in the posteroventral cochlear nucleus (Greenberg, 1996)), and that these neurons differ from those involved in phase-locking to signals. Further, one temporal process could be impaired independently of the other. Thus a listener might be able to perform a gap detection task, and still be unable to recognize words in noise due to internal temporal asynchrony. The gap detection task may not, therefore, be representative of a listener's ability to process all temporal information - or perceive speech in noise - accurately. Future experiments attempting to demonstrate a correlation between gap threshold and performance with jittered stimuli should: (1) employ a gap detection test that includes more trials; and (2) devise a "jitter-detection" threshold task that measures 88 a listener's sensitivity to temporal asynchrony, and determine if this value correlates with gap threshold and/or performance. 4.2.3 Null Hypothesis 3: Correlation of Performance with Age, Years of Education, and Vocabulary-test Scores This hypothesis stated that performance of young listeners in the jittered conditions would not be correlated with variables such as age, years of education, and vocabulary size. The present study revealed no such correlations, thus the null hypothesis was supported. Note, however, that this experiment had only 16 subjects, most of whom were close in age, years of education, and vocabulary-test scores. Thus this sample was not representative of all possible ages and levels of listeners. Future research attempting to find correlations between performance on temporal processing tasks (such as the jittered SPIN-R task in this study) and age, years of education, or vocabulary size should therefore have participants of varying ages, education, and vocabularies. 4.2.4 Null Hypothesis 4: Perceptual Eguivalence of the Moderate Jitter Conditions It was hypothesized that the high-SD and high-BW jitter conditions would yield similar scores, in spite of their qualitative differences. This was not found to be true in the present study. Performance on the high-SD sentences was significantly worse than that on the high-BW sentences. In addition, scores in the high-BW condition did not differ significantly from those in the unjittered condition. Thus the high-SD form of 89 temporal jitter, which contained delay values that deviated greatly in size from the original time coordinates but did not change quickly over time, significantly affected comprehension. The high-BW form of jitter, which contained delay values that changed quickly over time but did not deviate greatly from the original time coordinates, did not affect comprehension. The high-SD jitter may therefore be more representative of the internal temporal jitter thought to exist in elderly listeners. 4.3 Conclusions 4.3.1 The Effect of Temporal Asynchrony on Word Recognition Both the high-SD and the high-SD/BW jitter adversely affected participants' word recognition and speech perception abilities, with the latter condition causing more perceptual errors. This suggests that both the size of delay and the rate of change in delays in a temporally jittered signal can interfere with recognition of words spoken in noise. Further, context was helpful in the high-SD condition, but less so in the high-SD/BW condition. This suggests that, with large amounts of temporal jitter, the listener can no longer fully benefit from context. If, as has been hypothesized, this temporal jitter occurs internally during auditory processing in elderly listeners, such listeners will have great difficulty understanding the words they hear in noisy conditions. If, in addition to this internal jitter, an elderly listener has a high-frequency hearing loss, comprehension will be further compromised. Thus temporal synchrony may be a crucial factor in understanding language, particularly when it is spoken in noise. 90 4.3.2 The Effect of Context on Word Recognition The presence of supportive context had a significant effect on listeners' comprehension in all jitter conditions. In the two moderate jitter conditions, context enabled participants to achieve near-perfect scores; in the highly jittered condition, context helped the listeners recognize three times as many words as they could without context. This suggests that using supportive context is a helpful strategy in overcoming the perceptual interference caused by temporal jitter and background noise. Thus, if elderly listeners do have internal temporal jitter resembling the external jitter of this experiment, such listeners would understand much more of the spoken message if they used context to identify words they had missed because of the jitter. Note that this conclusion is consistent with those in Wingfield's (1996) review. 4.3.3 The Effect of S:N Ratio on Word Recognition The noise levels in this experiment also affected participants' word recognition -particularly in the high-SD and high-SD/BW jitter conditions. Not surprisingly, the lower S:N ratio affected performance more in the high-SD/BW jitter condition than in the high-SD jitter condition. This suggests that background noise and internal temporal asynchrony (if similar to the external jitter in this study) can interact to adversely affect elderly listeners' recognition of spoken words. Such listeners would therefore understand more of the words they heard if the level of background noise were decreased (e.g., by the listener moving to a quieter area to converse). 91 4.4 Future Research Directions Many aspects of this experiment could be altered to reveal more about the effect of temporal asynchrony on word recognition. First (as suggested earlier), participants of varying ages, years of education, or vocabulary size could do this SPIN task, to determine if the ability to compensate for temporal asynchrony is correlated with such variables. Second, different degrees of jitter could be used to distort the SPIN-R sentences. Only three types of jitter were used in this study; others could be created that contained, for example, slowly varying delays that did not deviate greatly from the original time coordinates (i.e., low-SD/BW jitter). Third, S:N ratios other than +4 dB and +8 dB could be combined with the three jitter conditions used in this experiment. Fourth, other types of stimuli could be jittered, to examine the effect of temporal asynchrony on other auditory processing tasks. Finally, a study could be done that established if working memory correlated with the ability to comprehend jittered language. Thus many changes could be made to the original experiment to reveal more about temporal processing abilities in the young and elderly, and how these abilities impact the degree to which people understand language in everyday life. REFERENCES 92 Bilger, R. C , Nuetzel, M. J., Rabinowitz, W. M., & Rzeczkowski, C. (1984). Standardization of a test of speech perception in noise. Journal of Speech and Hearing Research, 27, 32-48. Blosser, J. L , Weidner, W. E., & Dinero, T. (1976). The effect of rate-controlled speech on the auditory receptive scores of children with normal and disordered language abilities. Journal of Special Education, 10(3), 291-298. Blumstein, S. E., Katz, B., Goodglass, H., Shrier, R., & Dworetsky, B. (1985).The effects of slowed speech on auditory comprehension in aphasia. Brain and Language 24(2), 246-265. CHABA (Committee on Hearing, Bioacoustics, and Biomechanics). (1988). Speech understanding and aging. Journal of the Acoustical Society of America, 83, 859-893. Colburn, H. S., & Durlach, N. I. (1978). Models of binaural interaction. In E. C. Carterette & M. P. Friedman (Eds.), Handbook of perception: Vol. IV. Hearing (pp. 467-517). NY: Academic Press. Cranford, J. L., Morgan, M., Scudder, R., & Moore, C. (1993). Tracking of "moving" fused auditory images by children. Journal of Speech and Hearing Research, 36, 424-430. CSRE (4.5) (1995). Computer Speech Research Environment. London, Ont: AVAAZ Innovations, Inc. Culling, J. F., & Darwin, C. J. (1994). Perceptual and computational separation of simultaneous vowels: Cues arising from low-frequency beating. Journal of the Acoustical Society of America, 95(3), 1559-1569. Culling, J. F., & Darwin, C. J. (1993). Perceptual separation of simultaneous vowels: Within and across-formant grouping by F 0 . Journal of the Acoustical Society of America, 93(6). 3454-3467. Culling, J. F., & Summerfield, Q. (1995). The role of frequency modulation in the perceptual segregation of concurrent vowels. Journal of the Acoustical Society of America, 98(2). 837-846. 93 Darwin, C. J., & Ciocca, V. (1992). Grouping in pitch perception: Effects of onset asynchrony and ear of presentation of a mistuned component. Journal of the Acoustical Society of America, 91(6), 3381-3390. Darwin, C. J., Hukin, R. W., & Al-Khatib, B. Y. (1995). Grouping in pitch perception: Evidence for sequential constraints. Journal of the Acoustical Society of America. 98(2). 880-885. Drullman, R. (1995). Temporal envelope and fine structure cues for speech intelligibility. Journal of the Acoustical Society of America, 97(1), 585-592. Durlach, N. I. (1972). Binaural signal detection: Equalization and cancellation theory. In J. V. Tobias (Ed.), Foundations of Modern Auditory Theory, Vol. 2 (pp. 374-462). NY: Academic Press. Fitzgibbons, P. J., & Gordon-Salant, S. (1994). Age effects on measures of auditory duration discrimination. Journal of Speech and Hearing Research, 37(3), 662-/-\ *-» Freyman, R. L , Nerbonne, G. P., & Cote, H. C. (1991). Effect of consonant-vowel ratio modification on amplitude envelope cues for consonant recognition. Journal of Speech and Hearing Research, 34, 415-426. Ghitza, O. (1993). Adequacy of auditory models to predict human internal representation of speech sounds. Journal of the Acoustical Society of America, 93, 2160-2171. Gordon-Salant, S., & Fitzgibbons, P. J. (1993). Temporal factors and speech recognition performance in young and elderly listeners. Journal of Speech and Hearing Research. 36.1276-1285. Greenberg, S. (1997). On the origins of speech intelligibility in the real world. Proceedings of the ESCA Workshop on Robust Speech Recognition for Unknown Communications Channels. Greenberg, S. (1996). Auditory processing of speech. In N. J. Lass (Ed.), Principles of Experimental Phonetics (pp. 362-407). St. Louis: Mosby. Griffiths, R. (1992). Speech rate and listening comprehension: Further evidence of the relationship. TESOL Quarterly, 26(2). 385-390. Griffiths, R. (1990). Speech rate and NNS comprehension: A preliminary study in time-benefit analysis. Language Learning, 40(3), 311-336. 94 deHaan, H. J. (1977). A speech-rate intelligibility threshold for speeded and time-compressed connected speech. Perception and Psychophvsics, 22(4), 366-372. Hirsh, I., Davis, H., Silverman, S. R., Reynolds, E., Eldert, E., & Benson, R. W. (1952). Development of materials for speech audiometry. Journal of Speech and Hearing Disorders, 17. 321-337. Kalikow, D. N., Stevens, K. N., & Elliott, L. L. (1977). Development of a test of speech intelligibility in noise using sentence materials with controlled word predictability. Journal of the Acoustical Society of America, 61(5), 1337-1351. King, P. E., & Behnke, R. R. (1989). The effect of time-compressed speech on comprehensive, interpretive, and short-term listening. Human Communication Research. 15(3). 428-443. Ladefoged, P. (1993). A course in phonetics (3rd edition). NY: Harcourt Brace College Publishers. Letowski, T., & Poch, N. (1996). Comprehension of time-compressed speech: Effects of age and speech complexity. Journal of the American Academy of Audiology, 7(6), 447-457. Levitt, H. (1971). Transformed up-down methods in psychoacoustics. Journal of the Acoustical Society of America, 49, 467-477. Moore, B. C. J. (1989). An introduction to the psychology of hearing (3rd edition). CA: Academic Press Ltd. Moore, B. C. J., & Peters, R. W. (1992). Pitch discrimination and phase sensitivity in young and elderly subjects and its relationship to frequency selectivity. Journal of the Acoustical Society of America. 91(5). 2881-2893. Newman, C. W. (1982). Recognition of time-compressed high- and low-predictability sentences by normal subjects. Journal of Auditory Research, 22, 259-264. Patterson, R. D., & Allerhand, M. H. (1995). Time-domain modelling of peripheral auditory processing: A modular architecture and a software platform. Journal of the Acoustical Society of America, 98(4), 1890-1894. Phillips, D. P. (1995). Central auditory processing: A view from auditory neuroscience. The American Journal of Otology, 16(3), 338-352. Phillips, D. P., Hall, S. E., Harrington, I. A., & Taylor, T. L. (1998). "Central" auditory gap detection: A spatial case. Journal of the Acoustical Society of America, 103(4), 2064-2068. 95 Phillips, D. P., Taylor, T. L , Hall, S. E., Carr, M. M., & Mossop, J., E. (1997). Detection of silent intervals between noises activating different perceptual channels: Some properties of "central" auditory gap detection. Journal of the Acoustical Society of America, 101(6). 3694-3705. Pichora-Fuller, M. K., & Schneider, B. A. (1992). The effect of interaural delay of the masker on masking-level differences in young and old adults. Journal of the Acoustical Society of America. 91 (4), 2129-2135. Pichora-Fuller, M. K., Schneider, B. A., & Daneman, M. (1995). How young and old adults listen to and remember speech in noise. Journal of the Acoustical Society of America. 97(1), 593-608. Publication Manual of the American Psychological Association (4th ed.). (1994). Washington, D.C.: American Psychological Association. Raven, J. C. (1938). The Mill Hill Vocabulary Scale. London: Lewis. Riensche, L. L., Wohlert, A., & Porch, B. E. (1983). Aphasic comprehension and preference of rate-altered speech. British Journal of Disorders of Communication, 18(1), 39-48. Schmitt, J. F. (1983). The effects of time compression and time expansion on passage comprehension by elderly listeners. Journal of Speech and Hearing Research, 26, 373-377. Schmitt, J. F., & Carroll, M. R. (1985). Older listeners' ability to comprehend speaker-generated rate alteration of passages. Journal of Speech and Hearing Research, 28, 309-312. Schmitt, J. F., & McCroskey, R. L. (1981). Sentence comprehension in elderly listeners: The factor of rate. Journal of Gerontology, 36(4), 441-445. Schmitt, J. F., & Moore, J. R. (1989). Natural alteration of speaking rate: The effect on passage comprehension by listeners over 75 years of age. Journal of Speech and Hearing Research, 32, 445-450. Schneider, B. A. (1997). Psychoacoustics and aging: Implications for everyday listening. Journal of Speech-Language Pathology and Audiology, 21(2), 111-124. Schneider, B. A. (1997). Jitspin2.exe [computer program]. Toronto: University of Toronto. 96 Schneider, B. A. (1994). Gap detection threshold test [computer program]. Toronto: University of Toronto. Schneider, B. A., Pichora-Fuller, M. K., Kowalchuk, D., & Lamb, M. (1994). Gap detection and the precedence effect in young and old adults. Journal of the Acoustical Society of America. 95. 980-991. Schneider, B. A., Speranza, F., & Pichora-Fuller, M. K. (1994, July). Age-related changes in temporal resolution: Envelope and intensity effects. XXII International Congress of Audiology, Halifax, Nova Scotia. Shannon, R. V., Zeng, F., Kamath, V., Wygonski, J., & Ekelid, M. (1995). Speech recognition with primarily temporal cues. Science, 270, 303-304. Shannon, R. V., Zeng, F., & Wygonski, J. (1992). Speech recognition using only temporal cues. In M. E. H. Schouten (Ed.), The Auditory Processing of Speech: From Sounds to Words (pp. 263-274). Berlin, Federal Republic of Germany: Mouton de Gruyter. Small, J., Andersen, E., & Kempler, D. (1997). Effects of working memory capacity on understanding rate-altered speech. Aging, Neurology, and Cognition, 4(2), 126-139. Small, J., Kemper, S., & Lyons, K. (1997). Sentence comprehension in Alzheimer's Disease: Effects of grammatical complexity, speech rate, and repetition. Psychology and Aging, 12(1). 3-11. Strouse, A., Ashmead, D. H., Ohde, R. N., & Grantham, D. W. (1998). Temporal processing in the aging auditory system. Journal of the Acoustical Society of America (in press). Tomoeda, C. K., Bayles, K. A., Boone, D. R., Kaszniak, A. W., & Slauson, T. J. (1990). Speech rate and syntactic complexity effects on the auditory comprehension of Alzheimer patients. Journal of Communication Disorders, 23, 151-161. Tyler, R. S., Summerfield, Q., Wood, E. J., & Fernandes, M. A. (1982). Psychoacoustic and phonetic temporal processing in normal and hearing impaired listeners. Journal of the Acoustical Society of America, 72, 740-752. Van Tassell, D. J., Soli, S. D., Kirby, V. M., & Widin, G. P. (1987). Speech waveform envelope cues for consonant recognition. Journal of the Acoustical Society of America. 82. 1152-1161. 97 Van Tasell, D. J., Greenfield, D. G., Logemann, J. J., & Nielson, D. A. (1992). Temporal cues for consonant recognition: Training, talker generalization, and use in evaluation of cochlear implants. Journal of the Acoustical Society of America, 92, 1247-1257. Viemeister, N. F., & Plack, C. J. (1993). Time analysis. In W. A. Yost, A. N. Popper, & R. R. Fay (Eds.), Human Psychophysics (pp. 116-154). NY: Springer-Verlag New York, Inc. Visto, J. C , Cranford, J. L , & Scudder, R. (1996). Dynamic temporal processing of nonspeech acoustic information by children with specific language impairment. Journal of Speech and Hearing Research, 39(3), 510-517. Weismer, S. E., & Hesketh, L. J. (1996). Lexical learning by children with specific language impairment: Effects of linguistic input presented at varying speaking rates. Journal of Speech and Hearing Research, 39(1), 177-190. Wilber, L. A. (1994). Calibration, pure tone, speech and noise signals. In J. Katz (Ed.), Handbook of Clinical Audiology. (4th ed., pp. 73-94). Baltimore: Williams and Williams. Wingfield, L. A. (1996). Cognitive factors in auditory performance: Context, speed of processing, and constraints of memory. Journal of the American Academy of Audiology, 7(3), 175-182. Wingfield, A., Tun, P. A., Koh, C. K., & Rosen, M. J. (Submitted). Regaining lost time: Adult aging and the effect of time restoration on recall of time-compressed speech. Zera, J., & Green, D. M. (1993). Detecting temporal onset and offset asynchrony in multicomponent complexes. Journal of the Acoustical Society of America, 93(2). 1038-1052. APPENDIX A Participants' Pure-Tone Thresholds (dB HL) for Right (R) and Left (L) Ears Partici-pant Test Frequency (Hz) 250 500 1000 2000 4000 8000 R L R L R L R L R L R L 1 15 5 5 0 -5 0 0 0 0 5 5 0 2 10 10 10 0 5 0 0 0 0 5 5 5 3 5 0 5 0 -5 0 0 0 5 20 5 0 4 5 0 0 -5 -5 -5 -5 10 0 -5 5 15 5 5 0 5 0 0 5 0 0 0 -10 5 15 6 -10 -10 -5 -10 -5 -10 -10 10 -10 -5 -5 -10 7 10 0 10 0 10 0 5 -5 5 -10 30 10 8 0 0 5 5 0 0 5 0 20 25 20 20 9 5 5 5 0 5 0 5 -5 -5 -10 5 0 10 5 -5 -5 -5 -10 0 -10 -5 -5 15 15 0 11 5 -5 -5 -5 0 0 5 -5 0 -5 -5 -10 12 0 -5 -5 -5 5 -5 0 10 5 -5 5 0 13 5 5 5 5 5 5 0 5 0 5 -5 5 14 0 0 0 0 0 -5 0 5 -5 -10 0 5 15 -5 -5 -5 -10 0 -5 -5 -5 0 -10 -5 0 16 0 -5 0 0 0 0 5 5 0 0 5 0 APPENDIX B Participants' Characteristics Parti-cipant Pure Tone Average (dB HL) S R T Vocab Score 3 Age Years of Education Gap Detection Threshold (msec) b Handed-ness R L R L 1 0 0 5 -5 15 29 21 7.563 R 2 5 0 0 5 14 24 19 7.188 R 3 0 0 5 0 15 25 15 6.188 R 4 -3.33 -6.66 -5 -10 17 25 18 5.750 R 5 1.67 1.67 0 0 16 27 20 2.875 R 6 -6.67 -10 5 10 15 25 20 1.750 R 7 8.33 -1.67 5 0 13 30 20 1.938 R 8 3.33 1.67 0 0 13 32 12 3.625 L 9 5 1.67 5 0 15 19 14 3.438 R 10 -3.33 -3.33 -10 -10 16 23 19 2.313 R 11 0 -3.33 0 0 13 24 19 3.125 R 12 0 -6.67 5 0 16 25 14 8.188 R 13 5 5 -5 -5 14 25 17 6.938 R 14 0 -1.67 0 0 14 22 17 4.875 R 15 -3.33 -6.67 -5 0 16 35 20 1.875 R 16 0 -1.67 0 0 16 29 19 3.063 R aMill Hill, Raven (1938) "Schneider etal., 1994 APPENDIX C Forms of the Revised SPIN Test Practice Sentences 1 She had spoken about the SCAR. 2 Camels store water in their HUMPS. 3 Mr. Smith might discuss the MILL. 4 I built the model from a KIT. 5 The loud noise made him jump with FRIGHT. 6 Ruth couldn't know about the SHRIMP. 7 We were discussing the GAS. 8 She's been considering the COMB. 9 The winning card was an ACE. 10 Tom wore a tie and a white SHIRT. 11 I'm glad Tom asked about the SPOUT. 12 Her skin broke out in a RASH. 13 Paul heard they asked about the RICE. 14 The child threw bread crumbs to the DUCKS. 15 Paul chopped down the tree with his AXE. 16 They want to speak about the TRASH. 17 He's as stubborn as a MULE. 18 Paul is considering the DRILL. 19 The woman talks about the BEACH. 20 Hospitals should be free of GERMS. (Practice Sentences continued) 21 You should discuss the FLUTE. 22 Let's slide down the hill on a SLED. 23 He weighed the meat on a SCALE. 24 Paul wants to think about the PALM. 25 She was interested in the SOAP. 26 You lead them on a merry CHASE. 27 The swimmer was attacked by SHARKS 28 Miss White spoke about the SAUCE. 29 Stainless steel will never RUST. 30 He has considered the ITCH. 31 She had a problem with the PLUMS. 32 The robber committed a CRIME. 33 The secret message was in CODE. 34 Broil the steak on a charcoal GRILL. 35 He's been discussing the ACHE. 36 Ruth sat on the living room COUCH. 37 Bill should have considered the HOOD. 38 A bee can give a painful STING. 39 They are discussing the CURL. 40 Nancy had known about the COUGH. 41 Jane heard he called about the BUS. (Practice Sentences continued) 42 The soap washed away the DIRT. 43 He is discussing the GIFT. 44 Jane was dressed in a skirt and BLOUSE 45 She is considering the SHELF. 46 Animals are kept in the ZOO. 47 Harry was discussing the MAIL. 48 His face was concealed by a MASK. 49 Half a quart is a PINT. 50 Ruth is considering the CLOWN. SPIN Form 1 1 His plans meant taking a big RISK. 2 Stir your coffee with a SPOON. 3 Miss White won't think about the CRACK. 4 He would think about the RAG. 5 The plough was pulled by an OX. 6 The old train was powered by STEAM. 7 The old man talked about the LUNGS. 8 I was considering the CROOK. 9 Let's decide by tossing a COIN. 10 The doctor prescribed the DRUG. 11 Bill might discuss the FOAM. (SPIN Form 1 continued) 12 Nancy didn't discuss the SKIRT. 13 Hold the baby on your LAP. 14 Bob has discussed the SPLASH. 15 The dog chewed on a BONE. 16 Ruth hopes he heard about the HIPS. 17 The war was fought with armoured TANKS. 18 She wants to talk about the CREW. 19 They had a problem with the CLIFF. 20 They drank a whole bottle of GIN. 21 You heard Jane called about the VAN. 22 The witness took a solemn OATH. 23 We could consider the FEAST. 24 Bill heard we asked about the HOST. 25 They tracked the lion to his DEN. 26 The cow gave birth to a CALF. 27 I had not thought about the GROWL. 28 The scarf was made of shiny SILK. 29 The super highway has six LANES. 30 He should know about the HUT. 31 For dessert he had apple PIE. 32 The beer drinkers raised their MUGS 33 I'm glad you heard about the BEND. (SPIN Form 1 continued) 34 You're talking about the POND. 35 The rude remark made her BLUSH. 36 Nancy had considered the SLEEVES. 37 We heard the ticking of the CLOCK. 38 He can't consider the CRIB. 39 He killed the dragon with his SWORD. 40 Tom discussed the HAY. 41 Mary wore her hair in BRAIDS. 42 She's glad Jane asked about the DRAIN 43 Bill hopes Paul heard about the MIST. 44 We're lost so let's look at the MAP. 45 No one was injured in the CRASH. 46 We're speaking about the TOLL. 47 My son has a dog for a PET. 48 He was scared out of his WITS. 49 We spoke about the KNOB. 50 I've spoken about the PILE. SPIN Form 2 1 Miss Black thought about the LAP. 2 The baby slept in his CRIB. 3 The watchdog gave a warning GROWL. (SPIN Form 2 continued) 4 Miss Black would consider the BONE. 5 The natives built a wooden HUT. 6 Bob could have known about the SPOON 7 Unlock the door and turn the KNOB. 8 He wants to talk about the RISK. 9 He heard they called about the LANES. 10 Wipe your greasy hands on the RAG. 11 She has known about the DRUG. 12 1 want to speak about the CRASH. 13 The wedding banquet was a FEAST. 14 1 should have considered the MAP. 15 Paul hit the water with a SPLASH. 16 The ducks swam around on the POND. 17 Ruth must have known about the PIE. 18 The man should discuss the OX. 19 Bob stood with his hands on his HIPS. 20 The cigarette smoke filled his LUNGS. 21 They heard 1 called about the PET. 22 The cushion was filled with FOAM. 23 Ruth poured the water down the DRAIN. 24 Bill cannot consider the DEN. 25 This nozzle sprays a fine MIST. (SPIN Form 2 continued) 26 The sport shirt has short SLEEVES. 27 She hopes Jane called about the CALF. 28 Jane has a problem with the COIN. 29 She shortened the hem of her SKIRT. 30 Paul hopes she called about the TANKS. 31 The girl talked about the GIN. 32 The guests were welcomed by the HOST. 33 Mary should think about the SWORD. 34 Ruth could have discussed the WITS. 35 The ship's Captain summoned his CREW. 36 You had a problem with a BLUSH. 37 The flood took a heavy TOLL. 38 The car drove off the steep CLIFF. 39 We have discussed the STEAM. 40 The policemen captured the CROOK. 41 The door was opened just a CRACK. 42 Tom is considering the CLOCK. 43 The sand was heaped in a PILE. 44 You should not speak about the BRAIDS. 45 Peter should speak about the MUGS. 46 Household goods are moved in VAN. 47 He has a problem with the OATH. (SPIN Form 2 continued) 48 Follow this road around the BEND. 49 Tom won't consider the SILK. 50 The farmer baled the HAY. SPIN Form 3 1 Kill the bugs with this SPRAY. 2 Mr. White discussed the CRUISE. 3 How much can I buy for a DIME. 4 Miss White thinks about the TEA. 5 We shipped the furniture by TRUCK. 6 He is thinking about the ROAR. 7 She's spoken about the BOMB. 8 My T.V. has a twelve-inch SCREEN. 9 That accident gave me a SCARE. 10 You want to talk about the DITCH. 11 The king wore a golden CROWN. 12 The girl swept the floor with a BROOM. 13 We're discussing the SHEETS. 14 The nurse gave him first AID. 15 She faced them with a foolish GRIN. 16 Betty has considered the BARK. 17 Watermelons have lots of SEEDS. (SPIN Form 3 continued) 18 Use this spray to kill the BUGS. 19 Tom will discuss the SWAN. 20 The teacher sat on a sharp TACK. 21 You'd been considering the GEESE. 22 The sailor swabbed the DECK. 23 They were interested in the STRAP. 24 He could discuss the BREAD. 25 He tossed the drowning man a ROPE. 26 Jane hopes Ruth asked about the STRIPES 27 Paul spoke about the PORK. 28 The boy gave the football a KICK. 29 The storm broke the sailboat's MAST. 30 Mr. Smith thinks about the CAP. 31 We are speaking about the PRIZE. 32 Mr. Brown carved the roast BEEF. 33 The glass had a chip on the RIM. 34 Harry had thought about the LOGS. 35 Bob could consider the POLE. 36 Her cigarette had a long ASH. 37 Ruth has a problem with the JOINTS. 38 He is considering the THROAT. 39 The soup was served in a BOWL. (SPIN Form 3 continued) 40 We can't consider the WHEAT. 41 The man spoke about the CLUE. 42 The lonely bird searched for its MATE. 43 Please wipe your feet on the MAT. 44 David has discussed the DENT. 45 The pond was full of croaking FROGS. 46 He hit me with a clenched FIST. 47 Bill heard Tom called about the COACH. 48 A bicycle has two WHEELS. 49 Jane has spoken about the CHEST. 50 Mr. White spoke about the FIRM. SPIN Form 4 1 The doctor X-rayed his CHEST. 2 Mary had considered the SPRAY. 3 The woman talked about the FROGS. 4 The workers are digging a DITCH. 5 Miss Brown will speak about the GRIN. 6 Bill can't have considered the WHEELS. 7 The duck swam with the white SWAN. 8 Your knees and your elbows are JOINTS. 9 Mr. Smith spoke about the AID. (SPIN Form 4 continued) 10 He hears she asked about the DECK. 11 Raise the flag up the POLE. 12 You want to think about the DIME. 13 You've considered the SEEDS. 14 The detectives searched for a CLUE. 15 Ruth's grandmother discussed the BROOM 16 The steamship left on a CRUISE. 17 Miss Smith considered the SCARE. 18 Peter has considered the MAT. 19 Tree trunks are covered with BARK. 20 The meat from a pig is called PORK. 21 The old man considered the KICK. 22 Ruth poured herself a cup of TEA. 23 We saw a flock of wild GEESE. 24 Paul could not consider the RIM. 25 How did your car get that DENT? 26 She made the bed with clean SHEETS. 27 I've been considering the CROWN. 28 The team was trained by their COACH. 29 I've got a cold and a sore THROAT. 30 We've spoken about the TRUCK. 31 She wore a feather in her CAP. (SPIN Form 4 continued) 32 The bread was made from whole WHEAT. 33 Mary could not discuss the TACK. 34 Spread some butter on your BREAD. 35 The cabin was made of LOGS. 36 Harry might consider the BEEF. 37 We're glad Bill heard about the ASH. 38 The lion gave an angry ROAR. 39 The sandal has a broken STRAP. 40 Nancy should consider the FIST. 41 He's employed by a large FIRM. 42 They did not discuss the SCREEN. 43 Her entry should win first PRIZE. 44 The old man thinks about the MAST. 45 Paul wants to speak about the BUGS. 46 The airplane dropped a BOMB. 47 You're glad she called about the BOWL. 48 A zebra has black and white STRIPES. 49 Miss Black could have discussed the ROPE. 50 I hope Paul asked about the MATE. SPIN Form 5 1 Betty knew about the NAP. 2 The girl should consider the FLAME. 3 It's getting dark, so light the LAMP. 4 To store his wood he built a SHED. 5 They heard I asked about the BET. 6 The mouse was caught in the TRAP. 7 Mary knows about the RUG. 8 The airplane went into a DIVE. 9 The fireman heard her frightened SCREAM 10 He was interested in the HEDGE. 11 He wiped the sink with a SPONGE. 12 Jane did not speak about the SLICE. 13 Mr. Brown can't discuss the SLOT. 14 The papers were held by a CLIP. 15 Paul can't discuss the WAX. 16 Miss Brown shouldn't discuss the SAND. 17 The chicks followed the mother HEN. 18 David might consider the FUN. 19 She wants to speak about the ANT. 20 The fur coat was made of MINK. 21 The boy took shelter in a CAVE. 22 He hasn't considered the DART. (SPIN Form 5 continued) 23 Eve was made from Adam's RIB. 24 The boat sailed along the COAST. 25 We've been discussing the CRATES. 26 The judge is sitting on the BENCH. 27 We've been thinking about the FAN. 28 Jane didn't think about the BROOK. 29 Cut a piece of meat from the ROAST. 30 Betty can't consider the GRIEF. 31 The heavy rains caused a FLOOD. 32 The swimmer dove into the POOL. 33 Harry will consider the TRAIL. 34 Let's invite the whole GANG. 35 The house was robbed by a THIEF. 36 Tom is talking about the FEE. 37 Bob wore a watch on this WRIST. 38 Tom had spoken about the PILL. 39 Tom has been discussing the BEADS. 40 The secret agent was a SPY. 41 The rancher rounded up his HERD. 42 Tom could have thought about the SPORT. 43 Mary can't consider the TIDE. 44 Ann works in the bank as a CLERK. (SPIN Form 5 continued) 45 A chimpanzee is an APE. 46 He hopes Tom asked about the BAR. 47 We could discuss the DUST. 48 The bandits escaped from JAIL. 49 Paul hopes we heard about the LOOT. 50 The landlord raised the RENT. SPIN Form 6 1 You were considering the GANG. 2 The boy considered the MINK. 3 Playing checkers can be FUN. 4 The doctor charged a low FEE. 5 He wants to know about the RIB. 6 The gambler lost the BET. 7 Get the bread and cut me a SLICE. 8 She might have discussed the APE. 9 The sleepy child took a NAP. 10 Instead of a fence, plant a HEDGE. 11 The old woman discussed the THIEF. 12 Drop the coin through the SLOT. 13 They fished in the babbling BROOK. 14 You were interested in the SCREAM. (SPIN Form 6 continued) 15 We hear they asked about the SHED. 16 The widow's sob expressed her GRIEF. 17 The candle flame melted the WAX. 18 I haven't discussed the SPONGE. 19 He was hit by a poisoned DART. 20 Ruth had a necklace of glass BEADS. 21 Ruth will consider the HERD. 22 The singer was mobbed by her FANS. 23 The old man discussed the DIVE. 24 The class should consider the FLOOD. 25 The fruit was shipped in wooden CRATES. 26 I'm talking about the BENCH. 27 Paul has discussed the LAMP. 28 The candle burned with a bright FLAME. 29 You knew about the CLIP. 30 She might consider the POOL. 31 We swam at the beach at high TIDE. 32 Bob was considering the CLERK. 33 We got drunk in the local BAR. 34 A termite looks like an ANT. 35 The man knew about the SPY. 36 The sick child swallowed the PILL. (SPIN Form 6 continued) 37 The class is discussing the WRIST. 38 The burglar escaped with the LOOT. 39 They hope he heard about the RENT. 40 Mr. White spoke about the JAIL. 41 He rode off in a cloud of DUST. 42 Miss Brown might consider the COAST. 43 Bill didn't discuss the HEN. 44 The bloodhound followed the TRAIL. 45 The boy might consider the TRAP. 46 On the beach we play on the SAND. 47 He should consider the ROAST. 48 Miss Brown spoke about the CAVE. 49 She hated to vacuum the RUG. 50 Football is a dangerous SPORT. SPIN Form 7 1 We're considering the BROW. 2 You cut the wood against the GRAIN. 3 I am thinking about the KNIFE. 4 They've considered the SHEEP. 5 The cop wore a bullet-proof VEST. 6 He's glad we heard about the SKUNK. (SPIN Form 7 continued) 7 His pants were held up by a BELT. 8 Paul took a bath in the TUB. 9 The girl should not discuss the GOWN. 10 Maple syrup is made from SAP. 11 Mr. Smith knew about the BAY. 12 They played a game of cat and MOUSE 13 The thread was wound on a SPOOL. 14 We did not discuss the SHOCK. 15 The crook entered a guilty PLEA. 16 Mr. Black has discussed the CARDS. 17 A bear has a thick coat of FUR. 18 Mr. Black considered the FLEET. 19 To open the jar, twist the LID. 20 We are considering the CHEERS. 21 Sue was interested in the BRUISE. 22 Tighten the belt by a NOTCH. 23 The cookies were kept in a JAR. 24 Miss Smith couldn't discussed ROW. 25 I am discussing the TASK. 26 The marksman took careful AIM. 27 I ate a piece of chocolate FUDGE. 28 Paul should know about the NET. (SPIN Form 7 continued) 29 Miss Smith might consider the SHELL. 30 John's front tooth had a CHIP. 31 At breakfast he drank some JUICE. 32 You cannot have discussed the GREASE. 33 I did not know about the CHUNKS. 34 Our cat is good at catching MICE. 35 I should have known about the GUM. 36 Mary hasn't discussed the BLADE. 37 The stale bread was covered with MOLD. 38 Ruth has discussed the PEG. 39 How long can you hold your BREATH? 40 His boss made him work like a SLAVE. 41 We have not thought about the HINT. 42 Air mail requires a special STAMP. 43 The bottle was sealed with a CORK. 44 The old man discussed the YELL. 45 They're glad we heard about the TRACK. 46 Cut the bacon into STRIPS. 47 Throw out all this useless JUNK. 48 The boy can't talk about THORNS. 49 Bill won't consider the BRAT. 50 The shipwrecked sailors built a RAFT. SPIN Form 8 1 Bob heard Paul called about the STRIPS. 2 My turtle went into its SHELL. 3 Paul has a problem with the BELT. 4 I cut my finger with a KNIFE. 5 They knew about the FUR. 6 We're glad Ann asked about the FUDGE. 7 Greet the heroes with loud CHEERS. 8 Jane was interested in the STAMP. 9 That animal stinks like a SKUNK. 10 A round hole won't take a square PEG. 11 Miss White would consider the MOLD. 12 They want to know about the AIM. 13 The Admiral commands the FLEET. 14 The bride wore a white GOWN. 15 The woman discussed the GRAIN. 16 You hope they asked about the VEST. 17 I can't guess so give me a HINT. 18 Our seats were in the second ROW. 19 We should considered the JUICE. 20 The boat sailed across the BAY. 21 The woman considered the NOTCH. 22 That job was an easy TASK. (SPIN Form 8 continued) 23 The woman knew about the LID. 24 Jane wants to speak about the CHIP. 25 The shepherd watched his flock of SHEEP 26 Bob should not consider the MICE. 27 David wiped the sweat from his BROW. 28 Ruth hopes she called about the JUNK. 29 I can't consider the PLEA. 30 The bad news came as a SHOCK. 31 A spoiled child is a BRAT. 32 Paul was interested in the SAP. 33 The drowning man let out a YELL. 34 A rose bush has prickly THORNS. 35 He's glad you called about the JAR. 36 The dealer shuffled the CARDS. 37 Miss Smith knows about the TUB. 38 The man would not discuss the MOUSE. 39 The railroad train ran off the TRACK. 40 My jaw aches when I chew GUM. 41 Ann was interested in the BREATH. 42 You're glad they heard about the SLAVE. 43 He caught the fish in his NET. 44 Bob was cut by the jack-knife's BLADE. (SPIN Form 8 continued) 45 The man could consider the SPOOL. 46 Tom fell down and got a bad BRUISE. 47 Lubricate the car with GREASE. 48 Peter knows about the RAFT. 49 Cut the meat into small CHUNKS. 50 She hears Bob asked about the CORK. 122 APPENDIX D Time-Amplitude Waveforms of Jittered and Unjittered Tones and Speech Stimuli Origins of the Stimuli The following images were obtained using CSRE 4 .5 and digital files of 100-Hz, 1000-Hz, and 2000-Hz pure tones, as well as the first sentence from Spin-R Form 1, "His plan meant taking a big risk." All files were sampled at a rate of 2 0 kHz. All files were jittered with the high-BW, high-SD, and high-SD/BW jitter settings used in this experiment. The syllable depicted in the following waveforms is [tei] from the word "taking" in the aforementioned sentence. The consonant is the word-final [g] in the word "big" in this sentence. Note that the overall amplitude envelope of the sentence is not affected by any of the jitter settings (high-BW, high-SD, and high-SD/BW). This confirms that the jitter added to the Spin-R sentences in this experiment affected only fine-structure temporal cues. 123 Unjittered 100-Hz Pure Tone g 5.B P u a.a o I t.s.a IB.a 1BBB 2BBB 3BBQ 4BBB SBBB 6BBB TBBB BB BB 9BBB IB Tine (Hi) um^msmm EG3 S3 • • n iHWMPHff l l lEHEOCin • RECORD/CUT/SCALE | Size : 231.09 ms CURSOR AT Time : 3466.39 in Next Pt : -31977 Next Pt : -9.7586 V FILE PARAHETERS Full : 200000 pts Full : 10000,00 ms Disp : 200000 pts Disp : 10000.00 ms SF : 20.00 kHz AScl : OFF INSIDE MARKERS Begin : 4027.50 ms End : 4057,75 ms Diff . : 30.25 ms RNS : 7.0418 V +Peak • 9.9997 V -Peak : -9,9997 V High-BW Jittered 100-Hz Pure Tone 124 M m m a B m m m mm w S E3 ESI M M • • • SBQB Ti M E (Hi) 6BBB 7BBB I RECORD/CUT/SCflLE | Size : 357.14 ms CURSOR AT Time : 3403.36 m Next Pt : 3083 Next Pt : 0.8409 V F I L E PARAMETERS F u l l : 200000 pts Ful l : 10000.00 ms Disp : 200000 pts Disp : 10000.00 ms SF : 20.00 kHz AScl : OFF I N S I D E HARKERS Begin : 4027.50 ms End : 4057.75 ms Di f f . : 30.25 ms RMS : 7.0092 V •Peak : 9.9997 V -Peak : -9.9997 V 125 High-SD Jittered 100-Hz Pure Tone lOOHZJl-Waveform 1 H H _ _ _ _ _ _ _ _ H H H H H H H H H _ _ _ _ _ _ _ _ U B - B ^ ^ ^ ^ ^ ^ H _^_^_^_^_^L1 t - 5 . B ^ ^ ^ ^ ^ ^ ^ H sunups i i m n __ n M _a E_ __ __ _a • • • - , B - Q \BBB 2BBB 3BBB 4BBB 5BBB BBBB 7BBB BBBB 9BBB IB T I M E ( M I ) 100HZ Jl-Waveform__ £> lo HO 1 H«|» 1B.B flS.B P A U B.B o 1 t-5.B X / V/ -1B.B U01 7 dBl 94020 4022 40234025 4027 40204030 40: Tine (Mi) 100HZ Jl-Ua v e f o r m _ _ l P hlD l<C] 1 • 14< 1 • • 1B.B « S . B H P / \ U B.B o \ / t -5.B X / \ / -1B.B UB57 UB5B4BGB 4B62 HB63UB65 UB66 UBBBUB69 UB" T i H e (Hi) I RECORI/CUT/SCftLE | Size : 84,03 ms CURSOR AT Tine : 3487.39 m Next Pt : -21669 Next Pt : -6.6129 V F I L E PARAHETERS Full : 200000 pts Full : 10000.00 ms Disp : 200000 pts Disp : 10000.00 ms SF : 20.00 kHz AScl : OFF I N S I D E HARKERS Begin : 4027.50 ms End : 4057.80 ms Diff . : 30.30 ms RMS : 7.0347 V •Peak : 9.9997 V -Peak : -9.9997 V 126 High-SD/BW Jittered 100-Hz Pure Tone E30i3aS[l^^^ElC3EEl3E21][aaan •IBBB 5BBB BBBB Tine (H i ) 9BBB 4BKB1BH1 4B5B UB52 UB53 UB5H UBSS UB56 KBST UB Ti He (H i ) ttCSREtt-Result Uaveforn aMMMMS 03 IS 03 ESI M • • • I RECORD/CUT/SCflLE | Size : 63.03 ms CURSOR AT Tine : 3487.39 m Next Pt : -21669 Next Pt : -6.6129 V F I L E PARAMETERS F u l l : 200000 pts Ful l : 10000.00 ms Disp : 200000 pts Disp : 10000.00 ms SF : 20.00 kHz AScl : OFF I N S I D E HARKERS Begin : 4027.50 ms End : 4057.80 ms Dif f . : 30.30 ms RMS : 7.1180 V +Peak : 9.9997 V -Peak : -9.9997 V Unjittered 1000-Hz Pure Tone lOOOHZ-Maveforrn mMHP a_omnH_§ME_ata__[_••• 4BBB 5BB0 Tine <Mi) I RECORD/CUT/SCALE | Size : 105.04 ms CURSOR AT Time : 3382.35 m Next Pt : -10125 Next Pt : -3.0899 V F I L E PARAHETERS Full : 200000 pts Ful l : 10000.00 ms Disp : 200000 pts Disp : 10000.00 ms SF : 20.00 kHz AScl : OFF I N S I D E HARKERS Begin : 4021.75 ms End : 4024.85 ms Di f f . : 3.10 ms RMS : G.96G8 V +Peak : 9.9997 V -Peak : -9.9997 V 128 Hiah-BW Jittered 1000-Hz Pure Tone _10pOHZJ2-Waveform IB.: m m m a B I I m m H H M E 03 E3 M M • • • 50QB T i M C (Hs ) I RECORD/CUT/SCHLE I Size : 84.03 ms CURSOR AT Time : 3529.41 m Next Pt : -26509 Next Pt : -8.0899 V F I L E PARAMETERS F u l l : 200000 pts Ful l : 10000,00 ms Disp : 200000 pts Disp : 10000.00 ms SF : 20.00 kHz AScl : OFF I N S I D E HARKERS Begin : 4020.70 ms End : 4024.75 ms Di f f . : 4.05 ms RMS : 7.0405 V +Peak : 9.9997 V -Peak : -9.9997 V 129 Hiah-SD Jittered 1000-Hz Pure Tone RECORD/CUT/SCflLE 4BBB 5BBB Tine (Ml) IB Size : 105.04 ms CURSOR AT Time : 3403.3G m Next Pt : -19259 Next Pt : -5.8774 V F I L E PARAHETERS F u l l : 200000 pts Ful l : 10000.00 ms Disp : 200000 pts Disp : 10000.00 ms SF : 20.00 kHz flScl : OFF I N S I D E HARKERS Begin : 4021.70 ms End : 4024.70 ms Dif f . : 3.00 ms RMS : 7.0299 V •Peak : 9.9997 V -Peak : -9.9997 V 130 High-SD/BW Jittered 1000-Hz Pure Tone _1000HZJ3-Waveform "TBTB sa m Mm B n M n n n M & ca n ua • • • 5BBB Ti ME (Hi> I RECORD/CUT/SCflLE | Size J 63,03 ms CURSOR AT Time : 3508.40 m Next Pt : -19259 Next Pt : -5.8774 V F I L E PARAMETERS Full : 200000 pts Full : 10000.00 ms Disp : 200000 pts Disp : 10000.00 ms SF : 20.00 kHz AScl : OFF I N S I D E HARKERS Begin : 4021.65 ms End : 4024.90 ms Dif f . : 3,25 ms RMS : 7.1043 V +Peak : 9.9997 V -Peak : -9.9997 V 131 Unjittered 2000-Hz Pure Tone _2000HZ-Waveform "757 m m m m B H tH H H _§ E_ E_ E_ ESI _311 • • • 5S3BQ Tine <Hi> H _3 El L_ • • • I RECORD/CUT/SCflLE I Size : 42.02 ms CURSOR AT Time : 3529.41 m Next Pt : 10125 Next Pt : 3.0899 V F I L E PARAMETERS Full : 200000 pts Full : 10000.00 ms Disp : 200000 pts Disp : 10000.00 ms SF : 20.00 kHz AScl : OFF I N S I D E MARKERS Begin : 4021.40 ms End : 4022.95 ms Diff . : 1.55 ms RMS : 6.9779 V +Peak : 9.9997 V -Peak : -9.9997 V 132 High-BW Jittered 2000-Hz Pure Tone 2000HZJ2-Uaveform "TBTBI sai2iiiiaH[ii]g][iiiEaHE3[2^ [aaaa 5BBB Tine ( H i ) BBBB I RECORD/CUT/SCALE I Size : 63.03 ms CURSOR AT Time : 3592.44 m Next Pt : 10125 Next Pt : 3.0899 V F I L E PARAMETERS Full : 200000 pts Full : 10000.00 ms Disp : 200000 pts Disp : 10000.00 ms SF : 20.00 kHz AScl : OFF I N S I D E HARKERS Begin t 4021.40 ms End : 4022,95 ms Dif f . : 1.55 ms RMS : 6.9284 V +Peak : 9.9997 V -Peak : -9.9997 V 133 High-SD Jittered 2000-Hz Pure Tone 2000HZJ1-Uaveform 1B.BI W i l l i s S IHI I I I ZBBB 5BBB Ti HE (Hi) 7BBB I RECORD/CUT/SCALE | Size : 84,03 ms CURSOR AT Time : 3550.42 m Next Pt : 10125 Next Pt : 3.0899 V FILE PARAMETERS F u l l : 200000 pts Ful l : 10000.00 ms Disp ; 200000 pts Disp : 10000.00 ms SF : 20.00 kHz AScl : OFF INSIDE HARKERS Begin : 4021.40 ms End ; 4022.95 ms Di f f . : 1.55 ms RMS : 7.1060 V •Peak : 9.9997 V -Peak : -9.9997 V 134 High-SD/BW Jittered 2000-Hz Pure Tone M S3 i i I B B H M M H H 03 E 03 ESI H DO • • • 4EBB 5EBS BBBB Tine (Hi) I RECORD/CUT/SCALE I Size : 63,03 ms CURSOR AT Time : 3424.37 m Next Pt : -10125 Next Pt : -3.0899 V F I L E PARAMETERS F u l l : 200000 pts Ful l : 10000.00 ms Disp : 200000 pts Disp : 10000,00 ms SF : 20.00 kHz AScl : OFF I N S I D E HARKERS Begin : 4021.40 ms End : 4022.95 ms Dif f . : 1.55 ms RMS : 6,9591 V +Peak : 9.9997 V -Peak : -9.9997 V 135 Unjittered Sentence • d d i i t - i H i B M i ^^iiisa[i^[i][^izi__i_t3_[_i]L_ann 25.B n p I t -5 . [ It 54B3 Ti He (Hi) n P 3943 399B 4B5B 41B3 41 5B 42B9 4262 4315 436B 44| Tine (H i ) SPINl-Uavefr7rr___|p|<«DUC]ll»-l<«lM' 1B.B "s.a n p U B.B o 1 t - 5 . B X -1B.B BBB7 Bl 2B 61 74 B227 B2HB 6333 63B6 6439 6492 55 Tine (Hi) WSREW-Result Waveform if \m\m$< r o w i ^ IPI^ I^ I"1 i « i » 5233 Tine (H i ) I RECORD/CUT/SCALE | Size : 1978.34 ms CURSOR AT Time : 3755.35 m Next Pt : 20 Next Pt : 0.0081 V F I L E PARAHETERS F u l l : 219331 pts Full : 10966.55 ms Disp : 219331 pts Disp : 10966.55 ms SF : 20.00 kHz AScl : OFF I N S I D E HARKERS Begin : 4261.90 ms End : 6212.15 ms Diff . : 1950.25 ms RMS : 0.5134 V •Peak : 1.8552 V -Peak : -2.7673 V R A H P . PARHS Ramp : LINEAR Len, : 20 ms Fact. : 3.00 Step : 0.20 136 High-BW Jittered Sentence SPINlJ2-ldaveform fls.a n P a B H U H H M B13 ^ MMQQQ 54B3 Tine (Hi) SPINlJ2-Waveform 1B.B « 5 . H n P y B.B 0 1 t -5 .B X -1B.B t l iHtMfif ir 3939 3993 4B47 41 Bl 41 56 421 B 4264 431 B 4373 44 Ti HE (Ml) SPINlJ2-Uaveform EGSEsann 1B.B JJ5.B P U B.B 4 t -5.B X -1B.0 61B9 Bl 63 E21 7 B272 6326 B3BB B434 B4B9 6543 B5 Ti HE (Hi) m H H12 M i9 E3 E 03 ESI M •OD RECORD/CUTYSCftLE | Size : 1976.73 ms CURSOR AT Time : 3801.43 m Next Pt • -15 Next Pt : -0.0046 V F I L E PARAMETERS Full : 219331 pts Ful l : 10966.55 ms Disp : 219331 pts Disp : 10966.55 ms SF ; 20.00 kHz flScl : OFF I N S I D E HARKERS Begin : 4259.30 ms End ; 6217.45 ms Diff . : 1958.15 ms RMS : 0.5117 V •Peak : 1.8552 V -Peak : -2.7673 V R A H P . PARHS Ramp : LINEAR Len. : 20 ms Fact. : 3.00 Step : 0.20 137 High-SD Jittered Sentence SPINlJl-UaweforraB jaUC"Ni3ll^  l<MlM> 1B.B « 5 . B n P U B . B o I t - S . B X -1B.B nana UB?H m us ui 97 UZHB 4299 nasi HUBS 4453 45 Ti HE CHl) SPINlJl-MayeforraMJ £)l«E>Uc]||»|« |» 1 B.B 25.B n p U B.B 0 1 t-5.B X -1B.B BB97 61 49 B2BB 6251 63B3 6354 B4B5 B457 BSBB B5 Ti HE ( t l x ) | RECORD/CUT/SCALE | Size : 2096.55 rns CURSOR AT Tine : 3893.59 m Next Pt : 5 Next Pt : 0.0015 V F I L E PARAHETERS Full : 219331 pts Full : 10966.55 ms Disp : 219331 pts Disp : 10966.55 ms SF : 20.00 kHz AScl : OFF I N S I D E HARKERS Begin : 4259.70 ms End : 6211.75 ms Diff . : 1952.05 ms RMS : 0.5128 V +Peak • 1.8552 V -Peak : -2.7673 V R A H P . PARHS Ramp : LINEAR Len, : 20 ms Fact. : 3.00 Step : 0.20 High-SD/BW Jittered Sentence SPINU3-Uaveform " S . B 1 n F n B.B o t .5 .B x - I B . B ^^0EBiit§[^[iEicnHEis]5ii[aann W m If i " l w 54B3 Tine (Hi) SPINlJ3-Waveform IB .B fla.B n p U B.B 0 1 t .5 .B X - I B . B f tMIWI i ' i i i i 3992 I B M 4B97 41 49 42B2 4254 43B7 4359 Hill 2 4tt Tine <Hx) SPINlJ3-WaveT^r!rrB £>IC>l<C]liV l « l » IB .B 2 5 . B n P U B.B 0 t-s.a X - I B . B fL-T • • Jl LIU 6093 61 45 61 9B B25B B3B3 B355 B4BB 646B B51 3 B5 Tine (Hi) WCSREW-Result Uaveform I RECDRD/CUTTSCHIE I Size : 2008.99 ms CURSOR AT Time : 3778.39 m Next Pt : -13 Next Pt : -0.0040 V F I L E PARAHETERS F u l l : 219331 pts Ful l : 109GG.55 ms Disp : 219331 pts Disp : 10966.55 ms SF : 20.00 kHz AScl : OFF I N S I D E HARKERS Begin : 4201.90 ms End : 6302.85 ms Di f f , : 2100.95 ms RMS : 0.4921 V +Peak : 1.8198 V -Peak *. -2.7625 V R A H P . PARHS Ramp : LINEAR Len. : 20 ms Fact. : 3.00 Step • 0.20 1 Unjittered Syllable fteil « 5 . B F US IB I^IIBIIEM IPH&I«3|® !>• !• I« l » 54B3 65BB Ti He (Hi) JJ5.B n p 5B52 51 B5 51 5B 5211 5264 531 7 537B 5423 5476 55 Tine (Hi) "S.B n P u B.B I t - 5 . B x -1B.B 5B52 51 B5 51 5B 5211 5264 531 7 537B 5423 5476 55 Tine (Mi) HCSRE**-Result Waveform fls.B n P U B.B o I t-5.B x -1B.B U U M E2 M __ C- E Q_ E_ H • • • 5196 52BB 522B Tine (Mx) I RECORD/CUT/SCALE | Size : 118.09 ms CURSOR AT Time : 3870.55 m Next Pt : 9 Next Pt : 0.0027 V F I L E PARAMETERS F u l l : 219331 pts Ful l : 109BG.55 ms Disp : 219331 pts Disp : 109GG.55 ms SF : 20.00 kHz AScl : OFF I N S I D E HARKERS Begin : 5133.55 ms End : 5312.20 ms Diff. : 178.G5 ms RMS : 0.4363 V +Peak : 1.2665 V -Peak : -1.9696 V R A M P . PARHS Ramp : LINEAR Len. : 20 ms Fact, : 2.GO Step : 0.20 High-BW Jittered Syllable fteil 140 SPINlJ2-Uaveform fis.a n P M H Mm B M M H H H E_ E_ E_ ____[!••• . - - I i l l i i i 54B3 Tine (Hi) B773 SP I NI J2-Ua veform"__ P|<D\iC 1 • 1H1 1B.B jj - L B P U B.B o 1 t .5 .B x -1 B.B 51 BB 51 B6 511 3 511 9 51 25 513251 3B 51 44 51 5B 51 Ti ME CHD SP I NI J2-lda vefonrT__ P |«D U<\ 1 • 14i 1 1B.B fiS-B n P U B.B o 1 t-5.B X -1B.B ft. A y i ft n n XJ. A_ A 5354 S3BB S3B7 5373 5379 53B55392 539B 54B4 54 Tins (Hi) WSREW-Result Uaveform 25.0 H P u B.B I t -5.B x -10.0 aiiii]r____giE_i_HE_oana 51 BB 51 72 51 B4 51 96 52B7 5219 Tine (Hi) I RECORD/CUT/SCALE I Size : 118.55 ms CURSOR AT Tine : 3755,35 m Next Pt : 20 Next Pt : 0.0081 V F I L E PARAHETERS Full : 219331 pts Ful l : 10966.55 ms Disp : 219331 pts Disp : 10966.55 ms SF : 20.00 kHz AScl : OFF I N S I D E HARKERS Begin : 5125.40 ms End : 5379.35 ms Diff. : 253.95 ms RMS : 0.4555 V +Peak : 1.2665 V -Peak : -2.0944 V R A H P . PARKS Ramp : LINEAR Len. : 20 ms Fact. : 2.60 Step : 0.20 High-SD Jittered Syllable [teil aBBiiraiiHEiEEaEsiEiacgann 1B97 2193 329B 43B7 . 5UB3 B5BH 7B77 B773 9B7B 11 SPIHUl-Uaveforw 1B.BI « 5 . B P IB .B |P|<D|«I|H«I» 3992 4B44 4B97 41 49 42B2 4254 43B7 4359 441 2 44 Ti ME <M.) 1 SPINIJI-Uaveform 1B.BI 2,5.0 n P |P|Q|«3|» l«~T» BB93 Bl 45 El 9B B25B E3B3 B355 B4BB E4EB E51 3 65 T I M E ( H I ) »*CSREi*-Result Naveform 1B.BI A 5.0 P -1 B.B 5212 T I M E ( H I ) 5225 I RECORD/CUT/SCflLE • Size : 127.19 ms CURSOR AT Time ; 3755.35 m Next Pt : 20 Next Pt : 0.0061 V F I L E PARAHETERS F u l l : 219331 pts Full : 10966.55 ms Disp : 219331 pts Disp : 10966.55 ms SF : 20.00 kHz flScl : OFF I N S I D E HARKERS Begin : 4201.90 ms End : 6302.85 ms Diff. : 2100.95 ms RMS : 0.4943 V +Peak : 1.8552 V -Peak : -2.7673 V R A H P . PARKS Ramp : LINEAR Len. : 20 ms Fact. : 2.60 Step : 0.20 142 High-SD/BW Jittered Syllable rteil ®|^|^|^^|[ j |^| lBlfflEl^|p |^ha|®|l>.>|>>|^|^ 1B97 2193 3290 43B7 54B3 55BB 7B77 B773 9B7B IB 1 SPINU3-Uaveform 1B.BI £ 5 . 8 n P 3992 4B44 4B97 HI 49 42B2 4254 43B7 4359 441 2 44 TiHE (Mi) SPINlJ3-Uaveform 1B.BI « 5 . B n P i t -5 . IP|e|<cj|H«~I» 6B93 Bl 45 Bl 9B B25B B3B3 B355 B4BB B4BB B51 3 65 Ti ME (Ml) $*CSRE$*-Result Uaveform «5 .B n I P U B.B o 1-5.B; i -IB.a 51 Bl 519B 521B 5222 Ti ME (Mi) 5234 524B I RECORD/CUT/SCALE I Size : 121,45 ms CURSOR AT Time : 3778.39 m Next Pt : -13 Next Pt : -0.0040 V F I L E PARAHETERS F u l l : 219331 pts Full : 10966.55 ms Disp : 219331 pts Disp : 10968.55 ms SF : 20.00 kHz AScl : OFF I N S I D E HARKERS Begin : 4201.90 ms End : 6302.85 ms Diff . .* 2100.95 ms RMS : 0.4921 V •Peak : 1.8198 V -Peak : -2.7625 V R A H P . PARHS Ramp : LINEAR Len, Fact, Step 20 ms 2.60 0.20 Unjittered Consonant fql SPIHl-Uaveforni ^ mm 1B97 43B7 54B3 65BB TiMC (Mi) 9B7B SS.B n P t-5.B 55B2 5B36 5BB9 5742 5795 5B4B 59B1 5954 BBB7 BB T i HE (Ms ) £ 5 . B n P t-5.B 55B3 5B3B 5BB9 5742 5795 5B4B 59B1 5954 BBB7 BB TiHE (Ms) t»CSRE**-ResuIt Uaveform 1B.BI « 5 . B n P 5BB9 Ti HE (Hi) 5% • RECORD/CUT/SCALE I Size : 102.33 ms CURSOR AT Time : 3847.51 m Next Pt : -11 Next Pt : -0.0034 V F I L E PARAHETERS F u l l : 219331 pts Full : 109GB.55 ms Disp : 219331 pts Disp : 10966.55 ms SF : 20.00 kHz AScl : OFF I N S I D E HARKERS Begin : 5787.65 ms End : 5889.05 ms Dif f . : 101.40 ms RMS : 0.3944 V +Peak : 1.1377 V -Peak : -1.3959 V R A H P . PARHS Ramp : LINEAR Len. : 20 ms Fact. : 4.80 Step : 0,20 144 Hiah-BW Jittered Consonant \Q] SPINU2-Uaveform IB .B 2 5.a n p U B.B o I t -5.B x - I B . B ( D | ^ | ^ i | ^ ^ | [ i | ^ | f f l B l S E M l P H & h c j | ® | | i . . | ^ | ^ | ^ I, t*M iki" ' I f "n U3B7 5UB3 B5BB Tine (Hi) SPINU2-Uaveform IB .B 25.B n p U B.B o t-s.a i - I B . B F\ 1\?<f\ fit<f\ fil\f\ 1 — v \/V 57B5 571 B 571 5 572B 57255731 573B 57U1 5746 57 Tine (Hi) SPINlJ2-Uavefo7Hr__l £>IC=l<Ol.** IB .B JJ5.B P U B.B o 1 t-5.B I - I B . B 591 3 591 B 5923 59ZB 5933 593B 59UU 59U9 5954 59 Tine (Hi) WCSREW-Result Waveform 2 5.B n P I t -5 . - I B . B 5B32 5B41 Tine (Hi) 1 RECORD/CUT/SCALE • Size : 8S.71 ms CURSOR OT Time : 3893.59 m Next Pt : 5 Next Pt : 0.0015 V F I L E PARAHETERS Full : 219331 pts Ful l : 10966.55 ms Disp : 219331 pts Disp : 10966.55 ms SF : 20.00 kHz AScl : OFF I N S I D E HARKERS Begin : 5725.G5 ms End : 5933.45 ms Diff . : 207.80 ms RMS : 0.6399 V +Peak '. 1.5457 V -Peak : -2.2156 V R A H P . PARHS Ramp : LINEAR Len. : 20 ms Fact. : 4.80 Step : 0.20 145 High-SD Jittered Consonant fql SPINIJI-Uaveform M M HIE S n M M EI EH M E E3 E3 ^  M • • • IB .B JJ5.B F U B.B 0 1 t-5.B X - I B . B jtiJftMlWjLL 3992 UBUU LLB97 41 49 42B2 4254 43B7 4359 441 2 44 Tine <Hx) SPINlJl-Uavefo7m~B £>lcl«div l« l» IB .B 25.a n P U B.B 1 t-5.B X - I B . B 6B93 61 45 Bl 9B B25B B3B3 6355 64BB 646B B51 3 B5 Tine (Hi) I RECORD/CUT/SCALE | Size : 105.73 ms CURSOR AT Time : 3824.47 m Next Pt : 17 Next Pt : 0.0052 V F I L E PARAHETERS F u l l : 219331 pts Full : 10988.55 ms Disp : 219331 pts Disp : 10988.55 ms SF : 20.00 kHz AScl : OFF I N S I D E HARKERS Begin : 4201.90 ms End : 8302.85 ms Dif f . : 2100.95 ms RMS : 0.4943 V +Peak • 1.8552 V -Peak : -2.7673 V R A H P . PARKS Ramp : LINEAR Len. : 20 ms Fact. : 4.80 Step : 0.20 146 High-SD/BW Jittered Consonant fal SPINlJ3-Waveform •(Q |^ |^ lB |^ |g |^tg |IBB|ff lEMlPh&h<l |®H>" ! • !<<••• fi 5.B 1B97 2193 329H 43B7 54B3 B5BB 7B77 B773 9B7B IB 1 • Ti HE (Hi) SPINlJ3-Waveform IB .B 2 5.B n P U B.B 0 1 t-5.B x - I B . B 3992 4B44 4B97 i l l 49 42BZ 4254 43B7 4359 441 2 44 TiHE (Hi) SPINU3-Waveform IB .B 5 " p y.f, i -u u-ii BB93 Bl 45 Bl 9B B25B B3B3 B355 B4BB B4BB B51 3 B5 TiHE (Hi) UCSREtt-Result Waveform « 5 . B p •IB.B m m m 13 L_j G3 E_ E3 G_ [1 • • • 5BB9 5B3B TiHE (Hi) 5B54 I RECORD/OJT/SCALE I Size : 90.69 ms CURSOR AT Time : 3778.39 m Next Pt : -13 Next Pt : -0.0040 V F I L E PARAHETERS F u l l : 219331 pts Ful l : 10966.55 ms Disp : 219331 pts Disp : 10966.55 ms SF : 20.00 kHz AScl : OFF I N S I D E HARKERS Begin : 4201.90 ms End : 6302.85 ms Diff . : 2100.95 ms RMS : 0.4921 V +Peak : 1.8198 V -Peak : -2.7825 V R A M P . Ramp Len, Fact. Step PARHS ; LINEAR ; 20 ms : 4.80 : 0.20 APPENDIX E Spectrograms of Jittered and Unjittered Tones and Speech Stimuli Unjittered 100-Hz Pure Tone IHMMMM-_M-_MMM--_-(-3l — • — • M I _ I _ I _ I _ I _ I _ H _ B _ I _ I _ I _ H _ I _ I _ I _ I _ V T I 100HZ-n.ai JLlB B.B E D B.B u N 5 . B c ' H-B I k I H a.H Z.B 1 .B a.a -SI -55 -BB - M -BE -T3 -77 -32 -56 -9B 4BBB 5BBB Ti H E (Mi) IBB [U H E_ E 03 LSI • • • • 100HZ-Cross-Sect i on S 9 B *7 1.3 Z.B 3.B 4.B S.B B.B 7.B B.B 9.0 10, Ft-equencH (kHz) CURSOR AT Time : 3196.9 ms Freq : N/A Mag. : N/A F I L E PARAHETERS Proc. Wind. Bands Ovlp. Func. SF Pre. Order AC 512 pts 256 50 Z HANNING 20.00 kHz 98.0 % 15 5 PEAKS F<Hz) M(dB) 0.0 -53.80 117.0 -46.60 N/A N/A N/A N/A N/A N/A HARKER AT F(Hz) M(dB) 148 High-BW Jittered 100-Hz Pure Tone ma siii^i i i ip™^ I N IB.B -44 9.B B.B -49 -54 E q B.B u -5B -B3 -67 -72 -77 -Bl -BB N5-B c V 4.B k "a.a z 2.B 1 ,B -9B B.B 1! IBB 2BBB 3BBB 4BE B 5BBB F J B B B Tine (MsJ 7BBB BBBB 9BBB IBB 100HZJ2-Uaveform £013 03 E3 EE [3 [1 • • • 4BBB 5BBB BBBB Tine (Hi ) 100HZJ2-Data Processed H ' F 5-Bl 100HZJ2-Cross-Secti on l .B Z.B 3.B 4.B 5.B B.B 7.B B.B 9.B 1 Ft-cqucncM ( k H z ) 1 SPECTROGRAM Status: 1002 CURSOR AT Time : 4897,7 ms Freq : N/A Mag. : N/A F I L E PARAHETERS Proc. : AC Wind. : 512 pts Bands : 25G Ovlp. : 50 Z Func. : HANNING SF : 20.00 kHz Pre. : 98.0 I Order : 15 5 PEAKS F<Hz) M(dB) 78.0 -50.20 2382.0 -79.G0 3867.0 -81.80 5312.0 -79.30 7265.0 -82.GO HARKER AT F<Hz> M(dB) 149 High-SD Jittered 100-Hz Pure Tone 9.0 H.B E a B . B 1 N 5 . B C * 4 . B k " 3 . B z 2.0 1 .0 0.0 - 4 7 - 5 1 - 5 5 - 6 0 - 6 4 - 6 3 - 7 3 - 7 7 - B 2 - B 5 - 9 0 4 B B B 5 B B B TiHE ( t i l ) 2 | lOOHZJl-Waveform 7 H n P 5 . E l uB.ll ° - 5 . B t B . B 5 B B 0 T I H E ( H i ) B . B 100HZJ1-Data Processed TT M p 5 .a| U B . B ° - 5 - B T B . B 4 9 6 4 4 9 6 7 4 9 6 9 4 9 7 Z 4 9 7 4 4 9 7 7 4 9 7 9 4 9 B 2 4 9 B 5 4 9 1 Ti H E (MI) 100HZJl-Cross-Secti on l . B 2 . 0 3 . B 4 . B 5 . B 6 . B 7 . B B . B 9 . B I B ; Ft-squEncH ( k H z ) Status: 1002 CURSOR AT Time : 4961.S ms Freq : N/A Mag. : N/A F I L E Proc. Wind. Bands Ovlp. Func, SF Pre. Order PARAHETERS : AC : 512 pts : 256 : 50 I : HANNING : 20.00 kHz : 98.0 I : 15 5 PEAKS F(Hz) M(dB) 117.0 -49.90 22G5.0 -74.50 4804.0 -78.70 7382.0 -79.G0 8906.0 -82.30 HARKER AT F(Hz) M(dB) High-SD/BW Jittered 100-Hz Pure Tone 150 EZSEI ® ail ills _ 9.B B.B E q B.B U N s-a c V i l .B k H 3 . B z Z.B 1 -0 B.B -BB -73 -77 -Bl 5BBB T i M e (H i ) 100HZJ3-Waveform •51 4BBB 5BBB BBBB Tine (H i ) 100HZJ3-Cross-Secti on 1.0 2.0 3.0 U.0 5.0 B.B 7.B B.B 9.B 10j FrcqucncH (kHz ? Status: 1002 CURSOR AT Time : 5000.0 ms Freq : N/A Mag. : N/A FILE PARAHETERS Proc. Wind. Bands Ovlp. Func. SF Pre. Order AC 512 pts 256 50 Z HANNING 20.00 kHz 98.0 Z 15 5 PEAKS F<Hz) M(dB) 0.0 -50.20 29S8.0 -79.10 4414.0 -7G.40 5G25.0 -74.40 G796.0 -74.20 HARKER AT F<Hz) M<dB) Unjittered 1000-Hz Pure Tone £11 m & | " 3 | n A | P | « ) | « ] | | M > ! • | « | > IB.B 9.B B.B E Q B.B U NS.B c V ».B it "3.B z Z.B l .a B.B -SB! -36 1 -421 4B j -54 H -BBH -66 -72 -7B -B4 -9B 1BBB 2BBB 3BBB 4BBB 5BBB E TiHE (Mi) BBB TBBB BBBB 9BBB IBB 33B2 33B4 33B7 33B9 331 2 331 5 331 7 332B 3322 33 Ti HE (Ml) 1 .B 2.B 3.B 4.B 5.B B.B 7.B B.B 9. Ft-cqucncH (kHz) I SPECTROGRAM Status: 1002 CURSOR AT Time : 3299,2 ms Freq : N/A Ma9. : N/A F I L E PARAHETERS Proc, Wind. Bands Ovlp. Func. SF Pre. Order AC 512 pts 256 50 2 HANNING 20.00 kHz 98.0 2 15 5 PEAKS F<Hz) M(dB) 1015.0 -31.20 N/A N/A N/A N/A N/A N/A N/A N/A HARKER AT F<Hz) M(dB) 152 High-BW Jittered 1000-Hz Pure Tone SPECTROGRAM 511 B 51 ZB 51 Z3 51 Z5 51 ZB 51 3B 51 33 51 35 51 3B 511 Ti ME ( H i ) | Status: 100Z CURSOR AT Time : 5115,1 ms Freq ,* N/A Mag. : N/A F I L E PARAHETERS Proc. : AC Wind. : 512 pts Bands : 256 Ovip, : 50 t Func. HANNING SF : 20.00 kHz Pre. : 98.0 1 O der 15 5 P E A K S F(Hz) M(dB) 1015.0 -27.10 4804,0 -58. GO G79G.0 -61.70 8945.0 -59.G0 N/A N/A HARKER AT F<Hz) M(dB) 153 High-SD Jittered 1000-Hz Pure Tone lOOOHZJl-Haveform I Tine (H i ) BBBB lOOOHZJl-Cross-Secti on l . B 2.B 3LB 4 J I 5.0 B.0 7.0 B.0 9.0 i d FheguencH (kHz) Status: 100Z CURSOR AT Time : 4897.7 ms Freq : N/A Mag. : N/A F I L E PARAMETERS Proc. Hind. Bands Ovlp, Func. SF Pre. Order AC 512 pts 25G 50 Z HANNING 20.00 kHz 98.0 Z 15 5 PEAKS F(Hz) M<dB) 1054.0 -2G.40 3320.0 -55.90 5781.0 -57.90 7G95.0 -59.00 N/A N/A HARKER AT F<Hz) M<dB) 154 High-SD/BW Jittered 1000-Hz Pure Tone E H 9 J 1 B. B 9. B B.B E • B.B U M S . a c * H.B k " 3 . B z 2.B -38 -3B -42 -4H -54 -BB -BB -72 j -7B -BU 1 l .a -9B B.B IB 3B 2BBB 3BBB UBBB 5BBB BBBB 7BBB BBBB 9BBB IBB Ti HE (Hi) 1000HZJ3-Waveform "BTB! 1 B.B 5BBB Tine (Hi) TBE 1000HZJ3-Cross-Section | 0|a 3F. •13 •IH B 7 3 7 H n.i -90 _ 1.0 2.0 3.0 H.B 5.0 B.B 7.0 B.E FhequencH (kHz) i 9.a IE I SPECTROGRAM! Status: 100% CURSOR fiT Time : 4693.1 ms Freq : N/A Mag. : N/A F I L E PARAMETERS Proc, Wind. Bands Ovlp. Func. SF Pre. Order AC 512 pts 25G 50 2 HANNING 20,00 kHz 98.0 2 15 5 PEAKS F<Hz> M(dB) 1015.0 -36,90 1484.0 -40.80 3750.0 -G1.80 515G.0 -55.30 6601.0 -55.50 MARKER AT F<Hz) M(dB) 155 Unjittered 2000-Hz Pure Tone 9.a B.B E Q B.B U N 5 . B C V U.B k H 3 . B z 2.a l .a B.a iaaa zaaa aaaa 5BBB Ti He (H i ) BBBB TBBB 2000HZ-Uaveform BBBB 2000HZ-Cross-Section 1 .B 2.B 3.B it ._ FhequencH 5.B B.B 7.B B.B 9. — 'kHz) Status: 100Z CURSOR AT Time : 4156.0 ms Freq : N/A Ha 9. : N/A F I L E Proc. Wind. Bands Ovlp. Func, SF Pre. Order PARAHETERS : AC : 512 pts : 256 : 50 z : HANNING : 20.00 kHz : 98.0 Z : 15 5 PEAKS F(Hz) 1992.0 N/A N/A N/A N/A M(dB) -23.90 N/A N/A N/A N/A HARKER AT F(Hz> M<dB> 156 High-BW Jittered 2000-Hz Pure Tone -1 9 -26 -33 -40 -47 -54 -BZ -69 -76 -B3 -9B IBB sjHEaEiSiSiiinnn 4BBB 5BBB Tine (Mi) 2000HZJ2-Cross-Section A . l .H 2.0 3.0 4.0 5.B B.B 7.0 B.B 9.0 la, Frequency (kHz) Status: 100Z CURSOR AT Time : 4578.0 ms Freq : N/A Ma9. : N/A F I L E Proc. Wind. Bands Ovlp. Func. SF Pre. Order PARAMETERS : AC : 512 pts : 25G : 50 t : HANNING : 20.00 kHz : 98.0 I : 15 5 PEAKS F(Hz) M(dB) 2031.0 -29.20 4843.0 -50.00 G445.0 -52.90 8945.0 -53.50 N/A N/A MARKER AT F(Hz) M(dB) Hiah-SD Jittered 2000-Hz Pure Tone 9.a B.B E q B . B u N S . B c V U.B k H 3 . B z 2. a i .1 -55 -B2 -69 -7B ilBBB 5BBB T I H E ( H i ) ^i^EaHEaEiiinnn 5BBB Tine (Hi) 2000HZJl-Cross-Sect i on A 1.0 2.0 3.0 U.B 5.0 &.0 7.0 H.0 9.0 10] Fl~equcncH (kHz) I SPECTROGRAM! Status: 1002 CURSOR RT Time : 4936.1 ms Freq : 9765.0 Hz Mag, : -57.7 dB F I L E PARAMETERS Proc. Wind. Bands Ovlp. Func. SF Pre. Order AC 512 pts 256 50 2 HANNING 20.00 kHz 98.0 2 15 5 PEAKS F(Hz) M<dB) 1992.0 -27.10 3437.0 -50.30 49G0.0 -53.20 6328.0 -54.50 7539.0 -55.40 HARKER AT F(Hz) M(dB> 158 High-SD/BW Jittered 2000-Hz Pure Tone H H tsiiiiiiis_^Hi3a[i^^E3SiBiS]iinnn 5.H B.B E q B.B I N5.0 c 1 n .a k H 3 . B z 2.B 1 .B B.B UBBB 5BSB Ti ne (Mi) 2000HZJ3-Data Processed ia u B.B 0-5.0 1 5B92 5095 5077 51 B0 51 02 5105 51 B7 511 0 511 2 51' Tine (Mi) 2000HZJ3-Cross-Secti on 1.0 2.0 3.0 4.0 5.B B.B T.B B.B 9.BIB, FhequencH (kHz) M SPECTROGRAM Status: 100Z CURSOR AT Time : 5089,5 ms Freq : N/A Mag. : N/A F I L E PARAHETERS Proc. Wind. Bands Ovlp. Func. SF Pre. Order AC 512 pts 25G 50 Z HANNING 20.00 kHz 98.0 Z 15 5 PEAKS F<Hz> M(dB) 13G7.0 -41.40 2500.0 -36.G0 304G.O -35.10 G054.0 -49.20 7812,0 -47.50 HARKER AT F(Hz) H(dB) Unjittered Sentence 159 SPINl-i a _ HI s. a. V-E P B. U h. c «». k H i . 33 U H mia H f l B M M M H EH Gfl SI 03 E l l • • • U2U5 5877 5355 5B33 Tine (till 591 a SPINl-Waveform a. a °-S.B t B.B 5B77 5355 5B33 Ti M e (Ms) B7i SPINl-Data Processed 1 • SPINl-Cross-Section 1 BSlP, H B.B n P 5.B U B.B °-5.B 4 B.B X •I .1 •iq qii H S B 7T Rl nc -SB HfiBB HBB3 H66B 46BB UB71 U673 H676 HB7B UBB1 HE T i Me (Mi) l .B 2.B 3.B il.B 5.B B.B 7.B B.B 9.B1E FrequencH ( k H z ) | SPECTROGRAM | Size : 2764.68 ms CURSOR AT Time Freq Ma9. 4657.9 ms N/A N/A F I L E Proc. Wind. Bands Ovlp. Func. SF Pre. Order PARAHETERS : AC : 512 pts : 256 : 50 z : HANNING : 20.00 kHz : 98,0 z : 15 5 PEAKS F(Hz) M(dB) 820.0 -62.80 1445.0 -58.70 2695.0 -68.10 4101.0 -72.10 6406,0 -85.80 HARKER AT F<Hz) M(dB> 160 High-BW Jittered Sentence 9.a a.a E q B.B u „ 5 . a c ' n.a k H 3 .B z 2.B i .a a.a 4291 4563 4B3B 51 BB 53B1 5653 5926 519B 6471 Tins (Hi) SPINlJ2-Uaveform * B 7 B P 5.B U B.B °-5.B 4 B.B m mm "Mi" mi wimf. 53B1 Ti H E (Hi) SPIN1J2-Lata Processed fl n p 5.a| u B.B 0-5. B 4 a.a 4673 4676 4675 4651 4BB3 4BBB 4BBB 4691 4694 46' T i H E ( H i ) I SPINU2-Cross-Section l .B 2.B 3.B 4.B 5.B B.B 7.B B.B 9 . B i d FrEguEncH (kHz) I SPECTROGRAM I Size : 2718,60 ms CURSOR AT Time : 4670.6 ms Freq : 8960.0 Hz Mag. ; -78.6 dB F I L E PARAMETERS Proc. Wind. Bands Ovlp. Func. SF Pre. Order AC 512 pts 256 50 % HANNING 20.00 kHz 88.0 % 15 5 PEAKS F(Hz) M(dB) 1484.0 -60,70 2773.0 -69,10 4335.0 -74.60 5976,0 -75.50 8750.0 -78.00 MARKER AT F<Hz) M(dB) High-SD Jittered Sentence E2EHH i s i E i i i i i E z ^ H M S M ^ ^ n B H s n a n n SPINUl-tlaveform n' P 5 .H | U B.B °-5.B t B.B MRP Wd' 5374 TiHE (Hi) SPINIJI-Data Processed K P S.Bl B.B U °-5.B I t B.B B . B p ° r f * v 4B73 46TB 4B7B 46B1 46B3 4BB6 46BB 4691 4694 46' Ti HE (Hi i | SPINDl-Cross-Sect i on l . B 2.H 3.B 4.a S.B B.B 7.B B.B 9.B 1 B] Fr-squEncH (kHz > I SPECTROGRAM 1 Size : 2534.29 ms CURSOR AT Time : 4670.G ms Freq : 9531.0 Hz Mag. : -87.7 dB F I L E PARAHETERS Proc. Wind. Bands Ovlp. Func. SF Pre. Order AC 512 pts 25G 50 Z HANNING 20.00 kHz 98.0 % 15 5 PEAKS F(Hz) M(dB) 1484.0 -58.80 2695.0 -67.50 4218.0 -72.90 5625.0 -81.20 7304.0 -84.GO HARKER AT F<Hz) M(dB) High-SD/BW Jittered Sentence uamm g ) i l 0 1 i l ! B _ ^ I I S H M M ^ l E E l D 3 E l [ i n n n 9.B B.B E Q B.B U „ S . B C V 4.B k " 3 . B z a.a 1 .a B.B 5259 Tine (Hi) SPINU3-Uaveform P 5.BI rBTB °-5.B t B.B 1 B t f , l l , ! l | l T * 1 433B 5B29 5259 Tine (Mi) spir HJ3-Data Processed | | SPINU3-Cross-Section 1 •SlP, ^ B . B P 5.B y B.B 0 - 5 . B T B.B i il il •1 d K>1 Ed F.^ 3 „ " ni BR -9B 4635 4637 464B 4642 4645 464B 465B 4653 4655 46 Ti H E (Mi) l . B 2.B 3.B U.B 5.B B.B 7.0 B.B 9.0 IE Fl-equcncH (kHz) | SPECTROGRAM | Size : 2303.90 ms CURSOR AT Time : 4G32.3 ms Freq : 97G5.0 Hz Mag. : -75.8 dB F I L E PARAHETERS Proc. Wind. Bands Ovlp. Func, SF Pre. Order AC 512 pts 256 50 2 HANNING 20.00 kHz 98.0 2 15 5 PEAKS Rife? M(dB) 1601.0 -63.00 3203.0 -60.30 5G25.0 -70.90 7226.0 -74.30 8515.0 -73.10 HARKER AT F<Hz> M(dB> 163 Unjittered Syllable fteil 1. E B.B E a G.B u M5-B C V 4.B k H3 .B z 2. a i .a B.B 11 m^\mm^\pm<c\\>>A^m 5B99 511 B 51 3B 515? 51 76 T I M E (H I ) 5195 (jB.B P 5.B U B.B °-5.B t B.B I ^ M I P | € > | « 3 | H H < < . » w"'t/"~(/' ,"uVrtv^''"0*V^'*tA"\A"vA 5176 TiHE (H i ) 51 95 5214 5233 5253 SPINl-Data Processed JB-BT-P 5.B| U B.B °-5.B 51 45 51 4B 51 51 5153 51 56 51 58 51 61 51 63 51 66 511 T I M E ( M I ) SPINl-Cross-Section l . B 2.8 3.B 4.B 5.B 6.B 7.a B.B 9 . B i d FrequEncH (kHz) Size : 191,01 ms CURSOR AT Time : 5130.1 ms Freq : N/A Ma9. : N/A F I L E PARAMETERS Proc. Wind, Bands Ovlp. Func. SF Pre. Order AC 512 pts 256 50 X HANNING 20.00 kHz 98.0 1 15 5 PEAKS F(Hz) M(dB) 3789,0 -77.20 N/A N/A N/A N/A N/A N/A N/A N/A MARKER AT F<Hz> M(dB) 164 High-BW Jittered Syllable rteil \mssm SliIilSlls_tl|l^MMMllMHEiaiiann 3 . 1 B . B E q B.B u NS-B c V U.B k H3.B z s.a l.B B.a SBBB 5B9B 511S 5139 5163 51E T i MB ( H i ) 5212 5236 526B -49 -SH -SB -E2 -B7 -72 -77 -Bl -BB -9B "52H SPINlJ2-Uaveform f i T l n P S.B U B.B •t B.B 5BBB 51 63 T i HE ( H i ) 5236 SPIN1J2-Data Processed H ' P 5.B| 1 B.B U 0.5. B t B.B 5223 5225 522B 523B 5233 5236 523B 5241 5243 52' T I M E ( H I ) | SPINU2-Cross-Section l.B 2.B 3.B H.B 5.B 6.B 7.B B.B 9.BIB] F r e q u c n c M (kHz? I SPECTROGRAM! Size : 242.73 ms CURSOR AT Time : 5105.1 ms Freq : N/A Mag. : N/A F I L E PARAHETERS Proc. Uind, Bands Ovlp. Func. SF Pre. Order AC 512 pts 25G 50 Z HANNING 20.00 kHz 93.0 Z 15 5 PEAKS F<Hz) M(dB) 507.0 -GG.90 2070.0 -G5.40 3085.0 -71.40 5703.0 -81.90 7539.0 -81.80 HARKER AT F(Hz) M(dB) 165 High-SD Jittered Syllable rteil 1. B B.B E Q B.B II MS.a c * 4.B k H 3 . a z 2 . B i .a a .a 11 |i>|^|E|^D|^|PH&hc]|li-|>>|^|r> - M -49 -5U •SB -B3 -B7 -72 -77 -Bl -BB -9B 51 57 51 BB Tine (Ms) SPINIJI-Waveform MB.B p 5.a •t B.B 5157 Tine (Mi) SPINIJI-Data Processed M | P 5.B U B.B °-5.B •r a.a 51 71 51 7K 51 77 51 79 51 B2 51 B4 51 B7 51 B9 51 92 51 T i H E (Hi) | SPINUl-Cross-Section l . B 2.B 3.B H.B 5.B B.B 7.B B.B 9 . B I B . Ft-squEncH (kHz) | SPECTROGRAM | Size : 230.09 ms CURSOR AT Time : 5105.0 ms Freq : N/A Mag. : N/A F I L E PARAHETERS Proc. : AC Wind. : 512 pts Bands : 25G Ovlp. : 50 % Func. : HANNING SF : 20.00 kHz Pre. : 98.0 t Order ; 15 5 PEAKS F(Hz> M(dB) 3710.0 -70.10 4843.0 -74.50 6171.0 -75.60 N/A N/A N/A N/A HARKER AT F(Hz) M(dB) High-SD/BW Jittered Syllable fteil 9.a a.a E Q 6.B U N 5 . B C ' n.a k H 3 . a z 2„E i .a a.a 1^11 l ^ l ^ lB lM^MlPht> I^H>- | t> |^ |^ -4.9 -54 -5B -E -1 -1 51 63 Ti H E (Hi) raTB SPINlJ3-Waveform P 5 .B | u B.B °-5.B •t B.B II IS E3 Q2 Eg] DO • • • 5163 TiHE (Hi) B.B SPIN1J3-Data Processed H P 5 .B | u B.B 51 59 51 61 51 6* 51 B6 51 69 5172 51 74 51 77 51 79 51 T i HE (Hi> SPINlJ3-Cross-Section l . B 2.B 3.B 4.B 5.B B.B 7.B B.B 9.BIB] FhequencH CkHz) I SPECTROGRAM Size : 243.06 ms CURSOR AT Time : 5117.9 ms Freq : N/A Mag. : N/A F I L E PARAHETERS Proc. Wind. Bands Ovlp. Func, SF Pre. Order AC 512 pts 256 50 Z HANNING 20.00 kHz 98.0 Z 15 5 PEAKS F<Hz) M(dB) 3906.0 -69.00 5234.0 -72.60 6601.0 -74.00 7812.0 -74.90 8750.0 -75.70 HARKER AT F(Hz) M(dB) Unjittered Consonant \g] 9.B B.B E a B.B II N S.B c U k » 3 . B z Z.B 1 .B B.B - l l l l -49 -54 -5B -B3 -67 -73 -77 -HI -BB -9B 5B47 5BB3 SBBB 5B96 T I M E (H I ) EsiisiEEDa • "3 i n n P 5.B U B.B °-5.B t B.B X — — •  —1 • — _ — - ( / > » - vr^  * -v^  ,j» v"-- y/ l r ^ . 5B13 5B3B 5B47 5BB3 5BBB 5B96 5913 593B 594B 59 T I M E ( M I ) SPINl-Data Processed 1 • SPINl-Cross-Section 1 • J L 3 I H M U ' U P 5.B U B.B j>-5.B t B.B X H Q Ci! qn p. J H R7 B ?? Rl HP, -90 5B3B 5B4B 5B43 5B45 5B4B 5B5B 5H53 5B55 5B5B 5B T I M E ( M I ) 1 .0 2.0 3.0 4.0 5.B B.B 7.0 H.B 9 0 IE Frequency (kHz! Size : 165,98 ms CURSOR AT Time : 5835.0 ms Freq : N/A Mag, : N/A F I L E PARAHETERS Proc. Uind. Bands Ovlp. Func. SF Pre. Order AC 512 pts 256 50 Z HANNING 20.00 kHz 98.0 Z 15 5 PEAKS F(Hz) M(dB) 234.0 -75.40 N/A N/A N/A N/A N/A N/A N/A N/A HARKER AT F(Hz) M<dB) 168 High-BW Jittered Consonant fal B . B E a B .B II N 5 . B c k » 3 . B z Z.B 1 .B B.B 1^ 11 l^ l^ imr i^ D lMPI^ I^ ln l ^ l ' H l ^ -*4 -49 -5.4 -SB -B3 -B7 -72 -77 -31 -BB -9B 5BBB 5B96 T i n e ( H i ) lJ2-Waveform | HHI' Li Li LE. L3 -3 L312 P 5.B u B.B °-5.B t B.B X u — - - - I - — - l y r - v y — v / * — - v " - v"* — ^ — 5B13 SB3B 5B47 5BB3 5BBB 5B96 5913 593B 594B s4 Ti H e ( H i ) SPIN1J2-Data Processed H P 5. B.B B.B 5B3B 5S4B 5B43 5B45 5B4H SBSB 5B53 5B55 5B5B 5BI T i n e CHx) | SPINlJ2-Cross-Section l . B 2.B 3.B 4.B 5.B B.B 7.B B.B 9. F h c q u e n c H (kHz) I SPECTROGRAM | Size : 1GG.77 ms CURSOR AT Time : 5835.0 ms Freq : N/A Mag. : N/A F I L E PARAHETERS Proc. Wind. Bands Ovlp. Func. SF Pre. Order AC 512 pts 25G 50 Z HANNING 20.00 kHz 98.0 Z 15 5 PEAKS F<Hz) M(dB) 234.0 -74.20 N/A N/A N/A N/A N/A N/A N/A N/A MARKER AT F<Hz) M(dB) 169 High-SD Jittered Consonant fal •3.B a.B E g B . B u N 5.a c * n.a k "a.a z 2.a 1 .a a.a -« -SI -5B -B3 -BT -72 -77 -Bl -BE 5347 5flB3 5BBB Ti H E (Ms) SPJJ TTa HJl-Waveform | p s.a u B.B °-5.B -f B.B X 5B13 5B3B 5BU7 5BB3 SBBB 5B96 5913 S93B 59HB 59 Title (Hi) SPINUl-Data Processed | B SPINIJI-Cross-Section | BJSlA (ja.a P S.B u B.B 0-5.B t B.B I ilil •m qn ' %l B it Bl HP. -90 5B5B 5B53 5B55 5B5B 5BB1 5BB3 BBBB 5B6B 5B71 SB Tine (Mx) 1.0 2.0 3.0 a.B 5.0 &.0 7.0 B.0 9.0 IE FhEquencH (kHz) Size : 1G9.91 ms CURSOR AT Time : 5847.8 ms Freq : N/A Mag. : N/A F I L E PARAHETERS Proc, Wind. Bands Ovlp. Func. SF Pre. Order AC 512 pts 25S 50 I HANNING 20.00 kHz 98.0 X 15 5 PEAKS F<Hz) 234.0 N/A N/A N/A N/A M(dS) -73.60 N/A N/A N/A N/A HARKER AT F(Hz) M<dB) 170 High-SD/BW Jittered Consonant fgl 9.a B.B E q B.B u N 5.3 C V 4.B k H 3 . B z 2.a I .a a.a -an -49 -51 -SB -63 -67 -72 -77 -B l -BB -9B 5BBB 5B96 Ti H E (H i ) SPINlJ3-Wavefonn | • ^11 II* 1* 1" IH«I» " B.B n p 5.a U B.B °.5.B t B.B i v • v - V " 5B13 5B3B 5B47 5B63 5BBB 5B96 5913 593B Tins (H i ) 5946 59 SPIN1J3-Bata Processed M P 5-Bl B.B u a.a °-5.B 1 B.B 5B5B 5B53 5B55 5B5B 5BB1 5BB3 5BBB 5B6B 5B71 5B Ti ME (Hi) | SPINlJ3-Cross-Sect i on B TT l . B 2.B 3.B U.B 5.B G.B 7.B B.B 9 .BIB] Fl-cquencH (kHz) I SPECTROGRAM | Size : 170.57 ms CURSOR AT Time : 5847.8 ms Freq ; N/A Mag. : N/fl F I L E PARAMETERS Proc, : AC Wind. : 512 pts Bands : 258 Ovlp. : 50 % Func. : HANNING SF : 20.00 kHz Pre. : 98.0 I Order : 15 5 PEAKS F<Hz) 0.0 N/A N/A N/A N/A MCdB) -7G.70 N/A N/A N/A N/A HARKER AT F(Hz) M<dB) 171 APPENDIX F Connections on the Tucker Davis Technologies Modules Connections for the Gap Detection Threshold Test f Connections for the SPIN Sentences 173 Purpose of Each TDT Module The Tucker Davis Technologies system consists of hardware modules which are controlled via computer. The following TDT hardware modules were used in the present study. DDI This module is a digital-to-analog and analog-to-digital converter. The digital speech and babble signals were converted to analog signals using this module. FT5 This anti-aliasing filter is used to reduce the high frequencies that accompany recorded signals, and to smooth the digital signal. Thus the signal that reaches the listener has a more natural sound quality. PA4 This module is a programmable attenuator that can adjust the level of presentation of sentences and background babble. It is used to ensure that all signals are presented to the listener at the intended level. One PA4 module was used for the speech signals; another was used for the babble. As the in-house program presented each SPIN sentence, the RMS values for that sentence and accompanying babble file were sent to the PA4 modules, which attenuated accordingly. In this way, all sentences and babble files were ultimately presented at the same RMS levels to each listener. SM3 This module removes the high-frequency background hiss generated by the TDT itself during the playing of signals. Thus the signal that reaches the listener has a clearer sound quality because it is free of excess machine noise. This module also functions as 174 a sound mixer: two signals (such as speech and multi-talker babble) can be combined on this module, so the listener can hear both signals through one ear of the headphones (as was done in this experiment). HB5 This is the headphone buffer module. It is used to relay the speech signal from the Tucker Davis modules to the headphones in the sound-attenuated booth. 175 APPENDIX G Instructions to Participants for the SPIN and GAP Tasks SPIN Instructions In this task you will hear some sentences that are presented with noise in the background. This noise will sound much like the background chatter at a cocktail party. Also, some of the sentences may sound unusual. Your task is to repeat the last word of each sentence. If you are not sure what the last word was, take a guess. It is important that you give a response each time, so it's better to guess than say nothing. Let's practice a few sentences first, so you can get used to the task. Gap Detection Instructions The purpose of this test is to find the shortest duration of silence, or gap, that you can hear in a very short beep. The test takes about 10 minutes to complete. In each trial of this test, you will hear two short beeps. When you hear the first beep, the green light on this button box will light up. When you hear the second beep, the red light on the box will light up. One of the beeps will have a gap in the middle of it, and the other won't. Your job is to indicate which of the two beeps had the gap in it. If you think that the first beep had the gap in it, press the button under the green light. If you think the second beep had the gap, press the button under the red light. Don't press the button until you have heard both beeps. If your answer was right, the light above the button you pushed will light up. If you were wrong, then the light above the other button (the correct one) will light up. 176 When you are ready to go on to the next trial, press the yellow button in the middle of the box. Press this button to begin the test too. 177 APPENDIX H Percent-Correct Scores of Participants for High-Context, Low-Context, and All Sentences Parti-S:N = +8dB cipant No Jitter High-SD Jitter a High-BW Jitter" High-SD/BW Jitter0 High Low All High Low All High Low All High Low All 1 100 84 92 100 72 86 100 80 90 88 20 54 2 100 84 92 88 68 80 100 76 88 80 32 56 3 100 84 92 96 56 76 100 60 80 84 36 60 4 100 80 90 96 48 72 100 64 82 64 16 40 5 100 84 92 100 80 90 100 80 90 76 20 48 6 100 ' 64 82 92 76 84 100 80 90 76 24 50 7 92 76 84 96 64 80 100 80 90 72 24 48 8 100 76 88 100 64 82 100 76 88 64 28 46 9 100 76 88 100 64 82 100 80 90 76 20 48 10 100 72 86 100 76 88 100 84 92 68 40 54 11 100 80 90 96 60 78 100 68 84 52 20 36 12 100 76 88 100 56 78 96 88 92 92 40 66 13 100 88 94 100 84 92 100 84 92 88 36 62 14 100 84 92 96 72 84 96 92 94 76 24 50 15 100 80 90 100 80 90 100 80 90 96 44 70 16 100 88 94 96 68 82 100 64 82 84 20 52 a SD = 5; BW = 100 (0.05 msec) b SD = 1; BW = 500 (0.25 msec) CSD = 5; BW = 500 (0.25 msec) 178 APPENDIX H (Continued) Parti-S:N = +4dB cipant No Jitter High-SD Jitter a High-BW Jitter" High-SD/BW Jitter0 High Low All High Low All High Low All High Low All 1 100 80 90 96 52 74 100 52 76 72 12 42 2 96 60 78 88 72 80 92 68 80 48 16 32 3 96 76 86 92 76 84 92 84 88 80 20 50 4 100 76 88 84 56 70 96 52 74 40 24 32 5 100 60 80 100 72 86 94 72 84 60 8 34 6 100 56 . 78 96 56 76 96 84 90 60 28 44 7 100 84 92 96 48 72 92 64 78 48 24 36 8 96 80 88 84 40 62 100 64 82 44 12 28 9 92 76 84 92 60 76 100 72 86 36 12 24 10 100 84 92 100 60 80 100 88 94 52 28 40 11 92 60 76 88 52 70 96 68 82 44 8 26 12 100 84 92 100 44 72 100 72 86 64 36 50 13 100 80 90 92 40 66 100 84 92 48 24 36 14 96 76 86 88 56 72 100 72 86 44 28 36 15 100 80 90 100 64 82 100 72 86 80 28 54 16 96 76 86 100 48 74 100 72 86 68 16 42 a SD = 5; BW = 100 (0.05 msec) bSD = 1; BW = 500 (0.25 msec) CSD = 5; BW = 500 (0.25 msec) 179 APPENDIX I Order of Jitter Conditions and SPIN Forms for each Participant First Visit Second Visit Partici- S:N = +8dB S:N = +4dB pant No Jitter High-SD Jitter3 High-BW Jitter" High-SD/BW Jitter0 No Jitter High-SD Jitter High-BW Jitter High-SD/BW Jitter 1 1 3 5 7 2 4 6 8 2 8 1 3 5 7 2 4 6 3 6 8 1 3 5 7 2 4 4 4 6 8 1 3 5 7 2 5 2 4 6 8 1 3 5 7 6 7 2 4 6 8 1 3 5 7 5 7 2 4 6 8 1 3 8 3 5 7 2 4 6 8 1 No Jitter High-BW Jitter High-SD Jitter High-SD/BW Jitter No Jitter High-BW Jitter High-SD Jitter High-SD/BW Jitter 9 1 3 5 7 2 4 6 8 10 8 1 3 5 7 2 4 6 11 6 8 1 3 5 7 2 4 12 4 6 8 1 3 5 7 2 13 2 4 6 8 1 3 5 7 14 7 2 4 6 8 1 3 5 15 5 7 2 4 6 8 1 3 16 3 5 7 2 4 6 8 1 Note. Numbers in columns refer to SPIN forms 1 through 8. a SD = 5; BW = 100 (0.05 msec) bSD = 1; BW = 500 (0.25 msec) CSD = 5; BW = 500 (0.25 msec) Monaural Models APPENDIX J Models of Temporal Resolution 180 Model V stimulus Bandpass filter Half-wave Low-pass — • Decision rectifier filter device Model 21 stimulus Bandpass — • Square- — • Temporal — • Decision filter law device integrator device Binaural Models Durlach's Model of Equalization and Cancellationb Bandpass filter: ear 1 stimulus Bandpass filter: ear 2 Amplitude and time jitters (1) Amplitude and time jitters (2) Equalization mechanism Cancellation mechanism Decision device a Models as depicted in Moore, 1989, pp. 150. b Model from figures in Moore, 1989, pp. 220 and Colburn & Durlach, 1978, p. 482. 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0088554/manifest

Comment

Related Items