O N T H E A U D I T O R Y A D A P T A T I O N A F T E R E F F E C T S by CHARLES CHANG-JIANG DONG B.Sc, Shanghai Medical University, 1987 M.D., Shanghai Medical University, 1990 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHmOSOPHY in THE FACULTY OF GRADUATE STUDIES (Neuroscience Program) We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA September 2000 © Charles Chang-Jiang Dong, 2000 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. The University of British Columbia Vancouver, Canada DE-6 (2/88) ABSTRACT In this thesis, selective adaptation experiments are presented which investigate motion detection and signal processing in the human auditory system. The existence of a motion aftereffect is often regarded as evidence for the existence of specialized motion sensitive mechanisms in a modality. Using real moving sounds as both the adapting and test stimuli, a robust and reliable simple auditory motion aftereffect (aMAE) was observed in all of seven subjects tested. After listening to a sound source moving repeatedly in one direction (right or left), a stationary sound source was perceived to move in the opposite direction. The size of the aMAE increased with adapting velocity up to the highest velocity tested (20°/sec). Its strength depended on matching both the spatial location and frequency content of the adapting and test stimuli, suggesting that the aMAE is both spatially and frequency specific. The studies of auditory adaptation were extended to studies of auditory contingent motion aftereffects. In the visual system, many types of contingent aftereffect have been demonstrated since the discovery of the McCollough effect. Although the basis of this aftereffect remains unclear, it is believed to reflect some fundamental aspects of learning and sensory coding. By pairing the directions of sound movement in frequency and in azimuth, the presence of a contingent aftereffect was demonstrated for the first time in the auditory system. This spectral contingent spatial motion aftereffect can persist for four hours after adaptation. The existence of the contingent aftereffect in audition was ii further confirmed by demonstration of an intensity contingent spectral aftereffect. Since sound motion in azimuth (A) can be made contingent on sound motion in frequency (B), which in turn can be contingent on motion in intensity (C), the possibility of attribute A being made contingent on attribute C by way of attribute B was explored. No such "double" contingent aftereffect was obtained. These results imply that the neural mechanisms underlying contingent aftereffects are not specific to vision, but reflect general properties of sensory neural processing. The long time course of contingent aftereffects suggests that they may be related to cortex-based learning processes. iii TABLE OF CONTENTS A B S T R A C T " T A B L E O F C O N T E N T S iv L I S T O F F I G U R E S vii L I S T O F A B B R E V I A T I O N S xi A C K N O W L E D G E M E N T S xii C H A P T E R I G E N E R A L I N T R O D U C T I O N 1 1.1 Sound Localization 2 1.1.1 Cues for Sound Localization 3 1.1.2 Neural Coding of Localization Cues 18 1.2 Auditory Motion Detection 30 1.2.1 Psychophysical Studies 32 1.2.2 Neurophysiological Studies 36 1.3 Lessons from Studies of Visual Adaptation Aftereffects 40 C H A P T E R I I T H E A C O U S T I C A L S T I M U L A T I O N S Y S T E M 43 C H A P T E R I I I S T U D I E S O F T H E A U D I T O R Y M O T I O N A F T E R E F F E C T 52 3.1 Experiment 1: The Auditory Motion Aftereffect and Its Dependence on Adapting Velocity 54 3.1.1 Methods 54 3.1.2 Results 62 3.2 Experiment 2: The Spatial Tuning and Specificity of the Auditory Motion Aftereffect 66 3.2.1 Methods 66 3.2.2 Results 68 iv 3.3 Experiment 3: The Frequency Tuning and Specificity of the Auditory Motion Aftereffect 73 3.3.1 Methods 73 3.3.2 Results 74 3.4 Discussion 80 CHAPTER IV STUDIES OF THE AUDITORY CONTINGENT AFTEREFFECTS 84 4.1 Experiment 4: A Spectral Contingent Spatial Motion Aftereffect 88 4.1.1 Methods 88 4.1.2 Results 93 4.2 Experiment 5: Comparison of Time Courses of the Contingent and the Simple Auditory Motion Aftereffect 97 4.2.1 Methods 97 4.2.2 Results 98 4.3 Experiment 6: An Intensity Contingent Spectral Motion Aftereffect 99 4.3.1 Methods 99 4.3.2 Results 101 4.4 Experiment 7: A Study of the Double Contingent Aftereffect 104 4.4.1 Methods 104 4.4.2 Results 106 4.5 Discussion 106 CHAPTER V GENERAL DISCUSSION AND RECOMMENDATIONS FOR FUTURE WORK 109 5.1 Auditory Motion Processing 109 5.1.1 Specialized Motion Detection Mechanisms 109 5.1.2 Evidence for Hierarchical Organization of Auditory Motion Processing 110 5.2 Speculations about Neural Mechanisms underlying Contingent Aftereffects 112 5.3 Recommendations for Future Work 118 5.4 Conclusion 123 -REFERENCES 125 APPENDIX 1 PURE TONE AUDIOGRAMS 152 APPENDIX 2 A STUDY OF THE CROSS-MODAL CONTINGENT AFTEREFFECT 153 vi LIST OF FIGURES Fig. 1.1 Schematic diagram illustrating the calculation of the interaural time difference for a distant sound source at an angle 0 to the listener 5 Fig. 1.2 Interaural time difference as a function of azimuth. After Wightman and Kistler(1993). 6 Fig. 1.3 Interaural intensity differences (IIDs) for sinusoidal stimuli plotted as a function of azimuth. Adapted from Feddersen et al. (1957) 7 Fig. 1.4 Value of AIPD (interaural phase difference change) required for a P(C) of 75% discrimination as a function of frequency. After Yost and Dye (1991) 9 Fig. 1.5 Value of AIID (interaural level difference change) required for a P(C) of 75% discrimination as a function of frequency. After Yost and Dye (1991) 10 Fig. 1.6 Average azimuth error as a function of the frequency of a tone pulse. Adapted from Stevens and Newman (1936) 11 Fig. 1.7 Minimum audible angle between successive pulses of tone as a function " of the frequency of the tone and the direction of the source. Adapted from Mills (1972) 12 Fig. 1.8 Schematic diagram of the ascending pathway of the central auditory system. After Pickles (1988) 19 Fig. 1.9 Interaural time delay curves (left) and monaural period histograms (right) for a cell in the MSO. After Yin et al. (1997) 21 vii Fig. 1.10 Topographic organization of EI cell sensitivity to interaural intensity difference in deep layers of the superior colliculus. After Irvine (1992) 26 Fig. 1.11 Receptive field typology of cochlear nucleus units. After Rhode and Greenberg (1992) 29 Fig. 1.12 Auditory and visual estimates as a function of source velocity. After Waugh et al. (1979) 39 Fig. 2.1 The robot arm used in the present study 44 Fig. 2.2 Workspace of the loudspeaker 45 Fig. 2.3 Background noise level measured with the MLSSA system at the subject's position in the acoustically treated sound-proof chamber when the robot arm is still and in motion 48 Fig. 2.4 Schematic diagram of the acoustical stimulus system 50 Fig. 2.5 Transfer function of the modified sound-insulated room measured with MLSSA system 51 Fig. 3.1 .A Time sequence of stimuli for inducing simple auditory motion aftereffect. B, Speaker velocity as a function of time during one adapting sweep 56 Fig. 3.2 An example illustrating probit analysis that was used to estimate the 50 % response rate on the psychophysical function derived from the subject's responses 60 Fig. 3.3 The frequency spectrum of the broadband noise, measured at the position of the subject 61 Fig. 3.4 The magnitude of the aMAE measured as a function of adapting velocity 62 Fig. 3.5 Grand average of the aMAE over all four subjects tested in Experiment 1 as a function of adapting velocity 64 Fig. 3.6 Spatial tuning of the aMAE for three subjects, and the average over all three subjects (Experiment 2, Condition 1) 69 Fig. 3.7 Magnitude of the aMAE in the case when the adapting and test regions were separated (Experiment 2, Condition 2) 71 Fig. 3.8 Magnitude of the aMAE as a function of frequency band in the case when the adapting and test stimuli had the same spectrum (Experiment 3, Condition 1) 76 Fig. 3.9 Frequency specificity of the aMAE for three subjects (Experiment 3, Condition 2) 78 Fig. 4.1 Demonstration of the McCollough effect 84 Fig. 4.2 a. Time sequence of stimuli for inducing the auditory contingent aftereffect. b. Detailed time sequence of adapting stimuli 91 Fig. 4.3 The magnitude of the auditory spectral contingent spatial motion aftereffect, in degrees per second, as a function of time after exposure, for five different subjects 94 Fig. 4.4 Average of the contingent aftereffect for both a rising and a falling pitch across all five subjects 96 Fig. 4.5 Comparison of decays of contingent and simple auditory motion aftereffects as a function of time after adaptation 98 ix Fig. 4.6 The magnitude of the intensity contingent spectral motion aftereffect, in octaves per second, as a function of time, for three subjects 102 Fig. 5.1 An example of long-term potentiation (LTP) in the perforant pathway of the hippocampus recorded in vivo 114 Fig. 5.2 Schematic diagram of a model for contingent aftereffects. 117 Fig. 5.3 Time course of averaged magnetic resonance imaging (MRI) signal during adaptation to expanding or contracting concentric rings and test with stationary patterns 120 Fig. 5.4 Firing rate of a ganglion cell in the rabbit retina as a function of time in response to prolonged motion stimulation 123 Fig A2.1 Plateau's spiral used in the experiment 153 X LIST OF ABBREVIATIONS aMAE Auditory Motion Aftereffect A l Primary Auditory Cortex A M Amplitude-Modulated AP Action Potential A V C N Anteroventral Cochlear Nucleus BOLD Blood Oxygen Level-Dependent CF Characteristic Frequency CN Cochlear Nucleus DAS Dorsal Acoustic Stria DCN Dorsal Cochlear Nucleus EE Excitatory (contralateral) -Excitatory (ipsilateral) EI Excitatory (contralateral) -Inhibitory (ipsilateral) EPSP Excitatory Postsynaptic Potential fMRI Functional Magnetic Resonance Imaging IAS Intermediate Acoustic Stria IC Inferior Colliculus ICC Central Nucleus of the Inferior Colliculus IE Inhibitory (contralateral) -Excitatory (ipsilateral) IPSP Inhibitory Postsynaptic Potential ITD Interaural Time Difference LTP Long-Term Potentiation IID Interaural Intensity Difference IPD Interaural Phase Difference HRTF Head Related Transfer Function LL Lateral Lemniscus LSO Lateral Superior Olive M A A Minimum Audible Angle M A M A Minimum Audible Movement Angle MGB Medial Geniculate Body MNTB Medial Nucleus of the Trapezoid Body MSO Medial Superior Olive NLL Nucleus of the Lateral Lemniscus PET Positron Emission Tomography PSS Point of Subjective Stationarity PVCN Posteroventral Cochlear Nucleus SC Superior Colliculus SOC Superior Olivary Complex vMAE Vision Motion Aftereffect VAS Ventral Acoustic Stria ACKNOWLEDGEMENTS It is my greatest pleasure to thank my supervisor, Dr. Max S. Cynader, for being an unfailing fountain of knowledge, invaluable inspiration, intellectual guidance and consistent financial support. I am deeply indebted to my advisor, Dr. Nicholas V. Swindale, for his wisdom, invaluable insights, advice and inspiration. Without his generous technical and intellectual support, the work presented in this thesis would not have been possible. I would like to express my. sincere gratitude to my committee members, Drs. Dietrich Schwarz and Pierre Zakarauskas, for their helpful suggestions and technical support. Special thanks are also due to Dr. Vincent Hayward and his colleagues in the Dept. of Electrical Engineering at McGill University for their contribution to the design and construction of the acoustic stimulation system, to Dr. Murray Hodgson in the Dept. of Mechanical Engineering for his help in the measurement of acoustical characteristics of the sound-proof chamber, and to Dr. David Stapells in the School of Audiology and Speech Sciences for providing equipment for the measurement of the subject's pure tone audiograms. I would like to thank all of subjects who kindly volunteered to participate in the present studies. Thanks are also expressed to all members in the Research Laboratory in the Dept. of Ophthalmology. I would also like to take this opportunity to thank my wife and my parents for their unconditional love, encouragement and support. This thesis is dedicated to them. xii 1 General Introduction When, by selective adaptation, one can affect the perception of particular stimulus attributes in a modality, then it may be assumed that there exist neural mechanisms in that modality which are responsible for the detection and processing of these stimulus attributes. By manipulating the degree of similarity between the adapting and test stimuli, the stimulus specificity of the adaptation and thus the response selectivity of underlying neural mechanisms may be inferred. In vision, many varieties of aftereffect elicited by selective adaptation have been observed. They have provided a powerful tool for visual scientists to investigate the neural mechanisms underlying a variety of visual processes. One of the best known examples is the visual motion aftereffect (vMAE), which is commonly referred to as the waterfall illusion (e.g. Wohlgemuth 1911; Gates 1934; Wade 1994). After prolonged exposure to an adapting stimulus moving in one direction (e.g. a waterfall), a subsequently viewed stationary test stimulus (e.g. a rock beside the waterfall) appears to move in the opposite direction. The vMAE has been extensively studied for more than a century and has been taken as psychophysical evidence for the existence of specialized motion detection channels in the visual system (Wohlgemuth 1911; Holland 1965; Wade 1994). Because inferences regarding neural processes could be based on selective adaptation experiments, the vMAE, together with other visual adaptation aftereffects, have been claimed as the psychologist's "microelectrode" (Frisby 1979). 1 Despite its prominent role in visual research, adaptation aftereffect, the psychophysical "microelectrode", has not been fully exploited as a means of studying sensory processing in the auditory system. The auditory and visual systems share an important similarity at the processing level in that in both systems, signals from paired end-organs (the ears or the eyes) are compared, and small differences in these signals are used to extract information about stimulus location and motion (Stumpf, Toronchuk and Cynader 1992; Wightman and Kistler 1993; Grantham 1995). This thesis presents the results of studies which use this valuable tool to study auditory adaptation aftereffects, with the goal of gaining a better understanding of signal processing in the auditory system. This chapter is intended to provide a background for studies described in this thesis. In the first section, an overview of our current understanding of how our auditory system localizes sound sources is presented. In the second section, studies of motion detection in the auditory system are reviewed. In the final section, by making analogies to the visual system, the strategies used in the present studies of auditory motion detection and signal processing are proposed. 1.1 Sound Localization The ability to localize sound sources is important to almost all higher animals. It allows an animal to determine the position of objects of interest and facilitates the 2 identification of these objects by directing visual attention to the appropriate location. While some cues for sound localization are derived by comparing signals at the two ears, others result from monaural processing of the signals. In the first part of this section, studies of cues for sound source localization are reviewed. The neural processing of these spatial cues in the central auditory system is discussed in the second part. 1.1.1 Cues for Sound Localization Interaural Difference Cues Much of our knowledge about sound localization came first from studies of pure tones due to their simplicity. If a pure tone is presented from one side of the head, it arrives at the nearer ear before it reaches the farther ear and is more intense at the nearer ear. Thus, there are two possible cues as to the location of the sound: the interaural time difference (ITD) and the interaural intensity difference (IID). For ongoing pure tones, ITDs are equivalent to interaural phase differences (IPDs). Owing to the physical nature of the sounds, the effectiveness of these cues for localization depends on frequency. For low frequency tones (< 1.5 kHz), the sound wavelength can be several times larger than the size of the head. Due to the effect of diffraction, the IIDs established at the two ears are negligible. The ITDs/IPDs provide a major cue for horizontal localization. For high frequencies (> 3 kHz), however, the wavelength is short relative to the dimensions of the head. The IPDs would be ambiguous by steps of 360° (Kuhn 1987). But, at such high frequencies, the listener's head casts an effective acoustical "shadow" and thus substantial IIDs result. Therefore, the IIDs become the primary cue for high frequencies. 3 The notion that localization is governed by ITDs at low frequencies and by IIDs at high frequencies is often referred to as the "duplex theory" of sound localization, which was formulated by Lord Rayleigh as early as the turn of last century (Rayleigh 1907). Predictions from a simplified geometric model of the head and actual acoustical measurements have quantified the dependence of these potential cues on the azimuth and frequency of sinusoidal sources. By treating the head as a rigid sphere with the ears represented as the endpoints of a diameter of the sphere (Woodworth 1938), the path length difference from a sound source to the two ears can be estimated (Fig. 1.1) (Kuhn 1987). By dividing the sound velocity in the air which is 340 m/sec, the ITD can be calculated as the following: ITD = (r*sin(0) + r*0)/v Where v is the sound speed in the air, and other parameters are as shown in Fig. 1.1. 4 D = r*sin(8) + r*9 Fig. 1.1 Schematic diagram illustrating the calculation of the path length difference (D) from a distant sound source to the two ears at an angle 9. 5 Figure 1.2 plots the ITD as a function of azimuth, predicted by the rigid sphere model and obtained from actual measurements by placing miniature microphones in a subject's ear canals (Feddersen et al. 1957; Kistler and Wightman 1992). The ITD monotonically increases as a sound source moves from 0° (directly in front of the head) to 90° (J u s t opposite one ear), where it reaches the maximum. It then symmetrically decreases to 0° when the sound source moves from 90° to 180° (directly behind the head). I i i i 1 1 • 0 30 60 90 120 150 180 Azimuth (Deg) Fig. 1.2 Interaural time difference as a function of azimuth, predicted by the rigid sphere model (solid lines) and obtained from actual measurements (dotted lines) from a single subject using a wideband cross-correlation technique. After Wightman and Kistler (1993). 6 The spatial dependence of the IID has been measured at the entrances of, or from inside, the ear canals with probe tubes or miniature microphones when the sounds are presented from different locations (Fedderson et al. 1957; Shaw 1974; Middlebrooks, Makous and Green 1989). Figure 1.3 shows that the IID changes with azimuth in a more irregular fashion compared to the ITD functions. In addition, the IID changes as a function of frequency of the sinusoidal stimulus. As the frequency increases, the IID increases. Middlebrooks and his colleagues (1989) showed that at 4 kHz, the maximum IID measured at 90° azimuth was about 20 dB, and it rose to around 35 dB at 10 kHz. 10 200 0* 30* 60* 90* 120* 150* 180* Angle from Directly Ahead (azimuth) Fig. 13 Interaural intensity differences (IIDs) for sinusoidal stimuli plotted as a function of azimuth; each curve is for a different frequency. Adapted from Feddersen et al. (1957). 7 Lateralization studies have shown that interaural time and intensity differences are indeed detectable (Zwislocki and Feldman 1956; Mills 1960; Yost 1974; Grantham 1984; Yost and Dye 1988). In these experiments, sounds were presented through headphones. Usually, a fused sound image inside the head is perceived by the subject. By introducing an interaural intensity or time difference, the fused sound image can be moved from the midline toward one ear along the interaural axis. For example, as the IPD increases toward 180°, the sound image is located closer to the ear that receives the tone first (leading in time). As the IPD exceeds 180°, the image is located on the other side of the head and closer to the ear lagging in time, and as the IPD approaches 360°, the image moves toward the midline. Similarly, if an IID is introduced to the two ears, the image is perceived nearer to the ear receiving the louder tone. The lateralization experiments allow independent control of the interaural cues so that the ITD/IPD and IID thresholds can be precisely determined. Figures 1.4 and 1.5 display the IPD and IID thresholds as a function of frequency, respectively. The parameter next to each curve indicates the interaural phase (Fig. 1.4) or intensity (Fig. 1.5) difference of the reference tone which marks a given position on the interaural axis. As can be seen from figure 1.4, the IPD threshold remains approximately constant up to frequencies of about 900 Hz and then increases. Above 1200 Hz to 1500 Hz, the auditory system is insensitive to IPD cues. For the sound image around the midline (the IPD of the reference tone is between 0° and 45°), the IPD threshold can be as low as 2°, which corresponds to 10 pis of interaural delay for some frequencies. As the image is moved away from midline, the IPD threshold increases, suggesting that the binaural system is 8 most sensitive to the IPD cues around the midline. The data from Figure 1.5 show that the IID threshold changes in the same way as the IPD threshold when the image is moved off midline. When the image is in the middle of the head, the IID threshold is generally between with 0.5 and ldB (except around 1000 Hz). These results demonstrate that the binaural system is highly sensitive to interaural temporal and intensity differences. F i g . 1.4 Value of AIPD (interaural phase difference change) required for a P(C) of 75% discrirnination as a function of frequency. The various curves represent the values of the reference IPD. 1.4a IPD less than 180°; 1.4b IPD greater than 180°. After Yost and Dye (1991). 9 Fig. 1.5 Value of AIID (interaural level difference change) required for a P(C) of 75% discrimination as a function of frequency. The various curves represent the values of the reference IID. After Yost and Dye (1991). 1 0 Studies of sound localization and spatial resolution have produced results consistent with the duplex theory. Stevens and Newman (1936) placed a subject on the roof of a building and presented tone bursts to the subjects with a loudspeaker. For each presentation, the subject had to report the apparent direction of the sound source. Generally, the localization performance changed as a function of frequency (Fig. 1.6), although the subject frequently made front-back confusions. The errors of localization for pure tones were greatest around 3 kHz and decreased for lower and higher frequencies. These findings have been taken as support for the duplex theory. For low frequencies, the better performance was attributed to the IPD cues, and for high frequencies, to the IID cues. For intermediate frequencies (1.5 to 4 kHz), stimuli would be too high in frequency to produce effective IPD cues and too low in frequency to provide adequate IID cues. too woo FREQUENCY(Hi) WOOO Fig. 1.6 Average azimuth error as a function of the frequency of a tone pulse. The different symbols represent the results of successive replications of the measurements. (Stevens and Newman 1936). 11 In spatial resolution studies, Mills (1958) measured the minimum audible angle (MAA), which is the minimum arc between two sound sources in the horizontal plane that the subject can discriminate. Results show that the acuity of localization is poorest for tones in the middle frequency range (1.5 to 3 kHz) and that spatial resolution degrades as the sound source is moved from the midline to the periphery (Fig. 1.7). The results have been confirmed by many more recent MAA studies (Harris 1972; Grantham 1986; Chandler and Grantham 1992). The degradation of spatial resolution in peripheral azimuths shown in these studies can be accounted for by the reduction of the magnitude of the slopes of the ITD and IID functions (Figs. 1.2 and 1.3) and the loss of spatial sensitivity (Figs. 1.4 and 1.5) as the sound source is moved to the listener's side (Hafter and DeMaio 1975; Hafter et al. 1977; Grantham 1995). F R E Q U E N C Y (Hz) Fig. 1.7 Minimum audible angle between successive pulses of tone as a function of the frequency of the tone and the direction of the source (•: 0°; o: 30°; 60°; A: 75°). (Mills 1972). 12 Although the duplex theory is valid for pure tones, it does not hold for complex, high-frequency sounds, for example, amplitude-modulated (AM) high-frequency sounds and highpass transients. In these sounds, it has been shown that the envelope may contain information that can be used for lateralization. Sensitivity to envelope delays has been demonstrated for A M stimuli, beating-tones, narrow bandpass noises and highpass filtered clicks (Klumpp and Eady 1956; Yost, Wightman and Green 1971; Henning 1974, 1980; McFadden and Pasanen 1976; McFadden and Moffitt 1977; Nuetzel and Hafter 1981; Trahiotis and Bernstein 1986; Middlebrooks and Green 1990). Henning (1974) found that the detectability of interaural delays in the envelope of a 3900 Hz carrier modulated at a frequency of 300 Hz was about as good as that of interaural delays in a 300 Hz pure tone. Thus, the ITDs caused by interaural envelope delays can be used for localizing high-frequency complex sounds. This extends the original "duplex theory", in that the ITD cues are not only useful for the localization of low-frequency sound, but also play a role in the localization of high-frequency complex sound. Spectral Shape Cues Although interaural differences vary systematically with azimuth and thus provide useful localization cues in the horizontal plane, interaural cues are relatively constant for a sound source at front/back locations which are symmetric about the interaural axis, on a "cone of confusion" (Woodworm 1938), or in the median vertical plane. Thus, interaural difference cues are spatially ambiguous in these cases. 13 The role of the pinnae was first proposed by Thompson (1882) over 100 years ago, but his proposal received little attention until the 1960s when Batteau published his work on the time-domain effect of the pinnae (Batteau 1967, 1968). Batteau (1967) proposed that the incident sound waves were reflected within the convolutions of the pinna and that the resulting time delay between the direct sound path and the reflection from the pinna structure would vary according to the angle of incidence of sound. Thus those delays might provide localization cues for both elevation and azimuth. Hebrank and Wright (1974a, b) recognized that the time delay from the pinna reflection would be too short (a few microseconds) to be used by the human auditory systems. They suggested that directionally selective spectral filtering might be decoded and used as spatial cues by human listeners. The spectral effects caused by pinna (and head and torso, perhaps) filtering are often referred to as spectral shape cues or pinna cues (Middlebrooks and Green 1991). The importance of spectral shape cues to sound localization has been revealed by a large number of psychoacoustical studies. Studies show that they are crucial for localization in the vertical plane (Roffler and Butler 1968; Gardner and Gardner 1973; Hebrank and Wright 1974a, b; Oldfield and Parker 1984; 1986). Gardner and Gardner (1973) found that occlusion of the pinna cavities by filling them with moulded rubber plugs degraded the vertical localization ability. The largest effects occurred for broadband noises and for bandpass noises with center frequencies above 6 kHz. Blauert (1969/1970) showed that when narrowband (1/3 octave wide) noises were presented in the median vertical plane, the perceived elevation depended on the center frequency 14 rather than actual sound location. Additionally, monaural localization in the vertical dimension can be comparable to binaural localization with only slight degradation (Hebrank and Wright 1974a; Oldfield and Parker 1986; Butler, Humanski and Musicant 1990). These results indicate that the spectral shape cues are the major cues for vertical localization. In addition to their role in the vertical localization, spectral shape cues might contribute to localization in the horizontal plane. Butler and his colleagues demonstrated that when a listener was asked to localize narrowband noises (1 kHz wide) in the horizontal plane, the apparent horizontal location varied according to the center frequency of the sounds (4 to 14 kHz) (Butler and Flannery 1980; Musicant and Butler 1985). Under monaural localization conditions where spectral shape cues are presumably the only horizontal localization cues, subjects can achieve better than chance performance (Belendiuk and Butler 1975; Middlebrooks 1997). The spectral shape cues have been shown to help resolve front/back confusions where interaural differences are the same (Musicant and Butler 1984; Oldfield and Parker 1984; Kistler and Wightman 1992). They are also responsible for the externalized quality of sound image when signals are presented over headphones (Plenge 1974; Wightman and Kistler 1989a, b; 1992 Blauert 1996). The direction-dependent filtering effects of the pinnae, head and torso have been characterized by the so-called the "head-related transfer function" (HRTF), which is the ratio of the spectral amplitude at the sound source to the amplitude of the sound at the 15 eardrum measured with miniature microphones in the ear canals. The HRTF shows a complex pattern of peaks and notches which varies systematically with the direction of the source (Shaw 1974; Hebrank and Wright 1974b; Bloom 1977a; Watkins 1978; Butler 1987; Middlebrooks, Makous and Green 1989; Wightman and Kistler 1989a,b; Middlebrooks and Green 1990). For example, Hebrank and Wright (1974b) and Butler and Belendiuk (1977) found that for a broadband noise, a notch in the spectrum moved to higher frequencies as the elevation of the sound source changed from 30° under to 30° above the horizontal plane. The spectral features in the HRTF can be used to judge the location of a sound source given the assumption that the power spectrum of the source is relatively smooth and flat. Butler (1971) demonstrated that for a pure tone stimulus, the perceived direction correlated well with the direction which would give a peak in the HRTF at that frequency, suggesting that peaks in the spectrum are an important cue for localization. In an experiment done with spectral notches, Bloom (1976b) asked his subjects to adjust the center frequency of a band-reject filter so that the perceived elevation of a sound closely matched that of a sound source in the median plane. Results showed that the frequencies selected by the subjects corresponded to the notches in the HRTF associated with the source located at corresponding elevations. Similar results were reported by Hebrank and Wright (1974b) and Watkins (1978). In summary, psychophysical studies have shown that interaural temporal and intensity differences and spectral characteristics of a sound produced by the pinna 16 filtering all provide information about the location of a sound source, and these cues can be used by our auditory system to determine the location of the sound source. Using a "virtual sound source technique", Wightman and Kistler (1992) studied the relative importance of these localization cues. In this technique, the sounds presented over headphones were processed by the digital filters, derived from the subject's HRTF measured from a large number of spatial positions, so as to contain the major naturally occurring directional cues associated with free-field sources (Wightman and Kistler 1989a, b). The listener's judgement of the apparent positions of virtual sources was shown to be indistinguishable from that of corresponding free-field sources (Wightman and Kistler 1989b). By making the ITD cues correspond to a different source direction from that signaled by the IID and spectral cues of a broadband noise, it was found that the subject's judgement of the apparent position was governed by the ITD cues. However, with low frequencies removed by high-pass filtering, responses were dominated by the IID and pinna cues. For sounds in our environment, which usually contain low frequencies, it seems likely that the ITD cues play the major role. 17 1.1.2 Neural Coding of Localization Cues Ascending Pathway in the Central Auditory System Figure 1.8 depicts a schematic diagram of the ascending pathway of the central auditory system. As shown in the diagram, the auditory nerve enters the cochlear nucleus (CN) and bifurcates into an ascending branch that innervates the anteroventral division of the CN (AVCN) and a descending branch that innervates both the posteroventral (PVCN) and the dorsal (DCN) divisions of the CN. Projections from these separate regions within the CN spread out in highly organized ways and form three main bundles to reach other nuclei in the ascending pathway. These three bundles are the dorsal acoustic stria (DAS), the intermediate acoustic stria (IAS) and the ventral acoustic stria (VAS). The DAS is essentially a crossed pathway by which cells in the DCN project to the nuclei of the lateral lemniscus (NLL) and the central nucleus of the inferior colliculus (ICC). The IAS originates mainly in the PVCN and projects to the periolivary nuclei bilaterally, whereas the VAS, arising from cells in the AVCN and PVCN, reaches the major cell groups in the superior olivary complex (SOC) in both sides. The SOC occupies a pivotal position in the ascending auditory system, for it is the first site where the outputs from the two CNs converge and where interaural cues for sound localization are detected. It is composed of at least three nuclei: the medial superior olive (MSO), the lateral superior olive (LSO) and the medial nucleus of the trapezoid body (MNTB). Both the MSO and LSO receive bilateral inputs from the CN, form local circuits and send their axons to the IC via the lateral lemniscus (LL). The MSO projects to the ipsilateral IC, whereas the LSO projects to both the ipsilateral and contralateral ICs. Projections from the IC then lead to the medial geniculate body (MGB), and from there finally to the auditory cortex. 18 Fig. 1.8 Schematic diagram of the ascending pathway of the central auditory system. (After Pickles 1988). 19 Coding of Interaural Time Differences Fifty years ago, Jeffress (1948) first proposed that interaural delays were detected by neurons that compare the arrival time of discharges from monaural afferents derived from each ear. According to Jeffress' "coincidence detection" model, the output of such a neuron is a function of the degree of synchrony between discharges of its monaural afferents: maximum (minimum) discharge rate is elicited at ITDs that produce the highest (lowest) degree of temporal coincidence of the monaural afferent spike trains. The basic premises of the coincidence-detection model have been confirmed by neurophysiological recordings from the MSO where the initial processing of ITDs takes place (Goldberg and Brown 1969; Yin and Chan 1990). Neurons in the MSO receive excitatory monaural afferents that convey temporal information from the two ears (Warr 1966; Cant and Casseday 1986; Schwartz 1992). These neurons usually have low characteristic frequencies (CF) (less than 2.5 ~ 3 kHz) and respond preferentially during restricted portions of the cycle of tonal stimuli applied to each ear. In response to binaural stimulation, the discharge of the neurons depends on the relative time of arrival of the inputs from the two monaural pathways (Fig. 1.9) (Goldberg and Brown 1969; Yin and Chan 1990). 20 Fig. 1.9 Interaural time delay curves (left) and monaural period Mstograms (right) for a cell in the MSO. The mean interaural phase is given by <£d, and the mean monaural phases by c and i for contralateral and ipsilateral responses, respectively. The arrows labeled C and I indicate the monaural response rates for contralateral and ipsilateral stimulation, respectively. (After Yin et al. 1997). 21 In the experiments by Goldberg and Brown (1969) and by Yin and Chan (1990), coincidence was directly tested by comparing the ITD at which maximal response is obtained under binaural stimulation with that predicted from the phase of responses to monaural stimulation of each ear. Figure 1.9 shows typical responses obtained from a MSO neuron of the cat (Yin and Chan 1990). The neuron's discharge rate varied cyclically as a function of ITD, with a period equal to that of the stimulus tone (Fig. 1.9, left). The monaural period histograms (Fig. 1.9, right) indicate the relative timing of inputs to the MSO cell from each ear. The values 0 c and 0 i represent the mean phase angle of inputs from the contralateral and ipsilateral ears, respectively. If the responses were in accord with the coincidence model, then the maximum binaural response should occur when the two monaural phase were made equal by introducing an appropriate ITD so that the inputs from both sides arrived at the MSO simultaneously. In the example shown in figure 1.9, this predicted that coincidence should occur when the ipsilateral stimulus was delayed by 0.15 cycles (i.e. 0 C - 0 f = 1.02-0.87), or 150 pis for a 1000-Hz tone. The central peak of the delay curve (Fig. 1.9, left) occurred when the ipsilateral stimulus was delayed by 100 ps and the mean interaural phase 0d of the delay curve averaged over the four cycles was 90 ps. These were close to the values predicted from the monaural responses and thus provided direct evidence for the coincidence model. At levels above the MSO, ITD sensitivity has also been demonstrated. In the inferior colliculus (which is the direct target of the MSO projection), the ITD sensitivity of neurons has been extensively studied. In their classic study, Rose and his colleagues (1966) showed that discharge of IC neurons with low characteristic frequencies was a 22 cyclic function of the interaural delay, similar to that found in the MSO. Using pure tones of different frequencies, they found that the IC neurons exhibited a "characteristic delay", at which neurons responded with the same relative amplitude over different frequencies. This characteristic delay has been proposed to reflect the fixed physiological delay of the inputs between the two ears. Subsequent investigations have confirmed this concept (Yin and Kuwada 1983, 1984; Yin, Chan and Irvine 1986). In addition to pure tones, ITD sensitivity to broadband and narrowband noises has been demonstrated for a large sample of IC neurons (Yin, Chan and Irvine 1986; Chan, Yin and Musicant 1987; Yin, Chan and Carney 1987). Yin and his colleagues found that all low-frequency neurons that are sensitive to ITDs in pure tones are also sensitive to ITDs in noise stimuli. The majority of these IC neurons have cyclic delay functions with decreased peak amplitudes at larger ITDs. The sensitivity of neurons to ITDs of broadband noise stimuli was usually studied with identical noises presented to the two ears. When two independent, i.e. uncorrelated noise signals were used as stimuli, one to each ear, the IC neuron's responses were no longer modulated as a function of ITD (Yin, Chan and Carney 1987; Yin and Chan 1988). By varying the degree of correlation between two input signals, Yin and his colleagues found that there was a monotonic positive relation between the degree of modulation of the noise delay response and the correlation between input noises. Similar results were reported in a more recent study of neurons in the MSO (Yin and Chan 1990). These results suggest that encoding of ITD by the auditory brainstem neurons involves a 23 process equivalent to cross-correlation, which extends the model beyond the specific case of coincidence detection (Yin, Chan and Irvine 1986; Chan, Yin and Musicant 1987; Yin, Chan and Carney 1987; Yin and Chan 1990). Psychophysical studies indicate that ITDs in the envelope of A M high-frequency stimuli can be detected with a sensitivity comparable to that for the detection of ITDs in pure tones at the modulation frequency (e.g. Henning 1974; Nuetzel and Hafter 1976). The neural mechanisms underlying this sensitivity have been investigated (Yin et al. 1984; Batra et al. 1989). High-frequency neurons have been found in the IC of cats and rabbits that are sensitive to ITDs in the envelopes of A M tones with carrier frequency at the neurons' CFs. Their delay functions are cyclic, with period equal to that of the modulation frequency, indicating that these neurons are sensitive to IPD of the modulating waveform. The ITD sensitivity at the levels above the IC including the MGB and primary auditory cortex (Al) seems to reflect processing at lower levels. IPD-sensitive neurons have been described in the MGB (Aitkin and Webster 1972; Aitkin 1973; Calford 1983; Ivarsson, de Ribaupierre and de Ribaupierre 1988) and A l (Brugge et al. 1969; Brugge and Merzenich 1973; Benson and Teas 1976; Altman 1978; Reale and Brugge 1990). Similar to the neurons found in the MSO and IC, the response-delay functions of these MGB and A l neurons are typically periodic and the periods of these functions correspond to the periods of the tonal stimuli. Some of these neurons displayed "characteristic delays" (Aitkin and Webster 1972; Brugge and Merzenich 1973; Benson and Teas 1976; 24 Reale and Brugge 1990). Their maximum responses occurred at the same delay regardless of tonal frequency. In addition, neurons in the MGB have been reported that are sensitive to differences in the time of arrival of clicks to the two ears in the cat and squirrel monkey (Altman, Syka and Shmigidina 1970; Starr and Don 1972). Coding of Interaural Intensity Differences Neural sensitivity to IID has been demonstrated at different levels along the auditory pathway, including the superior olivary complex (e.g. Boudreau and Tsuchitani 1968, 1970; Gaird and Klinke 1983; Tsuchitani 1988), the dorsal nucleus of the lateral lemniscus (Brugge, Anderson and Aitkin 1970), the inferior colliculus (e.g. Roth et al. 1978; Semple and Aitkin 1979; Caird and Klinke 1983; Semple and Kitzes 1987; Irvine and Gago 1990); the superior colliculus (SC) (Wise and Irvine 1983; Yin, Hirsch and Chan 1985), the medial geniculate body and the primary auditory cortex (e.g. Aitkin and Webster 1972; Brugge and Merzenich 1973; Phillips and Irvine 1981). The major class of neurons showing sensitivity to IIDs are those receiving predominantly excitatory input from one ear and predominantly inhibitory input from the other (IE and EI neurons). At the level of the SOC, these neurons are concentrated in the LSO and receive excitatory inputs from the ipsilateral AVCN and inhibitory inputs from the contralateral A V C N via the ipsilateral MNTB (IE neuron). These IE neurons have high CFs. In contrast, the majority of such neurons at higher levels are EI neurons (see Irvine 1992 for review). In the case of the ICC, this reversal is in part due to the fact that high-frequency IE neurons in the LSO mainly project to the contralateral ICC (Glendenning and Masterton 1983). 25 The response of these IE and EI neurons changes monotonically as a function of IID. The typical form of the function is a sigmoid, varying from the excitatory response to the stimulus from one ear to the inhibitory response to the stimulus from the other ear (Fig. 1.10). The slope of the IID function and the IID at which the response switches from the excitation to inhibition vary for different neurons. Fig. 1.10 Topographic organization of EI cell sensitivity to interaural intensity difference in deep layers of the superior colliculus (SCD). The upper figure is a tracing of a sagittal section showing the location along the rostrocaudal axis of the superior colliculus of each dorsoventrally oriented electrode penetration. The horizontal bars indicate the depths at which the neurons were isolated. The lower figures present normalized IID-sensitivity functions for the neurons isolated in the penetrations identified by the corresponding numbers. The stimulus in each case was a broadband noise burst at 60 or 70 dB SPL. (After Irvine 1992). 26 IC S C O c a u d a l r o s t r a l c o n t r a l a t e r a l m e d i a n i p s i l a t e r a l a z i m u t h s p l a n e a z i m u t h s INTERAURAL INTENSITY DIFFERENCE (dB) (contralateral level re ipsilateral) 27 Coding of Spectral Shape Cues As discussed in Section 1.1.1, certain acoustic features derived from the effect of pinna filtering, such as steep notches in the HRTF, may provide important information for sound location. But neural substrates for encoding these spectral features remain largely unclear. Neurophysiological studies suggest that the.fusiform cells in the DCN may play a role in the detection of spectral notches in the HRTF (Spirou and Young 1991; Young et al 1992; Imig et al. 2000). In the cochlear nucleus, the response characteristics of neurons can be classified into five types (I ~ V) according to the frequency-intensity response pattern (or the pattern of receptive field) (Fig. 1.11). This form of analysis describes the frequencies and intensities of single tones that elicit excitatory or inhibitory output of a neuron. The fusiform cells typically exhibit type III and type IV receptive fields (Fig. 1.11) which are characterized by inhibitory sidebands (Young and Brownell 1976; Shofner and Young 1985). Due to the inhibitory sidebands, these neurons have a high degree of frequency selectivity. Small changes in the frequency position of the notches in broadband noises may result in large change in the output of these neurons. Sutherland (1991) showed that by sectioning the dorsal acoustic stria, (i.e. the output pathway of the DCN), the cat's ability to localize the elevation of a sound source was affected. 28 Fig. 1.11 Receptive field typology of cochlear nucleus units. A schematic illustration of the five receptive field classes, based on mapping the cell's excitatory and inhibitory response regions in a two-dimensional frequency-intensity plane. Excitation is indicated by "+" and suppression by (After Rhode and Greenberg 1992). 29 Projections from the D C N lead to the contralateral IC. Studies have shown that some neurons in the IC, SC and A l of the cat are sensitive to monaural spectral cues (King, Moore and Hutching 1993; Carlile and King 1993; Samson et al. 1993). Most spatially tuned neurons maintain their azimuthal selectivity for broadband noises after monaural occlusion or cochlear ablation. However, when tested with pure tones, their selectivity is lost, suggesting that their spatial tuning relies on monaural spectral cues. 1.2 Auditory Motion Detection In the natural environment, motion is a common phenomenon. It is an essential feature of the natural setting with which all organisms must deal. Aside from the obvious utility for avoiding collisions, there is a survival advantage for an organism to detect motion in its environment as quickly and efficiently as possible. For example, the ability of motion detection enables an animal to capture moving prey efficiently and to escape from an approaching predator successfully. It is now well established that the visual system possesses specialized motion detecting channels. Abundant psychophysical evidence indicates that the human visual system contains specialized channels for detecting three-dimensional motion and that these channels are distinct from those that process static position or depth (Sekuler and Ganz 1963; Sekuler and Pantle 1967; Pantle and Sekuler 1968; 30 Richards and Regan 1973; Beverley and Regan 1973; Regan and Beverley 1973a; Regan, Beverley and Cynader 1979). In addition, neurophysiological studies have shown that single neurons in the visual cortex are highly sensitive to the direction of stimulus motion in three dimensional space (Hubel and Wiesel 1959, 1962; Regan and Beverley 1973b; Cynader and Regan 1978, 1982; Newsome et al. 1986) and have differential sensitivity to stimulus velocity (Hubel and Wiesel 1965; Cynader and Regan 1978). The direction-sensitive neurons in the primary visual cortex (VI) project either directly or via visual area V2 to area MT (middle temporal area, or V5) (Zeki 1974), where 80 to 90% of neurons exhibit motion sensitivity (Albright 1984). The crucial role of area MT in visual motion detection has been demonstrated by several lesion studies. Studies showed that lesions in area MT could result in severe impairment of visual pursuit movement, significant reduction in motion direction discrimination and even selective motion blindness (Dursteler and Wurtz 1988; Newsome and Pare 1988; Zihl, von Cramon and Mai 1983; Vaina et al. 1990). While specialized motion sensitive channels are well established in vision, how motion is determined in the auditory system is still uncertain. In principle, there are two different interpretations for the perception of sound source motion by the auditory system (Middlebrooks and Green 1991). One, the "snapshot theory", is that the auditory system simply uses the same neural mechanism which is used to localize stationary sound sources, to register the positions of the sources at different times, and then compares them to extract the velocity. In this case, a specialized system for motion detection is not needed, or employed. The second interpretation is that there exist specialized motion detection channels in the auditory system. These specialized channels are most sensitive 31 to dynamic aspects of interaural intensity difference (IID), interaural time difference (ITD) or time-varying spectral information available to each ear. Lappin and his colleagues (1975) were probably the first to propose that source velocity might be a "directly perceived attribute of moving stimuli" (p. 393). In this section, I first review literature of psychophysical and neurophysiological studies pertinent to the auditory motion detection. Then, following a comparison of physiological and psychophysical data from the auditory and visual systems, I propose that, like the visual system, the auditory system contains specialized mechanisms for detecting sound source motion. 1.2.1 Psychophysical Studies In the psychophysical studies of detection of sound source motion, various paradigms have been employed. These paradigms used either real (Perrott and Musicant 1977; Perrott et al. 1979; Perrott and Tucker 1988) or simulated moving sound as the stimulus (Altman and Viskov 1977; Grantham 1986; Saberi and Perrott 1990). The simulated sound source movement was produced by systematically varying the intensity or relative phase of tones presented from two loudspeakers. Listeners were typically asked to discriminate between a stationary and a moving sound source (Perrott and Musicant 1977) or to discriminate between directions or velocities of motion (Harris and Sergeant 1971; Grantham 1986; Perrott and Tucker 1988; Perrott, Costantino and Ball 32 1993). In these studies, the minimum audible movement angle (MAMA), which is the minimum arc swept by a moving source that can be detected as moving (Perrott and Musicant 1977), was measured and compared with the minimum audible angle (MAA), which is the minimum arc between two static sources that can be discriminated (Mills 1958). While the M A A is the measure of acuity for stationary acoustic targets, the M A M A is the measure of auditory resolution of sound source in motion. Some studies have shown that MAMAs obtained in the tasks of. detection and discrimination of sound source motion vary in a similar way to the MAAs for static sound sources, when source azimuth or frequency is changed (Altman and Viskov 1977; Perrott and Musicant 1977; Grantham 1986; Perrott and Tucker 1988; Saberi and Perrott 1990). For example, MAMAs in the horizontal plane are smallest (i.e. performance is best) around 0° azimuth and increase with increasing azimuth (Harris and Sergeant 1971; Grantham 1986). When the frequencies of stimulus tones are in the range from 1.3 to 2 kHz, over which MAAs are largest (i.e. the acuity for static sound location is poorest) (Mills 1958), MAMAs are largest (i.e. performance is worst) (Perrott and Tucker 1988). MAMAs are smaller for broadband noises than for pure tones (Grantham 1986; Saberi and Perrott 1990; Litvosky and Macmillan 1994). The similarities between the performances of subjects in the dynamic and static sound localization tasks suggest that the auditory system may use the same mechanisms for static sound localization and motion detection and thus provide support for the "snapshot theory". 3 3 In a study of discrimination of sound source motion, Grantham (1986) provided additional evidence for the "snapshot theory". In his study, the differential velocity thresholds were determined. Grantham found that for a given extent of movement of a reference stimulus, about 4° to 10° additional extent is required for threshold discrimination of two moving stimuli, independent of stimulus duration or reference velocity. This suggests that judgements of motion may derive from comparing sound source positions at the beginning and end of its trajectory, not from velocity per se, and thus depend on the mechanisms described by the "snapshot theory". On the other hand, other psychophysical experiments have pointed to the existence of specialized motion detection channels in the auditory system. Some studies have demonstrated that even in situations in which binaural cues are unavailable, discrimination of sound source motion can be almost as good as when binaural cues are provided. Harris and Sergeant (1971), using sound sources of low velocity (2.8°/sec), found that monaural M A M A (2° to 4°) was comparable to binaural M A M A (1° to 3°) for broadband stimuli, suggesting that subjects were able to detect motion without binaural cues as well as with them. Since the performance of localization of static sounds in.the monaural condition is much poorer compared to that under the binaural condition (the absolute error is 30° to 40° for monaural listening, whereas it is 5° to 10° for binaural listening) (Oldfield and Parker 1986), the result suggests that the auditory system may use specialized mechanisms rather than a simple position comparison system to detect sound source motion. 34 The existence of specialized motion mechanisms in the auditory system has been supported by more psychophysical studies. Perrott et al. (1979) found that the difference threshold was three times smaller for discriminating the velocity of a real moving sound source than for motion simulated with trains of dichotic clicks with changing interaural time differences (Altman and Viskov 1977) (5.3 vs. 17°/sec), indicating that the auditory system has a special sensitivity to motion. Perrott and Marlborough (1989) attempted to test the "snapshot theory" directly by comparing MAMAs for detection of motion of a continually emitting sound source with those obtained when the sound source was active only at two end-points of its trajectory. In one condition, the broadbrand stimulus was continuously presented from a loudspeaker moving at a velocity of 20°/sec. The duration of the presentation was varied adaptively to track the M A M A threshold. In the other condition, two brief (10 msec) noise bursts were emitted in succession from the moving loudspeaker, which marked only the two end-points of the trajectory. The duration between the two bursts was also changed adaptively in order to obtain the M A M A . The "snapshot theory" would predict similar MAMAs measured under the two conditions. However, the results showed that the MAMAs obtained when the sound source was moving were significantly smaller than when only the end-points of the trajectory were marked. Therefore, the information arriving between the two end-points seemed to contribute to the resolution of motion. In another study by Perrott et al. (1993), it was found that subjects were able to discriminate between accelerated and decelerated movements even when the end-points of the trajectory and stimulus duration were the same. Although these findings do not necessarily indicate that there exist specialized 35 motion detection mechanisms in the auditory system, they suggest that motion detection involves more than just a simple position comparison mechanism. 1.2.2 Neurophysiological Studies Neurons that respond selectively to the direction and rate of sound source motion have been reported in the auditory cortex (Al) (Sovijarvi and Hyvarinen 1974; Altman 1987; Stumpf, Toronchuk and Cynader 1992; Ahissar et al.1992; Poirier et al. 1997), in the superior colliculus (SC) and the medial geniculate body (MGB) (Altman 1968; Altman, Syka and Shmigidina 1970; Rauschecker and Harris 1989), and in the inferior colliculus (IC) (Altman 1971; Yin and Kuwada 1983; Spitzer and Semple 1991, 1993) of many animal species, including monkey, cat and gerbil. In a free-field study in which sound was presented by a hand-held loudspeaker, Sovijarvi and Hyvarinen (1974) found neurons in the cat auditory cortex that responded to moving sound, but not to stationary sounds. In another free-field study, Ahissar and colleagues (1992) reported that 35% of neurons in their samples from the monkey auditory cortex were sensitive to sound source movement. These neurons responded more strongly when a sound source was moving in a particular direction than when the source was stationary or moving in the opposite direction. 36 In some studies, simulated sound movement was used as the stimulus, in which IPDs or IIDs were varied dynamically. Altman (1987) studied the responses of neurons in the cat auditory cortex to modulation of IPD and found that some A l neurons responded selectively to the direction of phase change. In studies investigating the influences of temporal variation of IPD on neuronal response in the IC of cat and gerbil, Spitzer and Semple (1991, 1993) showed that, for the majority of IPD-sensitive IC neurons, their responses to time-varying IPD did not simply track responses to static IPD, but reflected changes in IPD. When the IPD varied in opposite directions over the same range, the response profiles of these neurons did not overlap, but accordingly shifted to the opposite directions. Using amplitude-modulated (AM) tonal stimuli presented dichotically or diotically, Stumpf and colleagues (1992) observed that some A l neurons of cat responded differently to simulated moving sound and stationary sound. Although these neurons fired only at the onset of stationary sound, they discharged throughout A M ramps. Some neurons preferred low rates of A M ramp, while others preferred higher rates of change. More recently, Poirier and colleagues (1997) have used static and simulated moving sounds to study positional and directional selectivity of high-frequency A l neurons of the cat. In their study, simulated movement of the sound source was generated by successively activating loudspeakers in an array spanning an arc of 156° in azimuth. Their results showed that some A l neurons (26%) were selectively sensitive to one direction of apparent motion. These neurons responded at least twice as strongly to 37 one direction than to the other direction. In addition, 54% of neurons in their samples were tuned to apparent speed. In summary, psychophysical evidence for the existence of specialized motion detection mechanisms in the auditory system is controversial. More convincing psychophysical data are needed in order to clarify this issue. However, physiological findings, although not conclusive, point to the existence of neurons which are specifically able to signal information about sound source motion. Comparison of physiological and psychophysical data suggests that the auditory and visual systems share some similarities in motion detection. Physiologically, in both the auditory and visual cortex, direction- and speed-specific neurons have been found. The physiological parallels between the two systems are supported by psychophysical evidence indicating that the source velocity estimation capacities of the two systems are similar (Waugh, Strybel and Perrott 1979; Perrott et al. 1979). In these studies, the auditory and visual velocity estimates were analyzed as a function of source velocity (Fig. 1.12). Statistical analysis (ANOVA) showed that there was no significant difference between the visual and auditory estimates. Both estimates appeared to be power functions (i.e. a logarithmic change in the actual source velocity resulted in a linearly logarithmic change in the estimated velocity) with very similar best-fit exponents (0.90 and 0.87 for the auditory and visual functions, respectively). The nearly identical velocity functions of the two modalities suggest that the auditory system can judge the velocity of moving stimuli as well as the visual system. Thus, perception of motion in 38 the auditory and visual systems appears to depend on similar mechanisms of motion detection, i.e. auditory motion perception might be mediated by specialized motion detection channels in the auditory system. wo-| Z A Js £ (oecREEs i sea 1 MS I 260 -I HO 4 flL»5 i t 1.70 3.00 1 5.40 •L ( M I L E S / K R ) SOURCE VELOCITY Fig. 1.12 Auditory and visual estimates as a function of source velocity (After Waugh et al. 1979). 39 1.3 Lessons from Studies of Visual Adaptation Aftereffects In studies of the vMAE, the duration of the vMAE and the apparent velocity of the test stimulus have commonly been used as measures of its magnitude. It has been shown that the duration of the vMAE increases with the duration of the adapting stimulus and can last as long as 10 ~ 30 sec (Sekuler and Ganz 1963; Herschenson 1989). Herschenson (1993) reported that the duration of the vMAE increased in proportion to the square root of adaptation time up to 90 sec. By using a nulling method, in which the subject continuously adjusts the velocity of a test stimulus moving in the direction opposite to that of the vMAE until the test stimulus appears stationary, the apparent velocity of the vMAE over time can be recorded. It has been found that, within a limited range, the velocity of the vMAE increases with the velocity of the adapting stimulus (Wohlgemuth 1911; Taylor 1963; Brigner 1986). The vMAE has been shown to be restricted to the retinal area stimulated by adapting motion (Wohlgemuth 1911). Its spatial-frequency specificity has been indicated by the fact that the vMAE is strongest when the spatial frequencies of the adapting and test gratings are identical. Its strength falls to about one half if the adapting and test spatial-frequencies differ by one octave (Brigner 1982; Over et al. 1973; Cameron, Baker and Boulton 1992). In addition, the vMAE displays interocular transfer (Wohlgemuth 1911; Holland 1957; O'Shea and Crassini 1981). It can be observed with one eye following adaptation of the other eye, but only at about 50% of the strength of the 40 monocular vMAE (Moulden 1980; Anstis and Duncan 1983; Wade, Swanston and Weert 1993). A commonly held explanation for the vMAE is based on the existence of motion detectors in the visual system which are selectively tuned to motion in a particular direction (Barlow and Hill 1963; Sekuler and Pantle 1967). During prolonged exposure to one direction of motion, neurons sensitive to that direction of motion adapt (Barlow and Hill 1963; Marlin, Hasan and Cynader 1988; Saul and Cynader 1989 a, b; Giaschi et al. 1993). Then, when a static stimulus is presented, the activities of the neurons sensitive to the opposite direction dominate, causing the stationary object to be perceived as moving in the direction opposite to that of the adapting motion. By analogy, if a motion aftereffect can be demonstrated in audition, it would suggest that there exist specialized motion detection mechanisms in the auditory system. In the studies to be described here, a series of psychophysical experiments was conducted to determine whether a reliable and robust auditory motion aftereffect (aMAE) could be observed. After demonstrating the existence of the aMAE, I further studied its characteristics in the spatial and frequency domains, which reflect the performance characteristics of specialized motion detection channels in the auditory system. In vision, we have long known about a special class of aftereffects called contingent aftereffects, which was first reported by McCollough (1965). In McCollough's prototypical experiment, after viewing repeated presentation of gratings of 41 particular colors paired with particular orientations, subjects reported negative color aftereffects, dependent on the orientation of stripes in the gratings. That is, after adaptation, when an achromatic grating of the same orientation that was previously associated with a particular color is viewed, the grating appears to be tinged with the complementary color to that which was viewed during adaptation. Many other visual contingent aftereffects have subsequently been demonstrated. These contingent aftereffects are especially interesting since two stimulus attributes (e.g. colour and orientation) which are normally perceived independently can become linked by a relatively brief adaptation. In addition, these contingent aftereffects persist for an extraordinary length of time. Various observers (e.g. Mackay and Mackay 1975; Stromeyer and Mansfield 1970) have shown that the contingent aftereffects can still be observed several weeks after a relatively short presentation of the adapting stimulus. Although the exact neural mechanisms underlying the McCollough effect are still unknown, it is believed to reflect some fundamental aspects of learning and sensory coding (Barlow 1990). Equivalent contingent aftereffects in sensory modalities other than vision, however, have not been reported before. In this thesis, experiments were performed to determine whether contingent aftereffects could be demonstrated in the auditory system. I hope that studies presented in this thesis can provide further insight into auditory motion perception and improve our understanding of signal processing in the human auditory system. 42 2 The Acoustical Stimulation System In order to carry out psychoacoustical experiments, an acoustical stimulation system was designed which allows a sound source to move with a given velocity (up to 30°/sec) and along a specified trajectory. The stimulation system consisted of a robot arm and control circuits and software. The robot arm was designed and constructed by Dr. Vincent Hay ward and his colleagues in the Department of Electrical Engineering at McGill University. Control circuits for the robot arm were designed and connected by the author with the help of Dr. Pierre Zakarauskas of the Wavemakers Inc., while the control software was developed by Dr. Nicholas Swindale in the Department of Ophthalmology at the University of British Columbia. In this system, a loudspeaker (LCS-150, Labtec) was mounted at the 'tip' (the end effector) of the robot arm which is a servo controlled mechanism (Fig. 2.1). The robot arm can move the loudspeaker smoothly and quietly. During movement, the trajectory of the loudspeaker is constrained to lie on the surface of a sphere with a radius of 0.8 meters. When the subject's head is located at the center of this sphere, the loudspeaker is always oriented toward the subject and the distance between the loudspeaker and the subject's head is constant. 43 Fig. 2.1 The robot arm used in the present studies. The arm is positioned in an acoustically treated soundproof chamber. A loudspeaker (LSC-150, Labtec) mounted on the arm can move smoothly and quietly on the surface of a sphere with a radius of 0.8 m. In the picture, a subject is seated in the chamber with his head at the center of the sphere. 44 The robot arm was designed according to the following requirements: maximum coverage of space, low noise, high speed and acceleration, safety, low bulk, and low complexity and cost (V. Hayward, personal communication). The resulting design used a 'five bar' closed loop spherical mechanism in which all five joint axes meet at one end point, which is the center of concentric spheres. It is a property of the mechanism that all points of its links are constrained to move on the surfaces of concentric spheres. Thus, no part of the mechanism can penetrate the space in which the subject is located. If the actuators have coinciding axes, the workspace can cover the entire sphere, save the "antipodes". For reasons of mechanical simplicity and reduction of bulk, the two actuated joints were placed 20° apart in the system, thereby approaching the "coinciding" condition (V. Hayward, personal communication). The area of the surface that can be reached by the loudspeaker is about 63% of the total surface area of the sphere, covering almost all the subject's frontal hemifield (Fig. 2.2). Fig. 2.2 Workspace of the loudspeaker. a, Definitions of 9 and *F in spherical coordinates. b. Two dimensional representation of the workspace. 45 a. z 46 The base link of the system is mechanically grounded and supports two actuated and instrumented joints. The other three joints are free. The base link is supported by a rigid overhead gantry and is located behind and above the subject's head. It includes two sound proofed motion reduction boxes driven by DC motors via elastomeric belts, with a reduction of about 1:60. This maximizes acceleration capability (roughly proportional to the square root of the ratio of the mechanism's inertia to that of the motors) and greatly reduces noises generated by movement of the mechanism (V. Hayward, personal communication). The frequency spectrum of the noise produced by the motion of the robot arm is shown in figure 2.3. A control experiment showed that the noise produced by the movement of the robot arm did not provide any direction cues. In this experiment the loudspeaker was turned off and the arm moved randomly at one of eight velocities (1, 3, 5 and 7°/sec, either leftward or rightward) with one of eight start positions (1, 3, 5 and 7°, either on the left or right of the midline). Therefore, there was a total of 64 trials (8 velocities x 8 start positions) in the test. This test condition is identical to that used in the adaptation experiments (see Chapter 3) except for the absence of any loudspeaker-generated sound. The results showed that for each test velocity, the subject's response was at chance level, i.e., the percentage of the left or right responses was about 50%. 47 CldS 9P) 8SJ0N 48 The servo-motors were interfaced to a control computer (IBM PC Pentium-150) through a PC motion control interface card (MFIO-3A, Precision MicroDynamics Inc.) (Fig. 2.4). For each specified trajectory of the loudspeaker, a computer program developed by Dr. Swindale generated an array of theta and phi positions. The program then employed a look-up table calculated by Dr. Zakarauskaus to convert this array from spherical coordinates to arm-angle coordinates (i.e. the positions of the stepping motors), which was then used to move the arm (N. Swindale, personal communication). The loudspeaker was connected to a sound card (Sound Blaster 16, Creative Labs, Inc.) which was programmed to generate different kinds of sound, including broadband noise, bandpass noise, frequency- and amplitude-modulated noises, pure tones and clicks. The operation mode of the SoundBlaster was monaural, using 16-bit outputs at a sampling rate of 40 kHz. These sounds were synchronized with the movement of the robot arm. In this way, the sound is activated after a given latency when the loudspeaker moves into a given spatial region. This acoustical stimulation system was positioned in an acoustically treated soundproof chamber (2.65 m x 2.40 m x 2.00 m), which was modified from an IAC sound-insulated chamber (Industrial Acoustics Company, Inc.). Figure 2.5 shows the transfer function of the chamber measured with the MLSSA Acoustical Measurement System (DRA Laboratories). Taking advantage of this unique sound stimulator, psychoacoustical experiments were performed to study the aMAE and the auditory contingent aftereffects. 49 Digital Encoders Brushed DC Motors MHO-3A PC Motion Control Interface Card IBM PC Computer Sound Blaster Card i BTA-28V-6A Linear Power | I Amplifier Speaker Robotarm Fig. 2.4 Schematic diagram of the acoustical stimulus system. A specially designed robot arm tracks the reference trajectories specified by a controlling PC computer through a motion control interface card via joint servo control. In order to generate a real moving sound, a loudspeaker is mounted on the 'tip' of the robot arm, which is connected to a soundblaster card. Sounds generated by the sound card are synchronized with the movement of the robot arm. 50 SO.O -j 40.0 -j 30.0 • 20.0 • XO.O • O.O -JLO.O - 2 0 . 0 -30.0 -40.0 -SO.O Jiuto | } j 1 : I • I | | ] 1 | \ * 1 \ 11; • • • i 1 M i m z « : : 5 s : * | ; : t X I I z 5 z t • • 1 log Frequency - Hz Fig. 2.5 Transfer function of the modified sound-insulated room measured with MLSSA system. 51 3 Studies of the Auditory Motion Aftereffect Compared with studies of the vMAE, studies of analogous auditory motion aftereffects (aMAE) are relatively scarce (e.g. Grantham and Wightman 1979; Grantham 1989; Reinhardt-Rutland 1992; Ehrenstein 1994). In these studies, simulated moving sound was used as the stimulus. In the Reinhardt-Rutland study (1992), simulated moving sound was generated by simultaneously changing the sound level of 1000-Hz sinusoids at the two ears presented over headphones in opposite directions. After two minutes of initial adaptation, a changing-loudness aftereffect was reported for the monaural test stimulus by all three subjects tested. A sound with a constant intensity was perceived to change in loudness in the direction opposite to that during adaptation. However, when tested binaurally, no subjects observed any motion aftereffect or loudness aftereffect. Ehrenstein (1994) reported that following 90 seconds of listening to a simulated moving sound produced by dynamically varying interaural intensity or time differences, a stationary test stimulus was displaced in the direction opposite to that of adaptation, but no apparent motion was observed. However, when simulated moving sound was used as both the adapting and the test stimulus (Grantham and Wightman 1979) or just as the test stimulus (Grantham 1989), an aMAE was found, though it lasted only 1 ~ 3 seconds and was not observed in all subjects tested. In the study by Grantham and Wightman (1979) in which sound source motion was simulated by continuously changing interaural intensity and phase differences of a tonal stimulus presented through headphones, an aMAE was reported for 500-Hz sinusoids, but was virtually absent for 2000-Hz pure tones. When the velocity of an adapting stimulus (a 500-Hz low-pass 52 noise) was below 30°/s, an aMAE was observed only for two out of four subjects tested (Grantham 1989). Simulated moving sound may be a less-than-ideal stimulus to use because it provides the auditory system with incomplete motion cues. For example, when simulated moving stimuli are presented through headphones and the motion is achieved by dynamically varying the interaural intensity differences or the interaural time differences, the modifying effects of the head and pinna on the incoming signals are bypassed altogether. Thus, the auditory motion detection channels may not be adequately stimulated. However, a real moving sound can provide a more natural stimulus and potentially more localization cues, including a time-varying frequency spectrum, as well as time-varying intensity and phase information. In the studies presented here, a real moving sound was used as both the adapting and test stimulus, which was generated by the sound stimulation system described in Chapter 2. Using this stimulus, a robust and repeatable aMAE was demonstrated in all (7/7) subjects tested. This chapter describes the results from three experiments. Experiment 1 confirmed the existence of the aMAE and its dependence on adapting velocity. Experiments 2 and 3 examined the tuning and specificity of the aMAE in the spatial and frequency domains, respectively. The relationship of the present findings to previous studies of the aMAE is discussed in the final section. The main content of this chapter has been published in Perception and Psychophysics (Dong et al. 2000). 53 3.1 Experiment 1: The Auditory Motion Aftereffect and Its Dependence on Adapting Velocity This experiment was designed to demonstrate the aMAE and to analyze the magnitude of the aMAE as a function of adapting velocity. 3.1.1 Methods Subjects Four subjects (T.S., R.C, P.Z. and CD.) with clinically normal hearing examined with pure tone audiograms participated in the experiment. All subjects had behavioral pure tone audiometric thresholds of 15 dB HL or better at octave frequencies from 250 to 8000 Hz in both ears (Appendix 1). These subjects, aged 23 - 39 years, were recruited from members of the Ophthalmology Research Lab or were students at the University of British Columbia. Except for one subject (P.Z.) and the author (CD.), all the subjects were unaware of the purpose of the experiments. Procedure A two-alternative forced choice (2-AFC) paradigm was used to measure the aMAE. In the initial adaptation period (160 ~ 200 seconds), a sound source repeatedly swept a specified arc in a single direction. Between two successive adapting sweeps, there was a silent interval of about 1 second, which was required to reposition the 54 loudspeaker to the start position and to accelerate it to the desired speed (Fig. 3.1). By varying the number of adapting sweeps, the total duration of actual sound motion during the initial adaptation was kept constant (120 seconds) for all of the adapting velocities tested. During the test period, the subject's task was to indicate the direction (i.e. left or right) of a brief test sound by pressing one of two buttons. The velocity of the test stimulus was randomized from trial to trial. In order to maintain the subject in an adapted state, following each test, there was a re-adaptation period (8 — 10. seconds), within which the subject was exposed to the adapting sound again for a total moving sound exposure of 6 seconds (Fig. 3.1). The range of test velocities was always centered on zero and was chosen so that the extremes of the range were almost always correctly identified by the subject. A psychometric function was generated by plotting percent "left" responses to the different test velocities presented during a block of trials. Probit analysis (Finney 1971) was used to estimate the 50% response rate on the resulting psychometric function (Fig. 3.2). This is the stimulus velocity that sounds stationary to the subjects. If there is an aMAE, the test velocity that appears to be stationary (i.e. the point of subjective stationarity or PSS) will shift in the direction of the adapting velocity. In my studies, the value of the PSS was used as a measure of the magnitude of the aMAE. For each adapting velocity, at least three values of the PSS were obtained, each from a single block of trials. The mean aMAE and the standard error of the mean were then calculated based on these values. 55 Fig. 3.1 A Time sequence of stimuli. See text for details. B_, Speaker velocity as a function oftime during one adapting sweep. In this example, the adapting velocity is 109/s. After 0.2 seconds of acceleration, the speaker reaches its target velocity of 10°/s. Then, the speaker moves at this speed (10°/s) for 3 seconds with the sound on (highlighted by a filled bar), followed by 0.2 seconds of deceleration. C. Speaker position as a function of time during the adaptation period. Taking an adapting velocity of 107s as an example, since the adapting region is an arc of 30° centered at 0° azimuth, the duration of one sweep is 3 seconds (highlighted by filled bars). Note that between sweeps, there is a silent interval of 1 second, consisting of 0.2 seconds of deceleration of the speaker immediately following each sweep, a period of 0.6 seconds for repositioning the speaker to its start position, and 0.2 seconds of acceleration to get the speaker to the velocity of 107s immediately preceding the next sweep. 56 A. 160 ~ 200 s 1 or .5 s 8 ~ 10 s M KH \<—>\ t Initial adapt. Test Resp. Readapt. B. o C D C O o> 10 cu T J " C D > 0 .2 s 3 s .2 s M \< H K r I/I I\I t Sound on a 16 15 £ - 1 5 -1 6 .2 s 3 s .2 s .6 s .2 s 3 s —H K-Sound on 57 Close inspection of the experiments done by Grantham and his colleague (1979, 1989) reveals a potential drawback in their experimental design. In their studies, test stimuli moved symmetrically about the midline. Therefore, the direction of motion of a test stimulus could be decided based on the start or end position of the test stimulus relative to the midline. For example, if the start position of a test stimulus is on the right side of the midline, there may be a bias to report the test stimulus moving to the left. The subjects in these studies might have used the localization cues for the start or end position of a test stimulus to judge the direction of the moving test stimulus. To overcome this potential drawback in our experiments, for each start position, a test stimulus could move randomly in either direction, left or .right. Thus, the start and/or end positions could no longer provide direction cues for the subjects, except in extreme cases (i.e. when the end positions were at ±15°). All tests were conducted in the darkened soundproof chamber described in Chapter 2. Subjects were seated at the center of the sphere defined by the motion of the loudspeaker, and were instructed to keep their eyes closed and maintain a steady upright posture through the course of experimentation. A headrest was provided to prevent the subjects from tilting their heads either sideways or forward. Before data were collected, at least two hours of training was provided to each subject until the performance of the subject was stable. The order of trial blocks with different adapting velocities was randomized in order to minimize any order effects. Each session lasted about one hour and typically included three individual trial blocks with different adapting velocities, for example, +20°/s, 0°/s (control) and -20°/s, where positive velocities indicate that the 58 direction of the moving stimulus is to the right, while negative velocities indicate that the direction is to the left. After each block of trials, a 5-minute break was provided in order to prevent the subject from fatiguing. For each subject, the experiment was spread over three weeks, during which 12 to 15 sessions of test were carried out. Stimuli Broadband noise (500 ~ 14,000 Hz, Fig. 3.3) was used as both the adapting and test stimulus. The average level of each sound, presented from a stationary loudspeaker at 0° azimuth at the subject's ear level and measured with a sound level meter (Quest Electronics, Model 1800) at the subject's head position (0.8 m distant), was about 75 dB SPL(A). In each block of trials, the adapting velocity was constant. During adaptation, the adapting stimulus repeatedly traversed an arc of 30° (±15° centered on the midline) at the subject's ear level with one of six velocities (±10, ±15 and ±20°/s). The duration of each sweep was 3, 2 and 1.5 seconds for adapting velocities of 10, 15 and 20°/s, respectively. In order to keep the total duration of exposure to the moving sound equal (120 seconds), different numbers of adapting sweeps (40, 60 and 80) were presented for different adapting velocities (10, 15 and 20°/s, respectively). During testing, the test stimulus (1 second of broadband noise) moved at one of eight velocities (±1, ±3, ±5 and ±7°/s) with one of eight possible start positions (±1, ±3, ±5 and ±7°). Therefore, a total of sixty-four test presentations was given on each block of trials (8 test velocities x 8 start positions), and the whole adapting range (-15° ~ 15° azimuth) was covered by the motion of these test stimuli. For each adapting velocity, there was a control trial block, in which the adapting stimulus was a stationary sound (i.e. with a velocity of 0°/s) presented 59 directly in front of the subject (0.8 m distant) with the same time course as the corresponding moving adaptors. APHPost U2.1 s t i M u i u s = uee Data Table S t i HU1 U S -7 -5 -3 -1 1 3 S 7 correct e e e e 3 s 7 8 « t r i a l s S 8 8 8 8 8 8 8 percent e e e e 37.5 62.5 87.5 100 Probit Analysis: Threshold = 1.9459 +- .30608 Heart = 2.2794 +- .52564 Chi cviu>e= 1.3505 Fig 3.2 A n example illustrating probit analysis that was used to estimate the 50% response rate on tiie psychometric function derived from the subject's responses. Data in this example are drawn from responses of a subject who adapted to a rightward sweeping sound (adapting duration: 2 minutes; adapting velocity: 207s). "Mean" in the figure represents the spatial velocity that was judged moving leftward 50% of the time by the: subject. 60 61 3.1.2 Results In figure 3.4, the magnitude of the aMAE, measured as the point of subjective stationarity (PSS), from all four subjects in this experiment is plotted as a function of adapting velocity. Different panels represent results from different individual subjects. The results showed clear aMAEs for all the subjects for adapting velocities of 10, 15 and 20 °/s. After adaptation, the PSS shifted in the expected direction. That is, when the adapting velocity was positive (i.e. to the right), the test velocity that appeared to be stationary changed in a positive direction. Thus, a positive velocity was judged by the subject as stationary and correspondingly a stationary sound was heard with a negative velocity, i.e. as moving to the left. Fig. 3.4 The magnitude of the aMAE measured as a function of adapting velocity. Different panels show the mean sizes of the aMAE for different subjects. The velocity of 0°/s indicates the control condition in which the adapting sound was presented from a stationary loudspeaker directly in front of the subject (0.8m distant). Note that there is a clear aMAE for all the subjects tested. The magnitude of the aMAE increased up to the highest velocity tested. In this and the following figures, the error bar indicates ±1 standard error in the mean of 3 ~ 4 separate blocks of trials unless otherwise specified. 62 (oas/6ap) avifle QM1 jo epnjiu6e|/\| 63 In figure 3.5, the aMAE averaged over all four subjects tested in the experiment is displayed as a function of adapting velocity. In the top panel, as in figure 3.4, the magnitude of the aMAE is measured as the PSS; whereas in the bottom panel, in order to analyze the aMAE in terms of gain, the aMAE is expressed as a percentage of the adapting velocity. Although, as expected, in-the top panel of figure 3.5 the magnitude of the aMAE increased with the adapting velocity, the results in the bottom panel show that the gain of the aMAE decreased from 17% to 12.8% as the adapting velocity increased from 107s to 207s. Fig. 3.5 Grand average of the aMAE over all four subjects tested in Experiment 1 as a function of adapting velocity. The top panel shows the mean magnitude of the aMAE (the mean of the panels shown in Fig. 3.4). The bottom panel displays the size of the aMAE as a percentage of the adapting velocity, i.e. the gain of the aMAE. The error bar indicates +1 standard error of the mean across four subjects. Note that although the magnitude of the aMAE increased with increases in adapting velocity, the gain of the aMAE tended to decrease as the adapting velocity increased. 64 -o 3 -20 -15 -10 0 10 15 20 Adapting Velocity (deg/sec) 65 3.2 Experiment 2: The Spatial Tuning and Specificity of the aMAE When we adapt to a sensory stimulus, the neural mechanisms responsible for processing the stimulus are excited or activated. The greater the excitation or activation during adaptation, the greater the underlying mechanisms are affected and then the greater the aftereffect may be induced. Visual studies have shown that the vMAE is tuned, since it is greatest when the adapting and test stimuli share the same spatial and temporal properties and presumably stimulate the same neural mechanisms (Blakemore and Campbell 1969; Cameron, Baker and Boulton 1992; Bex, Verstraten and Mareschal 1996). In this experiment, the tuning of the aMAE was studied in the spatial domain. By comparing the aMAEs obtained when the adapting and test regions overlapped with that when the two loci were separate, the hypothesis that the aMAE is spatially specific was tested. 3.2.1 Methods Three subjects (M.H., T.S. and CD.) participated in Experiment 2, two of whom were also subjects in Experiment 1. The experimental procedure in this experiment was the same as that in Experiment 1. In this experiment, an arc of 70° (±35° centered on the midline) at the subject's ear level was divided into 7 equal sub-regions, each 10° in size (-35° to -25°, -25° to -15°, -15° to -5°, -5° to 5°, 5° to 15°, 15° to 25°, 25° to 35°). Two experimental conditions were used. In Condition 1, in order to study the dependence of 66 the aMAE on spatial location, the adapting and test regions were the same and the magnitude of the aMAE was measured as a function of spatial region. In Condition 2, in order to determine if the aMAE is spatially specific, the adapting and test stimuli were presented in separate spatial regions, and the magnitude of the aMAE was analyzed as a function of the distance between the two separate loci. In this condition, the region swept by the adapting stimulus was a 10° arc (±5° centered on the midline), whereas the test region was chosen from one of six separate regions (-35° to -25°, -25° to -15°, -15° to -5°, 5° to 15°, 15° to 25°, 25° to 35°). Only one adapting velocity, 207s, was used in this experiment. The test stimulus was always a 0.5-second presentation of broadband noise, moving at one of six velocities (±1, ±3 and ±57s) with one of four start positions, two of which were 1° on either side of the center of the test region and the other two 3°. Taking the central spatial region (-5° ~ 5°) as an example, the start positions of the test stimulus were ±1° and ±3° azimuths. During each run, eight responses were collected for each test velocity, two at each start position. As in Experiment 1, a psychometric function was constructed by plotting percent "left" responses (out of 8) as a function of test velocity. For each condition, three or four runs were obtained, and the mean and standard error of the aMAE magnitude for each condition were computed from these three or four functions. 67 3.2.2 Results Results obtained from Condition 1 and 2 for all the subjects in this experiment are displayed in figures 3.6 and 3.7, respectively. In these figures, results from different subjects are shown in different panels. In figure 3.6, the magnitude of the aMAE was plotted as a function of spatial region. Results showed that within the spatial region tested (-35° ~ 35°), the magnitude of the aftereffect was relatively independent of position when the adapting and test locations were the same. That the aMAE is spatially specific is clearly shown in figure 3.7, in which the magnitude of the aMAE was plotted as a function of location of the test region. When the test region coincided with the adapting region (-5° ~ 5°), the aMAE elicited was largest. As the distance between the test and adapting regions increased, the aMAE diminished, resulting in an inverted ' V shape of the spatial tuning function. Since the measure of the space constant can provide some information about the size of the receptive field of a "channel", a space constant was determined from the spatial tuning curves in figure 3.7 for each subject and for the average over all three subjects tested in this experiment, by finding the best fitting Gaussian function and taking the standard deviation as a measure of the space constant. These were 15.3°, 14.9° and 19.5° for subjects T.S., M.H. and CD. , respectively. The space constant obtained from the average spatial tuning curve of these three subjects was 16.4°. 68 Fig. 3.6 Spatial tuning of the aMAE for three subjects, and the average over all three subjects (Experiment 2 , Condition 1 ) . In this condition, the adapting and test regions were always the same, and the magnitude of the aMAE was measured as a function of spatial region. Note that in this experimental condition, within the spatial region tested, the aMAE was largely invariant with azimuth. 69 70 Fig. 3.7 Magnitude of the aMAE in the case when the adapting and test regions were separated (Experiment 2, Condition 2). Data are shown for three individual subjects as well as the average over all three subjects. Note that when the distance between adapting and test regions increased, the magnitude of the aMAE decreased. 7 1 co OJ 1- o n - co OJ T- o (oas/6ap) a v w ©MJP epni!u6e|fl 72 3.3 Experiment 3: The Frequency Tuning and Specificity of the aMAE Sounds of different frequencies are analyzed in different frequency channels. Previous experiments used broadband noises (500 ~ 14,000 Hz) to elicit the aMAE. In this experiment, the frequency dependence of the aMAE was studied. Using different bandpass noises as the adapting and the test stimulus, the hypothesis that the aMAE is frequency-specific was tested. 3.3.1 Methods Three subjects (M.L., J.Q. and CD.) with clinically normal hearing were tested, one of whom participated in Experiments 1 and 2. The experimental procedure, adapting region, velocities and start positions of the test stimulus, and stimulus intensity were the same as those in Experiment 1. As in Experiment 2, only one adapting velocity (20°/s) was tested. Instead of using broadband noise, five 1-octave band-pass noises with different center frequencies (fc) (i.e. 500 Hz, 1000 Hz, 2000 Hz, 4000 Hz and 8000 Hz) were used as stimuli. There were two adaptation conditions. In the first condition, in order to study the frequency tuning of the aMAE, the adapting and test stimuli had the same spectrum. In the other condition, in order to investigate if the aMAE is frequency specific, the adapting and test stimuli were in different frequency spectral ranges. A total of eleven combinations of the adapting and test stimuli was tested in these two conditions 73 (i.e. 500/500 Hz, 1000/1000 Hz, 2000/2000 Hz, 4000/4000 Hz, 8000/8000 Hz, 500/2000 Hz, 500/8000 Hz, 2000/500 Hz, 2000/8000 Hz, 8000/500 Hz and 8000/2000 Hz, where the first figure indicates the center frequency of the adapting stimulus and the second indicates that of the test stimulus). Due to limited time available from each subject, each band-pass noise and each combination of adapting and test stimuli was tested only once. Thus, each point on the psychometric function was based on 8 responses. 3.3.2 Results Resultsfrom this experiment are shown in figures 3.8 and 3.9. In figure 3.8, the magnitude of the aMAE was plotted as a function of frequency band. The results showed that the magnitude of the aMAE was slightly larger for low frequency (< 1000 Hz) sound than for middle (2000 ~ 4000 Hz) or high frequency (> 4000 Hz) sound, which is consistent with the results reported by Grantham and Wightman (1979). Sound in the middle frequency range (2000 ~ 4000 Hz) was least efficient in inducing the aMAE although the differences are not large. The same pattern was shown in all three subjects tested. Figure 3.9 compares the aMAE measured with nine combinations of the adapting and test stimuli (i.e. 500/500 Hz, 500/2000 Hz, 500/8000 Hz, 2000/500 Hz, 2000/2000 Hz, 2000/8000 Hz, 8000/500 Hz, 8000/2000 Hz and 8000/8000 Hz, where the first figure indicates the spectral content of the adapting stimulus and the second one indicates that of the test stimulus). In three panels (top right, bottom left and bottom right) of figure 3.9, low (fc: 500 Hz), middle (fc: 2000 Hz) and high (fc: 8000 Hz) frequency sounds served as test stimuli respectively. In these three panels, the adapting stimulus was a low (fc: 74 500 Hz), medium (fc: 2000 Hz) or high (fc: 8000 Hz) frequency sound. The results showed that the aMAE was largest when the adapting and test stimuli were identical in frequency, that is, when both of them were low, middle or high frequency sound. When the adapting and test stimuli were different in frequency, the magnitude of the aMAE decreased. The extent of the decrease depended on the difference in frequency between the adapting and test stimuli. The larger the difference in frequency, the weaker the aMAE, suggesting that the aMAE is frequency specific. These results are summarized in the top left panel of figure 3.9, in which the size of the circle represents the magnitude of the aMAE averaged over all three subjects tested in nine combinations of adapting and test stimuli. In addition to the frequency specificity of the aMAE which is reflected by the larger sizes of the circles along the diagonal (lower-left corner to upper-right corner), the pattern in this panel shows an interesting asymmetry of the frequency distribution of the aftereffect. The circles below the diagonal are larger than those above it, indicating that the aMAE is stronger when the test stimulus is higher in frequency than is the adapting frequency and weaker in the opposite conditions. 75 Fig. 3 . 8 Magnitude of the aMAE as a function of frequency band in the case when the adapting and test stimuli had the same spectrum (Experiment 3, Condition 1). Different panels show results for three individual subjects as well as the average over all three subjects (lower right panel). Note that the aMAE was slightly larger for low frequency sound than for high and middle frequency sounds. The middle frequency stimulus was least efficient in inducing the aMAE. 76 Fig. 3.9 Frequency specificity of the aMAE for three subjects. In the top right, bottom left and bottom right panels, the center frequency of the test stimulus is 500 Hz, 2000 Hz and 8000 Hz, respectively. In each panel, the legend lists combinations of adapting and test stimuli, where the first figure indicates the center frequency of the adapting stimulus, and the second the test stimulus. Note that the aMAE was largest when the adapting and test stimuli were identical in frequency. When the adapting and test stimuli were different in frequency, the magnitude of the aMAE decreased. The larger the difference in frequency, the weaker the aMAE. In the top left panel, results for all three subjects are summarized. The size of the filled -circle represents the magnitude of the aMAE averaged over all three subjects tested in nine combinations of adaptingand test stimuli. The numbers next to the circles show the magnitude of the aftereffect for each combination. Note that the sizes of the circles below the diagonal (lower-left corner-* to upper-right coiner) are larger than those above it, showing that the aMAE for the combinations of 500/2000 Hz, 500/8000 Hz and 2000/8000 Hz is stronger than for the 2000/ 500 Hz, 8000/500 Hz and 8000/2000 Hz combinations. 78 79 3.4 Discussion Previous studies of the aMAE have used simulated moving sound as the stimulus. Simulated sound source motion has been generated by modulating either IID (Reinhardt-Rutland 1992; Ehrenstein 1994) or ITD (Ehrenstein 1994) or both of these cues (Grantham and Wightman 1979; Grantham 1989). When the simulated moving stimulus used only one of these dynamic cues, no aMAE was elicited (Reinhardt-Rutland 1992; Ehrenstein 1994). Instead, a monaural changing-loudness aftereffect or a displacement aftereffect was observed. When both ITD and IID cues were used (Grantham and Wightman 1979; Grantham 1989), an aMAE was found at low frequencies (500 Hz), but for only some (2/4) of the subjects. In contrast, using real moving sound as the stimulus, an aMAE was observed for all the subjects in my sample and for all conditions tested, including relatively low adapting velocities, and low and high frequencies. This suggests that the aMAE requires as many motion cues as possible. A real moving stimulus will provide monaural spectral cues as well as ITD and IID cues. This suggests that, in order to have the aMAE explicitly expressed, not only do the auditory motion detection channels need to be adequately adapted, but it may also be necessary to use a stimulus rich in motion cues as a test. More recently, Grantham (1998) published a study in which moving sound stimuli were first recorded using two microphones put in the ear canals of a KEMAR manikin. These recorded sounds were later played back to the subject through headphones during experiments. Thus, in his experiments, sound waves reaching the subject's tympanic 80 membrane comprised the dynamic cues provided by the KEMAR manikin's head and pinnae, such as time-varying ITD, IID and monaural spectral shape cues. Four of the five subjects tested in his study reported that the sounds were externalized with a distance to their heads less than 30 cm. But one subject perceived the stimuli to be inside her head. In the Grantham study, the magnitude of the aMAE was defined as a percentage of the maximum possible area between the two psychometric functions obtained when the adapting stimuli had the same speed but moved in opposite directions. Using this kind of virtual moving sound as the stimulus, Grantham demonstrated an aMAE for both low (1000-Hz lowpass filtered) and high (5000 ~ 8000 Hz bandpass filtered) frequency signals within a range of velocities from 10°/s to 180°/s. No significant velocity effect was found on the magnitude of the aMAE. He also showed that the adaptor and probe must have the same spectrum and share the same azimuthal region in order to obtain a measurable aMAE, suggesting that the aMAE was frequency and spatially specific. As different methods were used to measure the aMAE, the strengths of the aMAE observed in Grantham's and the present study could not be directly compared. But in general, my results agree with Grantham's findings (1998), in that, with sounds moving in the real world as stimuli, an aMAE could be elicited by either low, middle or high frequencies, and the magnitude of the effect depended on the extent of overlapping spatial region as well as the spectral content of the adapting and test stimuli. In addition, the present study extends the Grantham study by further investigating the tuning of the aMAE in the frequency and spatial domains. My results (Fig.3.9) show that the aMAE is stronger when adapting stimuli contain lower frequency sound relative to that contained 81 in the test stimuli than when the frequency relation between the adapting and test stimuli is reversed. These results suggest that the frequency tuning of the motion sensitive channels may spread further to the high-end than to the low-end of the frequency spectrum. The present finding that, across the spatial range tested (-35° to +35° azimuth), the aftereffects are largely invariant with eccentricity is consistent with research on minimum audible movement angle (MAMA), which has shown that dynamic acuity is relatively constant throughout this range of azimuths (Grantham 1986; Chandler and Grantham 1992; Strybel, Manligas and Perrott 1992). In contrast to the absence of a significant effect of adapting velocity reported by Grantham (1998), my results show a clear velocity dependence of the aMAE (Figs. 3.4 and 3.5). Grantham showed that there was a considerable variability across subjects, to which his failure to find the velocity effect on the aMAE may be attributed. Taking into account the features of the virtual moving sound used by Grantham (1998), inter-subject variation in his results should not be surprising. In his experiments, sound stimuli delivered to the subject over headphones were filtered by the KEMAR manikin's head-related transfer functions (HRTF). However, studies (Burkhard and Sachs 1975; Middlebrooks and Green 1990) have shown that the dimensions of the head and pinna vary considerably among adults. For example, Burkhard and Sachs reported that the adult concha could vary from 2.1 to 3.1 cm in length and from 1.3 to 2.3 cm in breadth. Since these anatomical dimensions are roughly comparable to the wavelength of sounds in the range of human hearing, individual differences in anatomical dimensions transform to individual differences in potential localization cues (Shaw and Teranishi 1968; 82 Middlebrooks and Green 1990; Wenzel, Wightman and Kistler 1991). As a result, different individuals have different HRTFs. Some of the subjects tested in the Grantham study might have HRTFs similar to KEMAR's, while others might have quite distinct HRTFs, and their motion detection pathways might be inadequately stimulated. Most of these drawbacks are overcome by using real moving sound as a stimulus. It should be noted, however, that the sounds produced by the sound stimulus system used in the present study lack any significant Doppler shift (because the sounds move circumferentially around the head, Doppler cues are insignificant, but this will not be true for all moving sound), which might be a potent cue for detecting sound source motion in directions that are not circumferential to the head. Rosenblum, Carello and Pastore (1987) showed that listeners could use the Doppler effect, as well as ITDs and amplitude change, to locate a moving sound source, although the Doppler effect was the least effective of these cues. 83 4 Studies of the Auditory Contingent Aftereffects In vision, a special class of adaptation effects called contingent aftereffects was first reported by McCollough (1965). In her pioneering study, McCollough found that after a few minutes of alternately viewing an orange-black vertical grating and a blue-black horizontal grating, the white stripes in a vertical black-and-white grating appeared blue-green, while the white stripes in a horizontal grating appeared orange (Fig. 4.1). a. b. Fig. 4.1 Demonstration of the McCollough effect, a, Adapting pattern for inducing the aftereffect, b, Test pattern for eliciting the aftereffect. Adapted from McCollough (1965). 84 Since then, there have been numerous demonstrations of other types of visual contingent aftereffect. For example, Held and Shattuck (1971) reported an orientation aftereffect contingent on color in which, after scanning stripes tilted clockwise off the vertical in a red background and stripes tilted counterclockwise in a green background, vertical test stripes appeared tilted counterclockwise when they had a red background whereas clockwise when they had a green background. Lovegrove and Over (1972) found a color aftereffect contingent on spatial frequency. Following alternate exposure to a vertical grating with one spatial frequency in red light and a vertical grating with a higher spatial frequency in green light, an achromatic grating appeared greenish when it had the lower spatial frequency and reddish when it had the higher spatial frequency. By using appropriate adapting stimuli, a color aftereffect has been shown to be contingent on the direction of motion (Hepler 1968; Stromeyer and Mansfield 1970). After watching red stripes moving upwards alternating with green stripes moving downwards, achromatic stripes appear greenish when they move up and reddish when they move down. Conversely, a motion aftereffect has been shown to be contingent on color (Favreau, Emerson and Corballis 1972; Mayhew and Anstis 1972). After viewing repeated alternations of a red contracting spiral and a green expanding spiral for about ten minutes, a stationary spiral appears to be expanding or contracting, depending on its color: the red stationary spiral appears to be expanding and the green stationary spiral contracting. The visual contingent aftereffect can be quite persistent. McCollough (1965) reported that the orientation-contingent color aftereffect could still be elicited at least 1 85 hour after the initial adaptation. The motion-contingent color aftereffect and the color-contingent motion aftereffect can persist for at least 24 hours (Hepler 1968; Favreau, Emerson and Corballis 1972). In some cases, the motion-contingent color aftereffect can last as long as 6 weeks (Stromeyer and Mansfield 1970; Mackay and Mackay 1975). There are anecdotal reports of even longer periods of persistence. In a systematic exploration of the persistence of the McCollough effect, Jones and Holding (1975) reported that the contingent aftereffect was still observed at half its initial strength 3 months after 15 minutes of adaptation. In contrast to the rich variety of visual contingent aftereffects, there have been no reports of contingent aftereffects in other sensory modalities such as audition. Psychophysical evidence suggests that the auditory and visual systems share many similarities in signal processing. Both systems appear to operate by comparing parameters of the stimuli in the two ears or eyes. For example, the auditory system uses the time and intensity differences of a sound at the two ears to localize the sound source in the horizontal plane (see Chapter 1 for review). Similarly, the visual system employs the binocular disparities of two retinal images to determine positions of a target in depth (e.g. Wheatstone 1852; Regan, Beverley and Cynader. 1979). As demonstrated in Chapter 3, auditory motion aftereffects (aMAEs), which are analogous to the waterfall illusion in vision, can be obtained. In addition, aftereffects for spectral motion have also been reported (Shu, Swindale and Cynader 1993). This suggests that like the visual system, the auditory system contains specialized channels for motion detection and thus the auditory system and the visual system appear to depend on similar mechanisms for 86 motion detection. Therefore, I designed experiments to determine whether contingent aftereffects could be demonstrated in the auditory system. In vision, stimulus attributes that have been found to produce contingent aftereffects when paired are those that give simple aftereffects when presented singly. Since simple aMAEs have been observed in frequency and in azimuth, in Experiment 4, the directions of sound movement in frequency (i.e. a sound gliding upward or downward in frequency) and in azimuth (i.e. a sound moving leftward or rightward in space) were chosen to be paired to test if the spatial motion aftereffect could be contingent on frequency. Experiment 5 compared the time courses of the contingent and the simple auditory motion aftereffect. In Experiment 6, the direction of sound motion in frequency was made contingent on the direction of motion in intensity. Since sound motion in azimuth can be made contingent on sound motion in frequency, which in turn can be contingent on motion in intensity, the possibility that sound motion in azimuth can be made contingent on motion in intensity via an intermediate attribute, i.e. sound motion in frequency was explored in Experiment 7. This kind of aftereffect is termed here as the "double" contingent aftereffect because it would be based on the establishment of a twofold contingency. Part of the results presented in this chapter has been published in Nature Neuroscience (Dong, Swindale and Cynader 1999). 87 4.1 Experiment 4: A Spectral Contingent Spatial Motion Aftereffect This experiment was designed to demonstrate a spatial MAE contingent on spectral motion. 4.1.1 Methods Subjects Five subjects (S.L., G.S., J.Q., T.S. and CD.) with clinically normal hearing participated in the experiment. All subjects, aged 23 - 33 years, were recruited from members of the Ophthalmology Research Lab or were students at the University of British Columbia. Except for the author (CD.), all subjects were unaware of the purpose of the experiments. Experimental Procedure A two-alternative forced choice procedure was used to measure the contingent aftereffect. In the adaptation period, the subject listened to a sound with a rising pitch presented from a loudspeaker moving to the left, alternating with a sound with a falling pitch emitted from a loudspeaker moving to the right. Following the initial adaptation, a series of slowly moving test sounds with either a rising or a falling pitch was presented. The subject had to respond by pressing one of two buttons to indicate the direction of spatial movement of the sound source (leftward or rightward). The spatial velocity that was judged to be moving to the left (or right) 50% of the time represents the velocity that sounded stationary to the subject and 88 was determined by probit analysis of the psychometric functions representing the subject's responses (Finney 1971). A spatial motion aftereffect contingent on the direction of spectral motion was judged to have occurred if the point of subjective stationarity of spatial motion was shifted in the same direction as that paired with the direction of spectral motion during adaptation. In order to study the decay of the contingent aftereffect, the size of the aftereffect was measured immediately, and then 1, 4 and 24 hours after the initial adaptation. Each subject was tested twice, with a separation of at least three days between tests. Stimuli One-octave bandpass noises composed of sixty-four sinusoidal components with randomized initial phases were used as stimuli in the present study. The sound stimuli were generated from the loudspeaker in the acoustical stimulation system described in Chapter 2. The averaged sound level was about 75 dB SPL(A) when the sound was presented from a stationary loudspeaker at 0° azimuth and measured at the position of the subject's head that was located at the center of a sphere of 0.8 meters in radius. In this experiment, the center frequency of the stimulus moved either upwards or downwards at a velocity of 0.7 octaves/s in a frequency range centered at 1.5 kHz. These parameters were chosen based on the fact that spectral MAEs are strongest at adapting velocities between 0.5 and 1 octaves/s across a frequency range of 1 to 2 kHz (Shu, Swindale and Cynader 1993). During about 10 minutes of adaptation, the loudspeaker alternately swept to the left and the right with a velocity of 307s along an arc of 30° 89 (centered at 0° azimuth) at the subject's ear level. When the loudspeaker moved to the left, the center frequency of the stimulus sound moved upwards (0.7 octaves/s); and when the loudspeaker moved to the right, the center frequency of the sound moved down (-0.7 octaves/s). Each sweep to the right or the left lasted about 1 second (Fig. 4.2). In the control condition, the loudspeaker moved over the same trajectory with the same time course as in the adaptation condition, but the center frequency of the adapting sound was kept constant at 1.5 kHz. During test, brief test sounds (1 second in duration) with a spectral velocity of 0.7 octaves/s (either upwards or downwards) were presented from a loudspeaker moving at one of six spatial velocities (±2, ±6 and ±10°/s) with one of five start positions (0°, +2° and ±4°). The velocities of sound movement in frequency and in azimuth were randomly interleaved in successive trials. A total of 120 test presentations was given on each block of trials, half of which had a rising pitch and the other half a falling pitch. 90 Fig. 4.2 a. Time sequence of stimuli: each run began with 10 minutes of adaptation, followed by a series of brief test sounds (1 s), with either a rising (0.7 octaves/s) or a falling (-0.7 octaves/s) pitch presented by a loudspeaker moving at one of six different velocities (2, 6 and 107s, either to the left or the right). For each test presentation, the subject was asked to press one of two buttons to indicate the direction (leftwards or rightwards) of spatial movement, b. Detailed time sequence of adapting stimuli: while the central frequency of an adapting sound (1-octave band-pass noise) is moving upwards (0.7 octaves/s), the loudspeaker moves to the left (30 7s) for 1 second, i.e., from -15° to 15° in azimuth. Following a silent interval of 1.4 seconds, the loudspeaker moves to the right (-30 7s) for 1 second, while the central frequency of the sound moves downwards (-0.7 octaves/s). During adaptation, this sequence repeats continuously. In the control condition, the loudspeaker moved over the same trajectory with the same time course, but the center frequency of the adapting sound was kept constant at 1.5 kHz. Note that the vertical axis in the top panel has a logarithmic scale. 91 10 min 1 s Initial adapt Test Ftesp. b. _ High N X 1912 > » c 1500 © o I Left 18 15 o -t-0) • t Sound on 92 4.1.2 Results Results from Experiment 4 showed a clear contingent aftereffect in all five subjects involved. Figure 4.3 displays the magnitude of the auditory contingent aftereffect as a function of time for all five subjects. After about 10 minutes of adaptation, during which the subject listened to a spatially rightward moving sound with a falling pitch, alternating with a spatially leftward moving sound with a rising pitch, the perception of the direction of spatial movement of a sound was strongly influenced by the direction of spectral movement of the sound. When a sound had a rising pitch, leftward motion was judged as stationary and correspondingly a spatially stationary velocity was heard as moving to the right. Similarly, when a spatially stationary sound had a falling pitch, it was perceived as moving leftwards. These effects declined over time, but could still be observed after 1 hour in all subjects and after 4 hours in two subjects following the initial adaptation (Fig. 4.3b and 4.3d). An average of the aftereffect across all five subjects for both directions in frequency is shown in figure 4.4. Analysis of variance (ANOVA) indicated a significant difference (p < 0.01) among the sizes of the aftereffect measured at different times for both directions. Post-hoc Newman-Keuls tests showed that the aftereffect measured immediately after adaptation was significantly stronger (p < 0.05) than at other times (1, 4 or 24 hours following adaptation) for either direction in frequency, except for that measured 1 hour after adaptation for the sound with a rising pitch. The aftereffect measured 24 hours after adaptation was weaker (p < 0.01) than those measured immediately or 1 or 4 hours after adaptation. The difference between the 93 aftereffects measured 1 hour and 4 hours after adaptation was not statistically significant (p > 0.05). Fig. 4.3 The magnitude of the auditory spectral contingent spatial motion aftereffect, in degrees per second, as a function of time after exposure, for five different subjects. The spatial velocity that sounded stationary to the subject was determined by probit analysis of the psychometric functions representing the subject's responses. Panels a ~ e show results for different subjects. For each subject, two sets of measurements of the aftereffect were taken immediately and then 1, 4 and 24 hours after adaptation. 94 (oas/6ap) spni!u6ew J (ass/Bap) apmjuBew 95 -O Rising Pitch (average) -D Falling Pitch (average) X Rising Pitch (control) + Falling Pitch (control) 0 . 1 1 1 0 1 0 0 Ctrl Time (hrs) Fig. 4.4 The average of the aftereffect for both a rising and a falling pitch across all f i v e subjects. Error bars indicate ±1 standard error of the mean. The time axis is shown on a logarithmic scale. 96 4.2 Experiment 5: Comparison of Time Courses of the Contingent and the Simple Auditory Motion Aftereffect Simple aMAEs are rather transient, lasting for about 1 to 3 seconds following 2 minutes of adaptation (Grantham and Wightman 1979). But this seemingly much shorter persistence of simple aftereffects compared with that of contingent aftereffects might be attributed to the shorter adapting duration. To determine whether or not the contingent and simple auditory aftereffects had similar time courses, Experiment 5 was conducted to study the persistence of the simple aMAE elicited with the same adaptation duration as that used in the induction of the contingent aftereffect. The decay of the simple aMAE was then compared with that of the contingent aftereffect. 4.2.1 Methods Three subjects (S.L., J.Q. and CD.) who had participated in Experiment 4 were tested in this experiment. The adapting and the test stimulus were a 1-octave bandpass noise with a constant center frequency of 1.5 kHz. During adaptation, the sound source repeatedly swept in a single direction (from the right to the left). The adapting duration (10 minutes), and the time sequence, velocity (307s) and spatial extent (-15° to 15°) of the adapting stimulus were the same as those used in Experiment 4. The same procedure as in Experiment 4 was used to measure simple aMAEs. For each subject, simple aMAEs were measured immediately and then 10 minutes and 1 hour after adaptation, each twice. 97 4.2.2 Results Results show that with the same adapting duration, simple auditory aftereffects decayed much faster than did contingent aftereffects (Fig. 4.5). Although their sizes were comparable to those of contingent aftereffects when measured immediately after adaptation, simple aftereffects were absent when measured 10 minutes after adaptation. —Contingent AE ••—Simple AE + Contingent AE (Control) X Simple AE (Control) 100 Ctrl Time (hrs) Fig. 4 .5 Comparison of decays of contingent and simple auditory motion aftereffects as a function of time after adaptation. The curve representing the grand average of the contingent aftereffects is obtained by pooling the data collected with a falling pitch and with a rising pitch, after the sign of the aftereffect measured with the falling pitch is reversed, i.e., the negative sign is changed to positive and vice versa. Error bars indicate ±1 standard error of the mean. The time axis is logarithmic. 98 4.3 Experiment 6: An Intensity Contingent Spectral Motion Aftereffect Since the auditory spectral MAE (Shu, Swindale and Cynader 1993) and the changing-loudness aftereffect (Reinhardt-Rutland 1992, 1995) have been previously demonstrated, in this experiment, directions of sound motion in frequency and in intensity were paired to determine whether a spectral MAE could be contingent on changing intensity. 4.3.1 Methods Three subjects (S.L., T.S. and CD.) participated in this experiment. The experimental procedure was similar to that of Experiment 4. During adaptation, the subject listened to a sound moving upwards in frequency and in intensity, alternating with a sound moving downwards in frequency and in intensity. After about 6 minutes of adaptation, a series of test stimuli with changing intensity and frequency was presented. The subject had to press one of two buttons to indicate the direction of sound motion in frequency. The frequency velocity that appeared to be stationary was obtained from the 50% point on the psychometric function derived from the subject's responses. If there is a spectral MAE contingent on changing intensity, the point of subjective stationarity of frequency velocity would shift upwards when the sound intensity is increasing and shift downwards when the sound intensity is decreasing. This contingent aftereffect was measured immediately, and then 0.5 and 1 hour after adaptation. For each subject, three 99 blocks of trials were employed, at least three days apart between successive ones. The mean value and the standard error of the mean of the point of subjective stationarity were then calculated based on these three blocks of measurements. The sound stimulus was a 1-octave bandpass noise which was presented from a stationary loudspeaker directly in front of the subject (0.8 meters in distance). During the experiment, the center frequency of the stimulus sound moved in a frequency range centered at 1.5 kHz. The width of the range depended on the frequency velocity. The sound intensity changed at a velocity of 40 dB/s within the range from 55 to 95 dB SPL. Over a period of about 6 minutes of adaptation, when the intensity of the sound was increasing (rate: 40 dB/s; duration: 1 second), the center frequency of the sound moved upwards (0.7 octaves/s); and when the sound intensity was decreasing (rate: -40 dB/s; duration: 1 second), the center frequency of the sound moved down (0.7 octaves/s). The interstimulus interval was 0.2 seconds. In the control condition, the stimulus sound was the same as that in the adaptation condition, except that the sound intensity was kept constant at 75 dB SPL. During testing, a block of 120 test stimuli was presented, half of which had an increasing intensity (40 dB/s) and the other half had a decreasing intensity (-40 dB/s). The duration of the test sound was 1 second. Each test stimulus could have one of six frequency velocities (0.1, 0.3 and 0.5 octaves/s, either moving upwards or downwards). The order of presentation of the test stimuli with different velocities in intensity and in frequency was randomized. 100 4.3.2 Results Under the control condition, the normal effect of intensity of a sound on the perception of pitch of the sound was observed. Generally, when the intensity of a sound (constant in frequency at 1.5 kHz) was increasing, the pitch of the sound was heard as drifting downwards; when the intensity of such a sound was decreasing, the pitch of the sound was perceived as moving upwards. There was also a considerable inter-subject variability in the size of the pitch shift. For two of the subjects (T.S. and CD.) tested, the pitch shifts were measurable, whereas for the third subject (S.L.), the pitch of the sound remained relatively stable regardless of change in sound level. In order to make any existing intensity contingent spectral MAE apparent, the magnitude of the aftereffect was determined by subtracting the values obtained in the control condition from those in the adaptation condition. Figure 4.6 displays the size of the aftereffect as a function of time for three subjects tested in this experiment. Results show that after six minutes of adaptation to paired sound motion in intensity and in frequency, the perception of sound motion in frequency was influenced by sound motion in intensity. Following adaptation, the point of subjective stationarity of sound motion in frequency shifted to the higher end when the intensity of the test sound increased, and moved to the lower end when the intensity of the test sound decreased. Accordingly, when a sound with a constant center frequency was getting louder, the sound was perceived to have a falling pitch and when such a sound was getting softer, the sound was heard to have a rising pitch. This contingent aftereffect could still be elicited half an hour after adaptation for all three subjects. In figure 4.5d, the aftereffect was averaged across all three subjects for both 101 directions in intensity. Analysis of variance (ANOVA) showed that there was a significant difference (p < 0.01) among the magnitude of the aftereffect measured at different times for both directions. Results from post-hoc Newman-Keuls tests indicated that the size of the aftereffect was significantly larger when measured immediately after adaptation than that when measured 0.5 (p < 0.05) or 1 hour (p < 0.01) following adaptation. The aftereffect measured 1 hour after adaptation was not statistically different (p > 0.05) from that in control condition and was weaker (p < 0.05) than that obtained 0.5 hours after adaptation. Fig. 4.6 Themagnitude of the intensity contingent spectral motion aftereffect, in octaves per second, as a function of time, for three subjects. Panels a ~ c show results for different subjects. For each subject, three sets of measurement of the aftereffect were taken immediately and then 0.5 and 1 hour after adaptation. Panel d displays the average of the aftereffect across all three subjects. Error bars indicate ±1 standard error of the mean. 102 & f * £ * f » C C (D o > } : I I 7 / \ / \ At T \ \ \ \ • • •! * t • i . itiiii • • o J2 E n N ^ . o ^ q p o d o cp cp 0.05) between the adaptation and the control condition. When the measured effects were compared between the two directions of change in intensity after the data obtained from all three subjects were pooled together, no statistical difference was found (p > 0.05) either. 4.5 Discussion By pairing the direction of auditory spatial movement with the direction of motion in auditory frequency space, a spectral contingent spatial M A E was demonstrated. Successful demonstration of a second auditory contingent aftereffect, i.e. an intensity contingent spectral MAE by pairing sound motion in frequency and in intensity confirms the existence of this kind of phenomena in the auditory system. 106 These findings imply that the neural mechanisms underlying contingent aftereffects are not specific to vision but reflect general properties of sensory neural processing, and possibly, higher brain centers as well. In principle, there are two types of neural mechanism which might be proposed to account for these effects. One is that there are "built-in" double-duty units in the sensory systems that are sensitive to paired stimulus attributes and that become fatigued following repeated stimulation (McCollough 1965; Hepler 1968; Favreau, Emerson and Corbillis 1972). It seems unlikely, however, that simple neuronal adaptation or fatigue could account for aftereffects that can persist for hours or days. A second explanation is that the association between different stimulus attributes is not wired into the sensory systems, but is built up during adaptation or can be modified by some kind of cortex-based learning (Mayhew and Anstis 1972; Barlow 1990). The long time course of the auditory contingent aftereffect observed in the present study suggests that this latter explanation is more likely. Studies of double contingent aftereffects have not been found in vision. In the study presented in Section 4.4, no measurable auditory double contingent aftereffect was observed. The absence of such an aftereffect, however, may not necessarily indicate the absence of the association that might be established by adaptation. Generally, failure to demonstrate a double contingent aftereffect could be attributed to one of the two following reasons or both of them. One is that the relation established between the direction of spatial motion and the direction of change in intensity by adaptation employed in Experiment 7 was too weak to be revealed by the aftereffect. This is 107 possible especially when two stimulus attributes are linked not directly but through a third attribute. The other possible reason is that in Experiment 7, the intensity contingent spectral MAE was used to induce the spectral contingent spatial MAE and the first aftereffect may be not strong enough to induce the second one. As shown in Experiment 6, the magnitude of the intensity contingent spectral MAE was in a range from 0.1 to 0.3 octaves/s. This range of spectral velocity is much lower than the spectral velocity of the adapting stimulus (0.7 octaves/s) used for inducing the spectral contingent spatial MAE (see Experiment 4 ) . As a result, the spatial MAE contingent on spectral motion might not be elicited and thus the double contingent aftereffect could not be observed. 108 5 General Discussion and Recommendations for Future Work 5 .1 Auditory Motion Processing 5.1.1 Specialized Motion Detection Mechanisms The existence of a motion aftereffect is often regarded as evidence for specialized motion-detection mechanisms in a modality. An aftereffect caused by adaptation to auditory spectral motion was earlier demonstrated in our laboratory (Shu, Swindale and Cynader 1993). After listening to a simple spectral pattern (a spectral peak or a spectral notch) moving upwards or downwards in frequency for a few minutes, the same pattern was perceived as moving in the opposite direction even though it was actually stationary. This result suggests that there exist specialized channels in the auditory system which process dynamic spectral cues. In the studies described in Chapter 3, by using real moving sounds as both adapting and test stimuli, a robust aMAE was observed even with adapting velocities as low as 107s. Within the central spatial region tested (±35° centered on the midline), the aMAE was relatively independent of azimuth. The aMAE was larger for low frequency sound than for middle and high frequency sounds, but it could be observed for both low and high frequency stimuli. The aMAE observed was spatially and frequency specific. When the magnitude of the aMAE obtained in Experiment 1 is expressed as gain, i.e. a 109 percentage of the adapting velocity, it is comparable to that of the vMAE. By using a nulling procedure, Taylor (1963) measured the velocity of the vMAE as a function of adapting velocity. In order to compare the MAE between these two modalities, the magnitude of the vMAE measured by Taylor (1963) was recalculated in terms of gain. Consistent with my results for the aMAE, his results showed that although the velocity of the vMAE increased with the adapting velocity, the gain of the vMAE expressed as a ratio with adapting velocity decreased from 23.3% to 4.8% when the adapting velocity increased from 97s to 1087s. These results suggest that, like the visual system, the auditory system contains specialized mechanisms for auditory spatial motion detection. This notion gains further support from a neurological study, which described a patient who had a specific deficit in auditory motion detection although his ability to localize stationary sounds was much less impaired (Griffiths et al. 1996). fMRI examination of this patient revealed damage in cortical areas (including the right posterior parietal cortex and the right insula) that was distinct from the primary auditory cortex, suggesting that moving sound may be processed independently from stationary sound. 5.1.2 Evidence for Hierarchical Organization of Auditory Motion Processing Comparison of results from previous studies of the aMAE with those of the present studies (see Chapter 3) suggests that auditory motion processing channels respond optimally to a combination of several cues, and that these channels may operate 110 at more than one stage of analysis. In vision, this kind of psychophysical model has been proposed for information processing of motion in depth, in which monocularly driven changing-size channels and stereoscopic-motion channels, driven by interactions of signals from the two eyes, feed a motion-in-depth stage (Regan, Beverley and Cynader 1979). In that study, an impression of motion in depth was generated by either changing the size of a retinal image or by adjusting relative velocities of the left and right retinal images (stereoscopic-motion stimulus). Using a selective adaptation procedure, they demonstrated that there exist changing-size channels and stereoscopic-motion channels in the visual system. Their finding that the impression of motion in depth elicited by changing-size stimuli can be canceled by appropriate stereoscopic-motion stimuli indicates that changing-size stimulation and stereoscopic-motion stimulation generate visual signals that converge on the same motion-in-depth stage. Studies of the auditory aftereffects show that although adaptation to an auditory stimulus providing one cue (e.g. changing ITD or IID) only produced a loudness or a displacement aftereffect (Reinhardt-Rutland 1992; Ehrenstein 1994), adaptation to a moving sound containing two or more cues, including dynamic level, temporal and spectral information, did elicit a repeatable aMAE. Thus, it is reasonable to conceive a similar hierarchical organization for the auditory motion processing (Reinhardt-Rutland 1992). That is, channels responsible for perceiving changes in sound level, spectrum and temporal cues precede motion detection channels, and motion detection channels are fed by outputs from earlier analysis at these preceding channels. Simulated sound source movement generated by varying only one of these localization cues is unable to in adequately stimulate motion detection channels, and thus fails to elicit an aMAE. However, more peripheral channels (e.g. changing level channels) may be sufficiently stimulated by such stimuli to produce an aftereffect (e.g. a loudness aftereffect). 5.2 Speculations about Neural Mechanisms underlying Contingent Aftereffects Although a variety of contingent aftereffects has been observed in vision and audition, the underlying-neurafmechanisms remain unclear. Contingent aftereffects can persist for hours, days or even weeks. The long persistence of these effects makes the "fatigue" explanation unlikely, since the time course of recovery from adaptation does not seem to match that expected from neural fatigue. According to the fatigue explanation, neuronal fatigue might be caused by the depletion of neurotransmitters due to repetitive stimulation. During recovery, the stores of neurotransmitters would be replenished. It seems, however, that the time needed to refill the neurotransmitter stores would be much shorter than the time required for recovery from contingent aftereffects. Thus, the notion of neuronal fatigue can not account for long lasting contingent aftereffects. Alternatively, the long time course of contingent aftereffects seems to suggest that they might be related to some kind of learning processes. Activity-dependent long-term changes in synaptic efficacy are thought to form a basis for learning and memory. In 112 order to provide a better explanation for a model originally proposed by Barlow (1990) for contingent aftereffects, studies of use-dependent malleability of synaptic transmission are briefly reviewed as follows. About a century ago, Cajal (1911) proposed that external events are represented in the brain as spatiotemporal patterns, and that it is these patterns of activity that might result in strengthening of synapses. Hebb (1949) later refined these ideas, postulating that an increase in synaptic strength will occur when the presynaptic and postsynaptic elements of a synapse are activated simultaneously. In his original statement, known as Hebbian learning or associative rules, Hebb wrote that "when an axon of a cell A is near enough to excite cell B or repeatedly or persistently takes place in firing it, some growth or metabolic change takes place in both cells such that A's efficiency, as one of cells firing B, is increased". Such synapses were first identified in the hippocampus of the rabbit (Bliss and Lomo 1973; Bliss and Gardner-Medwin 1973). Following a brief period of tetanic stimulation (trains of high-frequency pulses, e.g. 100 Hz for 1 second) to monosynaptic excitatory pathways in the hippocampus, an abrupt increase in the efficiency of synaptic transmission can be observed, which can last for many hours (Fig. 5.1). When induced in the free moving animal, this effect can persist for days. This sustained increase in synaptic efficacy is called long-term potentiation (LTP). Since the discovery of LTP in the hippocampus, activity-dependent enhancement of synaptic transmission has been demonstrated in a number of other brain structures including the cerebellar cortex, neocortex, striatum and nuclear accumbens, and by using many other stimulus parameters. In most cases, induction of LTP is input-specific and associative: 113 synaptic modifications are restricted to the activated inputs, and synapses are potentiated only if they are active while the postsynaptic dendrite is sufficiently depolarized (Andersen et al. 1977; Lynch, Dunwiddie and Gribkoff 1977; Kelso, Ganong and Brown 1986; Wigstrom et al. 1986; Sastry, Goh and Auyeung 1986; Bliss and Collingridge 1993; Pike etal. 1999). 55 —100n o ~ 50 ] * o-fe& CL in CL 6 -50 2 3 hrs 4 Time (h) Fig. 5.1 An example of long-term potentiation (LTP) in the perforant pathway of the hippocampus recorded in vivo. The evoked response (population excitatory postsynaptic potential, or EPSP) was recorded from the cell body region in response to constant test stimuli, for 1 hour before and 3 hours after a tetanus (250 Hz, 250 ms). The arrow indicates the time when the tetanus was delivered. (Adapted from Bliss and Collingridge, 1993). More recently, studies showed that pairing of subthreshold excitatory postsynaptic potentials (EPSPs) caused by stimulation of presynaptic neurons with back-propagating postsynaptic action potentials (APs) could induce robust LTP. When these stimuli were unpaired or EPSPs were paired with non-back-propagating APs, LTP was absent. 1 1 4 (Magee and Johnston 1997; Markram et al. 1997; Bi and Poo 1998). These results provide direct evidence for Hebbian learning rules and suggest that, in Hebbian synapses, back-propagating APs may serve as an associative signal which informs the synaptic input region that an output has occurred. Although Hebbian learning rules were initially proposed for excitatory pathways, studies have suggested that they are also applied to inhibitory pathways, i.e. correlated presynaptic and postsynaptic activities can potentiate the efficiency of inhibitory synaptic transmission (Kairiss et al. 1987; Errington, Haas and Bliss 1988; Goddard et al. 1988; Komatsu 1994; Xie et al. 1995; Perez et al. 1999; Krnjevic and Zhao 2000). This is called Hebbian modification of inhibitory synapses. For example, Komatsu (1994) studied LTP of inhibitory synaptic transmission in rat visual cortex using intracellular recording in slice preparation. LTP of inhibitory postsynaptic potentials (IPSP) of layer V cells was induced by delivering stimulation to presynaptic cells of layer IV. It was found that elicited LTP of IPSP shared similar properties as those of LTP of EPSP in the hippocampus, that is, input-specificity, associativity and cooperativity (existence of an intensity threshold for induction). Extensive studies have been carried out to elucidate the cellular and molecular mechanisms underlying long-term potentiation of synaptic transmission. It is now believed that LTP is induced by the activation of the N-methyl-D-aspartate (NMDA) subtype of glutamate receptor. In the resting state, the NMDA receptor channel is voltage-dependently blocked by Mg 2 + . For the NMDA channel to open, postsynaptic 115 depolarization must reach a threshold level in order to expel MgI+ from the NMDA channel at the same time that glutamate binds the NMDA receptor. Ca 2 + then influxes into the postsynaptic neuron through open NMDA channels. This will trigger the biochemical cascades which lead to the persistent enhancement of synaptic efficiency. Although Hebbian modification of inhibitory synapses has been previously suggested as a cause of simple aftereffects (e.g. Blakemore, Carpenter and Georgeson 1971; Wilson 1975; Saul and Cynader 1989b), Barlow (1990) may be the first to suggest that such modification by adaptation may be involved in the induction of visual contingent aftereffects. In his theoretical model, Barlow proposed that the neurons in a network mutually inhibit each other and that the strength of inhibitory synapses between these neurons increases corresponding to the frequencies of contingencies that have occurred. This model can also account for the auditory contingent aftereffects described in Chapter 4. When taking into account the features of long-term potentiation of synaptic transmission reviewed above, a clearer and better explanation can be offered. For example, when applied to the spectral contingent spatial MAE (see section 4.1), the model may work as follows (Fig. 5.2): 116 spectral motion spatial motion t t output Fig 5.2 Schematic diagram of a model for contingent aftereffects. The essence of the model is mutual inhibition between neurons sensitive to spectral and spatial sound motion via modifiable inhibitory synapses. During adaptation, neuron A and neuron B, which are sensitive to upward spectral motion and leftward spatial motion respectively, are activated simultaneously. According to Hebbian learning rules, this simultaneous activation of presynaptic and postsynaptic elements of inhibitory synapse 1 and synapse 2 will result in a persistent increase in efficiency of synaptic transmission in these inhibitory synapses. Following adaptation, when a sound which is stationary in azimuth but has a rising pitch is presented, neuron A is activated and exerts inhibitory influence on neuron B through inhibitory synapse 2. As a result, the activities of leftward motion-sensitive neurons are suppressed and the activities of neurons sensitive to rightward motion become dominant, causing the sound that is stationary in azimuth to be perceived as moving to the right. Since use-dependent synaptic potentiation can persist for hours or days, it is not surprising that the spectral contingent spatial MAE observed in Experiment 4 could last for 4 hours in some subjects. 1 1 7 Thus, I propose that perceptual learning in the auditory system may be governed by the same kind of synaptic modification rules responsible for contingent aftereffects in the visual system. 5.3 Recommendations for Future Work Although, with the adaptation aftereffects as the "electrode", some meaningful observations on signal processing in the auditory system have been observed, I am clearly aware of how blunt this instrument really is. Without the help of other techniques, such as functional neuroimaging and single-unit recording, studies of adaptation aftereffects can only provide us with the crudest information about where in the brain these adaptation phenomena are generated and how the neuronal activity changes during and after adaptation. These questions needed to be elucidated before we can fully appreciate adaptation aftereffects and their implications for the mechanisms of auditory motion detection and other related processes. Therefore, I would like to suggest two directions for future research. One direction is the localization of the sites at which the auditory adaptation aftereffects are produced using functional magnetic resonance imaging (fMRI). In vision, fMRI has been employed to locate the vMAE in the human brain (Tootell et al. 1995; He, Cohen and Hu 1998). In the fMRI study by Tootell et al. (1995), the blood oxygen level-dependent (BOLD) signal was measured by fMRI under two conditions. In the adaptation condition, the subject was adapted to concentric circles moving in a single 118 direction, expanding or contracting. Following adaptation, when stationary concentric rings were presented, the subject experienced the vMAE. In the control condition, during adaptation, the subject was exposed to alternately expanding and contracting patterns and no vMAE was elicited postadaptation. They found that although the amplitude of the MRI signal was equally increased during adaptation in both conditions, the falloff in signal strength during testing with stationary patterns was much slower in the adaptation condition than that in the control condition (Fig. 5.3). The strongest effect was found in the cortical area MT (the middle temporal area, or V5). Smaller fMRI vMAEs were observed in areas V2 and V3a, but no vMAE-specific activation in VI . Further analysis showed that the time course of the prolonged MRI activation was essentially identical to that of the psychophysical vMAE (best-fit exponents: 8.3 seconds and 9.2 seconds, respectively). The result that area MT/V5 is the major site for the vMAE was confirmed by a more recent fMRI study (He, Cohen and Hu 1998). 119 01 cn c a o -05 40 80 120 160 200 240 280 320 T«me(s) Fig. 53 Time course of averaged magnetic resonance imaging (MRI) signal during adaptation to expanding or contracting concentric rings and test with stationary patterns. Prolonged elevation of the MRI signal amplitude was associated with the visual motion aftereffect induced in the subject after 40 seconds of viewing the concentric rings moving in a single direction (expanding (Exp) or contracting (Con)). MRI amplitude following adaptation to reversing-direction (Exp/Con) stimuli (no motion aftereffect was elicited) returned to the steady-state level more promptly. The result appears to reflect direction-specific adaptation. (Adapted from Tootell et al. 1995). 120 In the auditory system, neurophysiological and functional neuroimaging studies have shown that the midbrain structures (e.g. Spitzer and Semple 1993; Yin and Kuwada 1983), the primary auditory cortex (Stumpf, Toronchuk and Cynader 1991; Ahissar et al. 1992; Poirer et al. 1997; Baumgart et al. 1999), and the right parietal cortex (Griffiths, Rees, Witton, et al. 1996; Griffiths, Rees, Rees, et al. 1998) are involved in the processing of auditory motion information. With fMRI, Baumgart et al. (1999) found an area in the right auditory cortex which was activated more strongly by a moving sound than other fields of the auditory cortex. Activity of this area could distinguish whether a sound pattern was moving or stationary. In the study by Griffiths and his colleagues (1998), sound stimuli were used which contained identical changes in the phase and intensity at the two ears but produced different perception of sound movement. In one condition, sound motion was generated by varying the IPD and IID in the same direction. In the other condition, a stationary sound image was produced by changing the IPD and IID in opposite directions. The activity in the brain areas was measured by fMRI BOLD responses and by positron emission tomography (PET) which reflects regional cerebral blood flow. Both fMRI and PET studies showed differential activation in the right parietal lobe by moving sound and stationary sound. These cortical areas are likely to be the candidates that mediate the aMAE. But, until the data from fMRI studies are obtained, we may not be able to pinpoint the sites responsible for generating the aMAE. Similarly, neuroimaging techniques can be applied to studies of contingent aftereffects. The second direction I would like to recommend for future work is the single-unit recordings in the auditory system. As early as 1961, Sutherland offered a hypothesis 121 concerning the neural basis of the vMAE (Sutherland 1961). He proposed that "the direction in which something is seen to move might depend upon the ratios of firing in cells sensitive to movement in different directions, and after prolonged movement in one direction a stationary image would produce less firing in the cells which had just been stimulated than normally, hence apparent movement in the opposite direction would be seen to occur" (p. 227). Sutherland's prediction of adaptation effects in single visual neurons was first confirmed by a classical study of Barlow and Hill (1963), in which the firing rate of motion-sensitive ganglion cells in the rabbit retina was recorded during and following adaptation (Fig. 5.4). At the beginning of about 1 minute of adaptation, the firing rate of a ganglion cell increased abruptly and then gradually declined over the duration of adaptation. Immediately following the adaptation, the firing rate fell below its baseline level, recovering about over 30 seconds. This time course is closely related to that of the perceptual visual motion aftereffect. The ratio model proposed by Sutherland gained further support from numerous subsequent studies recording from cat and monkey cortical cells (e.g. Maffei et al. 1973; Von der Heydt et al. 1978; Petersen, Baker and Allman 1985; Marlin, Hasan and Cynader 1988; Giaschi et al. 1993). Single-unit studies of the auditory motion adaptation, however, are rare. These studies may provide information about what actually happens to the discharge rate of the auditory neurons during motion adaptation, how preceding adaptation affects the postadaptation activity of neurons sensitive to the adapted and opposite directions, and how adaptation affects spontaneous activity of neurons under study. With the information, we will be able to evaluate Sutherland's ratio model in the auditory system and then provide a more precise account of the aMAE. 122 UJ 10 _J a 2 201 IO START MOTION — H P STOP NULL DIRECTION i >—T—I • — / / — i rr 1 •—II—• ' O I0 t 20 30 60 7q 80 90 120 DO TIME SECONDS Fig. 5.4 Firing rate of a ganglion cell in the rabbit retina as a function of time in response to prolonged motion stimulation. Upper panel: motion adaptation in the preferred direction resulted in an exponential decline in firing rate. When motion stopped, the cell's discharge dropped to zero. Lower panel: adaptation in the null direction caused no change in the activity of the cell compared to the control. (Adapted from Barlow and Hill, 1963) 5.4 Conclusion Inspired by studies of selective adaptation in the visual system, the auditory adaptation aftereffects have been studied in this thesis. In the first series of experiments, 123 a repeatable and robust aMAE was demonstrated when real moving sound was used as the adapting and test stimulus. The aMAE observed was spatially and frequency specific. These results suggest that the auditory system contains specialized motion detection channels, and that the processing of motion information may proceed through a hierarchy of stages. With appropriate experimental design, for the first time, two different kinds of contingent aftereffects were observed in the auditory system, i.e. the spectral contingent spatial MAE and the intensity contingent spectral MAE. The present findings demonstrated further parallels in the processing of sensory information between the auditory and the visual "system and suggest that, like in the visual system, some properties - of neural circuitry are not wired into the auditory system, but are developed as a function of recent usage and by some kind of learning processes. The neural mechanisms of this sort of short-term plasticity are likely to be the same across different sensory modalities, at least for audition and vision. 124 References Ahissar M , Ahissar E, Bergman H and Vaddia E. (1992) Encoding of sound-source location and movement: activity of single neurons and interactions between adjacent neurons in the monkey auditory cortex. Journal of Neurophysiology, 67(1): 203-215. Aitkin L M and Webster WR. (1972) Medial geniculate body of the cat: organization and responses to tonal stimuli of neurons in the ventral division. Journal of Neurophysiology, 35:365-380. Albright TD. (1984) Direction and orientation selectivity of neurons in visual area MT of the macaque. Journal of Neurophysiology, 52:1106-1130. Altman JA. (1968) Are there neurons detecting direction of sound source motion? Experimental Neurology, 22(1): 13-23. Altman JA, Syka J and Shmigidina GN. (1970) Neuronal activity in the medial geniculate body of the cat during monaural and binaural stimulation. Experimental Brain Research, 10: 81-93. Altman JA. (1971) Neurophysiological mechanisms of sound source localization. In: Gersuni (Ed.). Sensory Processes at the Neuronal and Behavioral Levels. New York: Academic Press, pp. 221-224. Altman JA. (1978) Sound localization: neurophysiological mechanisms. In: Tonndorf J (ed) Translations of the Beltone Institute for Hearing Research, No. 30. Chicago: The Beltone Institute for Hearing Research. Altman JA. (1987) Information processing concerning moving sound sources in the auditory centers and its utilization by brain integrative and motor structures. 125 In: Syka and Masterton (Eds.)- Auditory Pathway: Structure and Function. New York: Plenum Press. Altman JA and Viskov OV. (1977) Discrimination of perceived movement velocity for fused auditory image in dichotic stimulation. Journal of the Acoustical Society of America, 61:816-819. Andersen P, Sundberg SH, Sveen O and Wigstrom H. (1977) Specific long-lasting potentiation of synaptic transmission in hippocampal slices. Nature, 266:736-737. Anstis SM and Duncan K. (1983) Separate aftereffects of motion from each eye and from both eyes. Vision Research, 23:161-169. Barlow HB. (1990) A theory about the function role and synaptic mechanism of visual aftereffects. Vision: Coding and Efficiency, Blakemore (ed). Cambridge Univ. Press, pp. 363-375. Barlow HB and Hill RM. (1963) Evidence for a physiological explanation of the waterfall phenomenon and figure after-effects. Nature, 200:1345-1347. Batra R, Kuwada S and Stanford TR. (1989) Temporal coding of envelopes and their • interaural delays in the inferior colliculus of the unanesthetized rabbit. Journal of Neurophysiology, 61: 257-268. Batteau DW. (1967) The role of the pinna in human localization. Proc. R. Soc. B (London), 168:158-180. Batteau DW. (1968) Listening with the naked ear. In: Freedman, (Ed.), The Neuropsychology of Spatially Oriented Behavior. Homewood, IL: Dorsey Press, pp. 109-133. 126 Baumgart F, Gaschler-Markefski B, Woldorff MG, Heinze H and Scheich H. (1999) A movement-sensitive area in auditory cortex. Nature, 400:724-725. Belendiuk K and Butler RA. (1975) Monaural location of low-pass noise bands in the horizontal plane. Journal of the Acoustical Society of America, 58:701-705. Benson DA and Teas DC. (1976) Single unit study of binaural interaction in the auditory cortex of the chinchilla. Brain Research, 103: 313-338. Beverley KI and Regan D. (1973) Evidence for the existence of neural mechanisms selectively sensitive to the direction of movement in space. Journal of Physiology, 235(l):17-29. Bi G-Q and Poo M-M. (1998) Synaptic modifications in cultured hippocampal neurons: dependence on spike timing, synaptic strength, and postsynaptic cell type. Journal ofNeuroscience, 18:10464-10472. Blakemore C, Carpenter RHS and Georgeson MA. (1971) Lateral thinking about lateral inhibition. Nature, 234:418-419. Blauert J.(1969) Sound localization in the median plane. Acustica, 22: 205-213. Blauert J. (1997) Spatial Hearing. London: MIT Press. Bliss TVP and Collingridge GL. (1993) A synaptic model of memory: long-term potentiation in the hippocampus. Nature, 361: 31-39. Bliss TVP and Gardner-Medwin AR. (1973) Long-lasting potentiation of synaptic transmission in the dentate area of the unanaesthetized rabbit following stimulation of the perforant path. Journal of Physiology (London), 232(2):357-374. 127 Bliss TVP and Lomo T. (1973) Long-lasting potentiation of synaptic transmission in the dentate area of the anaesthetized rabbit following stimulation of the perforant path. Journal of Physiology (London), 232(2):331-356. Bloom PJ. (1977a) Creating source elevation illusions by spectral manipulation. Journal of Audio Engineering Society, 25:560-565. Bloom PJ. (1977b) Determination of monaural sensitivity changes due to the pinna by use of minimum-audible-field measurements in the lateral plane. Journal of the Acoustical Society of America, 61:820-828. Boudreau JC and Tsuchitani C. (1968) Binaural interaction in the cat superior olive S segment. Journal of Neurophysiology, 31:442-454. Brigner WL. (1982) Spatial frequency selectivity of spiral aftereffect. Perceptual and Motor Skills, 55:1129-1130. Brigner WL. (1986) Is velocity of motion aftereffect proportional to velocity of induction? Perceptual and Motor Skills, 63:362. Brugge JF, Anderson DJ and Aitkin L M . (1970) Responses of neurons in the dorsal nucleus of the lateral lemniscus of cat to binaural tonal stimulation. Journal of Neurophysiology, 33:441-458. Brugge JF, Dubrovsky NA, Aitkin L M and Anderson DJ. (1969) Sensitivity of single neurons in auditory cortex of cat to binaural tonal stimulation: effects of varying interaural time and intensity. Journal of Neurophysiology, 32: 1005-1024. Brugge JF and Merzenich M M . (1973) Patterns of activity of single neurons of the auditory cortes of monkey. In: Moller AR (ed), Basic Mechanisms in Hearing. New York: Academic, pp. 745-772. Burkhard MD and Sachs RM. (1975) Anthropometric manikin for acoustic research. Journal of Acoustical Society of America. 58:214-222. Bulter RA. (1971) The monaural localization of tonal stimuli. Perception and Psychophysics, 9:99-101. Butler RA. (1987) An analysis of the monaural displacement of sound in space. Perception and Psychophysics, 41:1-7. Bulter RA and Belendiuk K. (1977) Spectral cues utilized in the localization of sound in the median sagittal plane. Journal of the Acoustical Society of America, 61:1264-1269. Butler RA and Flannery R. (1980) The spatial attributes of stimulus frequency and their role in monaural localization of sound in the horizontal plane. Perception and Psychophysics, 28:449-457. Butler RA, Humanski, and Musicant AD. (1990) Binaural and monaural localization of sound in two-dimensional space. Perception, 19:241-256. Caird D and Klinke R. (1983) Processing of binaural stimuli by cat superior olivary complex neurons. Experimental Brain Research, 52:385-399. Cajal SR. (1911) Histologic du Systeme Nerveux de l'Homme et des Vertebres. Paris: Malone. Vol. 2. Calford MB (1983) The panellation of the medical geniculate body of the cat defined by the auditory response properties of single units. Journal of Neuroscience, 3: 2350-2364. Cameron EL, Baker CL and Boulton JC. (1992) Spatial frequency selective mechanisms underlying the motion aftereffect. Vision Research, 32:561-568. 129 Cant NB and Casseday JH. (1986) Projections from the anteroventral cochlear nucleus to the lateral and medial superior olivary nuclei. Journal of Comparative Neurology, 247:457-476. Carlile S and King AJ. (1993) Monaural and binaural spectrum level cues in the ferret: Acoustics and the neural representation of auditory space. Journal of Neurophysiology, 71:785-801. Chan JCK, Yin TCT and Musicant AD. (1987) Effects of interaural time delays of noise stimuli on low-frequency cells in the cat's inferior colliculus. II. Responses to band-pass filtered noises. Journal of Neurophysiology, 58:543-561. Chandler DW and Grantham DW. (1992) Minimum audible movement angle in the horizontal plane as a function of stimulus frequency and bandwidth, source azimuth, and velocity. Journal of the Acoustical Society of America, 91(3): 1624-1636. Clarey JC, Barone P and Imig TJ. (1992) Physiology of Thalamus and Cortex. In: Popper and Fay (eds), The Mammalian Auditory Pathway: Neurophysiology. New York: Springer-Verlag, pp. 232-334. Cynader MS and Regan D. (1978) Neurons in cat parastriate cortex sensitive to the direction of motion in three-dimensional space. Journal of Physiology, 274: 549-69. Cynader MS and Regan D. (1982) Neurons in cat visual cortex tuned to the direction of motion in depth: effect of positional disparity. Vision Research, 22: 967-982. 130 Dodwell PC and Humphrey GK. (1990) A functional theory of the McCollough effect. Psychological Review, 97:78-89. Dong CJ, Swindale NV and Cynader MS. (1999) A contingent aftereffect in the auditory system. Nature Neuroscience, 10:863-865. Dong CJ, Swindale NV, Zakarauskas P, Hayward V and Cynader MS. (2000) The auditory motion aftereffect: its tuning and specificity in the spatial and frequency domains. Perception and Psychophysics, 62(5): 1099-1 111. Dursteler MR and Wurtz RH. (1988) Pursuit and optokinetic deficits following chemical lesions of cortical areas MT and MST. Journal of Neurophysiology, 60:940-965. Ehrenstein WH. (1994) Auditory aftereffects following simulated motion produced by varying interaural intensity or time. Perception, 23:1249-1255. Errington ML, Haas HL and Bliss TVP. (1988) Long-term potentiation in the recurrent inhibitory circuit of the hippocampus. In: Synaptic plasticity in the Hippocampus (eds. Haas HL and Buzaski G), Berlin: Springer, pp. 42-44. Favreau OE, Emerson VF and Corballis MC. (1972) Motion perception: a colour contingent after-effect. Science, 176:78-79. Feddersen WE, Sandel TT, Teas DC, Jeffress LA. (1957) Localization of high-frequency tones. Journal of the Acoustical Society of America, 29: 988-991. Finney DJ. (1971) Probit analysis (3rd edn.), Cambridge Univ. Press. Frisby JP. (1979) Seeing. Illusion, Brain and Mind. Oxford, U.K., Oxford University Press. 131 Gardner MB and Gardner RS. (1973) Problem of localization in the median plane: effect of pinnae cavity occlusion. Journal of the Acoustical Society of America, 53: 400-408. Gates LW (1934) The after-effect of visually observed movement. American Journal of Psychology, 46:34-46 Giaschi D, Douglas R, Marlin S and Cynader MS. (1993) The time course of direction-selective adaptation simple and complex cells in cat striate cortex. Journal of Neurophysiology, 70(5): 2024-2034. Glendenning K K and Masterton RB. (1983) Acoustic chiasm: Efferent projections of the lateral superior olive. Journal of Neuroscience, 3:1521 -1537. Goddard GV, Kairiss EW, Abraham WC and Bilkey DK. (1988) Long-term potentiation of feed-forward inhibition in hippocampus: extracellular evidence. In: Synaptic Plasticity in the Hippocampus (eds Haas HL and Buzaki G), Berlin: Springer, pp. 3-5. Goldberg JM and Brown PB. (1969) Response of binaural neurons of dog superior olivary complex to dichotic tonal stimuli: Some physiological mechanisms of sound localization. Journal of Neurophysiolgy, 32: 613-636. Grantham DW and Wightman FL. (1979) Auditory motion aftereffects. Perception and Psychophysics, 26(5): 403-408. Grantham DW. (1984) Interaural intensity discrimination: insensitivity at 1000 Hz. Journal of the Acoustical Society of America, 75(4): 1191-1194. 132 Grantham DW. (1986) Detection and discrimination of simulated motion of auditory targets in the horizontal plane. Journal of the Acoustical Society of America, 79(6): 1939-1949. Grantham DW. (1989) Motion aftereffects with horizontal moving sound sources in the free field. Perception and Psychophysics, 45(2): 129-136. Grantham DW. (1995) Spatial hearing and related phenomena. In: Moore (Ed.), Hearing. San Diego: Academic Press, pp. 297-339. Grantham DW. (1998) Auditory motion aftereffects in the horizontal plane: the effects of spectral region, spatial sector, and spatial richness. Acustica/Acta Acustica, 84(2): 337-347. Griffiths TD, Rees G, Rees A, Green G, Witton C, Rowe D, Buchel C, Turner R and Frackowiak R. (1998) Right parietal cortex is involved in the perception of sound movement in humans. Nature Neuroscience, 1:74-79. Griffiths TD, Rees A, Witton C, Shakir RA, Henning GB and Green GG. (1996) Evidence for a sound movement area in the human cerebral cortex. Nature, 383:425-427. Hafter ER and De Maio J. (1975) Difference thresholds for interaural delay. Journal of the Acoustical Society of America, 57:181-187. Hafter ER, Dye RH, Nuetzel JM and Aronow H. (1977) Difference thresholds for interaural intensity. Journal of the Acoustical Society of America, 61:829-834. Harris JD. (1972) A florilegium of experiments on directional hearing. Acta Otolaryngologica, Suppl. 298:1-26 133 Harris JD and Sergeant RL. (1971) Monaural-binaural minimum audible angles for a moving sound source. Journal of Speech and Hearing Research, 14:618-629. He S, Cohen ER and Hu X. (1998) Close correlation between activity in brain area MT/V5 and the perception of a visual motion aftereffect. Current Biology, 8:1215-1218. Hebb DO. (1949) The Organization of Behavior. New York: Wiley. Hebrank J and Wright D. (1974a) Are two ears necessary for localization of sound sources on the median plane? Journal of the Acoustical Society of America, 56: 935-938. Hebrank J and Wright D. (1974b) Spectral cues used in the localization of sound sources on the median plane. Journal of the Acoustical Society of America, 56: 1829-1834. Held R and Shattuck SR. (1971) Colour and edge sensitive channels in the human visual system: tuning for orientation. Science, 174:314-316. Henning GB. (1974) Detectability of interaural delay in high-frequency complex waveforms. Journal of the Acoustical Society of America, 55:84-90. Henning GB. (1980) Some observations on the lateralization of complex waveforms. Journal of the Acoustical Society of America, 68:446-454. Hepler N . (1968) Color: A motion-contingent color aftereffect. Science, 162:376-377. Herschenson M . (1989) Duration, time constant and decay of the linear motion aftereffect as a function of inspection duration. Perception and Psychophysics, 45:251-257. 134 Herschenson M . (1993) Linear and rotation aftereffects as a function of inspection duration. Vision Research, 33:1913-1919. Holland HC. (1957) The archimedes spiral. Nature, 179: 432-433. Holland HC (1975) The Spiral After-Effect. Oxford, U.K., Pergamon. Hubel DH and Wiesel TN. (1959) Receptive fields of single neurons in the cat's striate cortex. Journal of Physiology, 148: 574-591. Hubel DH and Wiesel TN. (1962) Receptive fields, binocular interactions and functional architecture in the cat's visual cortex. Journal of Physiology, 160:106-154. Hubel DH and Wiesel TN. (1965) Receptive fields and functional architecture in two nonstriate visual areas (18 and 19) of the cat. Journal of Physiology, 28: 229-249. Imig TJ, Bibikov NG, Poirier P and Samson FK. (2000) Directionality derived from pinna-cue spectral notches in cat dorsal cochlear nucleus. Journal of Neurophysiology, 83:907-925. Irvine DFR. (1992) Physiology of the Auditory Brainstem. In: Popper and Fay (eds), The Mammalian Auditory Pathway:^Neurophysiology. New York: Springer-Verlag, pp. 153-231. Irvine DFR and Gago G. (1990) Binaural interaction in high-frequency neurons in inferior colliculus of the cat: Effects of variations in sound pressure level on sensitivity to interaural intensity differences. Journal of Neurophysiology, 63:570-591. 135 Ivarsson C, de Ribaupierre Y and de Ribaupierre F. (1988) Influence of auditory localization cues on neuronal activity in the auditory thalamus of the cat. Journal of Neurophysiology, 59: 586-606. Jeffress LA. (1948) A place theory of sound localization. Journal of Comparative and Physiological Psychology, 41:35-39. Jones PD and Holding DH. (1975) Extremely long-term persistence of the McCollough effect. Journal of Experimental Psychology: Human Perception and Performance, 4:323-327. Kairiss EW, Abraham WC, Bilkey DK and Goddard GV. (1987) Field potential evidence for long-term potentiation of feed-forward inhibition in the rat dentate gyrus. Brain Research, 401:87-94. Kelso SR, Ganong AH and Brown TH. (1986) Hebbian synapses in hippocampus. Proceedings of the National Academy of Sciences USA, 83:5326-5330. King AJ, Moore DR and Hutchings ME. (1993) Topographic representation of auditory space in the superior colliculus of adult ferrets after monaural deafening in infancy. Journal of Neurophysiology, 71:182-194. Kistler DJ and Wightman FL. (1992) A model of head-related transfer functions based on principal components analysis and minimum-phase reconstruction. Journal of the Acoustical Society of America, 91:1637-1647. Klumpp R and Eady H. (1956) Some measurements of interaural time differences thresholds. Journal of the Acoustical Society of America, 28:859-864. Komatsu Y. (1994) Age-dependent long-term potentiation of inhibitory synaptic transmission in rat visual cortex. Journal of Neuroscience, 14:6488-6499. 136 Krnjevic K and Zhao YT. (2000) 2-Deoxyglucose-induced long-term potentiation of monosynaptic IPSPs in CAI hippocampal neurons. Journal of Neurophysiology, 83:879-887. Kuhn GF (1987) Physical acoustics and measurements pertaining to directional hearing. In: Yost and Gourevitch (Eds), Directional Hearing New York: Springer-Verlag. pp. 3-25 Lappin JS, Bell HH, Harm OJ and Kottas B. (1975) On the relation between time and space in the visual discrimination of velocity. Journal of Experimental Psychology, l(4):383-394. Litovsky RY and Macmilan NA. (1994) Sound localization precision under conditions of the precedence effect: Effects of azimuth and standard stimulus. Journal of the Acoustical Society of America, 96:752-758. Lovegrove WJ and Over R. (1972) Color adaptation of spatial frequency detectors in the human visual system. Science, 176:541-543. Lynch GS, Dunwiddie T and Gribkoff V. (1977) Heterosynaptic depression: a postsynaptic correlate of long-term potentiation. Nature, 266:737-739. Mackay DM and Mackay V. (1975) Dichoptic induction of McCollough-type effects. Quarterly Journal of Experimental Psychology, 27:225-233. Maffei L, Fiorentini A and Bisti S. (1973) Neural correlate of perceptual adaptation to gratings. Science, 182:1036-1038. Magee JC and Johnston D. (1997) A synaptically controlled, associative signal for Hebbian plasticity in hippocampal neurons. Science, 275:209-213. Markram H, Lubke J, Frotscher M and Sakmann B. (1997) Regulation of synaptic efficacy by coincidence of postsynaptic APs and EPSPs. Science, 275:213-215. Marlin SG, Hasan SJ and Cynader MS. (1988) Direction-selective adaptation in simple and complex cells in cat striate cortex. Journal of Neurophysiology, 59:1314-1330. Mayhew JW and Anstis SM. (1972) Movement aftereffects contingent on color, intensity, and pattern. Perception and Psychophysics, 12:77-85. McCollough C. (1965) Colour adaptation of edge detectors in human visual system. Science, 149:115-116. McFadden D and Moffitt CM. (1977) Acoustic integration for lateralization at high frequencies. Journal of the Acoustical Society of America, 61:1604-1608. McFadden D and Pasanen EG. (1976) Lateralization at high frequencies based on interaural time differences. Journal of the Acoustical Society of America, 59:634-639. Middlebrooks JC. (1997) Spectral shape cues for sound localization. In: Gilkey and Anderson (Eds), Binaural and Spatial Hearing in Real and Virtual Environments. Mahwah, New Jersey: Lawrence Erlbaum Associates, Publishers, pp. 77-97. Middlebrooks JC, Makous JC and Green DM. (1989) Directional sensitivity of sound pressure levels in the human ear canal. Journal of the Acoustical Society of America, 86:89-108. Middlebrooks JC and Green DM. (1990) Directional dependence of interaural envelope delays. Journal of Acoustical Society of America, 87:2149-2162. 138 Middlebrooks JC and Green DM. (1991) Sound localization by human listeners. Annual Review of Psychology, 42:135-159. Mills AW. (1958) On the minimum audible angle. Journal of the Acoustical Society of America, 30:237-246. Mills AW. (1960) Lateralization of high-frequency tones. Journal of the Acoustical Society of America, 32:132-134. Mills AW. (1972) Auditory localization. In: Tobias (Ed.). Foundations of Modern Auditory Theory. New York: Academic Press. Vol. 2. Moulden B. (1980) After-effects and the integration of patterns of neural activity within a channel. Philosophical Transactions of the Royal Society London. Series B: Biological Sciences, 290: 39-55. Musicent AD and Butler RA. (1984) The influence of pinnace-based spectral cues on sound localization. Journal of the Acoustical Society of America, 75:1195-1200. Musicent AD and Butler RA. (1985) Influence of monaural spectral cues on binaural localization. Journal of the Acoustical Society of America, 77:202-208. Newsome WT, Mikamf A and Wurtz RH. (1986) Motion selectivity in macaque visual cortex. III. Psychophysics and physiology of apparent motion. Journal of Neurophysiology, 55: 1340-51. Newsome WT and Pare EB. (1988) A selective impairment of motion perception following lesions of the middle temporal visual area (MT). Journal of Neuroscience, 8:2201-2211. 139 Nuetzel JM and Hafter ER. (1981) Lateralization of complex wave-forms: spectral effects. Journal of the Acoustical Society of America, 69:1112-1118. Oldfield SR and Parker SP. (1984) Acuity of sound localization: a topography of auditory space. II. Pinna cues absent. Perception, 13:601-617. Oldfield SR and Parker SP. (1986) Acuity of sound localization: a topography of auditory space. III. Monaural hearing conditions. Perception, 15:67-81. O'Shea RP and Crassini B. (1981) Interocular transfer of the motion after-effect is not reduced by rivalry. Vision Research, 21:801-804. Over R, Broerse J, Crassini B and Lovegrove W. (1973) Spatial determinants of the aftereffect of seen motion. Vision Research, 13:1681-1690. Pantle AJ and Sekuler RW. (1968) Velocity-sensitive elements in human vision: initial psychophysical evidence. Vision Research, 8(4):445-450. Perez Y, Chapman CA, Woodhall G, Robitaille R and Lacaille JC. (1999) Differential induction of long-lasting potentiation of inhibitory postsynaptic potentials by theta patterned stimulation versus 100-Hz tetanization in hippocampal pyramidal cells in vitro. Neuroscience, 90(3):747-757. Perrott DR, Buck V, Waugh W and Strybel TZ. (1979) Dynamic auditory localization: systematic replication of the auditory velocity function. Journal of Auditory Research, 19:277-285. Perrott DR, Costantino B and Ball J. (1993) Discrimination of moving events which accelerate or decelerate over the listening interval. Journal of the Acoustical Society of America, 93:1053-1057. 140 Perrott DR and Marlborough K. (1989) Minimum audible movement angle: marking the end points of the path traveled by a moving sound source. Journal of the Acoustical Society of America, 85:1773-1775. Perrott DR and Musicant AD. (1977) Minimum auditory movement angle: binaural localization of moving sound sources. Journal of the Acoustical Society of America, 62:1463-1466. Perrott DR and Tucker J. (1988) Minimum audible movement angle as a function of signal frequency and the velocity of the source. Journal of the Acoustical Society of America, 83:1522-1527. Petersen SE, Baker JF and Allman JM. (1985) Direction-specific adaptation in area MT of the owl monkey. Brain Research, 346:146-150. Pickles JO. (1988) An Introduction to the Physiology of Hearing. London: Academic Press. Pike FG, Meredith RM, Olding AW and Paulsen O. (1999) Rapid report: postsynaptic bursting is essential for 'Hebbian' induction of associative long-term potentiation at excitatory synapses in rat hippocampus. Journal of Physiology (London), 518:571-576. Plenge G. (1974) On the differences between localization and lateralization. Journal of the Acoustical Society of America, 56:944-951. Poirier P, Jiang H, Lepore F and Guillemot JP (1997) Positional, directional and speed selectivities in the primary auditory cortex of the cat. Hearing Research, 113:1-13. 141 Rauschecker JP and Harris LR. (1989) Auditory and visual neurons in the cat's superior colliculus selective for the direction of apparent motion stimuli. Brain Research, 490(1): 56-63. Reale RA and Brugge JF (1990) Auditory cortical neurons are sensitive to static and continuously changing interaural phase cues. Journal of Neurophysiology, 64: 1247-1260. Regan D and Beverley KI. (1973a) Disparity detectors in human depth perception: evidence for directional selectivity. Science, 181:877-879. Regan D and Beverley KI. (1973b) Electrophysiological evidence for existence of neurons sensitive to direction of depth movement. Nature, 246: 504-506. Regan D, Beverley KI and Cynader MS. (1979) Stereoscopic subsystems for position in depth and for motion in depth. Proceedings of the Royal Society of London. (Series B: Biological Science), 204(1157):485-501. Reinhardt-Rutland AH. (1992) Changing-loudness aftereffect follow simulated movement: implications for channel hypotheses concerning sound level change and movement. Journal of General Psychology, 119:113-121. Reinhardt-Rutland AH. (1995) Evidence for frequency-dependent and frequency-independent components in increasing- and decreasing-loudness aftereffects. Journal of General Psychology, 122:59-68. Rhode WS and Greenberg S. (1992) Physiology of the Cochlear Nuclei. In: Popper and Fay (eds), The Mammalian Auditory Pathway: Neurophysiology. New York: Springer-Verlag, pp. 94-152. 142 Richards W and Regan D. (1973) A stereo field map with implications for disparity processing. Investigative Ophthalmology, 12:904-909. Roffler SK and Butler RA. (1968) Factors that influence the localization of sound in the vertical plane. Journal of the Acoustical Society of America, 43:1255-1259. Rose JE, Gross NB, Geisler CD and Hind JE. (1966) Some neural mechanisms in the inferior colliculus of the cat which may be relevant to localization of a sound source. Journal of Neurophysiolgy, 29:288-314. Rosenblum LD, Carello C and Pastore RE. (1987) Relative effectiveness of three stimulus variables for locating a moving sound source. Perception, 16(2): 175-186. Roth GL, Aitkin L M , Andersen RA and Merzenich M M . (1978) Some features of the spatial organization of the central nucleus of the inferior colliculus of the cat. Journal of Comparative Neurology, 182:661-680. Saberi K and Perrott DR. (1990) Minimum audible movement angles as a function of sound source trajectory. Journal of the Acoustical Society of America, 88:2639-2644. Samson FK, Clarey JC, Barone P and Imig TJ. (1993) Effects of ear plugging on single-unit azimuth sensitivity in cat primary auditory cortex. I. Evidence for monaural directional cues. Journal of Neurophysiology, 70:492-511. Sarkins AJ. (1978) Psychoacoustical aspects of synthesized vertical local cues. Journal of the Acoustical Society of America, 63:1152-1165. Sastry BR, Goh JW and Auyeung A. (1986) Associative induction of posttetanic and long-term potentiation in CA1 neurons of rat hippocampus. Science, 232:988-990. 143 Saul AB and Cynader MS. (1989a) Adaptation in single units in visual cortex: the tuning of aftereffects in the spatial domain. Visual Neuroscience, 2:593-607. Saul AB and Cynader MS. (1989b) Adaptation in single units in visual cortex: the tuning of aftereffects in the temporal domain. Visual Neuroscience, 2:609-620. Schwartz IR. (1992) The superior olivary complex and lateral lemniscus nuclei. In: D.B. Webster, A.N. Popper and R.R. Fay (Eds.), The Mammalian Auditory Pathway: Neuroanatomy. Spinger, New York, pp. 117-167. Sekuler RW and Ganz L. (1963) Aftereffect of seen motion with a stabilized retinal image. Science, 139:419-420. Sekuler RW and Pantle AJ. (1967) A model of after-effects of seen movement. Vision Research, 88:1-11. Semple MN and Aitkin L M . (1979) Representation of sound frequency and laterality by units in central nucleus of cat inferior colliculus. Journal of Neurophysiology, 42:1626-1639. Semple MN and Kitzes L M . (1987) Binaural processing of sound pressure level in the inferior colliculus. Journal of Neurophysiology, 57:1130-1147. Shaw EAG. (1965) Ear canal pressure generated by a free sound field. Journal of the Acoustical Society of America, 39: 465-470. Shaw EAG. (1974) Transformation of sound pressure level from the free field to the eardrum in the horizontal plane. Journal of the Acoustical Society of America, 56:1848-1861. 144 Shaw EAG and Teranishi R. (1968) Sound pressure generated in an external ear replica and real human ears by a nearby sound source. Journal of Acoustical Society of America, 44:240-249. Shofner WP and Young ED. (1985) excitatory/inhibitory response types in the cochlear nucleus. Relationships to discharge patterns and responses to electrical stimulation of the auditory nerve. Journal of Neurophysiology, 54:917-939. Shu ZJ, Swindale NV and Cynader MS. (1993) Spectral motion produces an auditory after-effect. Nature, 364(6439): 721-723. Sovijarvi AR and Hyvarinen J. (1974) Auditory cortical neurons in the cat sensitive to the direction of sound source movement. Brain Research, 7: 455-71. Spirou BA and Young ED. (1991) Organization of dorsal cochlear nucleus type IV response maps and their relationship to activation by bandlimited noise. Journal of Neurophysiology, 66:1750-1768. Spitzer MW and Semple MN. (1991) Interaural phase coding in auditory midbrain: influence of dynamic stimulus features. Science, 254: 721-724. Spitzer MW and Semple MN. (1993) Responses of inferior colliculus neurons to time-varying interaural phase disparity: effects of shifting the locus of virtual motion. Journal of Neurophysiology, 69(4): 1245-63. Starr A and Don M . (1972) Responses of squirrel monkey (Saimiri sciureus) medial geniculate units to binaural click stimuli. Journal of Neurophysiology, 35: 501-517. Stevens SS and Newman EB. (1936) Localization of actual sources of sound. American Journal of Psychology, 48:297-306. 145 Stomeyer CF and Mansfield RW. (1970) Color aftereffects produced with moving edges. Perception and Psychophysics, 7:108-114. Strutt JW, Lord Rayleigh. (1907) On our perception of sound direction. Philos Mag, 13:214-232. Strybel TZ, Manligas CL and Perrott DR. (1992) Minimum audible movement angel as a function of the azimuth and elevation of the source. Human Factors, 34(3):265-275. Stumpf E, Toronchuk JM and Cynader MS (1992) Neurons in cat primary auditory cortex sensitive to correlates of auditory motion in three-dimensional space. Experimental Brain Research, 88(1): 158-68, Sutherland DP. (1991) A role of the dorsal cochlear nucleus in the localization of elevated sound sources. Abstracts of the Association for Research in Otalaryngology, 14:33. Sutherland NS. (1961) Figural aftereffects and apparent size. Quarterly Journal of Experimental Psychology, 13:222-228. Taylor M M . (1963) Tracking the decay of the after-effect of seen rotary movement. Perceptual and Motor Skills, 16:119-129. Tootell R, Reppas JB, Dale A M , Look RB, Sereno MI, Malach R, Brady TJ and Rosen BR. (1995) Visual motion aftereffect in human cortical area MT revealed by functional magnetic resonance imaging. Nature, 375:139-141. Trahiotis C and Bernstein LR. (1986) Lateralization of bands of noise and sinusoidally amplitude-modulated tones: effects of spectral locus and bandwidth. Journal of the Acoustical Society of America, 79:1950-1957. 146 Tsuchitani C. (1988) The inhibition of cat lateral superior olive unit excitatory responses to binaural tone-bursts. II. The sustained discharges. Journal of Neurophysiology, 59:184-211. Vaina L M , Lemay M , Bienfang DC, Choi A Y and Nakayama K. (1990) Intact "biological motion" and structure from motion" perception in a patient with impaired motion mechanism: A case study. Visual Neuroscience, 5:353-369. Von der Heydt RP, Hanny, et al. (1978) Movement aftereffects in the visual cortex. Archives Italienne de Biologie, 116:248-254. Wade NJ. (1994) A selective history of the study of visual motion aftereffects. Perception, 23:1111-1134. Wade NJ, Swanston MT and Weert CM. (1993) On interocular transfer of motion aftereffects. Perception, 22:1365-1380. Warr WB. (1966) Fiber degeneration following lesions in the anterior ventral cochlear nucleus of the cat. Experimental Neurology, 14:453-474. Waugh W, Strybel TZ and Perrott DR. (1979) Perception of moving sounds: velocity discrimination. Journal of Auditory Research, 19:103-110. Wenzel EM, Wightman FL and Kistler DJ. (1991) Localization with non-individualized virtual acoustic display cues. In: Proceedings of CHI'91, A C M Conference on Computer-Human Interaction. New York: A C M Press, pp. 351-359. Wheatstone C. (1852) Contributions to the physiology of vision. II. On some remarkable, and hitherto unobserved, phenomena of binocular vision. Philosophical Transactions of the Royal Society London. Series B: Biological Sciences, 142:1-18. 147 Wightman FL and Kistler DJ. (1993) Sound Localization. In: Human Psychophysics, Yost, Popper & Fay (Eds.), New York: Springer-Verlag. pp. 155-192. Wightman FL and Kistler DJ. (1989a) Headphone simulation of free-field listening. I: Stimulus Synthesis. Journal of the Acoustical Society of America, 85:858-867. Wightman FL and Kistler DJ. (1989b) Headphone simulation of free-field listening. II: Psychophysical validation. Journal of the Acoustical Society of America, 85:868-878. Wightman FL and Kistler DJ. (1992) The dominant role of low-frequency interaural time differences in sound localization. Journal of the Acoustical Society of America, 91:1648-1661. Wigstrom H, Gustafsson B, Huang Y Y and Abraham WC. (1986) Hippocampal long-term potentiation is induced by pairing single afferent volleys with intracellularly injected depolarizing current pulses. Acta Physiol Scand, 126(2):317-319. Wilson HR. (1975) A synaptic model for spatial frequency adaptation. Journal of Theoretical Biological, 50:327-352. Wise LZ and Irvine DRF. (1983) Auditory response properties of neurons in deep layers of cat superior colliculus. Journal of Neurophysiology, 49:674-685. Wohlgemuth A. (1911) On the after-effect of seen movement. British Journal of Psychology, Monograph Supplement 1:1-117. Woodworm RS. (1938) Experimental Psychology, New York: Holt, Rinehart and Winston. 148 Xie Z, Yip S, Morishita W and Sastry BR. (1995) Tetanus-induced potentiation of inhibitory postsynaptic potentials in hippocampal CAI neurons. Canadian Journal of Physiology and Pharmacology, 73:1706-1713. Yin TCT and Chan JCK. (1988) Neural mechanisms underlying interaural time sensitivity to tones and noise. IN: Edelman G M , Gall WE, Cowan W M (eds) Auditory Function. NewYork: Wiley, pp. 380-430. Yin TCT and Chan JCK. (1990) Interaural time sensitivity in Medial superior olive of cat. Journal of Neurophysiology, 64:465-488. Yin TCT, Chan JCK and Carney L H . (1987) Effects of interaural time delays of noise stimuli on low-frequency cells in the cat's inferior colliculus, III. Evidence for cross-correlation. Journal of Neurophysiology, 58:562-583. Yin TCT, Chan JCK and Irvine DRF. (1986) Effects of interaural time delays of noise stimuli on low-frequency cells in the cat's inferior colliculus. I. Responses to wideband noise. Journal of Neurophysiology, 55:280-300. Yin TCT, Hirsch JA and Chan JCK. (1985) Responses of neurons in the cat's superior colliculus to acoustic stimuli. II. A model of interaural intensity sensitivity. Journal of Neurophysiology, 53:746-758. Yin TCT and Kuwada S. (1983) Binaural interaction in low-frequency neurons in inferior colliculus of the cat. II. Effects of changing rate and direction of interaural phase. Journal of Neurophysiology, 50:1000-1019. Yin TCT and Kuwada S. (1984) Neuronal mechanisms of binaural interaction. In: Edelman G M , Gall WE, Cowan W M (eds). Dynamic Aspects of Neocortical Function. New York: Wiley, pp. 263-313. 149 Yin TCT, Kuwada S and Sujaku Y. (1984) Interaural time sensitivity of high-frequency neurons in the inferior colliculus. Journal of the Acoustical Society of America, 76:1401-1410. Yost WA. (1974) Discrimination of interaural phase differences. Journal of the Acoustical Society of America, 55:1299-1303. Yost WA. (1994) Fundamentals of hearing: an introduction. San Diego: Academic Press. pp. 169-180. Yost WA and Dye RH. (1988) Discrimination of interaural differences of level as a function of frequency. Journal of the Acoustical Society of America, 83:1846-1851. Yost WA and Dye RH. (1991) Properties of sound localization by humans, in Neurobiology of Hearing: The Central Nervous System. Altschuler R, Hoffman D, Bobbin R and Clopton B (eds), New York: Raven Press. Yost WA, Wightman FL and Green DM. (1971) Lateralization of filtered clicks. Journal of the Acoustical Society of America, 50:1526-1531. Young ED and Brownell WE. (1976) Responses to tones and noise of single cells in dorsal cochlear nucleus of unanesthesized cats. Journal of Neurophysiology, 39:282-300. Young ED, Spirou GA, Rice JJ and Voight HF. (1992) Neural organization and responses to complex stimuli in the dorsal cochlear nucleus. Philosophical Transactions of the Royal Society of London, Section B, 336:407-413. 150 Zeki SM. (1974) Functional organization of a visual area in the posterior bank of the superior temporal sulcus of the rhesus monkey. Journal of Physiology, 236:549-573. Zihl D, von Cramon D and Mai N . (1983) Selective disturbance of movement vision after bilateral brain damage. Brain, 106:311-340. Zwislocki J and Feldman RS. (1956) Just noticeable differences in dichotic phase. The Journal of the Acoustical Society of America, 28:860-864. 151 152 Appendix 2 A Study of the Cross-modal Contingent Aftereffect Adaptation aftereffects have been observed in both the visual and the auditory system. The spiral visual motion aftereffect (Wade 1994) and the changing-intensity auditory aftereffect (Reinhardt-Rutland 1992; 1995) are two examples. After viewing a spiral rotating clockwise and expanding for about half a minute, a stationary spiral appears rotating counter-clockwise and contracting. In audition, following about two minutes of listening to a sound increasing in intensity, a sound with a constant intensity appears decreasing in intensity. To investigate whether or not there exists the visual-auditory cross-modal contingent aftereffect, this experiment was designed in which the direction of change in sound intensity was paired with the direction of spiral movement. Two subjects (A.C. and C D . ) with corrected-to-normal vision and normal hearing were tested. During the experiment, the subject was instructed to gaze steadily at the center of a rotating spiral (Fig. A2.1), while listening to a sound (broadband noise, 0.5 ~ 14 kHz) with changing intensity. Fig. A2.1 Plateau's spiral used in this experiment. 153 The spiral was presented on the screen of a computer monitor (17" Trinitron, Sony) 1 meter distant directly in front of the subject, and the sound was emitted from a loudspeaker (LSC-150, Labtec) mounted on the computer monitor. During about six minutes of adaptation, when the spiral was rotating clockwise and expanding (32 rotations/min), the intensity of the sound was increasing (rate: 15 dB/sec; range: 55-100 dB SPL); when the spiral was rotating counter-clockwise and contracting (32 rotations/min), the intensity of the sound was decreasing (rate: 15 dB/sec; range: 55 -100 dB SPL). The two stimulus patterns alternated every three seconds. In the control condition, the auditory adapting stimuli were the same as those in the adaptation condition, whereas the visual adapting stimuli were different in that the spiral was stationary rather than rotating. Following adaptation, the influence of spiral motion on the perception of change in sound intensity was tested. In each test trial (1 second in duration), the spiral was rotating either clockwise or counter-clockwise (32 rotations/min), while the sound intensity was increasing or decreasing with one of three rates (5, 10 and 15 dB/sec). The subject had to press one of two buttons to indicate the direction of change in sound intensity. A total of ninety-six test trials was given during each run. In one half of the test trials the spiral was expanding and in the other half the spiral was contracting. Probit analysis was used to estimate the point of subjective stationarity (PSS) of sound intensity on the obtained psychometric functions. If there existed a changing-intensity auditory aftereffect contingent on the direction of spiral motion, the PSS of sound intensity would shift to the high end when the spiral was expanding and would shift to the low end when spiral was contracting. Each subject was 154 tested twice, with a separation of three days between the two runs. No such a cross-modal contingent aftereffect was observed. In a separate experiment, the subject adapted to the same adapting stimuli as those used in the experiment described above, and the reversed effect, i.e. the effect of change in sound intensity on the visual perception of direction of spiral motion, was tested. Results show that there was no measurable visual-auditory contingent aftereffect for the two subjects tested. 155