UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Selective attention in object substitution masking Tata, Matthew S. 1999

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_1999-0384.pdf [ 2.98MB ]
Metadata
JSON: 831-1.0089050.json
JSON-LD: 831-1.0089050-ld.json
RDF/XML (Pretty): 831-1.0089050-rdf.xml
RDF/JSON: 831-1.0089050-rdf.json
Turtle: 831-1.0089050-turtle.txt
N-Triples: 831-1.0089050-rdf-ntriples.txt
Original Record: 831-1.0089050-source.json
Full Text
831-1.0089050-fulltext.txt
Citation
831-1.0089050.ris

Full Text

SELECTIVE ATTENTION IN OBJECT SUBSTITUTION MASKING by MATTHEW S.TATA B. Sc., Cornell University, 1995 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES (School of Medicine, Program in Neuroscience) We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA June 1999 © Matthew S. Tata, 1999 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of The University of British Columbia Vancouver, Canada Date J^<_ DE-6 (2/88) ABSTRACT Backward masking in visual perception occurs when the visibility of a brief target stimulus is decreased by interference from another visual object appearing at the same location. Of particular interest is a type of backward masking known as object substitution, which has been shown by Di Lollo, Enns, and Rensink (1999) to be modulated by factors that also influence visual attention. The results of three experiments are presented here that corroborate their findings. It is shown that object substitution increases with the number of distractor items in the display, decreases when the target location is validly pre-cued, and is eliminated when the target can be located rapidly in the display. A theory accounting for this phenomenon has been proposed by Di Lollo and colleagues. Implicit in the theory is the notion that the mask becomes the focus of object perception mechanisms in the brain. Recent advances in the understanding of the neural mechanisms mediating attentional selection of visual objects are consistent with this theory. It is argued on theoretical grounds that attentional selection of the mask is likely to be involved in object substitution. Finally, the results of three experiments are presented that lend support to this claim. It is shown that object substitution is eliminated by the presence of distractors that divert attention from the mask at the target location. Furthermore, a mask that precedes the target in the display sequence is shown to be ineffective unless its contours are hidden in the background during the preview period. This is consistent with the findings of recent investigations into the attentional capture phenomenon and suggests that the mask may capture attention during visual search for the target. T A B L E O F C O N T E N T S Abstract ii List of Figures v Acknowledgements vi Introduction and Background 1 Spatial and Temporal Factors 2 Visual Attention in Backward Masking 5 Overview of Attention 5 Selective Attention and Object Substitution 7 Experiments 1,2, and 3 8 Experiment 1..., 8 Method 9 Results 10 Discussion 12 Experiment 2 13 Method 14 Results 15 Discussion 16 Experiment3 16 Method 18 Results 20 Discussion 20 Conclusion: Experiments 1,2, and 3 22 A Theory of Object Substitution 22 Experiments 4, 5, and 6 29 Experiment 4 29 Method. 30 Results 31 Discussion 32 Experiment 5 33 Method 34 Results 35 iii Discussion 36 Experiment 6 37 Method 37 Results 38 Discussion 39 Conclusions 40 References 44 iv LIST OF FIGURES Figure 1. Typical masking function 2 Figure 2. Schematic of stimulus used in Expt. 1 10 Figure 3. Individual results of Expt. 1 11 Figure 4. Mean results of Expt. 1 12 Figure 5. Schematic of stimulus used in Expt. 2 14 Figure 6. Individual results of Expt. 2 15 Figure 7. Mean results of Expt. 2 16 Figure 8. Schematic of stimulus used in Expt. 3 18 Figure 9. Results of Expt. 3 ..20 Figure 10. Schematic of stimulus used in Expt. 4 30 Figure 11. Results of Expt. 4 32 Figure 12. Schematic of stimulus used in Expt. 5 35 Figure 13. Results of Expt. 5 36 Figure 14. Schematic of stimulus used in Expt. 6 38 Figure 15. Results of Expt. 6 39 ACKOWLEDGEMENTS I would like to thank Debbie Giaschi, for encouraging me to take on a project which I was passionate about; Vince Di Lollo, for prompting me to "engage your brain" and imparting to me the skills and expertise to get off the ground; Bob Dougherty, for being my intellectual big brother and personal programming guru; Catherine, and everyone else in the lab for their assistance and support; my parents for getting me here and Candace for giving me a good reason to stay; and finally my guitar and skis...they know what I mean. I N T R O D U C T I O N A N D B A C K G R O U N D Detection, discrimination or identification of a visual event (a target stimulus) can be hindered by another visual event (a masking stimulus) occurring in close spatial and temporal proximity. Paradoxically, visibility of a visual target may be reduced by the presence of a mask several tens to hundreds of milliseconds after the target has vanished. Furthermore, backward masking can occur despite non-overlapping contours of target and mask. This type of backward masking has been called metacontrast for many years (Breitmeyer, 1984) and is thought to involve low-level visual processes based on contour interactions. Recently however, Enns and Di Lollo (1997) described a related form of backward masking, termed object substitution, which is critically dependent on visual attention. Subsequent studies (Di Lollo, Enns, & Rensink, 1999) have shown that, like other attentional phenomenon, object substitution is modulated by factors such as the number of items in the display (set-size), pre-cueing of the target location, and the structural similarity between target and non-target items. The main goal of this thesis is to further characterize the interaction between visual selective attention and the object substitution phenomenon. This goal is accomplished in several steps. First, the recent description of object substitution is placed in the context of previous research in backward masking. Next, the research of D i Lollo et al. (1999) is discussed and the results of three experiments are reported which corroborate and clarify their findings. Inherent in many theories of perception is what Weisstein, Ozog, and Szoc (1975) refer to as a "linking hypothesis"; a theoretical bridge between physiological and perceptual phenomena. Theories of backward masking have ranged from inhibition of slow sustained neurons by fast transient neurons early in the visual system (Breitmeyer & Ganz, 1976; Weisstein et al., 1975) to proposals that masking occurs at higher level cognitive stages of processing (Di Lollo, Enns, & Rensink, 1999; Enns & D i Lollo, 1997; Uttal, 1970). A secondary goal of this thesis is to relate the object substitution phenomenon to recent advances in the understanding of the response properties of cells in the primate visual cortex. With plausible neural mechanisms in mind, it will be proposed that object substitution occurs as a result of the attentional selection of the mask while the target remains unattended. This interpretation leads to the prediction that manipulations that are likely to decrease the salience of the mask will reduce or eliminate object substitution. The results of three experiments that support this interpretation will be presented Spatial and Temporal Factors In visual masking, three critical parameters determine the strength of the effect: the temporal relationship between target and mask known as stimulus onset asynchrony or SO A (Breitmeyer & Ganz, 1976; Di Lollo, Bischof, & Dixon, 1993; Kahneman,1968), the spatial relationship between target and mask contours (Breitmeyer & Ganz, 1976), and the number of non-target stimuli (Averbach & Coriel, 1961; Di Lollo, Enns, & Rensink, 1999; Enns & Di Lollo, 1997; Spencer & Shuntich, 1970; Tata, Di Lollo, & Giaschi, 1998). The variety of masking stimuli may be categorized in terms of both spatial and temporal parameters. In terms of temporal parameters, a mask which precedes the target has a negative SOA and is known as a forward mask, while a mask which appears subsequent to the target has a positive SOA and is called a backward mask. In terms of spatial parameters, the contours of the mask may spatially overlap those of the target, or be close to but not overlapping the target. Spatial and temporal parameters of masking stimuli interact to yield various patterns of masking effects. Masking by overlapping contours (noise) is maximally effective when the mask and target are presented simultaneously (SOA = 0) and less effective with increasing positive or negative SOAs (Breitmeyer, 1984; Kahneman, 1968). Masking by non-overlapping contours may lead to either metacontrast or object substitution or both depending on attentional factors such as set-size. Both metacontrast and object substitution result in exclusively backward masking and tend to be maximally effective when the mask is presented several tens of milliseconds after the target. This pattern of masking over time yields a characteristic U-shaped function relating sensitivity or accuracy to SOA (Breitmeyer & Ganz, 1976), as depicted in Figure 1. 0 120 240 360 SOA (msec) Figure 1. A typical U-shaped backward masking function with Stimulus Onset Asynchrony (SOA) on the x axis and ability to detect a feature of the target on the y axis. Maximal masking is evident at intermediate SOAs. 2 Two general mechanisms of masking, integration and interruption, have been described by Kahneman (1968). Attributes of a visual object are encoded by activity among neurons in the visual system. This activity can be thought of as a neural representation of that object. Under certain conditions, the target and mask may be integrated, effectively adding noise to the target representation, and reducing visibility. Masking by integration is predominant when stimuli overlap both spatially and temporally. Increasing S O A in either the positive or negative direction decreases the likelihood that the target and mask are integrated and thus decreases the masking effect. Indeed, masking functions for spatially overlapping target and mask show exactly this symmetric pattern (Breitmeyer & Ganz, 1976; Spencer & Shuntich, 1970). Alternatively, the mask may interrupt visual processing of the target (Breitmeyer & Ganz, 1976; Enns & D i Lollo, 1997; Kahneman, 1968). This account of backward masking supposes that perceptual mechanisms engaged in processing of the target can be halted or disrupted and does not entail a degradation of the target representation itself. In principle, this type of masking could occur with stimuli that do not overlap. For this reason, interruption has formed the basis for theories of metacontrast and object substitution. Since target and mask do hot overlap in these displays, integration does not contribute to the masking effect. Consequently, no masking would be expected at zero S O A . Furthermore, masking by interruption would be expected to yield a U-shaped masking function with peak masking occurring at an intermediate S O A - after the opportunity for integration has passed but before the target has been completely processed. This pattern is observed with both metacontrast and object substitution stimuli (Breitmeyer & Ganz, 1976; Enns & D i Lollo, 1997; Tata, D i Lollo, & Giaschi, 1998). The spatial and temporal factors involved in metacontrast masking have been of interest in the literature over the past several decades. Nevertheless, the importance of the complexity of the display has not gone unnoticed. Several studies have shown that the masking effect obtained for a given target-mask configuration and S O A depends on the number of items (set-size) in the display. For example, Averbach and Coriel (1961) examined the effect of set-size on subjects' ability to identify a single item in an array of letters. The target letter was singled out by the presentation of either a surrounding ring or an adjacent bar at variable SOAs. In the single item (set-size =1) condition subjects performed at 100% accuracy. However, in the larger set-size condition (set-size = 4) they found that letter identification accuracy followed a U-shaped dependence on S O A . At shorter and longer SOAs the ring and bar probes yielded similar 3 performance while at intermediate SOAs, the ring probe produced a pronounced decrement in performance relative to the bar probe. The authors proposed that the target letter was replaced in a short-lived sensory memory by the ring probe and they termed this phenomenon erasure. Others have interpreted this result in terms of metacontrast masking because of the spatial configuration of the target and surrounding ring (Breitmeyer & Ganz, 1976; Kahneman, 1968; Spencer & Shuntich, 1970). However, the implication of this interpretation is that metacontrast is somehow modulated by set-size, a point which was not explicitly accounted for in subsequent theoretical treatments of metacontrast masking (Breitmeyer & Ganz, 1976; Bridgeman, 1971; Weisstein, Ozog, & Szoc, 1975 ). Spencer and Shuntich (1970) showed evidence for both interruption and integration views of masking. Furthermore, it was shown that the relative contributions of the two forms of masking depended on the number of items in the visual display. Their procedure involved using a pattern mask that spatially overlapped the target. This mask was presented at several different positive or negative SOAs. The brightness of the mask relative to the target was varied in each of two set-size conditions (single item and 12 item displays). In the single item condition, the authors found that a dim mask produced little or no masking while brighter masks produced strong forward and backward masking with a maximum effect at an S O A of zero. This result is consistent with an integration theory of masking as described above. However, the dim mask, which caused no masking in the single target condition, yielded a U-shaped masking function in the 12 item condition. This result is not consistent with an integration theory of masking because visibility of the target was unaffected at zero S O A ; the mask was shown to be too dim to cause significant degradation of the target in the single item condition. The addition of additional items in the display caused the target to be more vulnerable to masking. The authors took this to be evidence for an interruption mechanism, and suggested that the greater complexity of the 12 item display delayed processing of the target from an iconic form of memory, in which it was susceptible to masking, to a more permanent form of memory. It is worth noting that manipulations of set-size are a common technique in demonstrating differences between focused and divided attention (Treisman & Gelade, 1980; Wolfe, Cave, & Franzel, 1989). A s distractor items are introduced into a display, locating and selectively attending to the target requires more time. However, both Averbach and Coriel (1961) and Spencer and Shuntich (1970) dealt with the influence of set-size on backward masking in terms of a limited capacity of information processing in the visual system. Visual selective attention did not play an explicit role in their interpretations. Breitmeyer and Ganz 4 (1976) bridged this gap by proposing that selective attention facilitates the transfer of target information from iconic storage, where it is susceptible to masking, to a more permanent categorical store. According to Breitmeyer and Ganz, in complex displays, a "sequential scanning" must occur to enable the system to "read-out or abstract" target information from iconic or sensory memory. This assertion carries with it the implication that backward masking occurs (or is more effective) when attention is divided rather than focused, and is consistent with the set-size effects observed by Averbach and Coriel (1961) and Spencer and Shuntich (1970). However, while their theory as a whole was strongly rooted in neurophysiological phenomena, they did not explicitly incorporate selective attention into the framework of their model. Much progress has been made in the understanding of attentional mechanisms, both psychological and physiological, since the earlier work on metacontrast masking. We are in a position now to re-cast the effect of set-size in the framework of visual attention. Visual Attention in Backward Masking Applying the concepts of visual attention to backward masking is a relatively new approach to explaining this phenomena. This approach opens up the study of backward masking to analysis in terms of visual search, attentional capture, and the physiological correlates of selective attention. Using the concepts of attention also incorporates a range of complexities not previously considered in masking studies. A brief review of the relevant principles of visual attention follows. Overview of Attention. The term attention has been used to describe a number of phenomena, but its use in the following discussion is somewhat specific. In the context of object substitution, selective attention should be taken to mean a mechanism or process of selecting one or several specific visual objects for enhanced processing to the exclusion of unattended items (Chelazzi, 1995; Yantis, 1996). For example, individual features such as color, orientation, motion, or luminance may be encoded at early stages of processing (Treisman, Cavanagh, Fisher, Ramachandran, & von der Heydt, 1990) but object identification often requires integrating discontinuities in several of these dimensions across the visual field (Treisman et al., 1990). This integration aspect of object perception is thought to be among the perceptual processes that may be enhanced by selective attention (Treisman et al., 1990; Treisman & Gelade 1980). 5 Attention can be focused on a particular location (Egeth & Yantis, 1997; Posner & Petersen, 1990) or object (Duncan, 1984), or it may be disengaged from one object or location and shifted across the visual field to engage another object or location. Attention can also be divided across several objects or visual locations as in a multi-element display. Scanning a room for a familiar face is a common example of the dynamic nature of attention. The location or object to which attention will be directed is determined by several factors. There may be a cue that tells an observer to shift attention to a particular location. Alternatively the observer may be required to shift attention systematically as in searching for a face in a crowd. Control of attention is often discussed in the framework of visual search tasks in which a subject must locate a target among distractors. This framework is of particular relevance to object substitution. The time required to focus attention on the target is indirectly measured as the reaction time to detect or identify that target. Two general patterns emerge in such experiments. For some searches, reaction time to detect or identify a target among distractors is independent of the number of items (set-size) in the display. In other words, visual search for the target is not influenced by the presence of distractors. Generally this occurs when the features of the target (a singleton) contrast significantly with the distractors (e.g. a single red target among green distractors) (Treisman & Gelade, 1980; Wolfe et a l , 1989). A variety of terms for this kind of search appear in the literature. The words parallel and pre-attentive have been used to convey the idea that the features of individual items in the display can be examined simultaneously by the visual system without requiring focused attention (Treisman et al., 1995; Treisman & Gelade, 1980). Also in common use is the term pop-out to describe the subjective impression that the target of a parallel search seems to be immediately salient and stands out from the distractors. Finally, parallel visual searches can be described in terms of efficiency. A n efficient search is one in which the slope of the function relating search time to number of distractors is nearly zero (Wolfe, 1998). The efficiency with which a visual search can be accomplished depends on the stimuli and there seems to be a continuous distribution of search slopes for different stimulus configurations. At the other extreme are searches in which the target is defined by a conjunction of several features (e.g. red and horizontal) or by its lack of a feature possessed by the distractors (e.g. a complete ring among rings with gaps). In this case, the set-size of the display is of critical importance to the efficiency of the search. Reaction times to detect the target increase with increasing set-size. This type of search has been described as requiring a serial attentional 6 scanning of items (Treisman & Gelade, 1980). Other accounts suggest that the term serial search may be misleading and that both serial and parallel processes are at work in locating a conjunction target (Wolfe et al., 1989). Attention may also be controlled by a cue that indicates the location or object to which attention should be directed (Posner & Petersen, 1990). In this regard, the effects of focused attention can be seen by comparing subjects' performance on a target detection, discrimination, or identification task in which the location of the target is correctly indicated by a valid cue versus trials with an incorrect or invalid cue. Performance is enhanced on validly cued trials relative to invalidly cued trials (Posner & Petersen, 1990). The focusing of attention on an object or location depends jointly on features in the stimulus (bottom-up or stimulus driven control) and the attentional state or intent of the observer (top-down or goal-driven control) (Egeth & Yantis, 1997; Yantis, 1996). Thus attention may be involuntarily captured by one object or location in space or it may be voluntarily shifted. In either case, attentional control relies on an interaction between stimulus-driven and goal-driven control mechanisms. Yantis (1996) summarized the extensive literature by outlining various degrees of attentional capture. One type of capture, referred to as strongly involuntary, elicits an orienting response regardless of the attentional set of the observer (e.g. a loud noise eliciting a startle response). At the other extreme, a pop-out target may be thought of as a weak form of capture in that attention is quickly shifted to it under some conditions. A third type of capture is based on evidence from studies by Yantis and Jonides (1984) in which subjects had to detect the presence of a particular letter among distractor letters. In each trial, the target letter could either appear abruptly at the start of the trial or be revealed from a set of block figure 8s (like the display of a digital clock). When the target letter appeared as an abrupt onset, detection was rapid and unaffected by the number of items in the display. When the target was revealed from an already displayed object however, a serial search was required. Yantis (1996) suggested that this constitutes a type of capture which he called weakly involuntary. During visual search, attention is first directed to the new object and then, if necessary, to old objects. This capture can be prevented however, i f the subject actively focuses attention elsewhere in the display (Theeuwes, 1991). Selective Attention and Object Substitution Object substitution as a distinct type of masking was first proposed by Enns and D i Lollo (1997). They reported that a target may be masked by four small dots arranged in a notional 7 square surrounding the target. This stimulus configuration lacks the strength of mask contour thought necessary for metacontrast masking (Breitmeyer & Ganz, 1976; Enns & D i Lollo, 1997). This is evidenced by the fact that visibility of a single target is not affected by the four-dot mask at any S O A . However, as was the case for the ineffective low-energy pattern mask used by Spencer and Shuntich (1970), the four-dot mask produces robust masking in multi-element displays. In their discussion, Enns and D i Lollo (1997) proposed a hypothesis based on selective attention. The authors suggested that masking of this sort occurs because spatial uncertainty precludes focusing of attention on the target prior to the appearance of the mask. In the absence of focused attention the target is encoded by the visual system with what they refer to as "low spatio-temporal resolution" and may be subsequently replaced in "object perception mechanisms" when attention is directed to the mask. Their attentional hypothesis of object substitution is a departure from previous accounts of masking which do not include a role for selective attention as a primary factor in the masking effect. E X P E R I M E N T S 1, 2, A N D 3: V I S U A L A T T E N T I O N M O D U L A T E S O B J E C T S U B S T I T U T I O N D i Lollo et al. (1999) reported a series of experiments that substantiate an attentional hypothesis of object substitution. If object substitution only occurs at unattended locations then manipulations of the display which enable subjects to rapidly focus attention on the target location are expected to yield less masking. D i Lollo et al. (1999) tested this hypothesis using three strategies to manipulate the spatial focus of attention: set-size, spatial cueing, and target pop-out. One of the primary goals of this thesis was to corroborate and add to their findings. The following three experiments were presented at the 1998 Annual Meeting of the Association for Research in Vision and Ophthalmology (Tata, D i Lollo, & Giaschi, 1998). Each was intended to complement one of the experiments of D i Lollo et al. Experiment 1 - Object Substitution is Modulated by Set Size A s we have seen, set-size has been used in previous masking studies to demonstrate an influence of display complexity on backward masking. Enns and D i Lollo (1997) suggested that the set-size effect resulted from an inability to rapidly focus attention on the target. D i Lollo et al. (1999) demonstrated the set-size effect in their Experiment 1. Subjects viewed a display consisting of 2 sequential frames. The first frame was 10 ms in duration and contained either 1, 2,4, 8 , or 16 rings, each having a 90° gap oriented up, down, left, or right (similar to Landolt Cs). One C (the target) was singled out by a slightly larger concentric ring without a gap. The subject's task was to report the orientation of this target C. The additional Cs served as distractors to prevent focusing of attention on the target. The second frame contained only the complete ring to act as a mask. This mask frame was presented for 0,40 , 80,160, or 320 ms in duration. The paradigm employed by Di Lollo et al. differed from conventional metacontrast paradigms in that the SOA between target and mask was always zero and no blank interstimulus interval (ISI) occurred between presentation of the two frames. That is, target and mask appeared together, but the mask outlived the target by a variable duration. This paradigm is referred to as common-onset and has been shown to produce masking in previous experiments (Di Lollo etal., 1993). As expected, Di Lollo et al. (1999) found a significant interaction between mask duration and set-size. Accuracy in discriminating the orientation of the target was a function of the duration of the mask, but the decrement in performance with increasing mask duration was greater with the larger set-sizes. Di Lollo et al. (1999) showed a strong set-size effect in the common-onset paradigm only. Although Enns and Di Lollo (1997) used an SOA paradigm, they tested only two different set-sizes. Experiment 1 was designed to demonstrate a set-size effect in the more conventional SOA paradigm over a range of different set-sizes. Experiment 1 is also related to the experiments of Averbach and Coriel (1961) described above. The significant difference is the nature of the task (letter identification vs. orientation discrimination) and the number of set-sizes examined. Method Four subjects with normal or corrected to normal vision viewed a two frame display similar to that used by Di Lollo et al. (1999). Stimuli were presented on a Tektronix 608 oscilloscope with a fast plotting buffer. Background illumination of the screen was provided by two fluorescent lamps and was attenuated to 10 cd/m2 by neutral density filters placed over each lamp. Luminance of the background and stimuli was calibrated with a Minolta LSI 00 photometer. Frame one contained 1,2,4, or 8 Cs arranged randomly on a notional circle of 2 degrees radius around a central fixation point (see Figure 2). The Cs were arranged such that the target appeared at every location a constant number of times within a block and could never appear at 9 the same location on successive trials. The Cs were randomly oriented up, down, left, or right and one of the Cs was singled out by a radial line extending from fixation. The subject's task was to report the orientation of this target C. Frame two contained a ring that was concentric with the target and served as a mask. The mask lasted 10 ms in duration and appeared at an SOA of 0,40, 80,120,160, or 320 ms. Each subject was tested on each of 4 (set-sizes) x 7 (SOAs) conditions. Trials were blocked within conditions into blocks of 48 trials each. Each block was tested 4 times yielding 192 trials per subject per data point. The Cs were 47 min in visual angle in diameter and the mask was 56 min in diameter yielding a target-mask spacing of 4.5 min. This target-mask spacing was adopted to allow for nearly perfect performance when set-size equaled one. Because metacontrast masking is known to occur in single item displays, this procedure ensured that the target-mask spacing was too large to produce metacontrast. Consequently, all masking effects could be attributed primarily to object substitution rather than metacontrast. 0 + • Frame 2 SOA = 0-320 ms Figure 2. Schematic of the stimulus used in Experiment 1. Frame 1 contained 1,2,4, or 8 Cs, one of which is singled out as the target by a radial line. The task was to discriminate the orientation of the target C. A masking ring slightly larger than the target appeared at the target location in Frame 2 after an SOA of 0 - 320 ms. Results Masking functions (accuracy as a function of SOA) for each of the four subjects is shown in Figure 3. Consistent with the findings of Di Lollo et al. (1999) and Averbach and Coriel (1961), masking increased as set-size increased. A dependency of object substitution on the number of items in the display is evidenced by a significant interaction between SOA and set-size (Fig, 54 = 6.725 ; p<.001). This can be seen most easily with the data averaged across. 10 subjects as shown in Figure 4. Note that masking was absent with a set-size of one because of the adjustments made to the target-mask spacing already described. Like all metacontrast and object substitution results, performance was near perfect and virtually identical at zero SOA for all set-sizes. As SOA increased, performance in the set-size = 2 condition suffered only slightly at intermediate SOAs between 40 and 120 ms. By contrast, with set-sizes of four and eight, a deep U-shaped masking function emerged. Maximal masking occurred with a set-size of eight at an SOA of 80 ms. Masking in the set-size = 8 condition was also effective at longer SOAs relative to the smaller set-size conditions. However, the use of the SOA paradigm rather than the common-onset paradigm resulted in an important departure from the object substitution pattern reported by Di Lollo et al. (1999). Like conventional metacontrast, accuracy returned to nearly 100% with a sufficiently long SOA. 120 240 SOA (msec) 0 120 240 SOA (msec) 0 120 240 SOA (msec) o Q> k_ o o 0) o 0) a. i i 1 r-120 240 SOA (msec) Set Size 1 4 • - 8 L — V Figure 3. Results of Experiment 1 for 4 subjects. Accuracy to discriminate the orientation of the target as a function of SOA in 4 different set-size conditions. 11 40 -( , , 1 , , 1— 0 120 240 SOA (msec) Figure 4. Results of Experiment 1 averaged across 4 subjects. Accuracy to discriminate the orientation of the target plotted as a function of SOA for 4 different set-sizes. Error bars represent the 95% confidence interval. Discussion The results of Experiment 1 complemented the findings of D i Lollo et al. (1999) and Averbach and Coriel (1961) which found that increasing the number of distractors presented along with the target leads to more effective masking. One possible explanation for the set-size effect, which can be quickly ruled out, is that the increase in set-size leads to visual crowding or low-level lateral masking. This may cause the slight systematic reduction in accuracy with increasing set-size which is noticeable in every subject at S O A - 0 and S O A = 320. Crowding, however, does not itself explain the increased vulnerability to masking exclusively at intermediate SOAs. If the reduction in accuracy were due to crowding, it would be consistent across SOAs. Another possibility which will be considered in more detail later, is that processing of the target is delayed because the target cannot be located rapidly among the distractors (Averbach & Coriel, 1961; Breitmeyer & Ganz, 1976; D i Lollo et al., 1999; Kahneman, 1968). Due to the nature of the display, subjects could locate the target either by finding the radial line and then tracing it to the target or by simply waiting for the mask to appear. Consequently, the representation of the target was held up at a stage in which it was unattended and vulnerable to the effects of the mask. This account leads to the prediction that manipulations of the display that either facilitate localization of the target or eliminate the need to search altogether would have the same effect as reducing the display to a single item. This is the strategy employed in the following two experiments. 12 Experiment 2 - Masking is reduced by an Attentional Pre-Cue The strategy of Experiment 1 was to prevent attention from being focused on the target by adding distractors to the display. The strategy of Experiments 2 and 3 was to essentially eliminate the effect of set-size by providing the subject with attentional cues to the target location. In Experiment 2 this was accomplished by presenting a spatial cue to the location of the target prior to its appearance in the display. Experiment 2 was meant to confirm and expand upon the findings of Di Lollo et al. (1999). Their rationale was that a spatial pre-cue would allow subjects to rapidly focus attention at the target location, just as they can when no distractors are present in the display. Consequently, the provision of a spatial pre-cue should reduce or eliminate masking even in a large set-size display . In their Experiment 6, subjects viewed a display consisting of three sequential frames. Frame one contained a single four-dot mask similar to that used by Enns and Di Lollo (1997). This frame was presented for 0 (no cue), 45, 90, 135, or 180 ms. The location of the four-dot mask indicated the location at which the target would appear on every trial (100% valid). The second frame contained an array of 1, 8, or 16 rings. When 8 or 16 rings were present, half of them had a vertical bar across the bottom of the ring. When a single ring was present, a vertical bar could be either present or absent. In every trial, one target ring was randomly singled out by the presence of the four-dot mask. The subject's task was to report the presence or absence of a vertical bar on the singled-out ring. The third frame contained only the four-dot mask as in the first frame. The effect of this spatial pre-cue was tested only at a mask duration of 90 ms that yielded maximal masking in previous experiments. The three frames were presented in continuous succession (no blank ISI). As predicted, masking was reduced by a pre-cue beginning 45 ms before the target appeared. Furthermore, increasing cue duration incrementally reduced masking in every set-size condition. While this result suggests that the ability to focus attention rapidly at the target location is sufficient to eliminate object substitution, two points require clarification. First, the effects of a valid pre-cue on task performance are least ambiguous when compared with the effects of an invalid pre-cue (one which directs attention to the wrong location). The four-dot mask in the Di Lollo et al. (1999) experiment was valid on every trial and performance was compared to trials on which no cue was presented. It is possible that the onset of the cue simply prompted the subject to be alert for the pending trial and performance consequently improved with increasing cue-duration. In this case, a valid cue should provide no additional performance advantage 13 relative to an invalid cue. Furthermore, any valid spatial pre-cue should be sufficient to reduce object substitution. Notice that D i Lollo et al. used a very specific cue, namely the visual object that would subsequently act as the mask. This point will be considered further in Experiments 5 and 6. Experiment 2 of this thesis addressed these concerns by including both validly and invalidly cued trials and employing a cue that bore no structural similarity to the mask. Method Stimulus displays used in Experiment 2 were similar to Experiment 1 with the exception that a spatial pre-cue was provided 1, 50, 150 or 200 ms prior to the onset of the target and distractors. The cue consisted of a square arrangement of four small dots (not to be confused with a four-dot mask) clustered together at a point which subsequently became the center of a C in the target frame (see Figure 5). The luminance of the cue was 200 cd/m 2 and it was presented for 1 ms in duration. Subjectively, the cue appeared to be a small flash of light at one of the possible target locations. The cue validly indicated the location of the target on 12% of trials and indicated the location of a distractor on all other trials. This yielded a 2 (validity) x 4 (cue lead time) experimental design carried out in blocks of 96 trials each. Each block was tested twice for a total of 192 trials per subject per data point. Each subject was tested in each condition. Set-size and S O A were fixed at 8 and 80, respectively, as this condition produced maximal masking in Experiment 1. Three of the four subjects who participated in Experiment 1 served as subjects in Experiment 2. A s in Experiment 1, the subject's task was to report the orientation of the target C . SOA = 80 ms Cue Lead Time = 1 - 200 ms Frame 1 Figure 5. Schematic of the stimulus used in Experiment 2. A cluster of 4 dots appeared in Frame 1 and served as a spatial pre-cue. This cue preceded Frame 2 by a lead time ranging from 1 - 200 ms. Frames 2 and 3 were as in Expt. 1 with set-size and SOA fixed at 8 and 80ms, respectively. The cue correctly indicated the target location on 12% of trials. The task was to report the orientation of the indicated C. 14 Results Accuracy for the orientation discrimination task for each subject is presented in Figure 6. In agreement with D i Lollo et al. (1999), masking was found to be significantly reduced when the target location was cued in advance. This is seen more easily in Figure 7 which shows data averaged across all three subjects. As expected, the reduction in masking becomes more pronounced as cue lead time is increased. This is evidenced by a significant interaction between cue lead time and cue validity ( F3,6 = 4.702 ; p = .051). The horizontal line plotted in each figure represents performance measured in Experiment 1 with the same set-size and S O A but without the cue. It is evident that no systematic reduction in performance resulted from an invalid cue. Performance on no-cue and invalid-cue trials was virtually identical. Cue Type Valid — • r Invalid No Cue (from Expt. 1) Figure 6. Results of Experiment 2 for 3 subjects. Accuracy to discriminate target orientation as a function of cue lead time for valid and invalid cues. The dotted line indicates performance on the corresponding set-size and SOA condition from Expt 1 in which no cue appeared. 15 100.0 o £ o 80.0 O » Valid —•—Invalid c Q) O 60.0 • 1 No Cue o a. 40.0 i 0 100 200 Cue Lead Time (ms) Figure 7. Results of Experiment 2 averaged across 3 subjects. Accuracy to discriminate orientation of the target as a function of cue lead time for valid and invalid cues. The dotted line indicates average performance on the corresponding set-size and SOA condition from Expt 1 in which no cue appeared. Error bars represent the 95% confidence interval. Discussion Experiment 2 demonstrated that masking is ineffective when the subject is given sufficient time to focus attention at the target location. Because the subject can ignore all items that are not at that location, focusing attention has an effect similar to reducing the set-size to one. This finding substantiated this claim made previously by D i Lollo et al. (1999) by demonstrating a decrease in masking on validly cued trials only. It might be expected that masking would be more pronounced on a trial in which attention was directed to a distractor rather than the target. Interestingly, no such cost is associated with an invalid cue. It has been assumed here that attention is divided across the entire display at the onset of the target. It is possible though that subjects naturally attend to a location in the display at random in the moments preceding the onset of the target. A n invalid cue simply re-directs this focusing of attention, but not to a location which helps the subject perform the task. In either case, attention is not focused at the target location and object substitution may occur. Experiment 2 also generalized the D i Lollo et al. finding by showing that a cue bearing no structural resemblance to the target is sufficient to reduce object substitution masking. Experiment 3 - Masking is Reduced by Target Pop-Out Experiment 2 employed a spatial pre-cue to direct attention to the location of the target. Another common manipulation which allows for rapid focusing of attention involves target pop-out. Recall that visual search for targets bearing a high degree of similarity to distractors 16 requires an inefficient serial search. Conversely, targets which are very different from the distractors can be located more rapidly and are said to pop out (Treisman & Gelade, 1980; Wolfe et al., 1989). Visual search for such a target is independent of set-size. If visual search for the target in an object substitution display is rapid regardless of number of distractors, then a target which pops out from its distractors should be invulnerable to object substitution. Experiment 5 of D i Lollo et al. (1999) tested this prediction using a display similar to that used in their cueing experiment. Subjects viewed a display with two sequential frames. Frame one contained an array of 1, 8, or 16 rings, one of which was singled out by a four-dot mask. On one half of the trials, a single vertical bar crossed the bottom of the ring, which caused the target to pop out. On the other half of trials, no vertical bar was present and all rings were indistinguishable except for the presence of the four-dot mask. The subject's task was to indicate whether the vertical bar was present or absent. The second frame contained only the four-dot mask as in their other experiments. This frame appeared immediately after Frame 1 disappeared (no ISI) and lasted for a variable duration. Performance on the detection task was compared between this experiment in which the target popped out and a previous experiment in which some of the distractors also had a vertical bar. In the latter case, the target was singled out only by the presence of the four-dots that, according to the authors, were insufficient to rapidly draw attention to the target location. A s predicted, masking was significantly reduced in the pop-out relative to the non-pop-out condition. The claim made by D i Lollo et al. (1999) is that pop-out targets are invulnerable to masking because attention can be rapidly deployed to them. However, a further clarification is warranted. Notice that D i Lollo et al. compared two conditions. In the pop-out condition, the target could be found among the distractors without being singled out by the mask. The non-pop-out condition was a partial report task in which the target was singled out only by the presence of the mask at the same location. Rather than comparing the masking effect between efficient and inefficient search tasks, D i Lollo et al. compared a parallel search task to a partial report task. A stronger prediction, following the logic of D i Lollo et al., is that a target, which is distinguishable from distractors but can be found only by an inefficient serial search, should be vulnerable to masking. Experiment 3 of this thesis was intended to support the claim of D i Lollo et al. (1999) by comparing masking of a pop-out target to masking of a serial search target. This sort of experiment must be designed with care. Any difference in the spatial contours of the targets between the two conditions could lead to differences in masking irrespective of the efficiency of 17 visual search. By using a search asymmetry in a yes/no signal detection paradigm, a task was designed which assessed masking for pop-out and non-pop-out targets without confounding the type of search with the discriminability of the target. It is known that visual search for a ring with a gap among complete rings (a C among Os) is highly efficient. Conversely, search for a complete ring among rings with gaps (an O among Cs) is inefficient (Treisman et al., 1990). Method Subjects viewed a display consisting of two sequential frames separated by a blank ISI as in the previous experiments (see Figure 8). There were two conditions, pop-out and non-pop-out, which differed only in the shape of the distractors and the instructions to the subject. In the pop-out condition, the subject's task was to detect the presence of a single C among seven Os. One half of trials were catch trials in which no C was present. In the non-pop-out condition, the subject was instructed to detect the presence of an O among seven Cs. In both conditions, the Cs were identical to those used in Experiments 1 and 2 and the Os were simply the same Cs with the gap filled in. The target and distractors were positioned randomly around a notional circle as in the previous experiments. While the orientation of the Cs was randomly up, down, left, or right, this was irrelevant to the task. The search array appeared in Frame 1, which vanished after 10 ms. Frame two contained a single ring that acted as a mask as in Experiments 1 and 2. The mask appeared on every trial and always surrounded the target i f present. Pop-Out Frame 1 Frame 2 Non-Pop-Out o ° o U o + 0 + c c o Frame 2 . Frame 1 Figure 8. Schematic of the stimulus used in Experiment 3. Frame 1 contained a search array of a target and 7 distractors. In separate blocks, the subject's task was to detect the presence of a C among Os (pop-out condition) or an O among Cs (non-pop-out condition). On catch trials the array contained only distractors. A ring mask appeared at the target location in Frame 2 following a variable SOA. 18 A n important benefit to this design is worth emphasizing. Notice that in both pop-out and non-pop-out conditions, the stimulus sequence at the target location was exactly the same. In both conditions, on one half of the trials the masked location contained an O and on one half of the trials the masked location contained a C . What was changed between the conditions was the shape of the distractors and which shape (C or O) was defined as the target for that block of trials. Two subjects participated in Experiment 3, one of which had participated in Experiments 1 and 2. Both subjects viewed displays in which the S O A was 0, 20, 40, 80, 120, 160, or 320. Subject M T was also tested at SOAs of 380 and 420 in the non-pop-out condition to obtain a more complete masking function. This yielded a 2 ( search type) x 7 (SOAs) design for the two subjects. Trials were blocked within conditions. Subject M T completed one block of 96 trials for each condition and subject T H completed three blocks of 96 trials for each condition. This yielded 96 trials per data point for subject M T and 288 trials per data point for subject T H . Simple analysis of accuracy on target present trials is not appropriate for this experiment. In the non-pop-out condition, the mask was a ring only slightly larger than the target, which was an O. If object substitution was effective, subjects might confuse the visible mask with the invisible target, thus responding "yes" when the target was present, but also responding "yes" when it was absent. Likewise, in the pop-out condition, subjects would be likely to confuse the visible mask with a distractor and respond "no" more frequently . Thus target-mask similarity is confounded with mask effectiveness. Comparison of accuracy between the two conditions would not lead to conclusions based solely on the effectiveness of the mask. This situation was rectified by analyzing performance on both target present and target absent trials using the techniques of signal detection theory. The similarity between the target and mask effectively biases the subject toward more hits and false-alarms in the non-pop-out condition and fewer hits and fewer false alarms in the pop-out condition. Using d' as a measure of sensitivity allows comparison between the two conditions because the measure is insensitive to changes in response bias, d ' was computed according to the procedure described by Macmillan and Creelman (1991). Hit rates and false-alarm rates for each block were obtained by dividing the number of "yes" responses by the number of target present and target absent trials respectively. These ratios were then converted into Z-scores. The difference between these two Z-scores is d \ When subjects responded with 100% accuracy in any block (hit rate = 1.0 or false-alarm rate = 0.0), hit rate was set equal to .35 1 / N or false-alarm rate was set to 1-.35 1 / N , respectively (where N represents the number of trials on which the target was present or absent). 19 Results Masking functions for both subjects are presented in Figure 9. Performance in both pop-out and non-pop-out conditions was near maximum at zero S O A . In the pop-out condition, subject T H showed a slight dip in sensitivity at S O A = 20 and 40 ms. Overall however, neither subject showed significant masking in the pop-out condition. On the contrary, performance tended to increase with increasing S O A . Masking in the non-pop-out condition, however, was strong with performance falling almost to chance (d' = 0) at S O A = 80 for subject T H and S O A = 160 for subject M T . Masking in the non-pop-out condition was also extended in time with a depression of sensitivity still evident at 320 ms. For both subjects, the mask became less effective at S O A = 320 ms . Only subject M T was tested at SOAs long enough to show a complete rebound from masking. In this case performance returned to the zero S O A level at an S O A of 420 ms. 5 4 3 2 1 0 "I—l—r 0 120 240 360 SOA (msec) >> +^  [> c CO -I— I—I—I— I—I—I— I-0 120 240 360 SOA (msec) Stimulus Type Pop-out Non-pop-out Figure 9. Results of Experiment 3 for 2 subjects. Sensitivity (d') to detect the presence of the target is plotted as a function of SOA in pop-out and non-pop-out conditions. Discussion The results of Experiment 3 support the claim by D i Lollo et al. (1999) that object substitution is less effective when the target can be found by an efficient visual search. Virtually no masking occurred in the pop-out condition in Experiment 3. Rather, performance on the task improved with increasing S O A . Exactly the opposite was found in the non-pop-out condition. 20 The degree and extent of the masking effect observed in this case was remarkable. Masking is rarely, i f ever, reported at target-mask SOAs in excess of 300 ms. Furthermore, transient-on-sustained inhibition theories (e.g. Breitmeyer & Ganz, 1976; Weistein et al., 1975) fail to explain masking at such long SOAs. According to Breitmeyer and Ganz (1976), "optimal interchannel inhibition should be obtained when the mask pattern onset is delayed by 50 - 100 ms." Clearly this estimate does not hold under conditions of divided attention. It is important to note that the non-pop-out condition was not simply too difficult for subjects to perform the task. If subjects were simply unable to locate the O among Cs then performance would not have rebounded at longer SOAs. The fact that both subject's performance rebounded from minimum at 320 ms and that subject M T showed a complete elimination of the masking effect at S O A = 420 ms indicates that the task could be performed provided that the target and mask were separated by a sufficiently long ISI. The subjective appearance of these displays is intriguing and worth contrasting with classical metacontrast masking. In metacontrast, visibility of the target is reduced, but its presence is nevertheless detectable (Fehrer & Biederman, 1962). Furthermore, metacontrast displays are usually so brief (< 100 ms ) that one has the sense that the target was missed rather than actually rendered invisible. By contrast, object substitution in the non-pop-out condition was phenomenologically complete. When S O A was in the 100 -300 ms range, subjects reported seeing a blank space in the array where the target should have been. A n important difference between the results of D i Lollo et al. (1999) and the present experiment is worth elaboration. While D i Lollo et al. found a reduction in masking of pop-out targets relative to non-pop-out targets, some masking was still evident. However, the results of Experiment 3 suggest that masking is virtually eliminated under certain pop-out conditions. The reason for the discrepancy between the results reported here and those reported by D i Lollo et al. is unclear. It may be that the common-onset mask used by D i Lollo et al. is a more powerful masking stimulus than the brief mask employed here. Alternatively, not all pop-out searches are perfectly efficient (Wolfe, 1998) and it is consistent with the underlying hypothesis of object substitution that more efficient searches should lead to less masking. Visual search in the D i Lollo et al. task may have been less efficient than search in the pop-out display used here. 21 Conclusion: Experiments 1, 2 and 3 Taken together, the results of Experiments 1, 2, and 3 strongly support the attentional hypothesis for object substitution put forward by Enns and D i Lollo (1997) and D i Lollo et al. (1999); one in which selective attention is a primary factor determining whether masking will occur in any given display. The evidence suggests that the initial representation of the target in the visual system is vulnerable to the effects of the mask. However, the representation of the target is fundamentally changed with the focusing of attention. Theses experiments also re-connected object substitution with previous masking studies by using a conventional S O A paradigm in which the target and mask were separated in time. Enns and D i Lollo (1997) observed object substitution in this type of paradigm, but the attentional manipulations used by D i Lollo et al. (1999) had not been applied to the S O A paradigm. Experiments 1,2, and 3 showed that the effects of the attentional manipulations used by D i Lollo et al. (1999) are not limited to the common-onset paradigm. Object substitution is a type of backward masking that occurs when attention is divided across multiple items in the visual display. Focusing attention on the target is sufficient to reduce or eliminate object substitution. The next step is to put together a theory explaining the underlying basis for object substitution. A T H E O R Y OF O B J E C T S U B S T I T U T I O N O f particular interest are the processes that mediate perception under conditions which yield object substitution. Object substitution is more consistent with Kahneman's (1968) description of masking by interruption than masking by integration. In masking by integration, the visual representation of the target is most degraded when target and mask appear simultaneously. Consequently, masking of this type is maximal at an S O A of zero. Object substitution is more like metacontrast in that maximal masking is observed at SOAs greater than zero. The general direction taken by several authors in explaining the set-size effect in backward masking has been to propose that processing of the target is slowed or delayed because the target must first be located among the distractors (Averbach & Coriel, 1961; D i Lollo et al., 1999; Kahneman, 1968). Specifically, processing is held up at a stage in which the visual representation of the mask may interfere with that of the target. The time-course over which 22 such visual processing occurs may be roughly divided into three epochs. At very short SOAs, the target and mask are registered as a single visual event and the stimulus is perceived as a target within a mask. At long SOAs the visual representation of the target is processed to a stage which is invulnerable to the effects of masking. Given sufficient time, even a target among many distractors can be processed before the mask appears. This explains the finding of Experiment 3 in which SOAs in the 300-400 ms range yielded little or no masking. Between these two extremes is a window of time during which the mask may interfere with processing of the target. Larger set-sizes prevent focusing of attention on the target because a visual search must be undertaken to locate the target in the display. With multi-element displays, the target remains vulnerable for a longer period of time and is more likely to be masked on any given trial. The principle underlying accounts of the set-size effect by Kahneman (1968) and Breitmeyer and Ganz (1976) is that unattended visual representations decay with time. A delay in processing the target from sensory to more permanent memory results in a weakening representation of that target in the visual system. The mask is then more powerful relative to the target, and the target more vulnerable to masking. The reason for increased masking in multi-element displays is simply that more time is required to locate and process the target. This approach falls short of explaining masking at a neuronal level. Neural mechanisms for metacontrast masking have been proposed in the past (Breitmeyer & Ganz, 1976; Weisstein et al., 1975). Breitmeyer and Ganz (1976) proposed an interchannel inhibition theory of metacontrast masking. In this theory, metacontrast is due to inhibition of a sustained channel, which carries information about the form of the target, by activity in a transient channel, which carries temporal information about the mask. A transient neuronal response to the onset of the mask inhibits the sustained neuronal response to the target. The transient channel is thought to carry signals somewhat faster than the transient channel. Thus, the transient response to the mask rises more rapidly than the sustained response to the target. Simultaneous presentation of target and mask results in synchrony of the transient responses to both stimuli and minimal interchannel inhibition. However, when the mask is delayed by an S O A of between 50-100 ms, the transient response to the mask peaks concurrently with the peak of the sustained response to the target. Interchannel inhibition reaches a maximum and the sustained neuronal response to the target is suppressed. The past decade however has seen a variety of experimental results that suggest that transient-on-sustained inhibition is not a complete explanation of backward masking phenomena. 23 Furthermore, difficulties arise when one tries to apply interchannel inhibition theories to object substitution. For example, common-onset displays (e.g. D i Lollo et al., 1993; D i Lollo et al., 1999) cannot be explained by this theory because it explicitly precludes masking when S O A = 0. Other studies, in addition to those described in the previous section, have demonstrated attentional manipulations in more classical metacontrast paradigms. Ramachandran and Cobb (1995) reported that directing attention to items other than the target leads to stronger masking than when the target is attended. Havig, Breitmeyer, and Brown (1998) showed that pre-cueing the positions of the masks decreased the masking effect. While Breitmeyer and Ganz (1976) do offer some explanation for the effect of increasing set-size on backward masking, their arguments are not tied to actual neural mechanisms and no attempt is made to explain how attention affects the neurons mediating interchannel inhibition. D i Lollo et al. (1999) have proposed a model to account for attention in object substitution masking. Their model deals with object substitution as a competitive interaction between information encoded at various stages of processing in the visual system. Incoming sensory information and information the visual system has already begun to process are put in direct conflict with each other when changes are made to the physical stimulus. At the heart of their model is the notion of re-iterative processing, first suggested to be involved in backward masking by Bridgeman (1988). This is a departure from the classically held view that information is processed at successively more complex stages as it is passed from one area of visual cortex to the next (Treisman et al., 1990; von der Heydt, 1995). Re-iterative processing supposes that information is encoded and re-coded multiple times at several levels of the visual system (Di Lollo et al., 1999). The underlying neuronal basis for re-iterative processing is suggested to be reciprocal or re-entrant cortico-cortical loops which pass information to and from extrastriate visual areas (e.g. V 4 and V5) and lower-level areas (e.g. V I ) (Lamme, 1995; Lee, Mumford, Romero, & Lamme, 1998; Zipser, Lamme, & Schiller, 1996). Specifically, re-entrant projections are thought to modify the properties of low-level (VI) neurons to represent complex features of objects such as borders defined by areas of different textures or colors. Using shapes defined only by differences in the orientation of small bars within their borders, Lamme (1995) found that the responses of some neurons in striate cortex of awake behaving monkeys were not driven exclusively by the local texture elements within their receptive fields. While the responses of these neurons during the first (approximately) 80 ms seemed to encode the orientation of the texture elements within their receptive fields, the later responses were strongly modulated by extra-receptive field context. 24 After 80 ms the neurons encoded information about object borders and surfaces defined by texture orientation regardless of the actual orientation of the elements vvithin their receptive fields. In this way, a neuron tuned to vertical lines may respond vigorously to a non-luminance defined vertical boundary between regions of slanted bars. These results have been expanded to include boundaries between regions of different depth, color, and motion (Lee et al., 1998; Zipser et al., 1996). This higher level of processing by V I neurons is thought to reflect the influence of descending information sent from extra-striate cortex back toVI. D i Lollo et al.'s (1999) model expands upon the role of re-entrant connections. It is suggested that these loops allow higher-level processing stages to return information back to lower level stages for the purpose of refining the outcome of stimulus identification and localization. Each iteration of these loops results in a more precise match between the information encoded at lower levels and higher-level interpretations of that information. According to D i Lollo et al., when descending and ascending inputs contradict, information at the higher levels is modified to correlate more precisely with that represented in lower levels. A typical common-onset stimulus sequence begins with simultaneous presentation of target and mask. The subsequent neural activity propagates from lower to higher stages of processing and a neural representation of both target and mask emerges in higher-level visual areas. This information is fed back to lower processing stages as descending input carried by re-entrant cortical loops. If the sensory input at these lower levels has not changed, that is, lower levels also contain a representation of a target surrounded by a mask, then the ascending and descending information does not contradict and the system continues to represent both target and mask at all levels. The entire process is then repeated. However, in common-onset object substitution, the target is presented only briefly while the mask remains visible for a longer duration. Once the target has vanished, the representation of the mask becomes the focus of object perception mechanisms. The descending representation of target and mask conflicts with an ascending representation that now contains only the mask. To create a better correspondence between the physical stimulus and the internal representation, the target is not encoded in successive iterations. The target becomes less visible with each successive iteration because its neuronal representation is gradually weakened. A s we have seen, selective attention must be given a role in any successful theory of object substitution. In D i Lollo et al.'s (1999) theory, selective attention when focused at the target location, acts to facilitate visual processing. Fewer iterations are required to identify the 25 target. Consequently, fewer iterations occur in which the ascending and descending input is in conflict and there is less opportunity for masking to occur. Object substitution is clearly modulated by selective attention. It follows that the responses of neurons mediating this phenomenon are also modulated by selective attention. In addition, object substitution is, in essence, a failure of the object perception mechanisms of the brain. It stands to reason that the neurons involved in object substitution should be found in brain areas involved in object perception. Indeed, evidence has emerged over the past decade that strongly suggests a critical role of selective attention in the functioning of the putative object perception pathway of the brain. Firing rates of neurons in areas V I , V2 , V4 , and inferior temporal cortex (IT) are modulated by attentional manipulations (Chelazzi, Miller, Duncan, & Desimone, 1993; Luck, Chelazzi, Hillyard, & Desimone, 1997; Moran & Desimone, 1985; Motter, 1993; Roelfsema, Lamme, & Spekreijse, 1998). These visual areas make up the ventral stream and are thought to be involved in object perception (Desimone, Schein, Moran, & Ungerleider, 1985). Moran and Desimone (1985) addressed the question of how neurons in area V 4 behave when two different stimuli are presented within a single receptive field. Neurons in V 4 tend to be tuned to specific colors and orientations of stimuli such that some stimuli elicit a strong response from a given cell and others do not (Desimone et al., 1985). The critical experiment involved placing one effective stimulus and one ineffective stimulus within the receptive fields of V4 neurons. Moran and Desimone found that the response of a cell depends on which stimulus is attended by the monkey. When attention was directed to the location of the effective stimulus, the cell responded normally. When attention was directed to the ineffective stimulus, the cell responded as if the effective stimulus was absent from the receptive field. Modulation did not occur when the ineffective stimulus was placed outside the receptive field, suggesting that attentional modulation occurs only when two objects appear at roughly the same visual location. Given that the target and mask in object substitution appear at the same location, this finding is of particular interest. This tendency for the visual system to encode only one of two objects at the same location fits with the theory proposed by D i Lollo et al. (1999). Implicit in D i Lollo et al.'s (1999) theory is the notion that the mask becomes the focus of object perception mechanisms. A neural correlate of this sort of attentional selection has been shown by Chelazzi et al. (1993). They recorded from neurons of inferior temporal cortex (IT) while a monkey was engaged in a delayed match-to-sample task. In this task the monkey was 26 presented with a target shape and required to remember that shape for a period of several seconds. At that point, an array of several similar shapes appeared, one of which was the original target shape. The monkey was rewarded if it made an eye-movement to the location of the target. The authors employed a strategy similar to Moran and Desimone's (1985) in that the target could be either a good or poor stimulus for eliciting activity in a particular cell. If the target was a poor stimulus, the cell's good stimulus was included as one of the distractors in the array. When the target was a good stimulus it elicited a strong increase in firing rate from the recorded neuron. However, when the target was a poor stimulus, a slight decrease in firing rate was observed. When the search array appeared, activity in the neuron began to increase because its good stimulus was present in the array. However, activity was sharply suppressed 90 -120 ms before the monkey initiated an eye movement to the poor target, and presumably in conjunction with a shift of attention. B y contrast, i f the target was a good stimulus for the neuron, activity in that neuron continued to increase after the initiation of the eye movement. It was also found that increasing the number of items in the search array increased the latency at which response to the poor target was suppressed. The authors take this to indicate the involvement of a visual search for the target. Chelazzi and his colleagues have found similar behavior among neurons in area V 4 (Chelazzi, 1995). Consistent with Moran and Desimone (1985), it was found that the suppression of a neuron encoding an unselected target occurs only when the attended target also falls within that neuron's receptive field. The conclusion to be drawn is that the initial visual response to items in a multi-element array is identical until one item becomes the focus of selective attention. At that point, neurons responding to other items in the same region of space are suppressed. The neurophysiological evidence described by Chelazzi et al. (1993) and the theory proposed by D i Lollo et al. (1999) have an important principle in common. Both are concerned with the effect of selecting one stimulus rather than another at roughly the same location. For Chellazi et al. (1993) the selected stimulus was the target of the visual search and the unselected stimuli were the distractors. In D i Lollo et al.'s theory, the selected stimulus is the mask and the unselected stimulus is the target. The attentional modulation of ventral stream neurons is then a likely candidate for the neural correlate of object substitution; a process which happens at the single-cell level and parallels findings at the behavioral level. This process is a specific suppression of the neurons encoding features of one stimulus due to the attentional selection of another stimulus at roughly the same location. 27 The findings of Chelazzi et al. (1993) suggest that the representation of the visual scene during a visual search is precarious, with the system initially attempting to encode virtually every feature of every object in the scene. This represents an important subtlety in the theory at hand: masking occurs when the mask is attentionaly selected while the target representation is in this precarious state, that is, before visual search has been completed. The attentional manipulations of pre-cueing and target pop-out prevent object substitution because they eliminate the need for such a search and reduce the span of time during which the target representation remains unattended. There are several indications that an attentional selection of the mask might be involved in object substitution. Recall that attention can be either voluntarily shifted or involuntarily captured. In object substitution, an argument can be made for both possibilities. In the studies of D i Lollo et al. (1999) and Experiments 1,2, and 3, the mask was 100% informative as to the location of the target. In fact, in all of the D i Lollo et al. experiments except the pop-out study, the target could not be distinguished from distractors except by the presence of the mask. It is therefore possible that subjects adopted a strategy of briefly attending to the mask so as to locate the target. A n argument can also be made for the involuntary capture of attention by the mask. It was mentioned in Chapter 3 that attention may be captured by the onset of a new perceptual object (Yantis & Jonides, 1984). Notice that the mask used in Experiments 1, 2, and 3, a ring appearing suddenly on an otherwise blank screen, is an onset stimulus that would be expected to capture attention. It might be argued that common-onset displays are evidence against the necessity of attentional capture by the mask . The target and mask appear at the same time; why then should attention be drawn to the mask? It is true that target and mask are new objects at the onset of the object substitution display, however the target is very brief and embedded among several distractors. The mask is more salient due to its extended presence on an otherwise blank screen. There is evidence to suggest that during visual search, the most salient item captures attention (Theeuwes, 1992). More general counter-arguments can be made as well. First, the Yantis and Jonides (1984) displays were also common-onset in that the search array was revealed simultaneously with the onset of the new object. The authors argued that attention was diverted first to the new object before the search could begin. The Chelazzi et al. (1993) experiment on which this discussion is based was itself a common-onset paradigm. The target of the visual 28 search appeared at the same time as the other items in the search array, yet suppression of neural responses to unattended stimuli still occurred. Another possible argument against the attentional capture hypothesis is that it predicts forward as well as backward masking. This argument fails to recognize an important subtlety in this hypothesis: object substitution occurs when the mask is attentionaly selected after the target has been encoded but before it has been attended. In this view, forward masking could occur only under display conditions which lead to attentional selection of the mask after the search array has appeared. Experiments 5 and 6 address this issue in more detail. A variety of predictions can be made to test the hypothesis that selective attention to the mask is responsible for object substitution masking. In each case, a manipulation that prevents the system from attentionally selecting the mask will reduce or prevent object substitution. Manipulations that restore the tendency for the system to attentionally select the mask during search for the target would be expected to restore the object substitution effect. This was the general strategy employed in the following three experiments. E X P E R I M E N T S 4, 5, A N D 6: O B J E C T S U B S T I T U T I O N I N V O L V E S A N A T T E N T I O N A L S E L E C T I O N O F T H E M A S K The crucial link between the neural correlates of visual search and the theory of D i Lollo et al. (1999) is the idea that attention might be shifted to the mask during search for the target. However, this claim is only tenable if it can be demonstrated that object substitution indeed involves an attentional selection of the mask. The most basic prediction is that object substitution should be sensitive to manipulations of the display which make it less likely that the mask becomes the selected object rather than the target. Experiment 4 - Object Substitution is Eliminated in Multiple Mask Displays Distractor items were added to the displays in Experiments 1,2, and 3 to divide attention and prevent focusing on the target. The same manipulation can be applied to the masks. Experiment 4 compared the strength of object substitution masking across two conditions, one in which a single mask appeared at the target location and one in which masks appeared at distractor locations as well. 29 Method Experiment 4 employed a common-onset paradigm much like that used by Di Lollo et al. (1999). Nine subjects viewed 2 consecutive frames of stimuli from a viewing distance of 57 cm. These were presented on a Sony Trinitron color monitor controlled by Cambridge Research Systems VSG video hardware. The background was set at 10 cd/m2 and the luminance of the display was calibrated with an Opti-cal photometer. Subjects initiated each trial with a key press at which point a central fixation cross appeared. Frame 1 contained the elements of a visual search task identical to the non-pop-out condition in Experiment 3 (see Figure 10). The target of this search was a single complete ring (an O) among 7 rings with randomly oriented gaps (Cs), and the subject indicated whether or not the target had been present in the display. In Experiments 4, 5, and 6, the set-size of the search array was fixed at 8 items. Target and distractors subtended 1° and the gap in the distractors was 90°. As in Experiment 3, the target appeared on only 50% of trials and both hit and false-alarm rates were recorded (on catch trials, all 8 items were Cs). This allowed sensitivity to be expressed in d' units. The elements of the array were 100 cd/m2 in luminance and were uniformly distributed around a notional circle of 3° radius. io] • 1 Mask, • 8 Masks* • * • Search Array = 10 ms Mask Duration = 0 - 750 ms Figure 10. Schematic of the stimuli used in Experiment 4. Frame 1 contained a search array with each element of the search appearing inside a square (common-onset masking). The subject's task was to detect the presence of an O among Cs. On catch trials the array contained only Os. The search array vanished after 10 ms leaving either the single square at the target location, or all 8 squares. Frame 2 remained visible for a duration of 0 - 750 ms. 30 Also appearing in Frame 1 were eight squares, each surrounding an element of the search array. The squares were 2° in width for all but one subject who required a slightly larger target-mask spacing to obtain above threshold performance on the task. Slight variations in target-mask spacing have been shown to have minimal influence on object substitution (Di Lollo et al., 1999; Enns & D i Lollo, 1997). Frame 1 was presented for 10 ms, followed immediately by Frame 2 (no blank ISI). Frame 2 contained either the same square mask that had surrounded the target in Frame 1 or all eight square masks from Frame 1. The term set-size in this experiment will now be used to describe the number of masks in Frame 2. The duration of Frame 2, or mask duration, was either 0,40, 80,160, 320, 500, or 750 ms, with duration = 0 indicating that the target and masks vanished together. In common-onset displays the target and masks appear for different durations, leading to a potential confound between mask duration and perceived brightness (Di Lollo et al., 1999). For brief (< ~ 100ms) stimuli, perceived brightness varies with duration. In this case, a mask of 40 ms duration at a luminance of 100 cd/m 2 would appear to be somewhat brighter than a 10 ms target of equal luminance. Any masking observed as a function of duration of the mask would be confounded with perceived brightness of the mask. To eliminate this confound, subjects participated in a short brightness matching procedure in which the luminance of a square (identical to the masks), presented for the durations listed above was adjusted to match a standard square of 10 ms duration with a luminance of 100 cd/m 2 . This procedure was used to establish adjusted luminances for the masks in all of the remaining experiments. A l l subjects were tested on each condition of the 2 (set-sizes) x 7 (mask durations) design. Trials were blocked with 30 trials in each block and each subject was tested in each block only once. Results Sensitivity averaged across subjects as a function of mask duration is presented in Figure 11. Notice that sensitivity for the two set-size conditions is nearly identical at a mask duration of 0 ms. This is because Frame 2 in both conditions was blank and the stimulus was identical. However, sensitivity in the two set-size conditions differs markedly at 40 ms mask duration with the masking functions following quite different patterns. This interaction between set-size and mask duration was significant ( 2 x 7 repeated measure A N O V A , F6,48= 17.53 8; p < . 001). Sensitivity to detect the target in the single mask condition was maximal at the short mask 31 duration of 40 ms and then decreased consistently with increasing mask duration (F6.48 = 29.468; p< .001) approaching a d' of zero at a mask duration of 750 ms. B y contrast, while a small drop in sensitivity is seen between 0 and 40 ms, no significant overall effect of mask duration was observed in the multiple masks condition (F6.48 = 1-57; p= .17). 0 200 400 600 800 Mask Duration (ms) Figure 11. Results of Experiment 4 averaged across 9 subjects. Sensitivity (d') to detect the presence of the target as a function of mask duration for the single mask and 8 mask conditions. Error bars represent the 95% confidence interval. Discussion Object substitution in a common-onset paradigm can be defined as a decrease in visibility of the target with increasing duration of the mask. As expected, this is the pattern shown in the single mask condition of Experiment 4. Sensitivity went from a maximum of d'=2.72 at a mask duration of 40 ms to a minimum of d'=0.25 at a mask duration of 750 ms. However, aside from the drop in sensitivity between 0 and 40 ms, the multiple-mask condition yielded no such systematic decrease in sensitivity. The conclusion to be made is that object substitution, as described by Enns and D i Lollo (1997) and D i Lollo et al. (1999), did not occur in this condition. The initial drop in sensitivity between 0 and 40 ms is most likely not related to object substitution, at least as it is described by D i Lollo et al. In their framework, object substitution cannot occur with so brief a mask duration because target and mask should be temporally integrated. A more likely explanation for this drop is an inhibitory interaction between the contours of the target and mask that manifested when the mask contours exceeded the target contours by any duration. A n alternative explanation or contributing factor might be an inadequacy in the brightness adjustment procedure described above which allowed the mask to appear slightly brighter than the target in all but the 0 ms mask duration condition. 32 A n unexpected finding was the increase in performance when mask duration = 40 ms in the single-mask condition. A likely explanation is that whatever the cause for the drop at 40 ms in the multiple-mask condition, it was offset by the benefit of a spatial post-cue to the target location. Notice that in the 0 ms mask duration condition, the target location was distinguished only by the presence of the target. At all other mask durations in the single-mask condition, the target location was also distinguished by the persisting presence of the mask. This might have allowed the mask to act as a post-cue, directing attention to the target location. What is interesting about this possibility is that all benefit gained from having the mask present briefly following the target is lost if the mask is left on for an additional 40 to 80 ms . What happens during this period of time? One explanation is that attention is shifted to the mask. A 40 ms mask duration would be too brief to allow attention to be shifted to mask while it was still physically present in the display. However, the emergence of object substitution at mask durations longer than roughly 80 ms agrees closely with estimates of the latency of engaging exogenous or involuntary attention (Nakayama & Mackeben, 1989). This finding strengthens the connection between the neural correlates of visual search described by Chelazzi (1993) and the theory of object substitution described by D i Lollo et al. (1999). It suggests that attentional selection of the mask is indeed involved in object substitution. When a single mask outlives the target long enough to allow an attentional shift, object substitution is strong. However, when attention is divided among 8 objects in the masking frame, no object substitution occurs. Experiment 5 - Object Substitution is Eliminated by Previewing the Mask Experiment 4 manipulated mask set-size to show that attentional selection might be involved in object substitution. Other attentional manipulations could be expected to modulate object substitution as well. O f particular interest is the attentional capture manipulations discovered by Yantis and colleagues. Yantis and Jonides (1984) showed that attention is captured by the onset of a new object. When subjects searched for a target letter, attention was captured by the onset of a new, non-target letter, preventing attention from going immediately to the older target. In another study, Yantis and Johnson (1990) presented subjects with a visual search task in which half of the search items were revealed from existing figure 8s and the other half appeared as new objects. Subjects located the target more quickly when it was one of the new objects. This finding suggests that the visual system dealt first with the newer items and 33 then with the older items. Yantis and Johnson (1990) concluded that the visual system assigns priority tags to objects in a scene, with new items in a display being afforded higher priority than older items. This paradigm can be applied to object substitution in order to test whether attentional capture plays a role in the attentional shift to the mask suggested by the results of Experiment 4. Based on the results of Yantis and colleagues, a mask which appears several hundred milliseconds before the target should be less effective because, being older than the items in the search array, it should not elicit an involuntary shift of attention once the search array has appeared. This exact experiment was described in the Introduction to Experiment 2. D i Lollo et al. (1999) showed that a four-dot mask that was previewed at the target location produced little masking. Recall, however, that this was attributed to the spatial pre-cueing effect. The presence of the mask allowed subjects to direct their attention to the target location in advance of the onset of the search array. Because D i Lollo et al. used the mask as their cue, the effect of spatial cueing is potentially confounded with the fact that the mask as a perceptual object was previewed before onset of the target. The results of Experiment 2 resolved this by showing that a non-specific spatial pre-cue was sufficient to reduce object substitution. The hypothesis here predicts that a previewed mask that provides no location information should also be sufficient to reduce object substitution. Method The methods of Experiment 5 were similar to the single-mask condition of Experiment 4. Four subjects viewed a display containing three sequential frames (see Figure 12). Frame 1 was presented for 200 ms and allowed a preview of the mask prior to onset of the search array. The sequence of Frames 2 and 3 was similar to the single-mask condition in Experiment 4. In the onset condition, Frame 1 was blank, and the masks, target, and distractors appeared simultaneously at the onset of Frame 2. In the no-onset condition, Frame 1 contained a square mask at each possible target location. Thus the mask had been previewed for 200 ms when the display switched to Frame 2. The displays used in the present experiment were designed to be used in Experiment 6 as well. This required that the eccentricities of the items in the array vary slightly. Items at the 12, 3, 6, and 9 o'clock positions were centered at 4.5° eccentricity while the remaining four items were at 4.14° eccentricity. Frame 3 contained a single square mask at 34 the target location. Subject's task was to detect the presence of an O in the search array, and hit and false-alarm rates were recorded. Each subject was tested on four 16-trial blocks for each of the 2 (preview conditions) x 5 (mask durations) conditions for a total of 64 trials per data point. • • • + • • • • 0 0 0 • + 0 + 0 Preview = 200 ms Search Array = 10 ms Mask = 0-320ms Figure 12. Schematic of stimulus used in Experiment 5. Each trial began with either a blank screen (onset condition) or 8 square masks (no-onset condition) at all possible subsequent target locations. This 200 ms preview duration was followed by onset of an element of the search task within each square. Subjects task was to detect the presence of an O among Cs with catch trials having only Cs present. The search array vanished after 10 ms leaving a single square at the target location to act as a mask. Results Mean sensitivity for four subjects appears as a function of mask duration in Figure 13. The pattern of masking in the onset condition is similar to that seen in the previous experiment in the single mask condition. Sensitivity peaked at a mask duration of 40 ms and then began to decrease with increasing mask duration until reaching an asymptote at 160 ms. The pattern observed for the no-onset condition was quite different. Sensitivity at mask duration = 0 was greater than in the onset condition. More importantly, this high degree of sensitivity did not drop off with increasing mask duration. This differential effect of mask duration in the two conditions was significant (2 x 5 repeated measures ANOVA, F^n = 4.39; p<.05). 35 3 0-| r- 1 0 200 400 Mask Duration (ms) Figure 13. Results of Experiment 5 averaged across 4 subjects. Sensitivity (d') to detect the presence of the target as a function of mask duration for onset and no-onset mask conditions. Error bars represent the 95% confidence interval. Discussion As predicted, object substitution was observed only in the onset condition. That is, object substitution was eliminated when the onset of the mask preceded the onset of the search array. This elimination was not dependant on spatial pre-cueing as shown by D i Lollo et al. (1999) because every possible target location was cued in the preview frame. The greater sensitivity at mask duration = 0 may be due to a decreased contribution of contour interaction due to the absence of an onset transient associated with the contours of the mask. One possible explanation for the results of Experiment 5 is consistent with the current hypothesis: in the no-onset condition, the mask lacked the distinction of being a new perceptual object during the critical moments of visual search for the target. In the no-onset condition, the visual system noted the presence of a square at every location in the array including the target location. This was accomplished before the search array appeared. When the search array became available, the visual system engaged immediately in visual search for the target without interference from the older mask. However, in the onset condition, the mask at the target location appeared at the onset of the visual search array. Being a highly salient and new object in the field, it was assigned highest priority and attentionally selected by the visual system before search could begin for the target. Another possible explanation involves the role of luminance transients in backward masking. Transient-on-sustained inhibition theories explicitly require that the onset of the mask 36 be accompanied by an increment or decrement in luminance. It is possible that object substitution requires the onset of the mask to be accompanied by a luminance transient in a similar fashion. Because the mask appeared 200 ms before the target in the no-onset condition, the luminance transient associated with the onset of the mask might have been insufficient to cause object substitution. It is also possible that the lack of a mask onset simultaneous with target onset eliminated contour interactions to the extent that performance reached ceiling and no masking effect could be observed. Both possibilities were tested in the following experiment. Experiment 6 - Object Substitution Depends on Onset of Objects not Luminance Contours The mask in the no-onset condition of the previous experiment was previewed for 200 ms prior to the onset of the search array. For this reason, the mask could be considered an old object and less likely to elicit an attentional response during the visual search. This is confounded by the lack of a luminance transient associated with the onset of the mask at the moment of onset of the target. Luminance transients play a crucial role in other types of backward masking (Breitmeyer & Ganz, 1976; Weisstein et al., 1975) and may be necessary in object substitution. However, evidence from attentional capture experiments suggests that a luminance transient is not necessary for attention to be captured by a new perceptual object (Yantis & Hillstrom, 1994). The prediction made by the present hypothesis is that a mask that emerges as a new perceptual object at the moment visual search for the target begins should be effective regardless of its association with a luminance transient. To test this prediction, a final experiment was run in which the contours of the mask in the preview frame were hidden or camouflaged in a grid which covered the entire screen. This grid had the subjective appearance of a background texture rather than an arrangement of independent squares. Method Three of the four subjects tested in Experiment 5 also participated in Experiment 6 as an additional condition. The procedure was identical to the previous experiment except that the square masks in Frames 1 and 2 were replaced by a grid, which covered the entire screen (see Figure 14). The contours of the grid were constructed in such a way that the contours of the mask that appeared in Frame 3 were also present in the grid. This allowed the luminance transient associated with the contours comprising the mask to occur 200 ms prior to onset of the search array. However, the mask did not appear as a distinct object until 210 ms later, with the 37 onset of Frame 3. The prediction made by the hypothesis at hand was that this mask should behave like the onset mask in Experiment 5. > > > Preview = 200 Search Array = 10 Mask = 0-320 ms ms ms Figure 14. Schematic of stimulus used in Experiment 6. Preview of square masks in Expt. 5 was replaced by a full-screen grid. Lines of the grid were such that each square in the grid was identical to a the square used as the mask in the previous experiments. After a 200 ms preview, a search array appeared embedded in the grid. Subject's task was to detect the presence of an O among Cs. After 10 ms the search array as well as the grid vanished leaving only the contours of a square mask at the target location. Notice that these contours which define the mask in the final frame remain unchanged throughout the entire trial. Results Mean sensitivity for the three subjects is plotted as a function of mask duration in Figure 15. This curve is superimposed on the results from Experiment 5 for these same three subjects. Sensitivity at mask duration = 0 was greater in the grid condition than either the onset or no-onset conditions. However, the pattern that emerged as mask duration increased followed the same pattern as the onset condition in Experiment 5. At 40 ms sensitivity increased slightly. As mask duration increased beyond 40 ms, a strong and consistent drop in sensitivity emerged with sensitivity falling almost to the level of the onset condition and below the level of the no-onset condition. A 3 (preview conditions) x 5 (mask durations) repeated measures A N O V A was performed on the data obtained from the three subjects who participated in Experiments 5 and 6. The preview condition x mask duration interaction was significant (Fg ; i6 = 5.814;p = .001). Sensitivity at mask durations of 160 ms and 320 ms was compared by simple main effects analysis. A significant effect of preview condition was found at a mask duration of 160 ms (F2,4 = 9.378; p < .05) and at a mask duration of 320 ms (F2,4 = 13.132; p<.05). Differences between preview conditions at these mask durations was further analyzed by the Newman- Keuls 38 procedure to determine if the observed differences between the onset and grid conditions and the no-onset and grid conditions were significant. For a 160 ms mask, a significant difference was found between the no-onset and grid conditions (Q2,4 = 4.26; p<.05), but not between the onset and grid conditions (Q2,4 = 2.10; P>.05). In similar fashion, a mask duration of 320 ms yielded a significant difference between the no-onset and grid conditions (Cj2,4 = 5.328; p<.05), but not between the onset and grid conditions ( Q ^ = 1.625; P>.05). 0 200 400 Mask Duration (ms) Figure 15. Results of Experiment 6 averaged across 3 subjects. Sensitivity (d') to detect the target as a function of mask duration. Results of the grid preview condition are superimposed on results from the onset and no-onset conditions of Experiment 5. Discussion The present hypothesis predicts that object substitution is critically dependant on the temporal relationship between onset of the target and onset of the mask. Furthermore, unlike previous theories of backward masking, it predicts that the time course of luminance transients in the display should be of little or no consequence. The specific prediction made in this experiment was that object substitution should occur in the grid condition just as it occurred in the onset condition. Comparison of the shape of the masking functions in Figure 15 clearly indicates that this prediction held true. The greater sensitivity at mask duration = 0 can be explained for both the no-onset and grid conditions by the absence of a luminance transient. It has been shown that object substitution displays are not entirely free from low-level contour interactions (Di Lollo et al., 1999). Eliminating the luminance transient at the onset of the mask 39 contours would likely reduce any low-level interference which existed between mask and target contours. Eliminating the luminance transient associated with the onset of the mask does not eliminate object substitution masking. Strong masking was observed in the grid condition, which matched the pattern observed in previous experiments. Object substitution occurred despite the contours of the target and mask appearing in the same spatial and temporal relationship as in the no-onset condition of Experiment 5. If the results of Experiment 5 were due simply to a ceiling effect, the results of Experiment 6 should have been similar to the no-onset condition. This is further evidence in support of an attentional selection of the mask during search for the target. During the preview period, the visual system was not able to segregate the contours of the mask from the background of the grid. The mask only appeared as a new object when the grid lines vanished at the termination of the search array. This emergence of the mask as a new object just as visual search for the target was beginning led to an attentional selection of the mask before attention could be directed to the target. CONCLUSIONS The backward masking phenomenon has been studied for nearly a century, yet complete understanding of the phenomenon has not been reached. This may be due, in part, to the non-unitary nature of the effect. A variety of different processes may contribute to masking, only some of which, namely metacontrast and object substitution, have been described here. It is also possible that an explanation for backward masking will not emerge until the problem is addressed in the proper framework. Previous accounts have centered on memory mechanisms (e.g. Averbach & Coriel, 1961; Spencer & Shuntich, 1970), or on low-level contour interactions (e.g. Breitmeyer & Ganz, 1976; Weisstein et al., 1975). The recent effort to characterize object substitution masking in terms of selective attention has been successful and may well be the key to understanding the underlying mechanisms. The six experiments presented here represent a significant refinement in our thinking about backward masking. Experiment 1 generalized the findings of Di Lollo et al. (1999) to confirm that object substitution is not confined to common-onset stimuli. Conventional SOA paradigms yield backward masking which is sensitive to set-size manipulations as well. The results of Experiments 2 and 3 have clarified some points left unexplored by Di Lollo et al. (1999) and have exposed at least one unexpected result. Experiment 2 showed that a 40 valid spatial pre-cue bearing no structural similarity to the mask acts to reduce the object substitution effect. This point is of particular interest when comparing the results of Experiment 5 to those of Di Lollo et al. Notice that Di Lollo et al. pre-cued the target location by presenting the mask before the target onset. By pre-cueing every possible target location with a mask in a similar fashion, Experiment 5 showed that mask preview alone is sufficient to eliminate object substitution, regardless of any spatial information provided by its location. The results of Experiment 2 clarify this point. Both a valid spatial pre-cue and a preview of the mask are sufficient to reduce or eliminate object substitution. Experiment 2 also compared valid and invalid spatial cues, which had not been done by Di Lollo et al. This led to an unexpected finding: there is no cost associated with an invalid cue. Shifting attention to the wrong location before the target onset is no worse than having no cue whatsoever. This may indicate that in the no-cue condition, subjects do not divide their attention across the entire display in the moments before target onset, but rather focus on a random location. In both invalidly-cued trials and no-cue trials, subjects must still disengage, shift and re-engage attention in order to do the task. The resulting delay allows object substitution to occur. Experiment 3 was an important refinement of Di Lollo et al.'s (1999) pop-out versus non-pop-out experiment. They claimed to show that object substitution does not occur when the target pops out of the display. This claim was too strong given the nature of their experiment. Their paradigm compared a pop-out display to a display in which the target appeared identical to some of the distractors. The mask in this condition served as a partial report cue to single out the target. Thus, they showed only that object substitution does not occur when the target can be differentiated from distractors based solely on its figural properties. Experiment 3 confirmed their claim by comparing object substitution in serial and parallel search displays. The findings showed that a target that can be located in the display only after an inefficient serial search, is highly susceptible to object substitution. By contrast, a pop-out target is invulnerable to object substitution. Experiments 4, 5, and 6 lend support for a different way of thinking about object substitution. In this framework, object substitution occurs as a result of attentional selection of the mask when the system should be selecting the target. Whereas emphasis has been previously placed on the time course of visual search for the target (eg. Di Lollo et al., 1999), this view puts emphasis on the order in which the mask and target are selected by the visual system. The notion that the mask becomes selected by the visual system had been implied by Di Lollo et al. (1999) but not tested. Experiment 4 applied the basic set-size manipulation to the 41 mask frame to test this possibility. Adding masks to the display prevented attentional selection of the particular mask at the target location just as increasing set-size in Experiment 1 prevented selection of the target. As predicted, this manipulation eliminated object substitution, suggesting that attentional selection of the mask is an important part of object substitution. In object substitution, the single mask at the target location appears highly salient. Both common-onset and SOA paradigms produce the subjective impression that one "can not help but look at" the mask. Such observations are consistent with an involuntary attentional selection of the mask, although a voluntary process cannot be dismissed. Experiment 5 addressed the possibility that attention is drawn involuntarily to the mask when it outlives the target in a common-onset display. The strategy was inspired directly by attentional capture paradigms (i.e. Yantis & Jonides, 1984). By exploiting the tendency for the visual system to selectively attended to new objects in the display, Experiment 5 effectively pre-empted an attentional selection of the mask during search for the target. Simply presenting the mask 200 ms in advance of the target eliminates object substitution. This finding highlights the importance of the order in which the visual system is given new objects to deal with. When presented with a highly salient mask and a visual search simultaneously, the system seems to select the mask first. When the onset of the mask occurs during visual search for the target, as in Experiments 1,2, and 3, the newer mask becomes selected rather than the target and object substitution results. Conversely, when the onset of the mask precedes the onset of the search array, the system may deal with the newer items in the search array without interference by the older mask. The importance of the temporal order of object selection in object substitution displays was made more compelling by the findings of Experiment 6. By concealing the mask in the background during the 200 ms preview period its emergence as a distinct object was prevented until the visual search for the target was underway. Because the visual system was presented with a new and highly salient mask during visual search, the mask was selectively attended rather than the target, and object substitution occurred. This shows that it is neither the temporal order of luminance transients nor the order of contour onsets which is critical in object substitution. The critical factor is the order in which the masking object and the visual search array are presented to the visual system. The effect of object substitution on conscious visual perception has been intentionally unexplored in the previous discussion. Visual awareness is difficult to define and no attempt was made to evaluate the subject's phenomenological impressions. It is worth noting, however, that object substitution results in a compelling sensation of not having apprehended the target. 42 For this reason object substitution may offer a unique tool in the scientific study of consciousness. In connecting attentional selection of visual objects to the object substitution effect, a link has been forged between backward masking and attentional modulation of neurons in visual cortical areas VI, V2, V4, and IT. Neurons encoding the features of unattended objects are suppressed when a single object in the display is selected (Chelazzi et al., 1993). Likewise, the visual representation of the target in object substitution is suppressed when the mask becomes selected. This simple observation opens the way for investigation into the neural correlates of object substitution, selective attention, and possibly visual awareness. 43 REFERENCES Averbach, E., & Coriell, A. S. (1961). Short-term memory in vision. Bell System Technical Journal, 40, 309-328. Breitmeyer, B. G., & Ganz, L. (1976). Implications of sustained and transient channels for theories of visual pattern masking, saccadic suppression, and information processing. Psychological Review, 83,1-36. Breitmeyer, B. G. (1984). Visual Masking: An integrative approach. New York: Oxford University Press. Bridgeman, B. (1988). Visual evoked potentials: Concomitants of metacontrast in late components. Perception & Psychophysics, 43 (4), 401-403. Bridgeman, B. (1971). Metacontrast and lateral inhibition. Psychological Review, 78, 528-539. Chelazzi, L. (1995). Neural mechanisms for stimulus selection in cortical areas of the macaque subserving object vision. Behavioural Brain Research, 71,125-134. Chellazi, L., Miller, E. K., Duncan, J. , & Desimone, R. (1993). A neural basis for visual search in inferior temporal cortex. Nature, 363, 345-347. Desimone, R., Schein, S. J., Moran, J., & Ungerleider, L. G. (1985). Contour, color, and shape analysis beyond the striate cortex. Vision Research, 25 (3), 441-452. Di Lollo, V., Enns, J. T., & Rensink, R. A. (1999) Psychophysics of reentrant visual pathways: a computational model of object substitution (CMOS). Manuscript submitted for publication. Di Lollo, V., Bischof, W. F., & Dixon, P. (1993). Stimulus-onset asynchrony is not necessary for motion perception or metacontrast masking. Psychological Science, 4,260-263. Duncan, J. (1984) Selective attention and the organization of visual information. Journal of Experimental Psychology: General, 113 (4), 501 - 517. Egeth, H. E., & Yantis, S. (1997) Visual Attention: Control, representation, and time course. Annual Review of Psychology, 48, 269 - 97. Enns, J. T., & Di Lollo, V. (1997). Object substitution: A new form of masking in unattended visual locations. Psychological Science, 8,135 - 139. Fehrer, E., & Biederman, I. (1962). A comparison of reaction and verbal report in the detaction of masked stimuli. Journal of Experimental Psychology, 64, 126-130. Havig, P. R., Breitmeyer, B., & Brown, V. R. (1998). The effects of pre-cueing attention on metacontrast masking. Investigative Ophthalmology and Visual Science Annual Meeting Abstract Book, 39(4), S630. 44 Kahneman, D. (1968). Method, findings, and theory in studies of visual masking. Psychological Bulletin, 70,404-425 Lamme, V. A. F. (1995). The neurophysiology of figure-ground segregation in primary visual cortex. Journal ofNeuroscience, 15,1605 - 1615. Lee, T. S., Mumford, D., Romero, R., & Lamme, V. A. F. (1998) The role of the primary visual cortex in higher level vision. Vision Research, 38,2429 - 2454. Luck, S. J, Chelazzi, L., Hillyard, A., & Desimone, R. (1997) Neural mechanisms of spatial selective attention in areas VI, V2, and V4 of macaque visual cortex. Journal of Neurophysiology, 77, 24 - 42. Macmillan, N. A. & Creelman, C. D. (1991). Detection theory: A user's guide. New York: Cambridge University Press. Moran, J., & Desimone, R. (1985). Selective attention gates visual processing in the extrastriate cortex. Science, 229,782-784. Motter, B. C. (1993). Focal attention produces spatially selective processing in visual cortical areas VI, V2, and V4 in the presence of competing stimuli. Journal of Neurophysiology, 3, 909 -919. Nakayama, K., & Mackeben, M. (1989). Sustained and transient components of focal visual attention. Vision Research, 29,1631-1647. Posner, M. I., & Petersen, S. E., (1990). The attention system of the human brain. Annual Review ofNeuroscience, 13, 25 - 42. Ramachandran, V. I., & Cobb, S. (1995) Visual attention modulates metacontrast masking. Nature, 373, 66-68. Roelfsema, P. R., Lamme, V. A. F., & Spekreijse, H. (1998). Object-based attention in the primary visual cortex of the macaque monkey. Nature, 395, 376-381. Spencer, T. J., & Shuntich, R. (1970). Evidence for an interruption theory of backward masking. Journal of Experimental Psychology, 85,198-203. Tata, M. S., Di Lollo, V., & Giaschi, D. E. (1998). Visual attention modulates metacontrast masking. Investigative Ophthalmology and Visual Science, 39(4), S632. Theeuwes, J. (1991). Exogenous and endogenous control of attention: The effect of visual onsets and offsets. Perception & Psychophysics, 49 (1), 83 - 90. Theeuwes, J. (1992). Perceptual selectivity for color and form. Perception and Psychophysics, 51,599-606. 45 Treisman, A., Cavanagh, P., Fisher, B., Ramachandran, V. S., & von der Heydt, R. (1990). Form perception and attention. Striate cortex and beyond. In L. Spillmann, & J. S. Werner (Eds.), Visual Perception: The Neurophysiological Foundations. (pp273-316). San Diego: Academic Press Treisman, A. M., & Gelade, G. (1980). A feature integration theory of attention. Cognitive Psychology, 12, 97-136. Uttal, W. R. (1970). On the physiological basis of masking with dotted visual noise. Perception & Psychophysics, 7(6), 321-326. von der Heydt, R. (1995). Form analysis in visual cortex. In M. S. Gazzaniga (Ed.), The Cognitive Neurosciences (pp. 365-382). Cambridge: MIT Press Weisstein, N., Ozog, G., & Szoc, R. (1975). A comparison and elaboration of two models of metacontrast. Psychological Review, 82, 325-343. Wolfe, J. M. (1998). Visual Search. In H. Pashler (Ed.), Attention (pp. 13-73). Hove: Psychology Press. Wolfe, J. M., Cave, K. R., & Franzel, S. L. (1989). Guided search: An alternative to the feature integration model for visual search. Journal of Experimental Psychology: Human Perception and Performance, 15 (3), 414-433. Yantis, S. (1996). Attentional capture in vision. In A. F. Kramer, M. G. H. Coles, & G. D. Logan (Eds.), Converging operations in the study of visual selective attention (pp45-76). Washington, DC: American Psychological Association. Yantis, S. & Hillstrom, A. P. (1994). Stimulus-driven attentional capture: Evidence from equiluminant visual objects. Journal of Experimental Psychology: Human Perception and Performance, 20 (1), 95 - 107. Yantis, S., & Johnson, D. N. (1990). Mechanisms of attentional priority. Journal of Experimental Psychology: Human Perception and Performance, 16, 812-825. Yantis, S., & Jonides, J. (1984). Abrupt visual onsets and selective attention: Evidence from visual search. Journal of Experimental Psychology: Human Perception and Performance, 10, 601-620. Zipser, K., Lamme, V. A. F., & Schiller, P. H. (1996). Contextual modulation in primary visual cortex. Journal of Neuroscience, 16 (22), 7376 - 7389. 46 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0089050/manifest

Comment

Related Items