UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

The role of expectations in the feature integration process Butler, Deborah Lynne 1985

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1985_A8 B88.pdf [ 4.92MB ]
Metadata
JSON: 831-1.0096441.json
JSON-LD: 831-1.0096441-ld.json
RDF/XML (Pretty): 831-1.0096441-rdf.xml
RDF/JSON: 831-1.0096441-rdf.json
Turtle: 831-1.0096441-turtle.txt
N-Triples: 831-1.0096441-rdf-ntriples.txt
Original Record: 831-1.0096441-source.json
Full Text
831-1.0096441-fulltext.txt
Citation
831-1.0096441.ris

Full Text

THE ROLE OF EXPECTATIONS IN THE FEATURE INTEGRATION PROCESS by DEBORAH LYNNE BUTLER B.A., The University of C a l i f o r n i a , San Diego, 1981 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF ARTS in THE FACULTY OF GRADUATE STUDIES (Department of Psychology) We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA A p r i l 1985 ©Deborah Lynne Butler, 1985 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of y c U o i o <g The University of British Columbia 1956 Main Mall Vancouver, Canada V6T 1Y3 Date QcMo&r cz . t°i ^ S " DE-6(3/81) i i Abstract According to Treisman (Treisman and Schmidt, 1982) feature detection occurs in p a r a l l e l , while the correct integration of detected features into an object requires focal attention. She has proposed that in the absence of attention, subjects w i l l perceive " i l l u s o r y conjunctions", or invented objects constructed out of features a c t u a l l y present in a display. The present experiments were designed to examine how the presence of expectations might affect the feature integration process and the construction of i l l u s o r y conjunctions. The results of these experiments suggest that expectations do affe c t the perception of simple objects: subjects make more i l l u s o r y conjunctions in the absence of expectations, and the perception of expected objects i s the most accurate. However, the data indicate that expectations do not exert t h i s influence by guiding the feature integration process, because subjects do not tend to construct expected objects out of features appearing in a display. As a re s u l t , i t is l i k e l y that expectations are i n f l u e n t i a l not by determining the construction of object f i l e s , but by speeding up the i d e n t i f i c a t i o n of the features of expected objects, so that focal attention can be applied, and object f i l e s constructed, more e f f i c i e n t l y . As a r e s u l t , the perception of expected objects can be accurately accomplished in a shorter amount of time. i i i Table of Contents Abstract • i i L i s t of Tables v L i s t of Figures v i Acknowledgements v i i I ntroduct ion 1 Definitions 3 Conceptually Driven Processing 6 The Role of Context in Perceptual Processing 11 Data Driven Processing 26 Feature and Dimension Combination: The Construction of an Object 26 Feature Integration Theory 36 Object F i l e s 48 Conceptual Knowledge and Feature Integration 52 The Experiments 59 Exper iment 1 61 Method 64 Subjects 64 Stimuli 65 i v M e a s u r e s .69 P r o c e d u r e .70 R e s u l t s a nd D i s c u s s i o n 72 C o n c l u s i o n s 89 E x p e r i m e n t 2 91 M e t h o d 96 S u b j e c t s • • 96 M a t e r i a l s • .96 P r o c e d u r e a n d M e a s u r e s ...97 R e s u l t s a n d D i s c u s s i o n ......98 G e n e r a l D i s c u s s i o n a nd C o n c l u s i o n s ..106 R e f e r e n c e s 1 1 4 V L i s t of Tables I Mean percentage of each type of cued object reported, and of those reports, the mean percentage correct: P i l o t study 20 II Mean proportions of response types to unconstrained, constrained normal, and constrained abnormal cued objects by confidence: Experiment 1 74 III Normal cued objects: mean number of features drawn from normal or abnormal objects, when making conjunction or feature errors 87 IV Mean proportions of response types to normal and abnormal cued objects by confidence: Experiment 2 99 v i L i s t of F i g u r e s 1. E x p e r i m e n t a l S t i m u l i 63 2. E x p e r i m e n t a l Cards: Experiment 1 ...66 3. E x p e r i m e n t a l Cards: Experiment 2 93 v i i Acknowledgement There are a number of people without whose help I could never have completed t h i s project, and I would l i k e to off e r my thanks to them. F i r s t I would l i k e to thank the members of my committee: Dr. Anne Treisman, my advisor, who has read many, many drafts of t h i s rather long document, and who has patiently and meticulously offered comments on every draft, and both Dr. Jennifer Campbell and Dr. Larry Ward, who have offered helpful comments and c r i t i c isms.'One person who has been exceptionally supportive during the l a s t two years i s Dr. Larry Walker, and I would l i k e to extend a very special thanks to him, for his time, his e f f o r t , and his patience. I would also l i k e to thank my family (Mom, Dad, Steve, and Ed) for their endless emotional and f i n a n c i a l support. I think they believed I would f i n i s h this more than I did. And f i n a l l y , my friends and fellow students have continuously boosted my morale over a long haul, and I couldn't have done i t without them: Teresa and Tweed es p e c i a l l y , Brian, Jacquie, the attention lab crowd, Cindy, Paula, Jackie, John...Thanks! 1 INTRODUCTION What are the psychological processes that underly object perception? F i r s t , i t is clear that perceptual processing involves extracting information from a visu a l stimulus, and building a representation of an object from that information. This type of processing has been termed "data driven", or "bottom up" (Lindsay and Norman, 1976), as i t involves constructing a perception out of data acquired -from our v i s u a l world. Perceptual processing also involves information acquired from our p r i o r knowledge about our world, however. Our knowledge of possible objects or the context in which an object usually appears w i l l often guide our interpretation of the vis u a l information we have accumulated (Lindsay' and Norman., 1976; Neisser, 1967). For example, an ambiguous figure can be perceived in either of two ways- that i s , a single set of vis u a l cues can be combined d i f f e r e n t l y - depending on our interpretation of the figure. This type of contribution to perceptual processing has been termed "conceptually driven" (Lindsay and Norman, 1976), as i t describes how our 2 expectations or concepts can themselves determine how our perceptions of the world are constructed. Thus perceptual processing involves the construction of a perception from a combination of information- some derived from the physical world, and some contributed by our prior understanding of how that world operates (Neisser, 1967). In order to understand object perception, i t is c r u c i a l to understand how these two types of information interact. When is v i s u a l information most important? How much and when do we rely on context or our expectations in the course of perceiving an object? These questions have been the driving force behind much of the previous research on object perception (Minsky, 1975; Biederman, Mezzanote, and Rabinowitz, 1982; Biederman, Glass, and Stacy, 1973; Mandler and Johnson, 1976; Loftus and Mackworth, 1978; Palmer, 1975; Bobrow and Norman, 1975; Kuipers, 1975; Friedman, 1979; Chastian, 1977), and are once again the focus here. Ultimately t h i s thesis w i l l describe two experiments designed to test the effect of one type of expectation on the concurrent perception of simple objects. In order to describe the relevance of the present experiments to an understanding of the perceptual process, i t is f i r s t necessary to construct a theoretical framework in which to discuss them. Therefore, t h i s introduction w i l l begin with a discussion of our current knowledge about the re l a t i v e 3 contributions of conceptual and vi s u a l knowledge to the construction of a perception, and an integration of thi s knowledge into a working perceptual theory. It w i l l then be possible, within the context of thi s integrated perceptual framework, to understand the potential contribution of the present experiments. DEFINITIONS A precise, comprehensive d e f i n i t i o n of many of the terms associated with object perception would require a discussion much too involved for thi s introduction. Therefore, a set of rather s i m p l i s t i c d e f i n i t i o n s w i l l be adopted here, s u f f i c i e n t for an understanding of the research reviewed below. F i r s t , the term "feature" w i l l be used in t h i s thesis to refer to the components out of which objects are constructed. In other words, features can be parts or properties, which in combination with one another make up our perception of. an object. Some examples of features would be l i n e segments, textures, brightness, colors, or angles. Generally features are thought to be recognized by "feature detectors" simultaneously, or in p a r a l l e l , and independently. One d i s t i n c t i o n that w i l l be blurred w i l l be between features and dimensions, since in the context of this thesis 4 t h i s d i s t i n c t i o n i s not r e a l l y important. Treisman and Schmidt (1982) have defined a dimension to be a set of possible mutually exclusive states of a variable, while features are pa r t i c u l a r values on a dimension. So although i t i s possible to d i f f e r e n t i a t e between features and dimensions, within this thesis the terms w i l l be used interchangeably. Objects, on the other hand, are c o l l e c t i o n s of features which appear together at a common s p a t i a l l o c a t i o n . In other contexts t h i s d e f i n i t i o n would be inadequate, because often objects overlap, or occupy large areas where there are gaps within the object i t s e l f . This thesis, however, i s primarily concerned with a description of how i t i s that objects composed of a set of features at a common s p a t i a l location are perceived, and so for the present purposes th i s limited d e f i n i t i o n should s u f f i c e . Further, a scene i s a c o l l e c t i o n of objects which appear together in a larger display. For example, a kitchen scene would consist of a- c o l l e c t i o n of objects appearing in related s p a t i a l locations: the refrigerator might be next to a counter, on the other side of which stands a stove. In a way, then, objects are to scenes as features are to objects, because objects are the components of scenes, while features are the components of objects. One clear problem with these d e f i n i t i o n s , however is that 5 given a single scene, i t i s not immediately easy to pick out an "object". For example, in a kitchen scene, would we c a l l the handle on an oven door an object? What about the oven door? Would we reserve the label for the entire stove i t s e l f ? What is the relationship of these d i f f e r e n t objects to one another? It is apparent that the above candidates for objecthood are not unrelated, but that in fact each i s a component of the next. The handle is a part of the oven door, which i s in turn part of the stove. The question, then, i s , r e l a t i v e l y speaking, which level of components i s the "object" l e v e l , and is i t possible to make such a d i s t i n c t i o n ? In t h i s thesis t h i s problem w i l l be solved somewhat a r b i t r a r i l y , partly because any answer to.this question is of necessity contextually bound. B a s i c a l l y , any of the above "units" could be objects, because each is composed of features which occur at a common s p a t i a l l o c a t i o n . The important point, however, i s that the objects which appear in a scene are often nested. In essence, any scene can be viewed as a hierarchy of levels of s p a t i a l l y d i s t i n c t components. For example, a kitchen scene i s composed of a set of components, which are themselves objects, some of which are a r e f r i g e r a t o r , a stove, and a sink. At the next l e v e l down, each of these objects is also composed of a set of components, which again, because they consist of a c o l l e c t i o n of features, are objects. Examples of 6 these would be the elements, the door, and the d i a l s on the stove. S i m i l a r l y , the d i a l s are composed of components, which are composed of components, etc. The lowest l e v e l in any hierarchy, according to these d e f i n i t i o n s , would be the feature l e v e l , because at that level the components of an object are perceived simultaneously and independently by the visu a l system. CONCEPTUALLY DRIVEN PROCESSING It i s widely accepted, and easily demonstrated, that perception i s not simply a process of constructing out of observable, physical data a "percept". In fact, even in trying to define the "units" of perception a problem ar i s e s . Some researchers (Rumelhart and Sip l e , 1974) have proposed that our visua l system i t s e l f structures the physical information i t receives, at an early point in the perceptual process. Rumelhart and Siple (1974) note that there,, i s a difference, between the set of a l l possible features to which our neural c e l l s respond, and "functional features", which are those which serve some psychological function. For example, these researchers claim, although our feature detectors may recognize lin e segments of varying length, those which become units of analysis are those which w i l l d istinguish for us between the 7 members of a set of possible objects. They propose, then, that not even a description of feature detection can be complete without taking into account the way in which our v i s u a l system interprets the environment. It has been proposed that our vi s u a l system, at an early point in the perceptual process, interprets the environment, or imposes i t s own order on the physical world. This may only be the beginning, however, of the influence of our own minds on the perceptions we construct. Researchers have documented many other ways in which our perceptions seem to be psychologically influenced. F i r s t , i t should be obvious that our perceptions of units at the di f f e r e n t l e v e l s within a perceptual hierarchy are not independent. What we perceive at the feature level c e r t a i n l y has implications for the perception we b u i l d of an object of which those features are a part. Yet perception does not seem to consist of an orderly accrual of information as we ascend through the perceptual hierarchy to the highest possible unit of analysis (Rumelhart, 1977; Neisser, 1967; Lindsay and' Norman, 1976). Instead, many researchers claim that perceptual processing may occur at a l l levels simultaneously. Not only does the perception of lower l e v e l information contribute to the construction of higher l e v e l percepts, but simultaneously information about the identity of higher l e v e l units influences the perception of those at lower l e v e l s . 8 For example, Homa, Haver, and Schwartz (1976) showed that the perception of features within a face i s better when the face i s organized than when i t i s scrambled. Here, the organization of f a c i a l features into a face, an object at a higher l e v e l , influences the i d e n t i f i c a t i o n of that object's component parts. Reicher (1969) demonstrated that l e t t e r s were recognized better after having appeared in a word than i f they had appeared either alone or in f o u r - l e t t e r nonsense words. In addition, P i l l s b u r y ' s (1898) frequently c i t e d demonstration showed that subjects w i l l perceive "FOREVER" when in fact they were presented with "FOYEVER"; the i d e n t i t y of the word altered the perception of a component l e t t e r . Further, at an even higher l e v e l in a perceptual hierarchy, Lindsay and Norman (1976) provide a compelling demonstration of their own of the effect of sentence and text comprehension on the perception of a word. In one sentence they l e f t out a word e n t i r e l y , and a student reading the text i s unlikely to catch the ommission. A large amount of work has also been done which demonstrates that the presence of a scene af f e c t s object perception (Mandler and Johnson, 1976; Biederman et a l , 1982; Biederman et a l , 1973; Palmer, 1975). For example, Biederman et a l (1973) showed that the i d e n t i f i c a t i o n of an object was more rapid in coherent scenes than in jumbled ones. These experimenters also measured subjects' reaction times to report 9 whether a target object was present in the scene, where the target objects would either be present in the scene, not present but probable given the context involved, or not present and unlikely to have appeared. They found that subjects were slowest to respond when the object might have been there but wasn't, and were quickest when the target was absent and improbable. Thus the semantic meaning of the scene c l e a r l y influenced the perception of an object within i t . S i m i l a r l y , Palmer (1975) showed that subjects were better at i d e n t i f y i n g objects which appeared within a reasonable context, for example, a bread box within a kitchen, than they were at iden t i f y i n g that object either within no cbntext or within an inappropriate one, such as on a c i t y street. And f i n a l l y , again in the realm of verbal materials, Bransford and Johnson (1972) described a number of experiments which demonstrated the influence of an entire paragraph on the perception of a component sentence. In sum, this, research has shown that the. organization of lower l e v e l components into objects at higher l e v e l s or into scenes may influence our perception of those lower level components. Thus our perceptions of the physical environment, or of those units at lower l e v e l s , i s influenced by our knowledge of the context in which they are embedded. And paradoxically, some researchers claim that the context i t s e l f 10 can be provided by information we accrue about higher l e v e l objects, composed themselves of that same physical information and of those lower level components. Clearly any influence that context may have on the perception of lower perceptual units must hinge on our prior knowledge about the context involved, and about the relationships of di f f e r e n t units within that context. In other words, i t may be our p r i o r knowledge which is responsible for the effect of context on perceptions at lower l e v e l s . Researchers have also shown that sometimes conceptual information derived from sources outside of the scene can also provide a context, and so influence our perception of that scene. For example, another way in which perception may be influenced i s by expectations we may have about a display. LaBerge (1973) led his subjects to have expectations about the stimuli they could expect to perceive, and found that expectations which were f u l f i l l e d aided subjects in their perceptions of unfamiliar l e t t e r s (ones that he had invented), while u n f u l f i l l e d expectations hindered that perception. On the other hand, expectations f a i l e d to affect subjects' v i s u a l processing of familiar l e t t e r s . S t i l l , these experiments showed that, at least in the perception of stimuli which are not perceived automatically, expectations a f f e c t subjects' perceptions. 11 S i m i l a r l y , other researchers have shown that subjects can be given enough contextual information to influence their perceptions simply by providing them with a verbal l a b e l . For example, the perception of a degraded picture can be aided by the provision of a verbal l a b e l , as in the f a m i l i a r example of the degraded picture of a dalmatian (see Lindsay and Norman, 1976). When a verbal label i s provided, i t can serve to structure the i d e n t i f i c a t i o n of objects in a scene. In another experiment, Bransford and Johnson (1972) documented the influence of a thematic label on the comprehension of a paragraph. In both of these examples, a context provided by a verbal l a b e l outside of the stimuli being perceived influenced the perception of those s t i m u l i . It seems, then, that conceptual information, derived either from the higher levels in a presented scene i t s e l f , or from outside sources, such as a verbal label or expectations, w i l l influence perception. The question then becomes, how is that influence effected? Many researchers have investigated, the way in which context and conceptual information influence perceptual processing, and some of t h i s work i s reviewed below. The Role of Context in Perceptual Processing Somehow i t seems i n t u i t i v e l y implausible that a semantic 12 interpretation of a scene could precede an analysis of that scene's component parts. Yet there i s a great deal of evidence that a semantic meaning i s derived from a scene f i r s t , and that t h i s understanding influences further processing of the scene. F i r s t , many researchers have shown that we can semantically process a picture very rapidly (Potter and Levy, 1969; Potter, 1975, 1976; Intraub, 1980). For example, Potter (1976) showed that targets could be i d e n t i f i e d within a picture with 69% accuracy after as l i t t l e as 113 msec, which replicated a similar result she had obtained in an e a r l i e r paper (Potter, 1975). Intraub (1980) showed that subjects could detect 35% of pictures described only by a category name at exposure durations of only 114 msec. Thus i t i s clear that we extract a semantic meaning from a picture very rapidly, within a single eye f i x a t i o n . It i s possible, then, that the semantic information we have about a scene i s available early enough to influence our further processing of the d e t a i l s contained within the scene.. Further, we seem to be able to extract the meaning of a scene without necessarily perceiving the d e t a i l s of the objects within i t (Mandler and Johnson, 1976; Loftus and B e l l , 1975). For example, Mandler and Johnson (1976) found that during the course of perception, we gradually perceive more and more d e t a i l s of the presented objects; yet thi s accrual of 13 information happens after our perception of the scene as a whole. Not only is the meaning of a scene extracted early, then, but also d e t a i l s of the components of a scene may be accumulated afterwards. F i n a l l y , consider again the results c i t e d above (Homa, Haver, and Schwartz, 1976; Reicher, 1969; P i l l s b u r y , 1898; Biederman et a l , 1973; Palmer, 1975) which showed that the perception of the component parts of a scene are influenced by a semantic interpretation of the scene as a whole. Again these findings support the claim that a semantic meaning may be extracted early, and subsequently af f e c t the perception of objects within the scene. It is a well established finding, then, that the presence of higher l e v e l components influences the perception of lower level s in a display. One possible explanation for thi s influence would be that somehow "global" information about a scene i s extracted f i r s t , prompting a semantic interpretation, which then influences the subsequent processing of lo.wer l e v e l s . Navon (1977), for example, claimed that during the course of visu a l processing, global forms had temporal precedence over " l o c a l " , or component forms. He presented subjects with stimuli where a global form (e.g. an X) was made up of a series of smaller forms, which were either the same as the global form (e.g. small x's), or different (e.g. small 1 4 +'s). He found, f i r s t , that subjects were faster to identi f y global forms than they were to identi f y l o c a l forms. Further, when the global and l o c a l forms were d i f f e r e n t , the inconsistency slowed down the i d e n t i f i c a t i o n of l o c a l , but not of global forms. As a result of these findings, Navon concluded that global forms had temporal precedence over l o c a l forms. However, other researchers have challenged Navon's claim that global forms have precedence over l o c a l forms. These researchers have demonstrated that there are many variables which determine the temporal precedence of levels in a display, including the salience of the forms at each of the levels (Martin, 1979), the size of the forms (Kinchla and-Wolfe, 1979), and the l e v e l which was most recently given precedence (Ward, 1982). Ward (1983) has pointed out that further research does not support Navon's o r i g i n a l claims. Therefore, i t i s probably not the case that an i n i t i a l v isual analysis of "global" c h a r a c t e r i s t i c s is responsible for the r e l i a b l e influence of context on the perception of lower l e v e l components. Rather, our vis u a l system i s most l i k e l y f l e x i b l e , with the capacity to attend to any level in a perceptual hierarchy: global, l o c a l , or somewhere in between. It is important, then, not to assume i t i s a temporal precedence of global forms which i s responsible for the 15 influence of context on the perception of lower l e v e l components. It may be an early perception of either global or l o c a l v i s u a l information which leads to an i n i t i a l semantic interpretation, which then w i l l influence the perception of either l o c a l or global forms. It i s l i k e l y , then, that a l l leve l s in a perceptual display influence one another in the course of perception (Rumelhart, 1977). In sum, the presence of higher lev e l s influences lower l e v e l perception by providing a broader context; not necessarily by being analyzed f i r s t . How might context a f f e c t the perception of components at dif f e r e n t perceptual levels? Many researchers have described the influence of higher lev e l s on lower levels in terms of "frames" or "schemata" (Minsky, 1975; Biederman et a l , 1973; Biederman et a l , 1982; Mandler and Johnson, 1976; Friedman, 1979; Loftus and B e l l , 1975; Loftus and Mackworth, 1978). A frame is a cognitive structure which presumably houses the prio r knowledge we possess about scenes or objects. For example, Minsky (1975) describes "room" frames. F i r s t , these frames might contain a l i s t of the objects we are l i k e l y to find in d i f f e r e n t types of rooms. Further, a p a r t i c u l a r frame would specify the relationships we know the objects in a room would probably have to one another. So a kitchen frame might specify that we should expect to find a r e f r i g e r a t o r , a sink, and a stove, and that we can expect to find a garbage bag under 16 the sink. It may be, then, that global information about a scene might be enough to "in s t a n t i a t e " a frame, which would af f e c t our further processing of the scene. Above i t was suggested that the way in which context could aff e c t the processing of component parts was through the prior knowledge we have about the context involved. This i s es s e n t i a l l y the claim researchers are making when they talk about frames. A frame contains our knowledge about a context. The global information we perceive in a scene may instantiate an appropriate frame, so that the information we already possess w i l l be available to us when v i s u a l l y processing that scene. Therefore, " whether or not frames a c t u a l l y exist, for the purposes of this thesis they provide a shorthand with which to discuss the way in which prior knowledge influences the perception of objects. In the terms of frame theory, then, the work of Potter (1975, 1976) and of Intraub (1980) suggest that a very brief exposure to a picture i s s u f f i c i e n t for the instantiation, of a frame. What might be the ef f e c t of the frame's instantiation? F i r s t , many researchers have proposed that we use our prior knowledge about a scene and the relationships between objects in order to d i r e c t our further perceptual processing (Loftus and B e l l , 1975; Loftus and Mackworth, 1978; Friedman, 1979; Antes, 1977). In other words, afte r a global perception of a i 17 scene, we may preattentively parse the scene into areas deserving of further processing, and frames may influence the way in which th i s parsing i s done. Generally, researchers agree that the areas concentrated on f i r s t are those which are most "informative". That our attention is drawn to the informative areas in a scene has been demonstrated in a number of experiments, within which "informative" was defined in a couple of di f f e r e n t ways. For example, Antes (1977) simply had one group of subjects subjectively rate the informativeness of sections of a picture, after which an analysis revealed that informative sections tended to contain more objects. Friedman (1979), Loftus and B e l l (1975), and Loftus and Mackworth (1978) defined areas within a scene as informative i f they would serve to distinguish the picture presented best from other pictures of the same type. Thus any object that was unusual within the context of a scene would be more informative, as i t would be very useful in distinguishing that scene from similar others. These researchers- showed that subjects pay. more attention to informative objects (Loftus and B e l l , 1975; Antes, 1977; Friedman, 1979). They dire c t more eye fixations to them, and they focus their attention on them longer (Loftus and Mackworth, 1978). The conclusion reached on the basis of these experiments was that subjects use a frame to define which areas of a picture are informative, and subsequently focus their 18 attention on those areas. The suggestion i s , then, that having prior knowledge about the probable composition of a scene influences the parsing of that scene into objects. As a r e s u l t , according to Friedman (1979), Loftus and B e l l (1975) and Loftus and Mackworth (1978), unexpected objects are more l i k e l y to be given attention. Friedman (1979) showed that as a consequence, subjects know more about the d e t a i l s of improbable objects, which they attend to, than they do about the d e t a i l s of expected objects. Furthermore, in cases when expected objects do not appear within a scene, subjects are l i k e l y to assume they perceived those objects anyway, simply because they expected to. For example, Antes (1977) showed that in a recognition task, subjects were more l i k e l y to f a l s e l y recognize an uninformative part of a scene, as opposed to an uniformative segment, presumably because they did not attend to i t . If subjects are presented with unexpected objects, then, they are more l i k e l y to direct t h e i r attention to them (Friedman, 1979). However, given a limited amount of processing time, they are also more l i k e l y to be inaccurate when reporting them (Biederman et a l , 1973; Palmer, 1975). Recall LaBerge (1975) found that subjects' perceptions of unfamiliar letters.were hindered when their expectations were not f u l f i l l e d , but improved when the l e t t e r s they saw matched 19 their expectations. It i s cle a r , then, that having contextual information or expectations does not always aid perception: in the case of unexpected or unusual objects, perception may in fact be hindered considerably. As another example of the eff e c t of a context on the perception of expected and unexpected objects, a similar pattern of results was found in a p i l o t experiment conducted prior to th i s thesis. In that experiment, a set of twenty-one di f f e r e n t pictures was used. These pictures consisted of a number of objects within an average scene. For example, one picture consisted of a desert scene, and included such objects as a tent, a cactus, and mountains in the background. Four d i f f e r e n t versions of each picture were constructed, so that two versions contained only "normal" expected objects, while the other two versions contained one object which was "abnormal", in that an object appeared in an unexpected color (e.g. a blue f i r e t r u c k , a red cactus, or a yellow stop sign). Objects such as these- which had an expected color were "constrained" objects, and objects which could appear in any color (e.g. a tent) were "unconstrained". Twenty-four subjects participated in the study, and were shown one version of each picture. They were asked to verbally report any objects which they were sure they saw, and were t o l d not to guess. The results from th i s p i l o t study are presented in Table I. F i r s t , for each type of object, the percentage out of a l l objects appearing on a l l 21 cards which were perceived and Table I Mean percentage of each type of cued object reported, and of those reports, the mean percentage correct: P i l o t study. Percentage Reported Percentage Correct Constrained Normal 12.5 94.7 Constrained Abnormal 32.9 77.6 Unconstrained 13.7 80.0 21 reported by subjects are presented. Subjects reported a s i g n i f i c a n t l y higher percentage of abnormal objects than either normally occuring constrained objects (t=3.67, p<.0l) or unconstrained objects (t=3.44, p<.0l). Therefore, consistent with previous research, unusual objects were more l i k e l y to capture the attention of subjects. Furthermore, subjects were s i g n i f i c a n t l y more accurate when reporting normal objects than they were when reporting unconstrained objects (t=3.96, p < . C M ) ; and were also more accurate when reporting normal as opposed to abnormal objects, although this l a t t e r finding was not s i g n i f i c a n t (t=1.89, P<.10). Expectations, then, resulted in a more accurate perception of expected objects as compared to unconstrained or abnormal objects. This last result is consistent with previous research as well (Palmer, 1975; Biederman et a l , 1973). A l l of the experiments c i t e d above demonstrate that we access information rapidly about the semantic meaning of a scene, which then influences further processing, but they do not expressly address the question of what c h a r a c t e r i s t i c s of the scene lead us to the instantiation of a frame. Is i t simply the presence of a number of objects that are conceptually related that leads to frame i n s t a n t i a t i o n , or is i t the s p a t i a l arrangemnt between these objects which structures them into a coherent scene? A number of researchers 22 have investigated this issue by studying object perception within d i f f e r e n t types of scenes. Mandler and Johnson (1976) used organized and "unorganized" scenes. The former type consisted of a picture containing l i n e drawings of objects in coherent s p a t i a l r e l a t i o n s to one another. Unorganized scenes retained the same objects but destroyed those meaningful s p a t i a l r e l a t i o n s . Mandler and Johnson then tested what sorts of changes subjects would notice in the pictures in a recognition task. They found that a change in s p a t i a l relations between objects was noticed more in organized scenes, while a change in s p a t i a l composition, defined by the gross figure/ground relations in a picture, were noticed more in the unorganized scenes. They concluded that in scenes described well by frames, we pay less attention to figure/ground relationships, which are already defined, but more to the semantic relationships between objects. On the other hand, we have to parse unorganized pictures into objects from scratch, and so notice any subsequent changes in the figure/ground relationships. Thus they demonstrated that the presence of a coherent scene results in the i n s t a n t i a t i o n of a frame which aids us in parsing the scene into objects. Similar evidence on what i t i s that i s c r u c i a l about a scene that leads to frame i n s t a n t i a t i o n i s provided by 23 Biederman (Biederman, 1981; Biederman et a l , 1973; Biederman et a l , 1982). In one study Biederman (1981) showed that depth gradients added to a c o l l e c t i o n of objects i s not s u f f i c i e n t to instantiate a frame, even i f i t does provide the objects with s p a t i a l relationships. Therefore, i t may be the presence of semantically meaningful s p a t i a l relations that is important. In two other studies (Biederman et a l , 1973; Biederman et a l , 1982) Biederman investigated whether or not semantic vi o l a t i o n s , where objects appear in relationships which are semantically inconsistent with the scene as a whole, hurt object i d e n t i f i c a t i o n within jumbled scenes as compared to within coherent ones. For example, one of Biederman's vi o l a t i o n s was a "size" v i o l a t i o n , where the relationships between the sizes of the objects within the scene would be unexpected (e.g. a man much smaller than a gas pump). The objects would maintain these abnormal size relationships within both jumbled and coherent scenes. Biederman found that the vi o l a t i o n s only affected object i d e n t i f i c a t i o n : w i t h i n coherent scenes. He concluded that during the perception of a coherent scene, a frame would provide information about the relations that should pertain between objects, and a v i o l a t i o n of these expectations would interfere with processing. In a jumbled scene, however, expectations about those relations would not be established, and v i o l a t i o n s should thus not aff e c t processing. 24 Although the discussion above has focused on the way in which context affects the perception of scenes and of the objects within them, most of the conclusions drawn apply to the perception of objects which appear by themselves. F i r s t , frame theorists would propose that frames exist at a l l perceptual l e v e l s , not just at the le v e l of whole scenes. So there would be object frames, which would contain our accumulated knowledge about the composition of objects. Therefore, i f a context were provided, an object frame would be instantiated which would affect the perception of an object. To summarize, then, when a context is provided, either by an i n i t i a l semantic processing of the scene, by a verbal l a b e l , or by expectations, perception is affected by t h i s conceptual information. Attention i s guided to informative parts of a scene: unexpected objects stand out. The perception of expected objects i s more accurate when they are present, and i s often inferred even when they are not. Recently, Biederman (Biederman, Teitelbaum, and Mezzanotte, 1983) has challenged the role of frames in perception, or at least the a b i l i t y of verbal labels to instantiate frames which aff e c t perceptions. He showed that giving a subject a verbal prime about the objects that were to appear within a picture did not s i g n i f i c a n t l y affect their i d e n t i f i c a t i o n . Potter (1976) provided subjects with either 25 verbal or v i s u a l descriptions of a target object, before quickly (100 msec.) presenting a scene. She found that the detection of target objects was just as accurate in the absence of either type of prime. Biederman et a l (1983) suggested that this lack of influence was due to the prime's f a i l u r e to provide s u f f i c i e n t l y detailed information to aid perception. Teitelbaum and Biederman (1979) had shown that object i d e n t i f i c a t i o n did improve with a picture prime rather than with a verbal one, and suggested that that was because a picture provides more d e t a i l . In any case, in the 1983 a r t i c l e , Biederman wonders whether frames can provide enough d e t a i l when instantiated by a verbal label to be of any use in perception. This debate and these experiments r e a l l y do not challenge the role of frames in perception, however. What these experiments investigate i s whether or not a verbal prime can add to the influence of the context which i s provided by the scene in which the target object i s embedded. A frame can be instantiated very rapidly once a scene is presented, and so i t is not surprising that an additional verbal prime should not add much to processing (Potter, 1976), or that to have any effect at a l l i t must provide very s p e c i f i c d e t a i l s , useful for distinguishing the target object from any other a l t e r n a t i v e s . Thus these findings do not imply that a frame i s not already 26 being u s e f u l i n p r o c e s s i n g ; a frame that was i n s t a n t i a t e d i n response to the context i n which the o b j e c t appeared. I t i s c l e a r that a v e r b a l prime i s very e f f e c t i v e i n a i d i n g p e r c e p t i o n i f that frame i s absent, on the other hand, i n , f o r example, the p e r c e p t i o n of a degraded p i c t u r e . Consider again that hard to f i n d dalmatian. In the above d i s c u s s i o n the way i n which the p e r c e p t u a l process i s a f f e c t e d by conceptual i n f o r m a t i o n has been examined. But i n order to have a complete view of p e r c e p t i o n , i t i s a l s o necessary to understand the bottom-up course of p e r c e p t u a l p r o c e s s i n g . T h e r e f o r e , the next s e c t i o n i n t h i s i n t r o d u c t i o n e x p l o r e s c u r r e n t r e s e a r c h on the f e a t u r e d e t e c t i o n process, and on the p e r c e p t i o n of o b j e c t s i n the absence of cont e x t . DATA DRIVEN PROCESSING Feature and Dimension Combination;  The C o n s t r u c t i o n of an Object How i s i t that the p e r c e p t u a l process proceeds i n the absence of any p r i o r i n f o r m a t i o n about a presented d i s p l a y ? A number of d i f f e r e n t r e s e a r c h e r s have i n v e s t i g a t e d the p e r c e p t i o n of a combination of d i f f e r e n t f e a t u r e s o c c u r i n g i n a 27 single location, with the goal of describing how those features are combined. These studies were b a s i c a l l y of three types. The f i r s t provides evidence that sometimes the presence of two or more dimensions or features simply results in the emergence of a new feature at the same l e v e l . These types of combination, then, do not further the course of the construction of an object except to provide more features that w i l l eventually be integrated. The second type of study addresses the question of how we perceive very familiar s t i m u l i , which are composed of a set of features whose integration we have overlearned. These studies explore the perception of objects where we perceive the combination of features automatically. And f i n a l l y , the t h i r d type of study examines how we perceive objects where the features are detected separately, and must be integrated into the perception of an object. Each of these types w i l l be discussed in turn. F i r s t , Pomerantz, Sager, and Stoever (1977) demonstrated that i t i s sometimes- the case- that two features are distinguished from each other faster when shown within an uninformative, constant context, than when alone. For example, " ( " and " ) " were distinguished faster when another single " ) " was added to both features, to form " ( ) " and " ) ) " . Paradoxically, Treisman and Gelade (1980) found that parsing a v i s u a l f i e l d into groups of objects i s preaftentive 28 and rapid i f the d i f f e r e n t groups are defined by a single feature, while i t requires s e r i a l attention to group members i f their difference i s defined by a conjunction of features. In other words, t h i s later work suggests that the figures which consisted of two separate features in Pomerantz et a l ' s studies should have been more d i f f i c u l t to perceive than single features, and should have been perceived more slowly. The apparent contradiction between these two sets of studies can be explained, however, i f we examine the product of Pomerantz et a l ' s feature combinations. In a l l cases where adding context results in an easier visual parsing as compared to a segregation of a v i s u a l f i e l d by single features, the combination of features created by the added context produces "emergent features". Pomerantz et a l (1977) propose that certain features, when presented together at a common spa t i a l location, result in the emergence of a new feature, which i s more than the sum of i t s parts-. These "emergent features'' are thought to be. detected, themselves, independently of their component features, and at the same l e v e l , so that they are then equally i n f l u e n t i a l in a preattentive parsing of a v i s u a l scene. They are detected in p a r a l l e l along with the other features in the display. One example of an emergent feature i s closure. In the experiment described above, the addition of a second 29 parenthesis resulted in the emergence of a closure feature, which i t s e l f was probably detected in p a r a l l e l with the other features present, and which then aided subjects in distinguishing between " ( ) " and " ) ) " . This feature was not present in the comparison single feature display. Thus the increased ease of parsing may have been due to the presence of the added emergent feature. Treisman (1983) has provided evidence for the feature-like quality of these emergent features. For example, she has shown that in a search task, targets defined by emergent features w i l l pop out of a display, just as targets defined by a single feature do. Further, Treisman and Schmidt (1982) showed that i f subjects are asked to perceive a display consisting of a number of objects, each of which i s composed of a set of features, under certain circumstances they w i l l report having seen invented objects that are in fact constructed out of features appearing in the display. This type of error i s an " i l l u s o r y conjunction". Treisman (1983) found that subjects are less l i k e l y to make i l l u s o r y conjunction errors i f the result would include an emergent feature which was not physically present in the display, suggesting again that an emergent feature is more than the sum of i t s parts. Thus subjects w i l l create a "l£" out of the features "L" and V " more easily than they w i l l mistakenly combine " L " and " N" in order to perceive a 30 tr iangle. These experiments show that the result of the simultaneous presence of two or more features at a common s p a t i a l location can sometimes be the emergence of another feature at the same l e v e l , which i t s e l f w i l l contribute to the construction of an object. Garner (Garner, 1974; Garner and Felfoldy, 1970; Felfoldy and Garner, 1971) provides a similar example of a situation where a combination of dimensions results in the formation of a new dimension, which i s i t s e l f perceived independently of the dimensions that compose i t . Garner provides t h i s example in his discussion of integral dimensions. Dimensions are integral i f when presented at a common sp a t i a l location they are perceived together as a new dimension, and are not decomposable into the o r i g i n a l components. An example of two integral dimensions would be the value and chroma of the color of a single chip. Separable dimensions, on the other hand, are combined into an object l e v e l percept, but remain c l e a r l y separate.; the object can e a s i l y be broken back down into i t s component parts. The p a r a l l e l here i s that integral stimuli produce a new dimension at the same l e v e l as their component dimensions, just as emergent features are themselves features at the same le v e l as those which compose them. Separable dimensions, on the other hand, combine in such a way as to compose an object at the next 31 l e v e l up. Note that there is one major difference between emergent features and integral dimensions: usually in the perception of emergent features, as in the perception of separable dimensions, the o r i g i n a l features are themselves s t i l l coded (e.g. Pomerantz et a l ' s " ( ) " and " ) ) " ) , while when two dimensions combine i n t e g r a l l y , the component dimensions may not be perceptually coded at a l l (e.g. Garner's value and chroma of a color chip). The p a r a l l e l that does exist, however, is that in both cases the presence of two features or two dimensions at a single location results in the emergence of a new feature or dimension, which i s i t s e l f coded at the same l e v e l as the features/dimensions which compose i t . In both cases t h i s combination i s automatic, and not subject to conscious control.. Garner (Garner, 1974; Gottwald and Garner, 1972; Gottwald and Garner, 1975) described a series of experimental paradigms which explore the difference between stimuli composed of separable and integral dimensions. For example, he showed that integral stimuli are sorted according to a Euclidean metric, while separable stimuli are c l a s s i f i e d according to a c i t y block one. Si m i l a r l y , sorting by one of the dimensions present while ignoring the other is possible for separable s t i m u l i , since the object created can be broken down into i t s component parts, but not for integral ones, where the dimensions cannot 32 be separated. Most combinations of features, for example the shape and the color of an object, are separable. We can easily perceptually separate the component features. This d i s t i n c t i o n between 'integral and separable dimensions might not be so simple, however. Using these experimental paradigms, Garner has tested various combinations of dimensions, and has shown that sometimes we process what are normally separable dimensions as we do integral ones. He found that whether or not dimensions are processed i n t e g r a l l y depends both on the physical nature of the features involved, and on the task we need to perform. In essence, we can treat some features or dimensions as integral i f i t i s in our best interest to do so. The existence of both emergent features and integral dimensions demonstrates that sometimes combining features or dimensions results in the production of a new feature or dimension that contributes in the same way as did the o r i g i n a l features to the. percept ion of. an object. The other two types of research that have been done on how dimensions are combined in the perception of objects have both concentrated on stimuli composed of separable dimensions. The f i r s t of these types has been concerned with describing how we perceive combinations of features that are highly f a m i l i a r , such as l e t t e r s or words. Studies done within this context have generally shown that 33 perception of highly familiar combinations has become predominantly automatic, and detached from conscious c o n t r o l . A f i r s t source of evidence for the proposition that the overlearning of some combinations of features occurs to the point where we automatically perceive the whole, even i f we desire to attend to the component parts, is the basic STROOP ef f e c t . T y p i c a l l y , in STROOP experiments, interference results when subjects try to read a word which represents a color (e.g. "blue"), but which " i s written in another color ink (e.g. written in red ink). Notice that i t i s not the perception of the color/(word meaning) combination that i s automatic, but the perception of the meaning of the displayed word, derived from the combination of i t s l e t t e r s . It would benefit a subject to be able to ignore the word meaning in reporting i t s color, but perception of the word's meaning i s apparently not subject to strategic c o n t r o l . Another example comes from proofreading texts. We are a l l familiar with how easy i t i s to miss mistakes at the l e t t e r level, because we cannot avoid attending to the words. We perceive the word as a whole because we have overlearned the combination of i t s component l e t t e r s . LaBerge (1973) offered evidence that the perception of highly f a m i l i a r stimuli i s indeed automatic. He suggested that this automaticity develops when we overlearn a combination of features through repeated exposure to the same stimulus. As a 34 result, he set out to demonstrate that familiar l e t t e r s are perceived automatically, while unfamiliar ones are not. He investigated t h i s by manipulating whether subjects' expectations about the identi t y of a l e t t e r were met or not. He reasoned that i f the perception of a l e t t e r were automatic, expectations which were not f u l f i l l e d would not af f e c t processing. Therefore the perception of familiar l e t t e r should be the same regardless of a subject's expectations. Expectations about unfamiliar l e t t e r s , however, should help their perception, since that perception i s not automatized. Thus, i f a subject had expectations which were then u n f u l f i l l e d , the perception of an unfamiliar l e t t e r should suffer. LaBerge found that the perception of an unfamiliar l e t t e r was just as rapid as that of a familiar l e t t e r i f subjects' expectations were r e a l i z e d . On the other hand, the perception of an unfamiliar l e t t e r suffered, while that of a familiar l e t t e r did not, when expectations were not f u l f i l l e d . LaBerge. concluded- that f amildar l e t t e r s are. perceived automatically, while unfamiliar l e t t e r s are not. The second way, then, in which two or more features appearing at a common location may be perceived i s that they may combine automatically, because their cooccurance i s overlearned. F i n a l l y , the t h i r d way in which dimensions may combine i s probably the most common: that i s , the dimensions 35 are perceived separately, and must be integrated in the construction of a perception. How i s i t that subjects c o r r e c t l y integrate separable features which are not combined automatically? Treisman, in her feature integration theory, has supplied a tentative response to t h i s question. The stimuli used by Treisman (Treisman and Schmidt, 1982; Treisman and Gelade, 1980) in her investigations of the way in which features are perceived are most l i k e those used by Gottwald and Garner (1972). That i s , they are composed of c l e a r l y separable features, such as color and shape, whose conjunction is not p a r t i c u l a r l y f a m i l i a r . Treisman actually demonstrated in one experiment that i t i s d i f f i c u l t to teach subjects to combine certain features, p a r t i c u l a r l y color and form, so as to create an automatic perception of those conjunctions. She showed that even frequently presented combinations of these two features w i l l not pop out of a display, as they might i f their perception were automatic. Contrast t h i s result, on the. other hand, with LaBer.ge's: (1973) finding that over time subjects can become more familiar with what start as unfamiliar l e t t e r s , and thus become more l i k e l y to perceive their combinations of features automatically. Thus as subjects become more familiar with those l e t t e r s , the f a i l u r e of the combination of features presented to meet their expectations had less of an effect on 36 their subsequent perception. The role of a temporary perceptual set decreased as the combination of features became automatic. It is not clear whether Treisman and Gelade (1980) f a i l e d to teach their subjects to automatically conjoin features because they didn't provide them with enough exposure, or because of the nature of the dimensions being combined. LaBerge did not provide any more experience with h i s stimuli than they did, and his subjects apparently were able to learn to automatically combine the features they were shown. It i s l i k e l y , however, that the test Treisman and Gelade used was more demanding than that of LaBerge, and that his l e t t e r s would not have popped out of a display ei t h e r : automaticity does not necessarily imply 'that the l e t t e r s are perceived as wholes preattentively. The only conclusion I care to make at t h i s junction i s that the tasks used by LaBerge (1973) and by Treisman and Gelade (1980) were c l e a r l y very d i f f e r e n t . The most important point- here i s that most- of the stimuli used by-Treisman in her studies have been composed of features which are c l e a r l y separable, and whose combinations are not e a s i l y overlearned so as to be perceived automatically. Feature Integration Theory 37 Treisman (Treisman and Schmidt, 1982; Treisman, 1982) has put forward a theory which offers a description of the way in which detected, separable features may be combined into the perception of an object. This theory i s founded on a number of assumptions. F i r s t , Treisman claims that an object i s composed of a set of features, some examples of which are a l i n e at a pa r t i c u l a r orientation, a curve, an angle, a color, or a texture. Second, we detect the presence of these features in p a r a l l e l , that is simultaneously, but independently. There are feature detectors within our vi s u a l system which are responsible for responding maximally to only a certain kind of feature, such as color, and which are not affected by the detectors concurrently responding to the other features present. Further, Treisman claims that once we have detected a set of features, due to the response of these feature detectors, then in order to complete our construction of a perception, we must somehow recognize that these features a l l belong to the same object. The-, l a s t assumption on which- her theory i s based i s that in order to l o c a l i z e an object and to recombine the detected features into a single e n t i t y , we need to attend to the object's s p a t i a l location. Thus the application of focal attention i s c r u c i a l to the correct integration of the features of an object. Treisman (Treisman and Schmidt, 1982; Treisman, Sykes, and 38 Gelade, 1977; Treisman, 1982; Treisman and Gelade, 1980) has conducted a number of experiments which offer support for this theory. F i r s t , one implication of the theory is that i f attention i s diverted, and i f subjects are required to perceive more than one object simultaneously, the subjects w i l l be able to detect most of the features present in a display in p a r a l l e l , but they w i l l not be able to recombine them properly. Thus subjects, in forming the perception of an object, w i l l make " i l l u s o r y recombinations", or "conjunctions", and believe that they have perceived an object which is in fact composed of a combination of the features taken from two or more of the objects presented. In one experiment, Treisman and Schmidt (1982) showed subjects cards which consisted of three colored l e t t e r s equally spaced, and flanked by two d i g i t s . The subjects' task was to f i r s t report the d i g i t s , so that their attention would be diverted, and next to report the color and identity of any l e t t e r they were sure they saw. They found that subjects often did make i l l u s o r y conjunctions, and- would, for example, report having perceived a blue T, even i f in fact they had been presented with a green T and a blue X. Further, subjects were often convinced that they had indeed seen these types of miscombinations; that they were not simply guessing. Treisman and Schmidt (1982) have also shown that these i l l u s o r y conjuctions occur within the context of experimental 39 tasks other than free report. For example, subjects were given a matching task where the probe to be detected either could or could not be constructed out of a combination of the features appearing on d i s t r a c t o r items. Subjects were more l i k e l y to mistakenly report the presence of a target when the features of the target were present in the display, even i f they were located on separate objects. Treisman and Schmidt argued that at least some of the time subjects were suffering from perceptual i l l u s i o n s which led them to make these false reports. Treisman wanted to argue that i l l u s o r y conjunctions are a perceptual phenomenon; that we construct a perception b u i l t from incorrect features, and that we r e a l l y believe we have seen that i l l u s o r y object. She recognized, however, that neither a search nor a free report task provide conclusive evidence that i l l u s o r y conjunctions are faulty perceptions. It could be that subjects were making an i n t e l l i g e n t guess about the identity of an object on the basis of the limited information they had acquired about the features present in a display, and i t just looked as i f they were making conjunction errors. For example, in a free report task, a subject might know that both blue and T were present, and guess that they were together on the basis of t h i s limited knowledge. Even i f t h i s were the case, i t i s s t i l l novel to propose 40 that a subject might know of the presence of features, without knowing where they were located or from which object they originated (Treisman and Gelade, 1980). Yet the theory would be more inter e s t i n g i f i t were true that the mistaken feature conjunctions subjects produced were indeed constructed perceptions, rather than just guesses. Treisman has supported her theory with converging evidence, derived from studies which she conducted e a r l i e r (Treisman, Sykes, and Gelade, 1977), for the perceptual r e a l i t y of i l l u s o r y conjunctions. For example, she reasoned that individual features can be detected in p a r a l l e l , but because we need to focus our attention on an object to combine the features c o r r e c t l y , a set of objects which are defined by conjunctions of features cannot (Treisman et a l , 1977). In order to perceive a set of objects c o r r e c t l y , i t i s necessary to attend s e r i a l l y to each object in turn. Therefore in a search task, i f a subject were asked to detect a target which d i f f e r e d from a set of d i s t r a c t o r s by a single feature, the. target should pop out of the display, regardless of the number of dist r a c t o r items present (Treisman et a l , 1977). If the target were i d e n t i f i a b l e only by a conjunction of features, on the other hand (e.g. i f the target were a green T among a f i e l d of blue T's and green O's), then the subject would have to attend s e r i a l l y to the objects in the display in order to locate the object with the correct conjunction of 41 features. Treisman et a l conducted a number of studies which confirmed t h i s expectation. S i m i l a r l y , Treisman (1982) showed that she could e f f e c t i v e l y hide a target from her subjects by displaying i t at the boundary of two groups of d i s t r a c t o r s , i f the target i t s e l f were composed of a conjunction of one feature from each of the two groups. Again, subjects would have to attend to the target object before i t s p a r t i c u l a r conjunction of features could be accurately detected. A target defined by a unique feature, on the other hand, would stand out even at the boundary between two d i s t r a c t o r groups. Further, Treisman and Gelade (1980) showed that subjects could preattentively parse a vi s u a l display into two groups of objects i f the two groups were defined by a difference in a single feature. I f , on the other hand, the difference between the two groups was defined by a conjunction of features, subjects could not parse the objects into groups preattentively. The definition, of a- boundary between the. two groups requires focal attention. Thus Treisman has conducted experiments employing a number of experimental paradigms which a l l lend credence to her proposed theory of object perception. To review, her theory claims that to perceive an object c o r r e c t l y , we must attend to i t . When we do, we l o c a l i z e i t , detect the object's features 42 in p a r a l l e l , and recognize that they a l l belong to a common location, and to a single object. If we are presented with an entire set of objects, we w i l l perceive them corre c t l y only i f we apply our attention s e r i a l l y to each one. If our attention is diverted, we w i l l perceive the features of a l l of the objects in p a r a l l e l , and be unsure both as to the location of each object, and as to i t s p a r t i c u l a r conjunction of features. Thus we may think we have perceived objects which are in fact mistakenly constructed from the features of the objects that are present. Recently there have been a number of studies which have contested d i f f e r e n t aspects of Treisman's feature integration theory. F i r s t , V i r z i and Egeth (1984) suggest that i l l u s o r y conjunctions may also occur at a later point in the perceptual process, after object i d e n t i f i c a t i o n , and that conjunction errors may be "propositional" as opposed to perceptual. Their subjects, when presented with displays including adjectives written in d i f f e r e n t colored inks, some of which were themselves color words, made errors at a semantic l e v e l . S p e c i f i c a l l y , subjects reported perceiving a color which was in fact presented as a word, or reading a color word which was in fact presented as a color. These results suggest that the inaccurate featural assignment to objects may occur after a semantic interpretation of the features and words has been 43 accomplished. V i r z i and Egeth conclude, then, that either conjunction errors occur both at an early and at a later stage in perception, or that perhaps conjunction errors are a result more of memorial, rather than perceptual, mistakes. Although these results are in t r i g u i n g , the fact that subjects w i l l make conjunction errors at a propositional l e v e l does not imply that they do not occur at the feature integration stage as well. Further, even i f V i r z i and Egeth's results are the result of memory f a i l u r e s , again, perceptual conjunction errors may not be. It is important to recognize, however, as V i r z i and Egeth point out, that i t i s d i f f i c u l t to dis t i n g u i s h between perceptual and memorial e f f e c t s , and even i f both couldn't contribute to a f i n a l subjective perception. Egeth, V i r z i , and Garbart (1984) have offered an aternative interpretation of Treisman's (Treisman, Sykes, and Gelade, 1977) search task. Treisman et a l demonstrated that the search for a target defined by a single feature was a p a r a l l e l process, while the detection of a target defined by a conjunction of features required a s e r i a l search. Egeth et a l (1984) proposed that subjects do not in fact scan the entire set of d i s t r a c t o r s , as Treisman claimed, in the search of a conjunction target. Rather, they claim, subjects f i r s t reduce the set of objects to be searched on the basis of a featural discrimination, and then search s e r i a l l y the remaining 44 potential target objects. For example, i f the target were a red 0 among red N's and green O's, subjects might reduce the set of d i s t r a c t o r s to red items preattentively, and then s e r i a l l y search this reduced set of items. This a l t e r n a t i v e explanation of Treisman's results i s consistent both with her data, and with feature integration theory. Subjects should be able to preattentively parse the set of d i s t r a c t o r items on the basis of a single feature, so as to weed out nontarget items. However, the subsequent search among the remaining objects must then be s e r i a l , because, as feature integration theory would predict, attention must be focused on the features of the target object, so that they w i l l be conjoined c o r r e c t l y . F a r e l l (1984) also contested Treisman's claim that the search for a target defined by a single feature i s accomplished preattentively and in p a r a l l e l , while the search for a target defined by a conjunction of features must be s e r i a l . In his f i r s t experiment, he compared subjects' reaction- times- to respond to the presence or absence of a target in displays of two objects. In one condition, the "target" display was defined by a set of four features, two colors and two shapes, where a l l four of the features had to be present, but could appear in any combination (the "FP" condition). In the other condition, the target display was defined by two s p e c i f i c 45 color/shape pairs appearing in defined s p a t i a l positions (the FC condition). In t h i s l a t t e r condition, then, the d e f i n i t i o n of the target display depended both on s p e c i f i c conjunctions of features and on the s p a t i a l location of those conjunctions. F a r e l l claimed that feature integration theory would have to predict that the i d e n t i f i c a t i o n of the targets would be slower in the FC, as opposed to the FP, condition. He found the opposite: subjects were quicker to detect the presence of the target displays in the FC condition. He concluded that in certain cases the detection of features i s actually slower than the correct i d e n t i f i c a t i o n of feature conjunctions appearing at s p e c i f i c s p a t i a l locations. There i s one problem with F a r e l l ' s interpretation of his res u l t s , however. F a r e l l concluded that the integration of features, or a h o l i s t i c perception, may take place before feature detection has occurred. However, his conclusions were based on reaction time data, from which he could only conclude that the f i n a l response.- to stimuli was slower in the FP condition. But i t i s d i f f i c u l t to determine what exactly i s responsible for longer reaction times: for instance, i t may be that the results of perceptual processing are more accessible to responding after feature integration has occurred, even i f the features of objects are detected f i r s t . F i n a l l y , Mozer (1983) was interested in studying a 46 p a r t i c u l a r class of l e t t e r migration errors: where subjects mix up the l e t t e r s of presented words so as to perceive a non-presented word or phrase. For example, subjects might read the display "barn door" as "darn bore". Mozer conducted a series of experiments ot determine whether feature integration theory could account for this type of error. F i r s t , Mozer hypothesized that i f subjects make conjunction errors with l e t t e r s of words, then they should perceive words made up of a combination of the l e t t e r s of the words actually presented. He found that subjects did make these types of errors (e.g. they would report "lane" when presented with " l i n e " and "lac e " ) . Mozer also proposed that i f these errors were truly conjunction errors, then the number of errors made should be unaffected by the s i m i l a r i t y of the words presented. He found, however, that errors were much more common when the presented words shared two l e t t e r s (e.g. cape and cone), than when they did not (e.g. cape and monk). Mozer drew a couple of conclusions on the basis; of these res u l t s . F i r s t , he concluded that l e t t e r s presented within words must not act as "features", which can be recombined so as to perceive a new word. This conclusion would be interesting, i f true, especially in l i g h t of previous studies which have shown that subjects do make conjunction errors with l e t t e r s presented singly (Treisman and Schmidt, 1982). 47 H o w e v e r , t h i s r a t h e r s t r o n g c o n c l u s i o n i s n o t s u p p o r t e d by t h e d a t a p r e s e n t e d . M o z e r l o o k s a t a v e r y r e s t r i c t e d s e t o f e r r o r s i n h i s r e s u l t s : t h o s e where a w o r d i s r e p o r t e d . He f a i l s t o e x a m i n e c l o s e l y any e r r o r s where l e t t e r m i g r a t i o n s may have o c c u r r e d , b u t where t h e r e s u l t i n g r e p o r t was n o t a w o r d . I t i s i n t e r e s t i n g , t o o , t h a t t h e p e r c e n t a g e o f e r r o r s w h i c h f e l l i n t o h i s " o t h e r " c a t e g o r y was so l a r g e . The q u e s t i o n i s w h e t h e r many o f t h e s e o t h e r e r r o r s were a c t u a l l y c o n j u n c t i o n e r r o r s w h i c h were n o t c o n s i d e r e d i n M o z e r ' s d a t a . F u r t h e r , M o z e r c o r r e c t e d h i s e s t i m a t i o n o f c o n j u n c t i o n e r r o r s f o r c h a n c e g u e s s i n g . Y e t a g a i n , h i s c o r r e c t i o n f a c t o r was b a s e d on a r e s t r i c t e d s e t o f e r r o r s . I n o r d e r t o a s s e s s w h e t h e r l e t t e r s , i n g e n e r a l , m i g r a t e f r e e l y b e t w e e n w o r d s , i t w o u l d be n e c e s s a r y t o c o n s i d e r a l l l e t t e r m i g r a t i o n e r r o r s , w h e t h e r a word was p r o d u c e d o r n o , a n d t o c h o o s e a c o r r e c t i o n f a c t o r t o c o n c e p t u a l l y m a t c h t h e s e e r r o r s . M o z e r ' s s e c o n d c o n c l u s i o n was t h a t f e a t u r e i n t e g r a t i o n t h e o r y does- n o t d e s c r i b e t h e process.- b e h i n d , l e t t e r m i g r a t i o n e r r o r s . T h i s c o n c l u s i o n i s p r o b a b l y t r u e , b e c a u s e M o z e r ' s s t u d y i s n o t c o n c e r n e d s o much w i t h t h e c o r r e c t a s s i g n m e n t o f i n d i v i d u a l f e a t u r e s t o w o r d s , a s i t i s w i t h t h e m i s t a k e n p e r c e p t i o n o f s i m i l a r w o r d s . T h e s e t y p e s o f e r r o r s a r e p r o b a b l y b e s t e x p l a i n e d by an a c t i v a t i o n t h e o r y ( s u c h a s t h a t p r o p o s e d by M c C l e l l a n d and R u m e l h a r t ( 1 9 8 1 ) ) , w h i c h c h a r a c t e r i z e s w o r d 48 recognition as a competitive process between words which share the features being extracted from a display. The errors examined by Mozer, then, did not address the question of whether, in general, l e t t e r s migrate between words. Rather, the subset of errors which were examined best provides answers about the word recognition process. The way in which feature detection/integration and word recognition interact has not been c l e a r l y defined. Object F i l e s Kahneman and Treisman (1984) have recently reexamined the role of attention in perception. After reviewing the history of experimentation on selective attention, they conclude that most of that work can be understood within a single framework. E s s e n t i a l l y , they propose that there are two stages in perception; the f i r s t is a preattentive parsing of the v i s u a l f i e l d into "objects", and the second involves focusing attention on the chosen object and processing i t in more d e t a i l . What we do then, in the perception of a scene, i s after a preattentive parsing, we pay attention to s p e c i f i c locations one at a time, and create "object f i l e s " , in which we store the information we accumulate about the object at that locat ion. 49 It i s important to emphasize that object f i l e s might exist because, i f features are detected in p a r a l l e l , somehow we need to group together the features that derive from the same object. Attention serves to l i m i t the area from which we draw those features, and therefore reduces the set of features relevant to an object. One implication of t h i s view is that attention can be applied to objects one at a time, s e r i a l l y , but that within the limited focus of our attention features w i l l s t i l l be detected in p a r a l l e l . A possible second implication, not e x p l i c i t l y claimed by Kahneman and Treisman, is that i f our attention i s diverted, we w i l l detect features from a l l of the objects present in p a r a l l e l , and our object f i l e s w i l l not be c l e a r l y defined. Kahneman and Treisman have cited numerous studies which confirm predictions generated by their point of view. To give one example, Kahneman and Henik (1981) presented subjects with a display which consisted of a square and a c i r c l e , within each of which was written a word-. The• subject was then faced with a vari a t i o n of the well known STROOP task: the two words within the display could be red or green, and each was written in either red or green ink. The subject's task was to read one of the words, located within one of the two shapes. It i s usually claimed that in a STROOP task subjects automatically and inadvertantly encode the color of the word presented while 50 trying to id e n t i f y i t s meaning, and interference r e s u l t s . Kahneman and Henik wanted to show that this interference only occurred when the two inconsistent features appeared within the same object, that i s , within the square or the c i r c l e . They found that a subject, when asked to read the word appearing inside one object, could ignore the features within the irrelevant object, but could not disregard an i n t e r f e r i n g feature which appeared in the same object. They concluded that attention was allocated to the location of the relevant object, and that the features appearing there were processed automatically and in p a r a l l e l . Among the other experiments that Kahneman and Treisman discuss in their a r t i c l e are • the i l l u s o r y conjunction experiments which were described e a r l i e r . It should be clear that there i s a close r e l a t i o n between feature integration theory and the assumptions behind object f i l e s . Attention functions to glue features together in that i t l o c a l i z e s them, and helps to dire c t them a l l . to the same, object f i l e . Thus experiments which demonstrate that diverted attention results in a mixed up assignment of features to object f i l e s , that i s , in i l l u s o r y conjunctions, lend further support to Kahneman and Treisman's ideas. Kahneman and Treisman (1984) propose, then, that after a display i s preattentively parsed, attention i s allocated to a 51 s p a t i a l location, and features appearing at that location are detected in p a r a l l e l . Prinzmetal (1981) has also done a series of experiments which i n d i r e c t l y support t h i s view. He showed subjects displays which consisted of objects which formed two groups (defined by Gestalt p r i n c i p l e s ) . His goal was to examine whether subjects made more conjunction errors with the features of objects appearing in a single group, or whether they would be equally l i k e l y to draw features from both groups in the production of an i l l u s o r y conjunction. He found that subjects were more l i k e l y to use the features of objects within one group when making these errors. It i s possible to conclude, then, that subjects directed their attention to one group, and that within this r e s t r i c t e d space, features were detected in p a r a l l e l , and. . were thus more susceptible to being miscombined. F i n a l l y , Prinzmetal and Willis-Wright (1984) conducted a series of studies which demonstrate that the a l l o c a t i o n of attention depends on more, than just - the spatial, segregation of objects or the d e f i n i t i o n of groups by Gestalt p r i n c i p l e s . These researchers showed that attention can be focused on d i f f e r e n t l e v e l s in a display, depending on the c h a r a c t e r i s t i c s of units at the d i f f e r e n t l e v e l s . Subjects in their experiments were asked to report the color of one of the l e t t e r s within a l e t t e r sequence, where the sequence could form 52 a word, a pronounceable non-word, a non-pronounceable non-word, or an abbreviation. These researchers proposed that subjects would probably d i r e c t their attention to the individual l e t t e r s which made up an unfamiliar non-word, while they were more l i k e l y to spread the i r attention across a l l of the l e t t e r s in a fa m i l i a r word, in a pronounceable l e t t e r string ( l i k e l y to be more f a m i l i a r ) , or in an abbreviation. They hypothesized that subjects would be more l i k e l y to make l e t t e r / c o l o r conjunction errors in sequences across which their attention was spread, than they were in sequences where attention was focused on each l e t t e r i n d i v i d u a l l y . They found that subjects did make more conjunction errors when presented with words, pronounceable non-words, or abbreviations, than they did when presented with non-words. It seems, then, that whether attention i s allocated to a low l e v e l unit in a scene (a l e t t e r ) or across a sequence of lower l e v e l units (a word or non-word) depends not only on s p a t i a l location, but also on the f a m i l i a r i t y and meaningfulness of the sequence as a whole. CONCEPTUAL KNOWLEDGE AND FEATURE INTEGRATION To summarize, three dif f e r e n t ways in which combinations of features can combine into a whole have been presented above. 5 3 In the f i r s t , the result of the presence of a combination of features (or of dimensions) is the emergence of a feature at that same l e v e l . This emergent feature influences perception just as the other features at the same level do. Further, i t s perception i s automatic. Thus t h i s type of combination provides l i t t l e illumination on how conceptual information influences the feature combination process. Second, some features are combined automatically, because their co-occurence i s extrememly familiar. The perception of these feature combinations i s immediate and involuntary. Examples are the perception of familiar l e t t e r s or words. In some cases, then, f a m i l i a r i t y , then, allows us to recognize an object, perhaps simply on the basis of a configural evaluation, without even resorting to an analysis of component parts. We can at least claim that for the perception of these highly learned stimuli we need very l i t t l e information about the component parts of the stimulus before we have i d e n t i f i e d the whole unit. It i s this sort of f a m i l i a r i t y that leads to effects such as Reicher's (1969) word superiority e f f e c t . In his study i t was the f a m i l i a r i t y of the combinations of l e t t e r s that led subjects to perceive the l e t t e r s within them better than they did l e t t e r s alone. The same advantage was c l e a r l y not provided by f o u r - l e t t e r , unfamiliar quadrigrams. F i n a l l y , many of the stimuli we perceive during our dai l y 54 e x i s t e n c e a r e not h i g h l y o v e r l e a r n e d . Yet e x p e c t a t i o n s may i n f l u e n c e our p e r c e p t i o n of those o b j e c t s as w e l l , even though these o b j e c t s c o n s i s t of c l e a r l y s e p a r a b l e d i m e n s i o n s , such as c o l o r and form. In a number of papers Treisman made the c l a i m t h a t people do not walk through the environment m i s p e r c e i v i n g the o b j e c t s around them because of the e x p e c t a t i o n s they have about what o b j e c t s i n the w o r l d w i l l l ook l i k e . For example, i f someone were s t a n d i n g a c r o s s the s t r e e t i n a b l u e s h i r t next to a green c a r , we might mix up the f e a t u r e s of the two o b j e c t s , i f we weren't p a y i n g a t t e n t i o n , and b e l i e v e t h a t we had seen a person i n a green s h i r t . I f , on the o t h e r hand, we were w a l k i n g down the s t r e e t and g l a n c e d over a t a green t r e e a g a i n s t the b l u e sky, we would be q u i t e u n l i k e l y t o m i s t a k e n l y p e r c e i v e a b l u e t r e e . Treisman has c l a i m e d t h a t t h i s i s because we have e x p e c t a t i o n s which pr e v e n t us from making such m i s t a k e s . There i s a s m a l l amount of r e s e a r c h which suggests t h a t c o n c e p t u a l knowledge p r o v i d e d ' by e x p e c t a t i o n s may indeed a f f e c t , the p e r c e p t i o n of o b j e c t s composed of s e p a r a b l e f e a t u r e s . F i r s t , c o n s i d e r once a g a i n LaBerge's (1973) paper. H i s s u b j e c t s adopted a temporary s e t t o p e r c e i v e a c e r t a i n c o n j u n c t i o n of f e a t u r e s when they were l e d t o expect an u n f a m i l i a r l e t t e r . The r e s u l t s from t h i s s t u d y suggest t h a t s u b j e c t s c o u l d s e t themselves t o expect a c e r t a i n c o n j u n c t i o n 55 of features, which then aided in their perception of the unfamiliar l e t t e r s . A second example is provided by Gottwald and Garner (1972). In a "condensation" task, subjects are required to sort stimuli which vary on two dimensions into groups. Usually subjects are poorer at t h i s task than they are at tasks which require sorting by redundant or single features. This i s not surprising given Treisman's findings, since subjects in a condensation task are required to sort by a conjunction of features. Gottwald and Garner compared an asymmetric condensation task to these other types of sorting task. 'In an asymmetric condensation task, out of the four possible stimuli created by orthogonally varying two dimensions, one i s defined to be a member of a f i r s t group, while the remaining three are a l l assigned to a second group. Gottwald and Garner suggested that for t h i s task, subjects could set themselves to expect one conjunction of features,, that which defined, the.'group with, only one member, and then respond to a l l stimuli by matching them against t h i s one expected conjunction. They found that subjects did find an assymetrical sorting task much easier than the t y p i c a l sorting tasks. In t h i s task, then, subjects seemed to be able to avoid the more d i f f i c u l t strategy of detecting and c o r r e c t l y conjoining the features of each presented 56 stimulus, perhaps by setting themselves to expect just one feature combination. The results of these two studies suggest an alternative explanation of the F a r e l l (1984) studies described e a r l i e r . In his f i r s t experiment, F a r e l l found that subjects were quicker to i d e n t i f y stimuli which were composed of two objects in two fixed p o s i t i o n s . These subjects may have set themselves to perceive t h i s combination of features, and so responded more quickly when their expectations were f u l f i l l e d . Consistent with t h i s interpretation, reaction times to respond to displays other than the target display were slower than they were in the "FP" condition, where the target display was defined by a set of four features. In other words, processing was slower when expectations were not f u l f i l l e d . F i n a l l y , i t would follow from this interpretation of F a r e l l ' s work that i f a display were defined by two objects which could appear in either of two positions, subjects would not be able to set themselves to perceive a consistently defined, combination of features, and so would respond more slowly. In his Experiment 3 F a r e l l found this to be the case: subjects responded more quickly in the FC condition than they did in response to displays where the s p a t i a l location of the two expected objects was not fixed. It appears, then, that the experiments of LaBerge (1973) and of Gottwald and Garner (1972) describe what the e f f e c t of 57 expectations i s l i k e l y to be on the perception of objects composed of separable features, whose perception i s not automatized. To review, these experimenters have shown that expectations about the conjunctions of features which w i l l appear enables us to perceive objects as well as fami l i a r , automatically perceived objects (LaBerge, 1973), and further enables us to confirm the identity of stimuli composed of an expected conjunction of features more rapidly (Gottwald and Garner, 1972). Do these expectations, then, free us from the need to attend f o c a l l y to object locations in order to perceive the conjunctions of features properly? Based on t h i s work i t is possible to hypothesize that t h i s would be the case: i t may be that expectations reduce the importance of focal attention in the feature detection and integration process. The results of LaBerge (1973) and of Gottwald and Garner (1972) suggest that subjects can voluntarily set themselves to expect a given combination of features.. Then, given that subjects set themselves to expect to detect these features, in a par t i c u l a r combination, i t is possible that they may need very l i t t l e v i s u a l information, maybe only enough to confirm their expectation that the features are indeed present, before they can construct the perception of the object. Expectations, in other words, might lead subjects to integrate expected features rapidly, because 58 they have e s s e n t i a l l y combined them in advance. As a result, i t may be that in the presence of expectations, focal attention would not be as important in the correct construction of object f i l e s . In more general terms,' given our current knowledge both of the ways in which conceptual information a f f e c t s object perception, and of the ways in which features and dimensions are combined into the perception of an object, how might we characterize object perception? F i r s t , i f a frame is instantiated, either as a resu l t of an i n i t i a l semantic interpretation of a scene, because of a provided verbal l a b e l , or because of expectations, t h i s frame may contribute to the preattentive parsing of a v i s u a l scene into s p a t i a l locations, at which separate objects appear. Conceptual information, then, might a f f e c t our a l l o c a t i o n of attention to diff e r e n t segments of a scene. Further, frames might then specify the features of an object which can be- expected to appear at a given s p a t i a l location, as well as a description of how those features are l i k e l y to be organized. Notice that this is exactly the information which affected subjects' perceptions in LaBerge's (1973) and Gottwald and Garner's (1972) experiments. This information might aid in the assignment of features to object f i l e s , and in their subsequent integration. It may be, then, 59 that conceptual information w i l l interact with feature perception and integration in order to increase the accuracy of our perceptions, so that we avoid making perceptual errors, such as i l l u s o r y conjunctions. THE EXPERIMENTS The experiments reported below examined the way in which subjects integrate the features of an object in the presence of expectations. The f i r s t hypothesis tested was that the presence of expectations leads subjects to perceive and conjoin the features of objects more accurately. In the f i r s t experiment, a l l subjects participated in two conditions; one where they were led to expect certain "constrained" color/shape conjunctions, and one where the feature combinations were "unconstrained", and where a l l colors and shapes were paired with equal p r o b a b i l i t y . In both conditions subjects' attention was spread across a display, so as to investigate the perception of objects in the absence of focal attention. It was predicted that subjects would make more i l l u s o r y conjunctions in the unconstrained condition, where perception would probably proceed more bottom up, in the absence of any conceptual information. Therefore, because focal attention was diverted, subjects should have been able to detect features, 60 but have been unable to l o c a l i s e them or to cor r e c t l y assign them to object f i l e s . In contrast, in the constrained condition, subjects' expectations should have aided them in the construction of object f i l e s , so that even i f they simply detected the presence of the expected features, they should have been more l i k e l y to conjoin them c o r r e c t l y into a perceived object. Secondly, subjects in both of the experiments below were presented with objects which were unexpected, or "abnormal". These were objects in the constrained condition where the color/shape combination was not "correct" (e.g. a blue as opposed to a black t i r e ) . In the f i r s t experiment these objects were included primarily to reduce guessing on the part of the subjects; so that they would know some abnormal objects could appear. In the second experiment the types of errors subjects made to these objects were more closely evaluated. It was predicted that in both experiments subjects' responses to these objects would be less accurate as compared to their responses to "normal" objects in the constrained condition. F i n a l l y , above i t was suggested that subjects might assign detected features to object f i l e s on the basis of their expectations. If i t i s the case that expectations are i n f l u e n t i a l in the construction of object f i l e s , then subjects should have a tendency to construct normal objects out of 61 features which appear in the display, even i f those features appear on abnormal objects. In other words, expectations might replace focal attention as the "glue" which holds together features from the same object in the feature integration process. This hypothesis was tested in Experiment 2. Experiment 1 Experiment 1 was designed to test two of the predictions presented above: f i r s t , whether the presence of expectations would reduce the l i k e l i h o o d of i l l u s o r y conjunction errors in the absence of focal attention, and second, whether the perception of unexpected objects would be p a r t i c u l a r l y inaccurate. In order to examine these issues, a variation of the experimental materials used by Treisman and Schmidt (1982) in their i l l u s o r y conjunction experiments was employed. That i s , cards were constructed on which three objects' appeared, flanked by two d i g i t s . The role of the d i g i t s was to spread the subjects' attention across the entire display and to prevent subjects from focusing on any one of the objects. The subjects' task was to report the two d i g i t s , and then to report the color and the shape of one post-cued object, as in Treisman and Schmidt's (1982) Experiments IV and V. 62 Two complete decks of cards were constructed for t h i s experiment, where the sets of shapes and colors which po t e n t i a l l y defined the objects appearing on the cards were id e n t i c a l for both decks. The objects on the cards of the f i r s t deck were formed by pairing a l l shapes with a l l colors equally often, so that subjects who saw these cards would have no expectations about what the shape/color combinations would be. This condition was very similar to those in the o r i g i n a l studies done by Treisman and Schmidt (1982). The objects on the cards in the second deck, however, induced subjects to have expectations about the conjunction of features they would perceive. This was because the shape/color combinations which formed these objects were highly associated, in that they could p o t e n t i a l l y represent a real world object. To i l l u s t r a t e , one of the shapes was a t r i a n g l e , and one of the colors was orange. The pairing of these two could represent a carrot, and within this context, the two were highly associated. The set of objects used to construct these/ cards, are depicted in Figure 1 . A number of d i f f e r e n t methods could have been employed in order to induce subjects' expectations- for example, subjects could have simply been told to expect a conjunction of features (as they were in Group 2, Experiment 2), or they could have been taught an association between those features. We chose to 63 Figure 1 Experimental Stimuli Tal " 0 V arrow triangle e l l i p s e equals ring colors: blue, orange, green, brown, black Tb7 tree carrot lake logs t i r e green orange blue brown black (a) Possible colors and shapes: unconstrained condition (b) Possible colors and shapes: constrained condition 64 induce expectations with familiar object labels, however, for a couple of reasons. F i r s t , p r a c t i c a l l y speaking, i t was important to induce strong enough expectations so that subjects would be able to use them. Therefore the objects chosen were familiar enough to ensure that subjects would have rather well ingrained feature combination expectations, but not so familiar as to be perceived automatically. These stimuli were preferable to l e t t e r s , for example for that reason. Further, i t was hoped that these labels would instantiate object frames, since we are l i k e l y to have carrot, tree, and t i r e frames at our disposal. Within the constrained condition, some of the objects on the cards did not conform to subjects' expectations. So for example, the subjects were occasionally shown objects that consisted of an unexpected conjunction of features (e.g. a black carrot, a green t i r e , or an orange lake). This manipulation was effected primarily to reduce the l i k e l i h o o d that subjects would just guess on the- basis; of; their expectat ions. Method  Subjects The subjects were 16 students at the University of B r i t i s h 65 Columbia, 8 male and 8 female, whose ages ranged from 18 to 24 years. Subjects were paid for their p a r t i c i p a t i o n and were run in 2 one hour sessions, occuring approximately 1 week apart. St imuli The stimuli consisted of 2 sets of 120 stimulus cards, plus 2 sets of 120 cue cards. On each stimulus card 3 objects appeared, one object in the center of the card, and one on either side of that object, evenly spaced. Two d i g i t s flanked these 3 objects (see Figure 2a). The v i s u a l angles subtended by each of the fiv e possible objects were, h o r i z o n t a l l y and v e r t i c a l l y respectively: arrow, .86° by .72* e l l i p s e , 1.43° by .65°; ring, 1° by f; triangle, .65° by 1.22°; equals, 1° by .72". These objects were matched subjectively in terms of apparent s i z e . Each d i g i t subtended a visual angle of .36* by .72°. The entire display subtended 6.8°horizontally, and 2.0° v e r t i c a l l y . The cue cards consisted of a rectangular checkerboard pattern centered in the card, designed to mask where the objects had been, and a single l i n e , the top of which was located 1.22*>above the mask. This l i n e pointed to one of the three locations in which objects could appear. Note that the subjects viewed the mask and this cue simultaneously (see 66 Figure 2 Experimental Cards: Experiment 1 (a) 2 c~. 5 0 © 5 (a) Stimulus cards (b) Cue cards 67 Figure 2b). The mask subtended a visual angle of 7.15° ho r i z o n t a l l y , and 1.86" v e r t i c a l l y , while the cue alone subtended . 36° v e r t i c a l l y , and . 5° ho r i z o n t a l l y . The objects on the cards were drawn from the pool of 5 possible shapes and colors which are presented in. Figure 1. The lumininances of the colors ranged from 9.7 X 10"2 for the black, to 4.15 X 10"' for the blue. These shapes and colors were i d e n t i c a l for the two sets of cards. However, in the unconstrained condition, the shapes were given names that would not lead to expectations concerning their color, while in the constrained condition, the names represented objects with highly associated shape/color combinations. So for example, the " " was c a l l e d an "arrow" in the unconstrained condition, and could be any color; i t was c a l l e d a "tree" in the constrained condition, and could be expected to be green. In both conditions a given color or shape never appeared more than once on the same card. In the unconstrained condition each shape was paired with each color an equal number of times. Furthermore, each object and each color appeared equally often in the 3 possible positions on the card, as did each object/color combination. Each object was seen equally often with a l l other objects, just as each color was seen equally often with a l l other c o l o r s . A set of 120 cards was drawn to f i t these s p e c i f i c a t i o n s . 68 In the constrained condition, 75% of the objects drawn were "normal", that i s , the color and shape were paired as expected (e.g. an orange c a r r o t ) . In the other 25% of the cases, the shapes were combined with the other 4 possible colors an equal number of times. On any given card a maximum of one abnormal object appeared. Thus out of 120 cards, 90 had an abnormal object on them. But only 1/4 of the shown objects were abnormal. The cards were counterbalanced in the same way as those in the unconstrained condition given these requirements, resulting in a second set of 120 cards. In both conditions, on each t r i a l one of the 3 objects on the card was cued to be reported by the subject. Each position was cued an equal number of times, as were each object and each color. In the unconstrained condition, each color/object pair was also cued an equal number of times. In the constrained condition, a l l objects were cued equally often in the expected combinations, and when abnormal, each color/object combination was cued equally often as much as possible. Also,, in the constrained condition, 75% of the cued objects were normal, while 25% were abnormal. Thus since abnormal objects appeared on 75% of the cards, some normal objects were cued on cards which contained abnormal objects. In sum, 120 cue cards were made for each condition. Subjects were tested i n d i v i d u a l l y using a Gerbrands 4-69 f i e l d tachistoscope. Subjects viewed both cues and stimulus cards from a distance of 80 cm. Measures The frequency of two types of error; conjunction errors, and feature errors, was observed. Subject responses were scored as conjunction errors i f a subject reported only features which appeared on the card, but in an incorrect combination. For example, i f the subject were presented with an orange carrot and a black t i r e , and reported an orange t i r e , that error would have been scored as a conjunction error. A subject would have made a feature error i f either one of the two reported features were not on the card. The cards were constructed so that the p r o b a b i l i t i e s of conjunction and feature errors were equal, given someone were guessing. Subjects saw three objects, one of which was post-cued to be reported. Thus of the 4 other shapes, and colors in the experiment (other than the ones cued), 2 of each appeared on the card, while 2 of each did not. Thus i f subjects randomly chose a shape or a color with which to make a mistake (other than the correct one), he would have been equally l i k e l y to have made a conjunction or a feature error. Furthermore, i f the subjects' mistakes were due to a 70 consistent confusion between any 2 features, this systematic confusion should have had no effect on the r e l a t i v e p r o b a b i l i t i e s of conjunction and feature errors. This i s true because a l l features appeared equally often with a l l other features on the cards. Confidence ratings were also collected after each t r i a l . The subjects stated that they were either sure, uncertain, or simply guessing. Procedure In both conditions subjects followed the same procedure. The sequence of stimuli seen by each subject on a t r i a l was as follows: f i r s t subjects were shown a card on which appeared 2 f i x a t i o n dots, which were placed at locations on a blank card which corresponded to the centers of the two outer objects on the stimulus cards. This card encouraged subjects to spread their attention across the< entire display. Next, a "beep" signal warned the subject that the t r i a l was about to begin, after which the stimulus card was presented, followed by the cue/mask. F i n a l l y the f i x a t i o n dots once more appeared. The cue was presented consistently for 500 msec on each t r i a l , but the stimulus duration was variable. The exposure varied according to a 7-1 staircase (where for every 7 consecutive 71 correct responses the exposure was decreased, and for every incorrect response the duration was increased, always by 20 msec.) where only feature errors were considered to be "incorrect". This was done in order to approximate a 10% feature error rate. Subjects were started at 300 msec for practice t r i a l s , and at 250 msec for the experimental t r i a l s . A 300 msec upper l i m i t and a 100 msec lower l i m i t were imposed, but otherwise exposure varied with the staircase. When subjects appeared for the experiment, the sequence of displays was described to them. They were then shown the set of possible shapes and colors that would appear on the cards, outside of the tachistoscope. In each condition the names of the objects were e x p l i c i t l y presented. In the constrained condition, subjects were to l d that most of the time the objects would appear in the correct color, but that every once in a while, to control for guessing, an object might be abnormal. The high proportion of normal objects was stressed. In the unconstrained condition, subjects were to l d that each shape, could be any of the five possible colors, and that they could expect to see each combination equally often. After t h i s description subjects were each shown a series of cards on which single objects appeared, this time through the tachistoscope, in order to allow them to practice the naming of the shapes and colors. A l l colors and shapes were 72 shown twice. The experimenter then described the task to each subject. They were to l d to f i r s t report the d i g i t s , and second, to report the color and the shape of the object in the post-cued po s i t i o n . Subjects were also asked for a confidence rating (sure, maybe, guessing) on each object reported. Subjects were encouraged to report both a color and a shape on every t r i a l , even i f they f e l t they were guessing. Subjects then did 20 practice t r i a l s . For each condition 10 practice cards had been made with the same format as the experimental cards, and these were shown twice with 2 different sets of cues, in order to create a reasonable practice session. F i n a l l y , the subjects participated in the 120 experimental t r i a l s . The cards were presented to the subjects in random order. Two di f f e r e n t random orders were used, each seen by half of the subjects. The whole procedure lasted from 50 min. to 1 hr. per session. Each subject was exposed to both conditions with an inte r v a l of approximately 1 week between sessions. The order of presentation was counterbalanced across subjects. Results and Discussion In t h i s experiment subjects were asked to identify the 73 color and shape of a cued object. On any t r i a l subjects could have made a variety of responses. F i r s t , a response was scored as a conjunction error i f the subject did not c o r r e c t l y report the cued object, but the features reported did both appear on the card. The object reported, then, consisted of a "conjunction" of the features present in the display. Subject responses were coded as "feature" errors, on the other hand, i f either one of the two reported features did not appear on the card. Next, subjects could have reported the two features of the cued object correctly, an object in a non cued position, or have f a i l e d to respond at a l l . Subjects' responses to each type of cue (constrained normal., constrained abnormal, unconstrained) -were coded in terms of the above 5 response categories. Then the proportions out of a l l of a subject's responses to each type of cued object which f e l l into each of the response categories was calculated. This was done, for each type of cue, by dividing the number of each type of response by the t o t a l number of objects cued. Table II presents the mean proportions out of a l l responses to each type of cue that were scored as conjunction errors, feature errors, correct responses, a wrong object c o r r e c t l y reported, or no response. The data in this table are also broken down by session. For example, data from subjects' f i r s t session are presented in 74 Table II Mean proportions of response types to unconstrained, constrained normal, and constrained abnormal cued objects, by confidence: Experiment 1 Session 1 Session 2 Mean of Sessions A l l resp Top two Sure only A l l resp Top two Sure only A l l resp Top two Sure only Uncons. Group 2 Group 1 Both Groups Con j . .29 .23 . 1 1 .25 .20 .05 .27 .21 .08 Feat. .13 .09 .03 .06 .04 .01 .10 .06 .02 C-F . 1 6 . 1 4 .08 .19 . 1 6 .04 .17 .15 .06 Correct .50 .47 .34 .60 .57 .39 .55 .52 .37 Wrg.Obj .05 .04 .02 .06 .05 .01 .06 .05 .02 No Resp .03 - - .03 - - .03 - -Cons.: Group 1 Group 2 Both Groups Normal Cue Con j . .09 .07 .02 . 1 3 .09 .03 . 1 1 .08 .02 Feat. .08 .05 .02 .04 .01 .00 .06 .03 .01 C-F .01 .02 .00 .09 .08 .03 .05 .05 .01 Correct .69 .67 .57 .75 .71 .47 .72 .69 .52 Wrg.Obj . 1 1 .10 .05 .07 .03 .02 .09 .07 .03 No Resp .03 - - .01 - - .02 - -75 Table II (cont'd) Abnorm. Cue Con j . . 1 2 . 1 1 .03 .15 .10 .03 . 1 4 .10 .03 Feat. .29 .24 . 14 . 1 2 .07 .03 .21 . 1 6 .09 C-F -.17 -.13 -.11 .03 .03 .00 -.07 -.06 -.06 Correct .41 .41 .30 .68 .63 .40 .55 . 52 .35 Wrg.Obj . 1 3 .10 .05 .05 .03 .01 .09 .06 .03 No Resp .04 - - .01 - - .03 - -76 the f i r s t three columns in the table. Notice that in the f i r s t session, half of the subjects (Group 1) p a r t i c i p a t e d in the constrained condition f i r s t , while the other half of the subjects (Group 2) i n i t i a l l y p a rticipated in the unconstrained condition. As a result, this f i r s t set of columns presents a between subjects comparison of the constrained and unconstrained conditions. S i m i l a r l y , the second set of 3 columns presents the data from subjects' second session. Here, the data from the Group 1 subjects' unconstrained condition and the Group 2 subjects' constrained condition can be reviewed. F i n a l l y , the l a s t set of 3 columns presents data averaged over Group 1 and 2 subjects. F i n a l l y , subjects' responses are also presented by confidence l e v e l . The data are presented separately for responses which were given a sure rating only, for responses given one of the top two (out of three) ratings, and for a l l responses. The main question addressed by Experiment 1 was. whether subjects' expectations about conjunctions of features would affect the perception of objects. It was hypothesized that expectations would influence perception by discouraging i l l u s o r y conjunctions in the constrained condition. This hypothesis was tested by assessing the proportion of "true" conjunction errors which subjects made out of a l l of their 77 responses to each type of cue. Treisman claimed that of the errors scored as conjunction errors, only some represented true i l l u s o r y recombinations of the features of objects, while others were actually feature misperceptions. Theoretically, feature errors occur when subjects confuse or misperceive features, yet errors were only scored as feature errors i f the reported features did not appear on the card. As a result, in some cases subjects were probably making feature errors which were scored as conjunction errors, because the features reported happened to be in the display. F i n a l l y , because i t i s true that i f subjects are misperceiving or confusing features, the probability of reporting a feature on or off the card i s equally l i k e l y , the number of features errors mistakenly .scored as conjunction errors should be about the same as the number of scored feature errors. Treisman concluded that in order to estimate the number of true i l l u s o r y recombinations, i t i s necessary to subtract the number of feature errors made from the number of scored conjunction errors (C-F). The proportions of true conjunction errors made to each type of cued object were estimated by ca l c u l a t i n g C-F, and div i d i n g t h i s difference by the t o t a l number of times each type of object was cued. True conjunction error proportions were calculated for each subject separately, and the means are 78 presented in Table I I . The question i s , do subjects make fewer true conjunction errors in the constrained condition, as opposed to the unconstrained condition? A mixed design ANOVA with the order of presentation (constrained vs unconstrained condition f i r s t ) as a between-subjects factor and the type of object cued (constrained normal, unconstrained) as a within-subjects factor was performed on the mean proportions of true conjunction errors made to each type of cue. This analysis revealed a s i g n i f i c a n t main effect of cue (F(1,14)=5.106, p<.002), and a s i g n i f i c a n t cue X order interaction (F(1,14)=4.854, p<.045). As predicted, subjects made fewer true conjunction errors when normal constrained as compared to unconstrained objects were cued. The s i g n i f i c a n t interaction suggests that the difference between the unconstrained and constrained normal cued objects is somewhat larger for Group 1 subjects, who participated in the constrained condition f i r s t . This order effect w i l l be discussed in more depth below,. Treisman and Schmidt (1982) hypothesized that true conjunction errors did exist, because the number of conjunction errors exceeded the number of feature errors by more than a chance l e v e l . The above analysis showed that the proportions of true conjunction errors made to constrained normal and unconstrained cued objects d i f f e r e d s i g n i f i c a n t l y . However, 79 the analysis did not e x p l i c i t l y test whether or not the number of true conjunction errors was s i g n i f i c a n t l y greater than 0 (C-F>0), and thus whether the data were meaningful in that true conjunction errors were being made. Therefore, the data from t h i s experiment were also analyzed so as to determine whether OF>0 for both normal and unconstrained cued objects. Two 2X2 ANOVA's comparing the proportions of conjunction and feature errors were performed, each with order and the type of error made (conjunction, feature) as factors. Both analyses revealed that the proportion of conjunction errors was s i g n i f i c a n t l y greater than the proportion of feature errors, in response both to cued normal . objects . (F(1,14)=5.877, p<.029), and to cued unconstrained objects (F(1,14)=27.673, p<.00l). Further, Treisman in her o r i g i n a l experiments showed that subjects were more confident when making conjunction as opposed to feature errors, which suggested that the conjunction errors made by her subjects may have.been perceptual phenomena, as opposed to guesses. A similar analysis was performed on the data in this experiment. The proportion of conjunction and feature errors which were given either one of the top two confidence ratings was calculated (one subject was excluded from this analysis, because of a d i v i s i o n by 0). A mixed design ANOVA on these proportions, with order, type of cue, and 80 type of error made as factors, revealed a s i g n i f i c a n t main effe c t of error (F(1,13)=11.643, p<.005). That i s , the proportion of confident conjunction errors made by subjects (.76) was s i g n i f i c a n t l y larger than the proportion of feature errors given either of the top two confidence ratings (.61). As in Treisman's experiments, then, subjects are more confident when making perceptual feature recombinations than they are when making feature errors. No s i g n i f i c a n t differences were found in an analysis that compared the proportions of sure only conjunction and feature errors, perhaps because the proportions were quite small. It should be noted that the abnormal objects were not considered in the above analyses for a couple of reasons. F i r s t , the reason for including the abnormal objects in this study was simply to prohibit subjects in the constrained condition from guessing, and i t was not planned that these errors would be c a r e f u l l y scrutinized. Second, the p r o b a b i l i t i e s o.f- conjunction and feature- errors were not equal when these abnormal objects were cued, and therefore an analysis of C-F would not be very meaningful. A more careful examination of the cards used in this experiment makes i t clear why thi s i s the case. Given the combinations of objects which appeared on these cards, subjects could not both make a conjunction error and 81 report having seen a normal object. When an abnormal object was cued, the other two objects on the card were always normal. Therefore, in order to report a normal object, subjects would have to pair one of the abnormal object's features with i t s mate off of the card, which would be scored as a feature error. In sum, the problem is that i f subjects are biased because of their expectations to report a normal object, they could only make feature errors to do so. The data in Table II reveal that when abnormal objects were cued, subjects were biased by their expectations to report normal objects, although this i s true primarily for Group 1 rather than Group 2 subjects. This i s reflected in the large number of feature errors made to abnormal cued objects, 73% of which did in fact result in the report of a normal object. Therefore these data suggest that subjects did tend to report normal objects, so that any comparison of conjunction and feature errors on those objects i s not meaningful. The data in this experiment indicate that, subjects were less l i k e l y to make i l l u s o r y conjunctions when responding to constrained normal as opposed to unconstrained cued objects. It i s also true that subjects tended to make fewer feature errors in response to normal, as compared to unconstrained objects. It should be noted that although the exposure duration for the stimulus cards varied according to a staircase, with 82 the intent of maintaining approximately a 10% feature error rate in both conditions, the staircase did not have that e f f e c t . Instead, the feature error rates were s i g n i f i c a n t l y d i f f e r e n t between the three types of cue. However, because the staircase did result in almost i d e n t i c a l average exposure durations within the two conditions (200 msec. in the constrained condition, and 195 msec. in the unconstrained condition), i t is possible to meaningfully compare feature error rates: subjects had approximately the same amount of time to v i s u a l l y process each type of cued object. F i r s t , a 2X2 ANOVA on the proportions of feature errors made was performed where only the errors made to normal and unconstrained cued objects were included. This analysis, which had order and type of cue as factors, revealed that subjects made s i g n i f i c a n t l y fewer feature errors in response to normal as opposed to unconstrained cued objects (F(1,14)=11.399, p<.005). There was also a s i g n i f i c a n t order X cue interaction (F(1,14)=20.978, p<.00l). The' interaction i n this case, was probably the result of practice e f f e c t s , and w i l l be discussed further below. It was also predicted that subjects would be less accurate when responding to abnormal as opposed to normal cued objects. A 2X3 ANOVA with order and type of cue (constrained normal, constrained abnormal, unconstrained) as factors was 83 performed on the proportions of feature errors made. This analysis revealed a highly s i g n i f i c a n t effect of cue (F(1,14)=18.829, p<.00l), as well as a s i g n i f i c a n t order X cue interaction (F(1,14)=11.895, p<.00l). Subjects made s i g n i f i c a n t l y more feature errors to abnormal, as opposed to constrained normal or unconstrained cued objects. The interaction here r e f l e c t s the tendency of Group 1 subjects to be less accurate on abnormal objects than Group 2 subjects. The prediction that overa l l subjects would be least accurate when responding to abnormal cues was also supported by the data on correct responses. A 2X3 ANOVA on the proportion of correct responses was also performed, again with order and the type- of cue as factors. This analysis also revealed a s i g n i f i c a n t main effect of cue (F(2 , 28) = 13.686, p<.00l) and a s i g n i f i c a n t order X cue interaction (F(2,28)=6.380, p<.005). Subjects were the most accurate when reporting normal objects, and were the least accurate when reporting abnormal objects. The interaction here also is a result of. the< order e f f e c t s : Group 1 subjects tended to be least accurate in response to abnormal cued objects, while Group 2 subjects did not. Again, t h i s order effect w i l l be discussed further below. F i n a l l y , the data in this experiment also suggest that when subjects respond to normal cues, they are more confident about their responses. The proportion of responses given a 84 sure confidence rating was calculated by dividing the number of responses given a sure rating by the number of t r i a l s , for each type of cue separately. A 2X3 ANOVA on these proportions, with order and type of cued object as factors revealed that a s i g n i f i c a n t l y higher percentage of the responses to normal cues (.56), as opposed to responses to either abnormal (.47) or unconstrained (.47) cues, were ranked sure (F(2,28)=3.328, p<.05). This increase in confidence could be due to the fact that subjects were most accurate in reporting normal objects, and are simply more confident when they are correct. But in any case, i t i s the presence of expectations which seems to lead .subjects to be more accurate, and therefore, perhaps more .^.confident. This finding suggests that subjects were not simply guessing on the basis of their expectations, but rather are more confident in the perceptions they have because of them. In sum, the data presented above suggest that subjects are more accurate in responding to constrained normal as opposed to unconstrained cued objects: they make fewer feature and fewer conjunction errors in the former condition. It seems, then, that expectations enable subjects to both detect and conjoin features more accurately. It i s also true, however, that the effect of expectations on the perception of expected objects i s much stronger on conjunction as opposed to feature errors: The 85 main effect of expectations seems to be a reduction in the rate of i l l u s o r y conjunction errors. F i n a l l y , the data also reveal that when subjects have expectations which are not f u l f i l l e d , perception s u f f e r s . Subjects are much less accurate when responding to abnormal as opposed to normal or unconstrained objects. The question at this point, then, i s how do expectations exert these effects? One p o s s i b i l i t y i s that subjects are reluctant to break up normal objects in order to make i l l u s o r y conjunctions: that somehow normal objects are protected in the feature integration process. There i s a small amount of evidence, however, that t h i s i s not the case. There were two dif f e r e n t types of cards on which normal objects were cued: on 2/3 (60) of the cards, besides the cued normal object, one normal and one abnormal object also appeared. On the other 1/3 (30) of the cards, only normal objects were presented. If i t were the case that normal objects are protected from i l l u s o r y conjunctions, you might expect that cards where only normal objects appeared would inspire fewer conjunction errors. However there i s no evidence that t h i s i s the case. The ove r a l l proportion of true conjunction errors on cards where abnormal objects appeared (.04) was actually s l i g h t l y (though not s i g n i f i c a n t l y ) less than the proportion on cards containing only normal objects (.08). 8 6 Second, the errors subjects made on cards where both a normal and an abnormal object appeared along with the cued normal object were evaluated more c l o s e l y . If normal objects were protected from i l l u s o r y conjunctions, i t should be the case that, when making conjunction errors, subjects would be less l i k e l y to to draw features from normal, as opposed to abnormal, objects. Therefore, for each subject, the number of individual features which were drawn from normal and abnormal objects, when making either conjunction or feature errors, was tabulated (Table I I I ) . (Note: because a response was scored as a feature error i f either reported feature was not on the card, in a small number of cases subjects reported two incorrect features, one on and one off the card. This was scored as a feature error. Therefore the type of object from which the one of the reported features was drawn could be examined) Two mixed design ANOVA's on the mean numbers of features drawn from normal and abnormal objects were performed, with order and type of object from which subjects: drew- features (normal, abnormal) as factors. In the f i r s t analysis, only conjunction errors were included. This analysis revealed no s i g n i f i c a n t differences: subjects were equally l i k e l y to draw objects from normal and abnormal objects when making conjunction errors. The second analysis, which was performed on feature 87 Table III Normal cued objects: mean number of features drawn from normal or abnormal objects, when making conjunction or feature errors. Normal object feature Abnormal object feature Group 1 (cons f i r s t ) 2.23 8.00 Group 2 (uncons f i r s t ) 4.25 5. 13 Mean 3.24 6.56 88 e r r o r s , r e v e a l e d t h a t s u b j e c t s t a k e s i g n i f i c a n t l y more f e a t u r e s f r o m a b n o r m a l o b j e c t s ( F ( 1 , 1 4 ) = 1 8 . 0 7 2 , p < . 0 0 l ) . T h e r e was a l s o a s i g n i f i c a n t o r d e r X o b j e c t i n t e r a c t i o n ( F ( 1 , 1 4 ) = 9 . 4 2 6 , p < . 0 0 8 ) . A c l o s e r a n a l y s i s o f t h e s e e r r o r s s u g g e s t s t h a t s u b j e c t s , e v e n when a n o r m a l o b j e c t was c u e d , d e t e c t e d one o f t h e f e a t u r e s o f t h e a b n o r m a l o b j e c t on t h e c a r d c o r r e c t l y , a n d p a i r e d i t w i t h i t s a s s o c i a t e d f e a t u r e , w h i c h was n o t on t h e c a r d , i n o r d e r t o r e p o r t a n o r m a l o b j e c t . The i n t e r a c t i o n i n d i c a t e s t h a t t h i s e f f e c t , t o o , was s t r o n g e r f o r G r o u p 1 s u b j e c t s . T h r o u g h o u t t h e a b o v e d i s c u s s i o n o f t h e r e s u l t s o f t h i s f i r s t e x p e r i m e n t s i g n i f i c a n t i n t e r a c t i o n s i n v o l v i n g t h e o r d e r o f p r e s e n t a t i o n o f t h e c o n s t r a i n e d a n d u n c o n s t r a i n e d c o n d i t i o n s h a v e been d e s c r i b e d . T h e s e o r d e r e f f e c t s seem t o d e r i v e f r o m two s o u r c e s . F i r s t , p a r t o f t h e d i f f e r e n c e b e t w e e n t h e r e s p o n s e s o f s u b j e c t s i n G r o u p s 1 a n d 2 a p p e a r s t o be a r e s u l t o f p r a c t i c e . I t i s c l e a r t h a t s u b j e c t s i n b o t h g r o u p s made f e w e r f e a t u r e e r r o r s i n t h e s e c o n d s e s s i o n , i n w h i c h t h e y p a r t i c i p a t e d ( s e e T a b l e I I ) . So f o r e x a m p l e , t h e s i g n i f i c a n t o r d e r X t y p e o f cue i n t e r a c t i o n on f e a t u r e e r r o r s was p r o b a b l y a r e s u l t o f t h e r e d u c t i o n i n f e a t u r e e r r o r s r a t e s made by s u b j e c t s b e t ween s e s s i o n s 1 a n d 2: G r o u p 1 s u b j e c t s made s l i g h t l y more f e a t u r e e r r o r s i n t h e c o n s t r a i n e d c o n d i t i o n , i n w h i c h t h e y p a r t i c i p a t e d f i r s t , t h a n i n t h e u n c o n s t r a i n e d c o n d i t i o n , w h i l e s u b j e c t s i n 89 Group 2 made far more feature errors in the unconstrained rather than the constrained condition There does seem to be another systematic difference betweeen the subjects of Groups 1 and 2, resul t i n g from the order of presentation of the two conditions. Subjects who participated in the constrained condition f i r s t seem to have been more influenced by their expectations than were the subjects who participated in that condition after exposure to the unconstrained condition. Subjects in Group 1 were much more inaccurate in reporting abnormal objects than were subjects in Group 2. F i r s t , they tended to make more feature errors and fewer correct responses than did Group 2 subjects. Further, Group 1 subjects had more of a tendency to report objects which conformed to their expectations. F i n a l l y , i t is apparent from Table II that subjects in Group 2 made a larger proportion of conjunction errors (.09) to normal cues than did subjects in Group 1 (.01). So although subjects in both conditions made- fewer true conjunction errors to normal cues as compared to unconstrained cues, subjects in Group 1 were somewhat less l i k e l y to make these errors. Conclusions The results of Experiment 1 suggest that the perception of 90 normal objects i s improved by expectations. Subjects made fewer i l l u s o r y recombinations, and fewer feature errors when normal objects were cued. However, t h i s did not seem to be because normal objects were less l i k e l y to be "broken up", at least when compared to abnormal objects. What, then, might be responsible for the increased accuracy in reporting normal objects? There are at least two p o s s i b i l i t i e s . One is that the presence of constraints somehow speeds up the detection and intergration of features from normal objects, so that given the same exposure duration, subjects w i l l be more accurate in perceiving expected objects. Further, subjects would have more of an opportunity to attend to expected objects, and because perception would- be quicker, they would also have a greater chance of correctly conjoining an expected object'„s . features . Expectations, then, may not replace attention as the glue in the feature integration process, but might simply give attention a head start on gluing features together. It i s somewhat inconsistent with t h i s hypothesis that the exposure durations in the constrained and unconstrained conditions were the same, because presumably i f perception were quicker on normal objects, then the exposure duration in the constrained condition should have been driven lower by the staircase. However, the average exposure duration in the constrained condition does not r e a l l y r e f l e c t the speed with 91 which subjects processed normal objects. F i r s t , the floor of 1 0 0 msec, imposed on the exposure durations might have prevented the exposure duration in the constrained condition from dropping below that l e v e l . Furthermore, the presence of abnormal objects 'in the constrained condition and the high feature error rate on those objects kept that exposure duration high. A second possible explanation for the results from Experiment 1 would be that perhaps subjects detected features in the display, and then simply guessed on the basis of their expectations about the conjunction of those features. This would lead to more accurate responses to cued normal objects; i f subjects build object f i l e s on the basis of i n t e l l i g e n t guesses. Experiment 2 was designed to help decide between these two possible explanations. Experiment 2 In this experiment, stimulus' cards-were' designed s c that subjects could construct normal objects out of features appearing on the cards, but in abnormal combinations. If subjects detect features c o r r e c t l y and then assign them to object f i l e s on the basis of expectations, they should have a tendency to build normal objects out of the features they detected on abnormal objects. On the other hand, i f 92 expectations function to speed up the perception of normal objects, then the presence of features which could be conjoined into a normal object should not influence subjects' perceptions. Subjects should simply continue to be more accurate when perceiving normal objects as compared to abnormal objects. A l l subjects in this experiment had expectations about the conjunctions of features which would appear together, and on 70% of the cards a l l three of the objects that appeared conformed to these expectations. But on 30% of the cards, two of the objects were abnormal. An example of an abnormal card appears in Figure 3. The important experimental manipulation here was defined by the relationship-between the two abnormal objects on the card. Out of the four features composing these two objects, two of them could be recominbed to form a normal object, while a combination of the other two could not. Therefore i f either one of the abnormal objects was cued, subjects could construct a perception of a normal object by taking one of the., features from the other abnormal object. This construction would result in an i l l u s o r y conjunction. On the other hand, subjects could also make a feature error, by constructing a normal object out of the other feature of the cued object plus i t s associated feature, which was not on the card. Therefore given that a subject reports a normal object, this report could be due to either a conjunction or a feature error. 93 Figure 3 Experimental cards: Experiment 2 green orange brown 3 9 tree t i r e carrot 94 These cards were also designed so that i f a subject were guessing, the probability of a conjunction or a feature error would be equal. That i s , i f subjects chose at random one of the features of the cued object to construct a normal object, then they would be equally l i k e l y to make a conjunction or a feature error. Furthermore, even i f they systematically r e l i e d on either shape or color as the type of feature on which to base their reports, the p r o b a b i l i t i e s of the two types of error would s t i l l be equal. This i s because each of the two abnormal objects on a card was cued once. Therefore i f subjects consistently used one type of feature, for example color, and based their reports on that, then for each card, their guess would produce a conjunction error in response to one of the objects, and a feature error in response to the other. In sum, i f subjects make more i l l u s o r y conjunctions than feature errors in order to perceive normal objects when the features of these objects are present, but in abnormal combinations, i t would imply that subjects are' building object f i l e s out of detected features on the basis of their expectations. But i f subjects do not construct normal objects out of the features of abnormal objects, while their perceptions of normal objects continue to be more accurate, then one could argue that expectations increase accuracy by speeding up the detection and integration of the features of normal objects. 95 F i n a l l y , the constrained and unconstrained conditions in Experiment 1 d i f f e r e d not only in terms of the presence or absence of expectations, but also in terms of the labels assigned to the objects. Objects in the constrained condition had "conjunction" labels, which implied a conjunction of features, and which corresponded to common, highly familiar objects (e.g. a green tree) . Objects in the unconstrained condition, on the other hand, had "unconstrained" labels, which were also familiar (e.g. an orange, black, or blue arrow), but which were not associated with a familiar object in the real world. Given the e f f e c t of expectations on subjects' perceptions, then, i t i s not clear whether the e f f e c t was due e n t i r e l y to the expected conjunctions of the objects' features, or to the difference in labels as well. In order to test whether subjects' responses were affected by the labels provided in Experiment 1, the same two sets of labels were used again, but t h i s time the stimuli seen by the two groups of subjects were i d e n t i c a l . Therefore, i f there* were, no difference between the groups in t h i s experiment, we could conclude that the difference between the constrained and unconstrained conditions in Experiment 1 was due exclusively to the differences in the stimuli themselves; to the d i f f e r e n t p r o b a b i l i t i e s of feature pairing in the two conditions. On the other hand, i f a s i g n i f i c a n t effect of label were found, we 96 could conclude that a familiar object label in some way mediates the perception of objects, and perhaps that the influence of expectations on the perceptual process i s dependent upon a well learned expectation about the conjunction of features to be perceived. Method  Subjects Subjects were 24 UBC students who were paid $4 for their p a r t i c i p a t i o n . Of these subjects, 7 were male and 17 were female, and the ages ranged from 19 to 29 years. Sixteen of the subjects were assigned to Group 1, the conjunction labels condition, and eight to Group 2, the unconstrained labels condition. Mater laIs The cards used in t h i s experiment were i d e n t i c a l in format to those employed in the constrained condition of Experiment 1 with the following exceptions: 70% of the cards had only normal objects on them, while on the other 30%, 2 objects were abnormal while the other was normal. Abnormal objects were constructed 97 from color/shape combinations so that a l l possible combinations appeared equally often. The abnormal objects appeared so that given the four features present, two could recombine to form a normal object, while the other two could not. The remaining normal object on the card was defined by the only two ' remaining features which neither appeared in the display, nor were associated with the features of the abnormal objects which appeared. The cards were counterbalanced given these constraints. A l l of these cards were cued twice. For cards where two abnormal objects appeared, each of these abnormal objects was cued once. Normal objects were cued only on the remaining cards, on which no abnormal objects appeared. The cue cards were of the same form as those used for Experiment 1 . The cue and mask again appeared simultaneously. Procedure and Measures For the conjunction labels condition in this experiment, the procedure and measures were i d e n t i c a l to those used in the constrained condition of Experiment 1 . The unconstrained labels condition d i f f e r e d in that the subjects were provided with the labels employed in the unconstrained condition of Experiment 1 , and were simply informed that most of the time they could expect 98 certain color/shape pairs to appear together. Again a 7-1 staircase on feature errors was employed to equalize the feature error rate at 10% across a l l conditions. This was intended to ensure a r e l a t i v e l y high (90%) accuracy of feature detection. Results and Discussion An overview of the data from th i s experiment is presented in Table IV. This table displays the mean proportions of subjects' responses to normal and abnormal objects which f e l l into a small set of response categories. F i r s t , again subjects could have made a conjunction error, i f they made a mistake where both reported features were on the card; or they could have made a feature error, i f at least one of the features reported was not on the card. The data in t h i s table are also broken down depending on whether subjects reported having seen, a-normal or an abnormal object, for both conjunction and feature errors separately. F i n a l l y , subjects again could have reported the cued object c o r r e c t l y , or have correctly reported an object in another, non-cued p o s i t i o n . F i r s t , no difference was found between the two groups in this experiment. In a l l of the analyses of variance described 99 Table IV Mean proportions of response types to normal and abnormal cued objects by confidence: Experiment 2. A l l Responses Top Two Conf. Sure Only Normal Object Cue: Con j . .00 .00 .00 Norm. Report Feat. .01 .01 .00 C-F -.01 -.01 .00 Con j . .10 .07 .01 Abnorm. Report Feat. .04 .02 .00 C-F .06 .05 .01 Correct .68 .65 .43 Wrong Object .17 . 14 .04 Note: It was only possible to report a f o r m a l object in response to a cued normal object by reporting two features which did not appear on the card. This type of error was infrequent, and was scored as a feature error. 100 Table IV (cont'd) A l l Responses Top Two Conf. Sure Only Abnormal Object Cue: Con j . . 1 3 . 1 1 .04 Norm. Report Feat. . 1 1 .08 .02 C-F .02 .03 .02 Con j . . 1 3 .09 .04 Abnorm. Report Feat. .04 .03 .02 C-F .09 .06 .02 Correct .48 .44 .25 Wrong Object .10 .08 .02 101 below, group (conjunction labels, unconstrained labels) was run as a between subjects factor, and there was not one s i g n i f i c a n t main effect of group, nor was there a single s i g n i f i c a n t interaction where group was involved. Therefore subjects who were provided with "conjunction" labels did not respond d i f f e r e n t l y from those subjects given labels having no implications for the paired conjunctions of features. The main question addressed by t h i s experiment was whether expectations would induce subjects to construct normal objects out of abnormally paired features. The data in Table IV suggest that when subjects did report normal objects in response to cued abnormal objects, they did not tend to do so by making true conjunction errors. Instead, they made almost equal numbers of conjunction and feature errors (C-F=.02). In other words, subjects did not tend to construct normal objects by forming i l l u s o r y recombinations with the features from abnormal objects. A comparison of the proportion of true conjunction errors made (C-F) which resulted in the report of normal and abnormal objects revealed that subjects a c t u a l l y made more conjunction errors when reporting abnormal, as opposed to normal objects (see Table IV). This result was rather unexpected. It appears that when subjects report normal objects in response to abnormal cued objects, they are actually less l i k e l y to make conjunction 1 02 errors: they make fewer than when reporting abnormal objects. It i s not clear why this was the case. A closer analysis of the data suggests that subjects responded to abnormal objects by correct l y perceiving one feature, and by pairing that feature with i t s mate, regardless of whether that mate was present in the display. Subjects were e s s e n t i a l l y making feature errors, then, which explains why the number of scored feature and conjunction errors was equal. However, i t is not clear why subjects f a i l e d to make any true conjunction errors over th i s feature error rate. It may be that subjects made so many feature errors that they didn't have an opportunity to make any other type of error. However, future research i s c l e a r l y necessary before any conclusions about the mechanism behind t h i s finding can be drawn. The above results suggest that expectations do not exert their e f f e c t on object perception by influencing the formation of i l l u s o r y conjunctions. It i s true, however, that expectations have an eff e c t on subjects' perceptions of objects, i f not on their tendency to recombine detected features. F i r s t , again subjects were less accurate when abnormal, as opposed to normal, objects were cued. A 2X2 ANOVA on the proportion of correct responses to abnormal and normal objects was performed, with group and type of cue as factors. This analysis revealed a si g n i f i c a n t main effect of cued object only (F(1,22)=57.339, 103 p<.001). The proportion of correct responses to abnormal cues was much lower than the proportion of correct responses to normal cues. Furthermore, subjects were more confident in their correct responses to normal objects. The proportion of correct responses which were given a "sure" confidence rating was calculated for each type of cue separately. A mixed design ANOVA on these proportions with group and type of cue as factors indicated that the proportion of sure correct responses made to normal object cues (.61) was s i g n i f i c a n t l y higher than the proportion of confident correct responses to abnormal object cues (.50, F(1,22)=10.219, p<.004). Subjects also had a clear tendency to report having seen objects which conformed to their expectations. If subjects were simply guessing, by chance 1/4 of the i r responses to abnormal object cues should have resulted in the report of a normal object. For example, consider the card in Figure 3. If the orange t i r e were cued, and a subject were- to randomly choose one of the other features on the card with which to make a conjunction error, only one out of those four features could result in the report of a normal object (orange c a r r o t ) . Similarly, out of the four features not on the card (blue, lake, black, logs) only one of the four could be reported so as to result in a feature error and the construction of a normal 104 object (black t i r e ) . Although you would expect, then, that there would be 4 times as many abnormal objects reported, in thi s experiment the proportion of responses to abnormal object cues which resulted in the report of a normal object (.25) was greater than the proportion which resulted in the report of an abnormal object (.17). In order to test t h i s effect s t a t i s t i c a l l y , f i r s t , for each subject an estimate of the number of abnormal objects they should have reported due to chance was calculated. This was done by multiplying the subject's t o t a l error rate (C+F) by 3/4, since i t would be expected by chance that 3/4 of these errors would result in the report of an abnormal object. Then an observed-expected score was derived for each subject. These scores were analyzed by a Wilcoxon-Signed Ranks test separately for each group to determine whether the effect was consistent across subjects. This analysis revealed that subjects in both groups reported more normal objects than would be expected by chance (z=-3.1284, p<.00l, for Group 1 subjects; z=-2.5205, P<.001, for Group 2 subjects, one-tailed t e s t s ) . F i n a l l y , subjects were also more confident when reporting incorrect normal objects in response to either normal or abnormal cued objects. The proportion of the responses where subjects reported normal objects which were given a sure confidence rating was calculated, and was compared to the 105 proportion of sure abnormal reports, including responses to both normal and abnormal cued objects. (Note that subjects could only report a normal object in response to a normal cue by making a feature error where both reported features did not appear on the card. This type of feature error was infrequent (see Tables 2 and 4)). A 2X2 ANOVA on these proportions with group and type of object reported as factors revealed a s i g n i f i c a n t main e f f e c t of the type of object reported (F(1,22)=7.217, p<.01). The proportion of normal reported objects ranked sure (.18) was s i g n i f i c a n t l y higher than the proportion of sure abnormal reports (.09) . In sum, then, this experiment showed that expectations influence the perception of simple objects. F i r s t , subjects were more accurate and more confident when responding to normal objects. Further, they had a tendency to report normal objects when abnormal objects were cued. And f i n a l l y , they were more confident when reporting normal objects. F i n a l l y , as in Experiment 1, subjects were more confident when making conjunction errors than they were when making feature errors, suggesting that some of these errors were more than just guesses on the part of the subjects. The proportion of a l l conjunction errors made to both types of cue which were given a sure confidence rating was calculated and was compared to the proportion of sure-rated feature errors. A 2X2 ANOVA 106 with group and type of error made as factors revealed a s i g n i f i c a n t main effect of type of error (F(1,22)=5.292, p<.031). The proportion of sure conjunction errors (.17) was s i g n i f i c a n t l y higher than the proportion of sure feature errors (.09). Again, t h i s supports Treisman's feature integration theory: subjects make true conjunction errors which may be perceptual phenomena, rather than guesses, or simply feature errors made where the reported features happen to be on the card. In sum, the data from these experiments suggest that subjects are more accurate when perceiving objects which conform to their expectations than they are when perceiving either unconstrained or unexpected objects. F i r s t , in comparison to the perception of unconstrained objects, subjects make fewer true conjunction, and fewer feature errors, when perceiving expected objects. Further, the perception of objects which defy expectations i s the least accurate, and subjects tend to inc o r r e c t l y report having seen objects which conform, to thed.r expectations anyway. GENERAL DISCUSSION AND CONCLUSIONS The top down effects produced by expectations in these two experiments are consistent with the res u l t s of previous 107 research. F i r s t , the finding that subjects can set themselves to expect c e r t a i n conjunctions of features, so as to give an advantage to expected (and a disadvantage to unexpected) objects i s consistent with the findings of both LaBerge (1973), and of Gottwald and Garner (1972). Further, other researchers have also shown that unexpected objects tend to be perceived less accurately, and less quickly than expected objects (Loftus and Mackworh, 1978; Friedman, 1979; Biederman et a l , 1973; Palmer, 1975). The results of these experiments, then, t i e in nicely with the findings of the previous research which has examined the contribution of conceptual information to the perception of objects and scenes. However, the present data do not support the hypothesis that expectations exert their effect by guiding the assignment of features to object f i l e s during the feature integration process. When subjects were presented with features in unexpected combinations, they did not detect those features and combine them into objects on the basis of their expectations. Instead, subjects seemed simply to choose one of the features of a cued abnormal object to match with i t s associated feature, regardless of whether that feature was on or off the card. As a r e s u l t , the proportions of conjunction and feature errors made by subjects when reporting normal objects in response to cued abnormal objects were about equal. 108 It also seems unlikely that normal objects, when not receiving attention, are somehow protected against i l l u s o r y recombinations of their features, although this conclusion i s based on a limited amount of data from Experiment 1. When errors were examined where subjects had the opportunity to draw features from either a normal or an abnormal object, they reported features from both types of objects equally. A possible d i r e c t i o n for future research would be to evaluate whether, when subjects are presented with unconstrained and expected objects in the same display, subjects w i l l be less l i k e l y to form i l l u s o r y conjunctions with the features of the expected objects. The hypothesis that seems best able to account for these data, then, seems to be that expectations may speed up the detection and integration of the features of normal objects, so that given the same exposure duration, these objects w i l l be perceived more accurately than w i l l either unconstrained or abnormal objects. It is- possible that subjects set up a "template" or a temporary frame on the basis of their expectations, against which to compare the features of the objects they perceive. Features which match those specified by t h i s temporary frame may be detected more rapidly, so that attention could be applied, and feature integration accomplished, in less time. Further, templates or expectations 4 1 09 may aid perception not only by speeding the detection of features of a single object, but also by providing more of an opportunity to apply attention to more than one expected object. Clearly the p r o b a b i l i t y of a correct report would be higher in a post-cue paradigm i f subjects perceived more than one object c o r r e c t l y . S i m i l a r l y , i t may be that objects which v i o l a t e expectations are perceived inaccurately because they do not match these frames. The resu l t i n g v i o l a t i o n s would lead subjects to attempt to analyze unexpected objects further. But given a short exposure duration, subjects might be reduced to corr e c t l y detecting only one feature, and then guessing on the basis of their expectations what the other feature might have been. And f i n a l l y , in the absence of expectations, the process of perception would necessarily proceed from the vi s u a l information "up". As a r e s u l t , in the absence of focal attention, and given short exposure durations, subjects would, be p a r t i c u l a r l y vulnerable to miscombining the features of the objects appearing concurrently in a display. Most errors would thus take the form of i l l u s o r y conjunctions, where subjects perceive the features of objects rather accurately, but are uncertain about the object to which those features belonged, and about the o r i g i n a l location of the feature in the display. 1 10 The main conclusion that can be drawn on the basis of these experiments, then, i s that the process behind feature integration remains unchanged in the presence of expectations. Instead, expectations seem to improve the perception of objects by speeding the feature detection process, so that attention can be applied more e f f i c i e n t l y . As a r e s u l t , given short exposure durations, expected objects are most l i k e l y to be c o r r e c t l y perceived. One a l t e r n a t i v e explanation of the results presented here would be that the presence of expectations leads subjects to have a response bias to report expected objects. It i s possible that after detecting one feature c o r r e c t l y , subjects guess on the basis of their: expectations what the other feature would be. Thus responses to. normal cues would be more accurate, and subjects would tend to report normal objects to abnormal cues. However, some aspects of the data here suggest that subjects' responses were founded on more than just a response bias or guessing. F i r s t , subjects were more: accurate in detecting the' features of normal as opposed to unconstrained objects, given the same exposure durations. Subjects' expectations provided no information about which of the fiv e objects would be cued. Therefore i t i s more l i k e l y that expectations led to a speedier detection, rather than just more accurate guessing, of the features of normal objects. 111 Second, there i s some indication that conjunction errors did not result from guessing. F i r s t , subjects were equally l i k e l y to report abnormal and normal objects when making conjunction errors, and were as l i k e l y to draw features from normal and abnormal objects. Further, subjects were more confident when making conjunction, as opposed to feature errors. Guesses do not seem to be involved in the production of true conjunction errors, or therefore in the finding that subjects made fewer conjunction errors in response to constrained normal as opposed to unconstrained cues. Further, i t does seem as i f subjects predominately made feature errors in response to cued abnormal objects, in order to report normal objects. Here i t does seem as i f subjects were guessing on the basis of one detected feature. However i t i s this finding that is most interesting: f i r s t , the fact that subjects are more confident when reporting normal objects implies that they have a good deal of f a i t h in their guesses. And second, i t i s the finding that subjects do not build object f i l e s on the basis of their expectations which is important: expectations do not contribute to the construction of object f i l e s . Rather, the data suggest that expectations influence perception by speeding the detection of features, so that feature integration can be accomplished within a shorter time, requiring less attention. However, further research is needed 1 12 to determine conclusively whether i t i s at this point, or in the production of a response, where the effect of expectations actually occurs. The data from Experiment 2 also suggest that i t i s r e l a t i v e l y easy to set up expectations which are strong enough to influence perceptual processing. Although in Experiment 1 the stimuli were chosen to ensure that well learned expectations would be operative, c l e a r l y this caution was not necessary. In Experiment 2 expectations were equally i n f l u e n t i a l whether they were induced by familiar object labels or by a quick explanation of the conjunctions of features to expect. This is c e r t a i n l y not evidence against "frames", or the idea that our prior knowledge about a context w i l l af.fect our perceptions. After a l l , the expectations which were set up in this experiment provided subjects with the same sort of information that is thought to be provided by a frame, which could be instantiated in any of a number of d i f f e r e n t ways. The data here simply suggests that either, frames can be set up very-quickly, on the basis of newly acquired knowledge, or that the perceptual process can also be affected by a temporary set to perceive certain conjunctions of features. In conclusion, the experiments in this thesis were designed to examine the ways in which conceptual information affects the detection and integration of the features of simple objects. 1 13 F i r s t , i t does not seem to be the case that expectations replace attention in the feature integration process. Subjects do not detect features and then on the basis of their expectations combine those features in an object f i l e . Further, the data suggest that normal objects are not themselves protected from i l l u s o r y conjunctions, although t h i s second finding needs further support. The data seem to point to the conclusion, rather, that expectations function to speed up the detection of features and the application of attention to the integration of the features of expected objects: Focal attention may be applied more e f f i c i e n t l y . As a result, the correct integration of the detected features of expected objects can be accomplished accurately in a shorter amount of time. 1 14 REFERENCES A n t e s , J.R. ( 1 9 7 7 ) . R e c o g n i z i n g a n d l o c a l i z i n g f e a t u r e s i n b r i e f p i c t u r e p r e s e n t a t i o n s . Memory a n d C o g n i t i o n , 5, 155— 161. B i e d e r m a n , I . ( 1 9 8 1 ) . Do b a c k g r o u n d d e p t h g r a d i e n t s f a c i l i t a t e o b j e c t i d e n t i f i c a t i o n ? P e r c e p t i o n , 10, 573-578. B i e d e r m a n , I . , G l a s s , A., & S t a c y , E.W. ( 1 9 7 3 ) . S e a r c h i n g f o r o b j e c t s i n r e a l w o r l d s c e n e s . ' J o u r n a l o f E x p e r i m e n t a l  P s y c h o l o g y , 9 7 ( 1 ) , 2 2 - 27. B i e d e r m a n , I . , M e z z a n o t t e , R., & R a b n i n o w i t z , J . ( 1 9 8 2 ) . Scene p e r c e p t i o n : D e t e c t i n g a n d j u d g i n g o b j e c t s u n d e r g o i n g r e l a t i o n a l v i o l a t i o n s . C o g n i t i v e P s y c h o l o g y , 14, 143-177. B i e d e r m a n , I . , T e i t e l b a u m , R., & M e z z a n o t t e , R. ( 1 9 8 3 ) Scene p e r c e p t i o n : No b e n e f i t f r o m e x p e c t a n c y o r f a m i l i a r i t y . J o u r n a l o f E x p e r i m e n t a l P s y c h o l o g y : L e a r n i n g , Memory, and  C o g n i t i o n , 9, 4 1 1 - 1 3 0 . Bobrow, D. & Norman, D. ( 1 9 7 5 ) . Some p r i n c i p l e s o f memory 1 15 schemata. In D. Bobrow & A. C o l l i n s (Eds.), Representation  and Understanding: Studies in Cognitive Science. N.Y. : Academic Press. Bransford, J. & Johnson, M. (1972). Contextual Prerequisites for understanding: Some investigations in comprehension and r e c a l l . Journal of Verbal Learning and Verbal Behavior, 11, 717-726. Chastian, G. (1977). Feature analysis and the growth of a percept. Journal of Experimental Psychology: Human Perception and Performance, 3(2), 291-298. Egeth, H.E., V i r z i , R.A., & Garbart, H. (1984). Searching for conjunctively defined targets. Journal of Experimental  Psychology: Human Perception and Performance, 10(1), 32-39. F a r e l l , B. (1984). Attention in the processing of complex vis u a l displays: Detecting features and their combinations. Journal of Experimental Psychology: Human Perception and  Performance, 10(1), 40-64. Felfoldy, G.L. & Garner, W.R. (1971). The effects on speeded c l a s s i f i c a t i o n of i m p l i c i t and e x p l i c i t instructions 116 regarding redundant dimensions. Perception and Psychophysics, 9, 289-292. Friedman, A. (1979). Framing pictures: The role of knowledge in automatized encoding and memory for g i s t . Journal of  Experimental Psychology: General, 108, 316-355. Garner, W.R. (1974). The Processing of Information and Structure. Polomac, Md: Erlbaum. Garner, W.R. & Felfoldy, G. (1970). I n t e g r a l i t y of stimulus dimensions in various types of information processing. Cognitive Psychology, J_, 225-241. • Gottwald, R. & Garner, W.R. (1972). E f f e c t s of focusing strategy on speeded c l a s s i f i c a t i o n with grouping, f i l t e r i n g , and condensation tasks. Perception and Psychophysics, 11, ??????. Gottwald, R. & Garner, W.R. (1975). F i l t e r i n g and condensation tasks with integral and separable dimensions. Perception and  Psychophysics, 18, 26-28. Homa, D., Haver, B., & Schwartz, T. (1976) Perceptability of 1 17 schematic face s t i m u l i : Evidence for a perceptual gestalt. Memory and Cognition, 4, 176-185. Intraub, H. (1980). Presentation rate and the representation of b r i e f l y glimpsed pictures in memory. Journal of  Experimental Psychology: Human Learning and Memory, 6, 1-12. Kahneman, D. & Henik. A. (1981). Perceptual organization and attention. In M. Kubovy & J.R. Pomerantz (Eds.), Perceptual  Organization. H i l l s d a l e , NJ: Erlbaum. Kahneman, D. & Treisman, A. (1984) Changing views of attention and automaticity. In R. Parasuraman, K.Davies, & J. Beatty (Eds.), V a r i e t i e s of Attention. New York: Academic Press. Kinchla, R.A. & Wolfe, J.M. (1979). The order of v i s u a l processing: "Top-down", "bottom-up", or "middle-out". Perception and, Psychophysics, 25, 225-231. Kuipers, B.J. (1975). A frame for frames: Representing knowledge for recognition. In D. Bobrow & A. C o l l i n s (Eds.), Representation and Understanding: Studies in Cognitive Science. New York: Academic Press. 1 18 LaBerge, D. (1973). Attention and the measurement of perceptual learning. Memory and Cognition, 1 (3) , 268-276. Lindsay, P. & Norman, D. (1976). Human Information Processing, (2nd e d i t i o n ) . N.Y.: Academic Press. Loftus, G. & B e l l , S. (1975). Two types of information in picture memory. Journal of Experimental Psychology: Human  Learning and Memory, 104(2), 103-113. Loftus, G. & Mackworth, N. (1978). Cognitive determinants of fi x a t i o n location during picture viewing. Journal of Experimental Psychology: Human Perception and Performance, 4(4), 565-572. Mandler, J . & Johnson, N. (1976). Some of the thousand words a picture i s worth. Journal of Experimental Psychology: Human  Learning and Memory, 2, 529-540. Martin, M. (1979). Local and global processing: The role of sparsity. Memory and Cognition, 1_, 476-484. McClelland, J.C. & Rumelhart, D.E. (1981). An interactive a c t i v a t i o n model of context e f f e c t s in l e t t e r perception: 119 Part I. An account of basic findings. Psychological Review, 88, 375-407. Minsky, M. (1975). A framework for representing knowledge. In P.H. Winston (Ed.) The Psychology of Computer Vi s i o n . N.Y.: McGraw-Hill. Mozer, M.C. (1983). Letter migration in word perception. Journal of Experimental Psychology: Human Perception and  Performance, 9(4), 531-546. Navon, D. (1977) Forest before trees: The precedence of global features in visua l perception. Cognitive Psychology, 9, 353-383. Neisser, U. (1967). Cognitive Psychology. N.Y.: Appleton. P i l l s b u r y . ( 1898). Cited in P. Lindsay & D,. Norman. (1976) Human Information Processing, (2nd e d i t i o n ) . N.Y.: Academic Press. Palmer, S. (1975). The effects of contextual scenes on the i d e n t i f i c a t i o n of objects. Memory and Cognition, 3, 519-526. 1 20 Pomerantz, J.R., Sager, L., & Stoever, R. (1977). Perception of wholes and of their component parts: Some configural s u p e r i o r i t y e f f e c t s . Journal of Experimental Psychology:  Human Perception and Performance, 3, 422-435. Potter, M. (1975). Meaning in visua l search. Science, 198, 965-966. Potter, M. (1976). Short-term conceptual memory for pictures. Journal of Experimental Psychology: Human Learning and  Memory, 2, 509-521. Potter, M. & Levy, E. (1969). Recognition memory . for a rapid sequence of pictures. Journal of Experimental Psychology, 8_i, 10-15. Prinzmetal, W. (1981). P r i n c i p l e s of feature integration in v i s u a l perception^. Perception and Psychophysics , 30 (4 ) , 330— 340. Prinzmetal, W. & Willis-Wright, M. (1984). Cognitive and l i n g u i s t i c factors a f f e c t v i s u a l feature integration. Cognitive Psychology, 16, 305-340. 121 Reicher, G. (1969). Perceptual Recognition as a Function of Meaningfulness of Stimulus Materials. Journal of Experimental Psychology, 81(2), 275-280. Rumelhart, D. (1977). An Introduction to Human Information Processing. N.Y.: Wiley. Rumelhart, D. & Siple, P. (1974) Process of recognizing tac h i s t o s c o p i c a l l y presented words. Psychological Review, 8J_, 99-118. Teitelbaum, R.C. & Biederman, I. (1979). Perceiving real-world scenes: The role of a prior glance. Proceedings of the Human  Factors Society, 23, 456-460. Treisman, A. (1980). The role of attention in object perception. Invited address to the Canadian Psychological Association, Calgary. Treisman, A. (1982). Perceptual gropuing and attention in visua l search for features and for objects. Journal of Experimental  Psychology: Human Perception and Performance, 8, 194-214. Treisman, A. & Gelade, G. (1980). A feature integration 1 22 theory of attention. Cognitive Psychology, 12, 97-136. Treisman, A. & Schmidt, H. (1982). Illusory conjunctions in the perception of objects. Cognitive Psychology, 14, 107-141. Treisman, A., Sykes, M., and Gelade, G. (1977). Selective attention and stimulus integration. In S. Dormic (Ed.) Attention and Performance VI. H i l l s d a l e , N.J.: Erlbaum. V i r z i , R.A. & Egeth, H.E. (1984). Is meaning implicated in i l l u s o r y conjunctions? Journal of Experimental Psychology:  Human Perception and Performance,•10(4), 573-580. Ward, L.M. (1982). Determinants of attention to l o c a l and global features of vi s u a l forms. Journal of Experimental  Psychology: Human Perception and Performance, 8, 562-581. Ward, L.M. (1983). On processing dominance: Comment on Pomerantz. Journal of Experimental Psychology: General, 112(4), 541-546. 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0096441/manifest

Comment

Related Items