UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Children's use of rehearsal to remember pictures and words : do self-report, observation, and stimulus… Colozzo, Paola Elizabeth 2009

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2009_fall_colozzo_paola.pdf [ 1.86MB ]
Metadata
JSON: 24-1.0067338.json
JSON-LD: 24-1.0067338-ld.json
RDF/XML (Pretty): 24-1.0067338-rdf.xml
RDF/JSON: 24-1.0067338-rdf.json
Turtle: 24-1.0067338-turtle.txt
N-Triples: 24-1.0067338-rdf-ntriples.txt
Original Record: 24-1.0067338-source.json
Full Text
24-1.0067338-fulltext.txt
Citation
24-1.0067338.ris

Full Text

CHILDREN’S USE OF REHEARSAL TO REMEMBER PICTURES AND WORDS: DO SELF-REPORT, OBSERVATION, AND STIMULUS EFFECTS TELL THE SAME STORY by Paola Elizabeth Colozzo A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in The Faculty of Graduate Studies (Audiology and Speech Sciences) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) July 2009 © Paola Elizabeth Colozzo, 2009  ABSTRACT This study looked at the use of verbal rehearsal in children from grades I to IV in two immediate serial memory tasks, one involving spoken words and the other nameable pictures. It took a new approach for studying rehearsal by integrating and building on the theoretical frameworks and accumulated data from two largely independent research traditions, memory development research and developmental applications of short-term memory models. The objective was to make both a theoretical and a methodological contribution by juxtaposing different conceptualisations of verbal rehearsal as well as by determining how best to ascertain on a child-by-child basis whether the participants were using this strategy. Both memory tasks involved the recall of order. The serial memory for pictures task did not require any talking. It was left up to the children whether or not to resort to verbal strategies in the form of labelling or rehearsal. Three different indicators were used to tap into rehearsal. One indicator, the phonological-similarity effect, was linked to the manipulation of the characteristics of the to-be-remembered items. The two other indicators were observational and self-report data. In addition, by manipulating visual similarity, it was possible to ask whether children were relying on the pictures while they completed the serial memory for pictures task. The combination of observational and self-report data resulted in a reliable indicator of whether or not a child was rehearsing. However, when this data was compared to whether this same child presented with a phonological-similarity effect, it became clear that these indices were not measuring exactly the same thing. The data can nonetheless be reconciled by invoking a broader conception of verbal mediation, one that encompasses the allocation of attention for the purpose of maintenance as well as forms of complex rehearsal.  ii  Finally, this study looked at whether any child-related variables, and in particular language abilities, were linked to the use of rehearsal. Children with better production skills were more likely to have relied on verbal rehearsal to remember lists of spoken words.  iii  TABLE OF CONTENTS Abstract ........................................................................................................ii Table of Contents ........................................................................................iv List of Tables .............................................................................................. vii List of Figures...............................................................................................x Acknowledgements .....................................................................................xi Dedication ................................................................................................. xiii 1. Introduction.............................................................................................. 1 1.1 Placing the study of rehearsal in its historical context............................................. 2 1.1.1 The study of deliberate memory in children ..................................................... 2 1.1.2 Phonological coding and rehearsal in the context of models of short-term memory ............................................................................................................ 6 1.2 Why study verbal rehearsal?................................................................................. 12 1.3 The development of rehearsal: evidence from two distinct research traditions..... 14 1.3.1 The development of rehearsal as a mnemonic strategy ................................ 14 1.3.2 The application of the working memory model to children: New questions about rehearsal .............................................................................................. 19 1.3.3 Stalemate or reconciliation?........................................................................... 26 1.4 A theoretical framework for interpreting indices of verbal mediation in immediate serial memory in children ...................................................................................... 31 1.4.1 Visual encoding, verbal encoding, and rehearsal in Baddeley’s multiple component model of working memory ........................................................... 31 1.4.2 The phonological-similarity effect as an indicator of rehearsal ...................... 36 1.4.2.1 How does the phonological-similarity effect emerge? ............................. 37 1.4.2.2 The phonological-similarity effect and rehearsal in children.................... 45 1.4.3 Susceptibility to visual similarity in children and adults.................................. 52 1.4.4 Likely developmental course of verbal mediation in short-term memory ....... 55 1.5 Contributions from the memory development perspective .................................... 60 1.6 Bridging the two perspectives to document and define rehearsal......................... 66 1.7 Individual differences in language abilities and rehearsal ..................................... 69 1.8 Research questions .............................................................................................. 74  2. Methods................................................................................................. 75 2.1 Participants ........................................................................................................... 75 2.2 Experimental protocol ........................................................................................... 76 2.3 Immediate serial memory for words and pictures.................................................. 77 2.3.1 Summary........................................................................................................ 77 2.3.2 Task design.................................................................................................... 78  iv  2.3.2.1 Stimuli...................................................................................................... 79 2.3.2.2 Number and length of trials completed by each child.............................. 85 2.3.2.3 Creating trials of three to six words and pictures .................................... 87 2.3.2.4 Presentations and responses.................................................................. 89 2.3.2.5 Pretest ..................................................................................................... 92 2.4 Procedure.............................................................................................................. 94 2.4.1 Pretest: picture recognition ............................................................................ 94 2.4.2 Auditory-Verbal Recall ................................................................................... 95 2.4.3 Visual Recognition ......................................................................................... 98 2.4.4 Post-task questions........................................................................................ 99 2.4.5 Presentation order ....................................................................................... 100 2.5 Data reduction and task scoring.......................................................................... 101 2.5.1 Scoring of the word and picture memory tasks............................................ 101 2.5.1.1 Variables for the memory tasks............................................................. 103 2.5.2 Observational data....................................................................................... 104 2.5.2.1 Variables for observed strategies.......................................................... 107 2.5.3 Self-report data ............................................................................................ 108 2.5.3.1 Variables for reported strategies ........................................................... 110 2.5.4 Test of Narrative Language ......................................................................... 110 2.5.5 Test of Nonverbal Intelligence ..................................................................... 111 2.5.6 Rapid Automatic Naming of Colours and Animals ....................................... 111 2.6 Reliability of scoring ............................................................................................ 112  3. Results ................................................................................................ 113 3.1 Indicators of rehearsal in immediate serial memory for words and pictures ....... 115 3.1.1 Phonological-similarity and visual-similarity effects ..................................... 115 3.1.1.1 Auditory-Verbal Recall........................................................................... 115 3.1.1.2 Visual Recognition................................................................................. 125 3.1.1.3 Comparing tasks ................................................................................... 136 3.1.1.4 Summary ............................................................................................... 138 3.1.2 Observational data....................................................................................... 139 3.1.2.1 Auditory-Verbal Recall........................................................................... 140 3.1.2.2 Visual Recognition................................................................................. 144 3.1.2.3 Comparing tasks ................................................................................... 149 3.1.2.4 Summary ............................................................................................... 150 3.1.3 Self-report data ............................................................................................ 151 3.1.3.1 Auditory-Verbal Recall........................................................................... 151 3.1.3.2 Visual Recognition................................................................................. 154 3.1.3.3 Comparing tasks ................................................................................... 156 3.1.3.4 Summary ............................................................................................... 157 3.1.4 Combining indices of rehearsal.................................................................... 157 3.1.4.1 Auditory-Verbal Recall........................................................................... 158 v  3.1.4.2 Visual Recognition................................................................................. 164 3.1.4.3 Comparing tasks ................................................................................... 172 3.1.4.4 Summary ............................................................................................... 173 3.2 Interactions between child-related variables and the presence of rehearsal ...... 174 3.2.1 Demographic data, cognitive and language scores, by grade ..................... 174 3.2.2 Phonological-similarity effects and visual-similarity effects .......................... 179 3.2.2.1 Auditory-Verbal Recall........................................................................... 179 3.2.2.2 Visual Recognition................................................................................. 182 3.2.2.3 Summary ............................................................................................... 189 3.2.3 Children who were likely rehearsing and those who likely were not ............ 189 3.2.3.1 Auditory-Verbal Recall........................................................................... 189 3.2.3.2 Visual Recognition................................................................................. 195 3.2.3.3 Summary ............................................................................................... 199 3.3 Summary of results ............................................................................................. 200  4. Discussion ........................................................................................... 203 4.1 To what extent did the children use strategic rehearsal?.................................... 203 4.1.1 Auditory-Verbal Recall ................................................................................. 210 4.1.2 Visual Recognition ....................................................................................... 214 4.1.3 Comparing tasks .......................................................................................... 220 4.2 Were any within-child variables associated with the use of strategic rehearsal? 227 4.3 Do the various indicators of rehearsal measure the same thing? ....................... 231 4.4 Multiple approaches to help oneself remember: changes with development...... 234 4.5 Rehearsal redefined: attempting to reconcile two research traditions................. 241 4.6 Methodological considerations............................................................................ 243 4.7 Conclusion .......................................................................................................... 246  Bibliography............................................................................................. 249 Appendices.............................................................................................. 269 Appendix A: Appendix B: Appendix C: Appendix D: Appendix E: Appendix F: Appendix G: Appendix H: Appendix I: Appendix J: Appendix K:  Data for the word-picture pairs ........................................................... 269 Trials for immediate memory tasks..................................................... 272 Examples from Visual Recognition..................................................... 277 Pretest response cards and targets.................................................... 281 Detailed task instructions.................................................................... 284 Detailed scoring decisions for words and pictures correct.................. 291 Rapid Automatic Naming of Colours and Animals.............................. 294 Raw data ............................................................................................ 300 Subject-wise data for the indicators of labelling and rehearsal........... 304 Testing for order effects...................................................................... 310 UBC Research Ethics Board certificates of approval ......................... 324  vi  LIST OF TABLES Table 1. Table 2. Table 3. Table 4. Table 5.  Table 6. Table 7. Table 8. Table 9. Table 10. Table 11. Table 12. Table 13 Table 14. Table 15 Table 16. Table 17. Table 18. Table 19.  Table 20.  Table 21. Table 22.  Distribution of Children by Presentation Order for Auditory-Verbal Recall and Visual Recognition................................................................................ 100 Demographic Data and Scores on Tests of Language and Cognition ........ 114 Mean Words Correct, Trials Correct, and Highest Span, by Condition and by Trial Length, Auditory-Verbal Recall ....................................................... 116 Mean Pictures Correct, Trials Correct, and Highest Span, by Condition and by Trial Length, Visual Recognition ...................................................... 127 Distribution of Participants Based on Whether or Not They Presented With Significant Phonological-Similarity Effects in Auditory-Verbal Recall and Visual Recognition................................................................................ 138 Observed Occurrences of Overt Labelling in Auditory-Verbal Recall, by Condition ..................................................................................................... 141 Observed Occurrences of Overt Rehearsal in Auditory-Verbal Recall, by Condition ..................................................................................................... 142 Classification of Participants as Overtly Rehearsing or Not for AuditoryVerbal Recall Based on Observations, by Grade ........................................ 144 Observed Occurrences of Overt Labelling in Visual Recognition, by Condition ..................................................................................................... 146 Observed Occurrences of Overt Rehearsal in Visual Recognition, by Condition ..................................................................................................... 147 Classification of Participants as Overtly Rehearsing or Not for Visual Recognition Based on Observations, by Grade .......................................... 148 Distribution of Children by Reported Strategy, Auditory-Verbal Recall ....... 153 Classification of Participants as Reporting Rehearsing or Not for AuditoryVerbal Recall, by Grade .............................................................................. 153 Distribution of Children by Reported Strategy for Visual Recognition ......... 155 Classification of Participants as Reporting Rehearsing or Not for Visual Recognition, by Grade................................................................................. 156 Number of Children Reporting Rehearsal as a Strategy, by Task............... 157 Distribution of Children as Rehearsing or Not, Based on Observations and on Self-Report, Auditory-Verbal Recall................................................. 158 Distribution of Participants by Likely Strategy, Based on Combined Data From Observation and Self-Report, Auditory-Verbal Recall........................ 161 Distribution of Children With or Without a Phonological-Similarity Effect (PSE) and Likely or Not to Have Been Rehearsing, Auditory-Verbal Recall .......................................................................................................... 162 Distribution of Children With or Without a Phonological-Similarity Effect (PSE) and Likely or Not to Have Been Labelling or Rehearsing, AuditoryVerbal Recall ............................................................................................... 164 Distribution of Children as Rehearsing or Not, Based on Observations and on Self-Report, Visual Recognition ...................................................... 165 Distribution of Participants by Likely Strategy, Based on Combined Data from Observation and Self-Report, Visual Recognition ............................... 169 vii  Table 23. Distribution of Children With or Without a Phonological-Similarity Effect (PSE) and Likely or Not to Have Been Rehearsing, Visual Recognition ..... 170 Table 24. Distribution of Children With or Without a Phonological-Similarity Effect (PSE) and Likely or Not to Have Been Labelling or Rehearsing, Visual Recognition ................................................................................................. 170 Table 25. Distribution of Participants by Likely Strategy in Auditory-Verbal Recall and Visual Recognition Based on Observation and Self-Report ................. 173 Table 26. Demographic Data and Scores on Tests of Language, Cognition, and Memory, by Grade....................................................................................... 176 Table 27. Number of Words Correct by Condition and by Grade, Auditory-Verbal Recall .......................................................................................................... 181 Table 28. Number of Pictures Correct by Condition and by Grade, Visual Recognition ................................................................................................. 183 Table 29. Distribution of Children Based on Whether Their Performance in the VS Condition in Visual Recognition was Suggestive of Either a VisualSimilarity Effect or a Practice Effect, by Grade............................................ 185 Table 30. Distribution of Children Based on Whether or Not They Presented with a Significant Phonological-Similarity Effect (PSE) and a Significant VisualSimilarity Effect (VSE) in Visual Recognition, by Grade.............................. 186 Table 31. Children’s Ages, and Scores on Tests of Cognition, Language, and Memory, Depending on Whether or Not They Were Likely Rehearsing, Auditory-Verbal Recall................................................................................. 191 Table 32. Children’s Ages, and Scores on Tests of Cognition, Language, and Memory, Depending on Whether or Not They Were Likely Rehearsing, Visual Recognition....................................................................................... 196 Table 33. Data regarding Age of Acquisition (AoA), Word Length, and Frequency for the Word-Picture Pairs, by Condition .......................................................... 269 Table 34. Demographic Data, Scores on Tests of Language and Cognition, Presentation Orders, Words and Pictures Remembered Correctly by Condition, Observed Strategies by Condition, and Self-Reported Strategies, Auditory-Verbal Recall and Visual Recognition, all Participants.................................................................................................. 300 Table 35. Subject-Wise Data for the Phonological-Similarity Effect, Overtly Labelling, Reporting Labelling, Overtly Rehearsing, Reporting Rehearsal, and Final Status in Terms of Likely Labelling or Likely Rehearsing, Auditory-Verbal Recall................................................................................. 304 Table 36. Subject-Wise Data for the Phonological-Similarity Effect, the VisualSimilarity Effect, Overtly Labelling, Reporting Labelling, Overtly Rehearsing, Reporting Rehearsal, and Final Status in Terms of Likely Labelling or Likely Rehearsing, Visual Recognition..................................... 306 Table 37. Subject-Wise Data for the Phonological-Similarity Effect, Overtly Labelling, Reporting Labelling, Overtly Rehearsing, Reporting Rehearsal, and Final Status in Terms of Likely Labelling or Likely Rehearsing, Comparing Auditory-Verbal Recall and Visual Recognition ........................ 308 Table 38. Distribution of Participants Based on Whether or Not They Showed a Significant Phonological-Similarity Effect in Visual Recognition, by Presentation Order ...................................................................................... 316  viii  Table 39. Distribution of Participants in Terms of Whether or Not They Showed a Significant Visual-Similarity Effect in Visual Recognition, by Presentation Order ........................................................................................................... 318 Table 40. Classification of Participants as Overtly Rehearsing or Not for AuditoryVerbal Recall Based on Observations, by Presentation Order.................... 320 Table 41. Classification of Participants as Overtly Rehearsing or Not for Visual Recognition Based on Observations, by Presentation Order ...................... 321 Table 42. Classification of Participants as Reporting Rehearsal or Not for AuditoryVerbal Recall, by Presentation Order .......................................................... 322 Table 43. Classification of Participants as Reporting Rehearsal or Not for Visual Recognition, by Presentation Order ............................................................ 323  ix  LIST OF FIGURES Figure 1. Mean number of words remembered correctly (95% CI) in Auditory-Verbal Recall, by condition ..................................................................................... 118 Figure 2. Difference score in number of correct words, dissimilar vs. PS conditions, Auditory-Verbal Recall................................................................................. 120 Figure 3. Difference score in number of correct words, dissimilar vs. VS conditions, Auditory-Verbal Recall................................................................................. 122 Figure 4. Mean number of pictures remembered correctly (95% CI) in Visual Recognition, by condition ............................................................................ 128 Figure 5. Difference score in number of correct pictures, dissimilar vs. PS conditions, Visual Recognition..................................................................... 132 Figure 6. Difference score in number of correct pictures, dissimilar vs. VS conditions, Visual Recognition..................................................................... 134 Figure 7. Percentages of correct words in Auditory-Verbal Recall and of correct pictures in Visual Recognition, all conditions combined, trials of three to five items ..................................................................................................... 137 Figure 8. Scores on the Test of Nonverbal Intelligence (TONI-3), by age and by grade ........................................................................................................... 177 Figure 9. Mean number of words remembered correctly, by condition and by grade, Auditory-Verbal Recall................................................................................. 180 Figure 10. Mean number of pictures remembered correctly, by condition and by grade, Visual Recognition ........................................................................... 184 Figure 11. Distribution of children in terms of whether or not they presented with a visual-similarity effect and/or a phonological-similarity effect in Visual Recognition, by age..................................................................................... 188 Figure 12. Distribution of children in terms of whether or not they were likely rehearsing by age and by grade, Auditory Verbal Recall ............................ 193 Figure 13. Distribution of children in terms of whether or not they were likely rehearsing by age and by grade, Visual Recognition .................................. 197 Figure 14. Mean number of words remembered correctly, by condition and by presentation order, Auditory-Verbal Recall.................................................. 312 Figure 15. Mean number of pictures remembered correctly, by condition and by presentation order, Visual Recognition........................................................ 314  x  ACKNOWLEDGEMENTS I would like to express my most sincere gratitude to the members of my supervisory committee, Geoff Hall, Barbara Purves, Jeff Small, and in particular to my supervisor and mentor, Judith Johnston. They have guided and supported me throughout, and allowed me to go further in my learning and thinking than I thought possible. For their patience, insight, expertise, and kindness I shall always be indebted. I would also like to thank my fellow students who have been a constant source of peer learning, laughter, and support. Special thanks go out to Monique Charest and to Penelope Bacsfalvi who made my doctoral experience richer, more fruitful, and more pleasant. I am also most grateful for all the wonderful teachers I have had since I first started school at the age of 3; they have continually helped to quench my thirst for knowledge and let the joy of learning become an integral part of my everyday life. I would not have been able to reach this point without the wonderful children who accepted to participate in this study, the parents who consented, nor the teachers, principals, colleagues, and friends who helped to identify them. I most enjoyed the time I spent with these funny, beautiful, perspicacious, and bright children who willingly completed the experimental protocol. It also reassured me that no matter how different my professional life was going to be from now on, I would always be most happy in the company of children and my inner-child was alive and well. I also benefitted from the financial support provided by the following sources throughout my doctoral studies: BC Medical Services Foundation, Bamford-Lahey Foundation, the University of British Columbia, Centre Hospitalier Universitaire SainteJustine, and the Natural Sciences and Engineering Research Council of Canada (awarded to J. Johnston).  xi  Finally, I would like to convey my deepest appreciation to my friends and family who have put up with me, supported me, believed in me, and laughed both at and with me over the years of my doctoral studies. Whenever I felt like giving up and going back to clinical practice and ‘real life’ I thought of them and kept at it.  xii  DEDICATION  To my parents  xiii  1. INTRODUCTION Although many unanswered questions and much controversy exists regarding the relationship between language and thought, there appears to be a general consensus that language can serve as a cognitive scaffold or mediator. Verbal mediation is presumed to enhance the range, efficiency, and complexity of our conceptual and reasoning processes, to provide the means to reflect on our own thinking, and to facilitate complex trains of reasoning (Atran et al., 2002; Carruthers, 2002a, 2002b). This study focuses on the development of a particular form of verbal mediation, namely verbal rehearsal. This deceptively simple yet powerful memory strategy may in fact shed light on the development and use of other cognitive strategies. At the level of the individual, it may result in payoffs in the moment and facilitate further learning, and consequently it could pave the way to discovering the usefulness of verbal mediation more generally. This project used a within-subjects design to examine under what conditions children in grades I to IV made use of verbal rehearsal in two immediate memory tasks. Three indicators of rehearsal were combined and contrasted in both group- and subjectwise analyses: an indirect measure based on the presence of a phonological-similarity effect, direct observations of strategic behaviours, and children’s self-reported strategies. In addition, this project looked at how child-related variables, and in particular age and language ability, impacted children’s use of verbal rehearsal. This study takes a new approach to look at verbal rehearsal by integrating and building on the theoretical frameworks and accumulated data from two largely independent research traditions—memory development research and developmental applications of short-term memory models. As such, its contributions are both theoretical  1  and methodological. To understand its roots, one must take a stroll down research memory lane.  1.1 Placing the study of rehearsal in its historical context 1.1.1 The study of deliberate memory in children Memory research has a long and rich history, actually going back to the late 1800s with important advances made early on in the USSR, Germany, and the USA (for reviews, see Schneider, 2000; Schneider & Pressley, 1997). This work anticipated to different degrees research and thinking that has followed. The seminal work of John Flavell and his colleagues (e.g., Flavell, Beach, & Chinsky, 1966; Keeney, Cannizzo, & Flavell, 1967) regarding the emergence of rehearsal in children launched the modern era of memory development research. In 1971, Flavell invited the research community to reflect on ‘what memory development is the development of?’. The basic themes introduced then continue to be considered as important explanatory variables to this day. Early work predominantly focused on describing the typical developmental course of deliberate memorisation. The goal was to document what factors were involved in changes in immediate memory ability with development, and strategies were the prime suspect. Strategies are used to solve problems. In the memory development literature, “strategies are traditionally defined as goal-directed cognitive operations used to facilitate task performance. Most researchers agree that strategies reflect operations above and beyond those that are natural consequences of carrying out a task, are intentional, and are potentially available to consciousness” (Bjorklund & Coyle, 1995, p. 161). As the bulk of this early research was intended to document changes in strategy development, experiments were designed to give participants an opportunity to use and 2  to ‘discover’ strategies. The typical methods involved delayed recall, free recall (i.e., in any order), multiple trials, strategy observation, strategy training, and think-aloud protocols (i.e., participants verbalise their thoughts while performing a task). So the process was partially inferred based on how much one remembered but mostly observed, and the focus was on the individual and the possibility of children figuring out ways to better accomplish the task. Initially, the ability for deliberate memorisation was assumed to emerge in the early school years and to correspond to “a transition from passive to active approaches to memory tasks” (R. P. Ferguson & Bray, 1976, p. 490). The original question of interest was differentiating between whether young children could use a strategy at all, or whether they could improve their memory performance when shown a strategy even though they failed to use one spontaneously (Flavell et al., 1966; Flavell, Friedrichs, & Hoyt, 1970). Results were generally consistent: Preschoolers did not spontaneously resort to expected memory strategies such as verbal rehearsal or semantic categorisation. On the other hand, young school-aged children who did not demonstrate strategy use in unconstrained situations could nonetheless do so and improve their memory performance when instructed. Although preschoolers could also be taught to use a strategy, they typically required more explicit directions; making a strategy available or suggesting its use was not always enough for them to adopt it (Corsini, Pick, & Flavell, 1968; Hagen & Kail, 1973). Research indicated that this tendency not to use a strategy that could prove helpful extended to nonverbal strategies as well (e.g., copying, Corsini et al., 1968; Selective Recall task, P. H. Miller, Haynes, DeMarie-Dreblow, & Woody-Ramsey, 1986). For many years, children's increasing tendency to use strategies was seen as the primary impetus for age changes in how well children remembered in deliberate memory  3  tasks. More sophisticated and effective strategies were assumed to replace less sophisticated ones as children progressively became mature problem solvers. However, new data and interpretations soon challenged the prevailing view. In particular, even young children were found to demonstrate deliberate, intentional behaviours aimed at remembering (e.g., naming, staring, and pointing) when performing simplified tasks in familiar contexts, although these strategies tended to be simple and sometimes ineffective (e.g., DeLoache & Brown, 1983; DeLoache, Cassidy, & Brown, 1985; Wellman, Ritter, & Flavell, 1975). Moreover, manipulations of task parameters led to very different impressions regarding whether children of a given age qualified as strategic (e.g., Baker-Ward, Ornstein, & Holden, 1984; Ritter, Kaprove, Fitch, & Flavell, 1973). Such new results led Ornstein, Baker-Ward, and Naus (1988) to propose a progression of mnemonic effectiveness which took into consideration the various supports for information processing. They argued that focusing on what children were able to do in favourable conditions could provide insight into factors supporting emerging strategies. They predicted that one would find evidence of strategy deployment initially in highly salient and supportive contexts, and later observe transfer to more demanding situations. Familiarity with routines and reliance on prior knowledge were presumed to reduce the processing demands and thus make the appearance of strategies more likely.  Strategy  effectiveness  was  also  assumed  to  progress,  with  strategy  implementation becoming “increasingly effective, reflecting the routinization and automatization that comes from both practice and the development of certain underlying information handling skills (such as retrieval and the ability to make interconnections in the knowledge base)” (Ornstein et al., 1988, p. 38). Hence, by the early 1980s, it was generally accepted that memory strategies developed over a protracted period. Baker-Ward, Ornstein, and Holden (1984) summed up the dominant view as follows: 4  The ontogeny of memorization is characterized by age-related increases in the degree to which the use of mnemonic mediators characterizes performance in memory tasks, in the manner in which these mediators are applied to the task, and in the efficacy of the use of nascent memory strategies…. The implication of these findings is the view that the acquisition of memory strategies represents a gradual and lengthy process. (p. 574) Much of the early work on memorisation which had led to this conclusion was based on the development of a few strategies, including the use of external cues (Ritter, 1978; Ritter et al., 1973), organisational strategies based on meaning (e.g., Carr, Kurtz, Schneider, Turner, & Borkowski, 1989; Corsale & Ornstein, 1980; Moely, Olson, Halwes, & Flavell, 1969; Schneider, 1986), and most of all verbal rehearsal, which is the focus of the current study. In its simplest definition rehearsal “involves the repetition of target information” (Bjorklund and Douglas, 1997, p. 208), whereas cumulative rehearsal "can be defined as the conscious, deliberate, additive repetition, either covertly or overtly, of the information to be learned" (Allik & Siegel, 1976, p. 317) 1. A further distinction can be made between cumulative and chunked rehearsal: In cumulative rehearsal, the subject attempts to rehearse the entire list, updating it as each digit is presented. In chunked rehearsal, the subject breaks the list into chunks, or groups of digits (typically, groups of two, three, or four digits); each of these chunks is rehearsed separately” (Gupta & MacWhinney, 1997, p. 299). These advanced forms of rehearsal have been observed or reported in older children and adults (Allik & Siegel, 1976; Henry, Turner, Smith, & Leather, 2000; Logie, Della Sala, Laiacona, Chalmers, & Wynn, 1996; Ornstein, Naus, & Liberty, 1975; Turner, Henry, & Smith, 2000). As a result of the work of Baker-Ward, Naus, Ornstein and their colleagues, the general consensus was that use of verbal rehearsal developed both quantitatively and  1  Rote rehearsal (going over the information exactly) can be distinguished from elaborative rehearsal, which involves forming new, meaningful connections between items, which is essentially a semantic strategy (Cowan, 1997). The research reviewed here focuses on rote rehearsal.  5  qualitatively: It was expected to become more frequent, more sophisticated, more effective, and also progressively internalised as children got older. In free recall tasks, spontaneous rehearsal was generally considered to appear around 7 years of age, whereas the age of 10 was thought to coincide with the emergence of sophisticated cumulative rehearsal strategies (Cowan, 1997; Gathercole, 1998), although it was recognised that task parameters could influence these milestones significantly. Of crucial importance, children at younger ages were considered to be nonrehearsers. On the other hand, young school-aged children who did not demonstrate the use of rehearsal in unconstrained situations could nonetheless do so when instructed, and consequently saw their memory performance improve. In parallel, a few researchers coming from an altogether different tradition began applying models and methods developed to describe the functioning of short-term memory in adults to children. Before reviewing this new line of research in more detail, it seems necessary to trace how those who came to this problem space via the memory models route ended up drawing conclusions about the development of rehearsal.  1.1.2 Phonological coding and rehearsal in the context of models of short-term memory In the early 1960s, Conrad published a series of studies which highlighted a group of interesting and related short-term memory phenomena. When adults were presented with lists of written letters to remember in order, they tended to make nonrandom errors, being more likely to substitute a similar-sounding letter names for the correct response (e.g., p for b) rather than a dissimilar-sounding one (e.g., m for b); furthermore, these recall confusions followed the same patterns as those observed in an auditory identification task in which participants were asked to identify letters under difficult listening conditions (Conrad, 1964). For Conrad, these findings pointed to “a  6  remarkable similarity between memory errors and listening errors” (1964, p. 79), and led him to predict that how much one could remember would be affected by the confusability of the to-be-remembered items. He tested this hypothesis, and indeed found that adults made significantly more recall errors when trying to remember series of letters with higher compared to lower probabilities of acoustic confusion (Conrad & Hull, 1964). Conrad (1963) had also observed a similar pattern of errors for words that sounded alike, as accuracy of serial list recall was significantly poorer when the target words were drawn from pairs of visually-distinct homophones (e.g., sum, some, ruff, rough, rays, raise….) rather than from words that were both phonologically and visually dissimilar (e.g., best, deep, fall, glue, mine, nose, …). Other researchers also studied this phenomenon and found strong effects of acoustic similarity for letters (Wickelgren, 1965a), consonant-vowel digrams (Wickelgren, 1965a, 1965b), and words (Baddeley, 1966, 1968). For instance, Baddeley (1966) asked adults to recall lists of words by writing them in the same order as they had been presented. The participants correctly recalled significantly fewer lists that consisted of words which sounded alike (e.g., mad, man, mat, cat, cap) compared to control lists built from words that sounded dissimilar (e.g., cow, day, bar, few, hot). Baddeley acknowledged that the set of similar-sounding words were also more alike in terms of their spelling, and hence formal (or visual) similarity and acoustic similarity were confounded. However, additional results (Experiment 2) suggested that, at least in the case of written words presented to literate adults, any impact of formal similarity was much smaller than that of acoustic similarity. This result, and the fact that the acoustic confusions occurred regardless of whether adults were presented spoken (Experiments 1 and 2) or written words (Experiment 3) suggested that participants were relying on a verbal code regardless of presentation modality.  7  This research was part of the ambitious endeavour of attempting to unravel the mysteries of verbal short-term memory in adults. The identification of phenomena such as the acoustic (or phonological) similarity effect2 allowed researchers to describe the functioning of verbal short-term memory, whereas explaining these effects was deemed crucial to developing a theoretical model. Baddeley and his many collaborators remained at the forefront of this research, and used various task manipulations in their attempt to unlock the secrets of verbal short-term memory. In most cases, the tasks involved immediate serial (i.e., in order) verbal recall, and the materials to be remembered were spoken or written words, letters, or digits. The focus of interest was on how the manipulation of task parameters or stimuli impacted memory span, with span measured as the longest list length that the participant could recall at some criterion of performance (e.g., correct recall on 50% of trials at that list length). In 1974, Baddeley and Hitch proposed the first instantiation of a multiple component model of short-term memory which they called the working memory model. This model, in a somewhat modified version (e.g., Baddeley, 1986, 2000, 2003; Baddeley & Logie, 1999) continues to be highly influential to this day. Baddeley has had numerous collaborators, most notably Graham Hitch and Robert Logie. He has been the constant, however, and for the sake of simplicity I will simply refer to Baddeley’s model of working memory. Much of the early focus was on the articulatory or phonological loop3, the component of the system presumed to be responsible for the storage and recall of  2  This phenomenon continues to be studied to this day, and has gone by different terms over the years, including acoustical-, phonemic-, or phonological-similarity effect. I shall return to the nomenclature issue later on. 3 At some point after 1986, Baddeley renamed this system from the articulatory loop to the phonological loop, which is the term he continues to use in the most recent versions of the model. Baddeley apparently changed the name “on the grounds that the capacity for storage was the central feature of the system, which can operate without articulation, provided material is presented auditorily” (Baddeley & Larsen, 2007a, p. 497).  8  ‘speech-based material’ (Baddeley, 1986). According to the model, the phonological loop consists of both a passive store and an active rehearsal process. The store is of limited temporal capacity, and its content must be maintained and refreshed by the process of articulatory rehearsal or it will be lost. The presence of a phonological-similarity effect (PSE) was assumed “to be due to the confusability of the traces of these stimuli in the passive phonological store” (Halliday & Hitch, 1988, p. 200). As such, this effect was deemed to reflect the functioning of the phonological store which, in neurologically-intact adults, was “accessible either through auditory presentation or by the articulatory coding of visually presented material” (Baddeley, 1986, p. 84). Another phenomenon, the wordlength effect, was thought to reflect the process of rehearsal. Studies consistently showed that adults were less successful when asked to recall lists of words that took longer to say (e.g., three- to five-syllable words) compared to those that could be pronounced more quickly (e.g., one-syllable words) (e.g., Baddeley, Thomson, & Buchanan, 1975; Schiano & Watkins, 1981). Baddeley suggested that the word-length effect “is based on the simple fact that longer words take longer to say, hence reducing the rate at which an item can be rehearsed” (1986, p. 84). Finally, if adults were prevented from rehearsing by being required to utter an irrelevant sound (e.g., the, the, the, ….) throughout the task, they could not recall as much information, and both the word-length effect and the phonological-similarity effect were considerably reduced for visually-presented stimuli (e.g., Baddeley et al., 1975; Cowan, Cartwright, Winterowd, & Sherk, 1987). These findings were taken to highlight the dual role of the rehearsal mechanism, which was presumed to recode visually-presented information (written words or nameable pictures) so that it could access the phonological store, and to maintain the contents of the store so that they continued to be accessible. With articulatory suppression, this translation function of rehearsal was presumably blocked.  9  This research has a common denominator: it aimed at developing a model of short-term memory in adults by using task manipulations to infer whether groups of individuals were relying on a speech-like code and on a rehearsal process. These inferences were based on overall recall performance, or on how item recall varied as a function of an item's position within a study list (i.e., the serial position curve). To summarise, if a group showed a phonological-similarity effect, this was interpreted as an indication of access of the phonological store. Rehearsal was indexed by the wordlength effect and better recall for the earliest items in the list. This primacy effect was attributed to the fact that the participants had more time and opportunity to rehearse the earlier list items in a serial recall task. In addition, suppression was presumed to disrupt recoding of visually-presented stimuli into words as well as rehearsal; as a result, the word-length effect was abolished with suppression for visually presented materials. These task manipulations were applied to children as early as the 1970s (e.g., Allik & Siegel, 1976; R. M. Brown, 1977; Conrad, 1971; Hayes & Rosner, 1975; Hayes & Schulze, 1977) in order to see at what developmental point children were showing similar impacts as adults. However, it was not until the 1980s that they resulted in any broad claims about rehearsal, most notably in the work of Charles Hulme, Graham Hitch, and M. Sebastian Halliday. As had been the case with adults, the presence of subvocal rehearsal was inferred exclusively on the basis of indirect evidence stemming from task manipulations (e.g., word length, phonological similarity). Interestingly, these studies are never mentioned in any of the extensive reviews from the memory development literature. To summarise succinctly, results suggested that under some circumstances groups of children as young as 3 or 4 years of age were sensitive to the same task manipulations as adults (e.g., Hitch, Halliday, Dodd, & Littler, 1989; Hitch, Halliday, Schaafstal, & Heffernan, 1991; Hulme, 1987; Hulme, Silvester, Smith, & Muir, 1986; Hulme, Thomson, Muir, & Lawrence, 1984; Hulme & Tordoff, 1989). Hence, it was 10  hypothesised that the children must have been doing something similar to what adults were presumed to be doing—namely, using a verbal code to remember nameable pictures and rehearsing subvocally. What did this new line of research mean regarding the generally accepted developmental course for verbal rehearsal? Could one truly deduce that children as young as 3 or 4 years of age were resorting to subvocal rehearsal when years of research using other methods and measures (e.g., observational data, think-aloud protocols, or training) had concluded that they were not using sophisticated forms of overt or covert rehearsal? Also, what exactly was meant by the term rehearsal in this context? Was it more akin to a semi-automatic process related to phonological coding or activation? If so, this is very different from the concept of rehearsal as defined in the memory development literature as a deliberate strategy potentially accessible to consciousness. In the first case, the process seems in charge; in the latter, some measure of control clearly lies within the intentional individual. Finally, were changes in rehearsal primarily quantitative (i.e., more frequent, faster) or qualitative (i.e., changing rehearsal styles, progressive internalisation, etc.)? As will become clear below, in spite of over forty years of accumulated research, these issues remain partly unresolved to this day. That being said, I would argue that it is premature to declare them unresolvable given that very little research has tackled these questions head on. Before reviewing in more detail the existing research regarding the development of rehearsal from these two distinct traditions—memory strategies and short-term memory models—and exploring how to build on them in order to address some lingering questions, it seems important to begin by justifying why rehearsal is an important cognitive strategy worthy of additional research.  11  1.2 Why study verbal rehearsal? Exploring the use of rehearsal by children is motivated on many levels. Firstly, verbal strategies (including verbal encoding and rehearsal) figure prominently both in influential models of short-term memory (e.g., Baddeley & Logie, 1999; see below) and in the study of memory and cognitive development (for reviews, see Bjorklund, Dukes, & Brown, 2009; Schneider, 2000; Schneider & Pressley, 1997). The work of Flavell and his colleagues on the issue of verbal mediation actually spawned the memory development field in its modern incarnation. Also, the developmental progression of other mnemonics (e.g., semantic strategies such as organisation or elaboration) and other cognitive strategies (e.g., strategies to solve arithmetic, spatial, social, or other problems) parallel that of verbal rehearsal in many ways (Kuhn, 1995, 2000a, 2000b; McGilly & Siegler, 1989; Schneider & Bjorklund, 1998; Siegler, 1994, 1995). This has led Deanna Kuhn to predict  that  "there  is  likely  to  be  significant  similarity  between  these  strategy/metastrategy relations across different kinds of cognition" (Kuhn, 2000a, p. 24), including, memory, problem solving, reasoning, and inference. Nonetheless, rehearsal is attributed somewhat special status because it is applicable in diverse situations and tasks, often in combination with other strategies (e.g., Cox, Ornstein, Naus, Maxfield, & Zimler, 1989; Coyle & Bjorklund, 1997; Hock, Park, & Bjorklund, 1998; Schlagmüller & Schneider, 2002; e.g., Schneider, 1986). Secondly, although verbal rehearsal is a process best adapted to situations where sequential verbal information must be stored and later recalled, there is no denying that it is a useful strategy for succeeding in school-based environments, which tend to require substantial amounts of rote learning. Whether we like it or not, “the lives of many children, especially those growing up in the information-age societies such as ours, require the deliberate acquisition, retention, and retrieval of information, and  12  memory strategies play an important role in such information processing” (Bjorklund et al., 2009, p. 146). The mnemonic processes underlying a seemingly simple experimental task such as immediate verbal recall of sequential information may actually be important in a much broader variety of everyday problem-solving activities (Cowan & Kail, 1996). In particular, rehearsal may be an important mechanism for acquiring information that can later be retrieved with little effort, thus leaving more resources available for new learning. Hence efficient use of verbal rehearsal could lead to payoffs in the moment as well as over the long term in multiple areas, including developing a large and rich vocabulary (Gupta & MacWhinney, 1997), acquiring new conceptual and procedural knowledge in mathematics (Fazio, 1994, 1996, 1999), developing good reading comprehension abilities (McNeil & Johnston, 2004; Palmer, 2000a; Steinbrink & Klatte, 2008), and following complex directions (Gill, Klecan-Aker, Roberts, & Fredenburg, 2003). Finally, rehearsal may be one of many ways in which verbal mediation can support thinking. As such, it could be a stepping stone for children to eventually realise the usefulness and power of verbal strategies and of self-directed talk. These various forms of private speech could then be applied in a wide range of tasks and for multiple goals (memorising, following and executing a plan, self-monitoring, weighing different options or solutions, etc.). This final point is only speculation, as little research has looked at verbal strategy use over multiple tasks at the individual level. A recent study by Al-Namlah, Fernyhough, and Meins (2006) did look at the use of self-regulatory private speech during a problem solving task as well as verbal recoding of visually presented material in a short-term memory task in 4- to 8-year-olds. However, the analyses did not directly tackle the question of whether there was a relationship between the likelihood of children using self-directed talk and a verbal memory strategy. Al-Namlah and his collaborators did find that the use of self-regulating private speech was related to recall 13  of dissimilar-sounding spoken words. However, one would have to make the inference that those children who recalled more were using verbal rehearsal. Although this is possible, it is not necessarily the case. There is now considerable accumulated evidence indicating that children do not always experience an immediate performance benefit from using a strategy, a phenomenon that Patricia Miller dubbed a utilization deficiency (e.g., Bjorklund & Coyle, 1995; P. H. Miller, 1990; P. H. Miller & Seier, 1994). Hence the conclusion that private speech and phonological recoding “are instances of a domaingeneral shift to verbal mediation in the preschool years” (Al-Namlah et al., 2006, p. 128), although interesting, requires additional empirical support. There is nonetheless evidence that rehearsal and private speech follow similar developmental courses, with increased use and awareness, similar relationships to task difficulty, and a progressive shift towards internalisation (e.g., Fernyhough & Fradley, 2005; Flavell, Green, Flavell, & Grossman, 1997; Winsler & Naglieri, 2003). Hence, there are many reasons to continue to focus on the development, use, and efficiency of verbal rehearsal in children. This work would however plainly benefit from additional clarity in terms of the definition of rehearsal and how one can confidently measure it at the level of individual participants.  1.3 The development of rehearsal: evidence from two distinct research traditions As mentioned above, there have been two different approaches to studying rehearsal in children. Each will be reviewed below.  1.3.1 The development of rehearsal as a mnemonic strategy Flavell and his colleagues are credited with launching the study of verbal rehearsal in children. In an ingenious pair of studies, they observed whether children  14  would show signs of rehearsing in a memory task using familiar nameable pictures and a nonverbal response. In the first study (Flavell et al., 1966) kindergarteners, second- and fifth-graders performed a serial recognition task. The experimenter pointed to a subset of seven pictures (two or more), and the children then had to point to the same pictures on another array (with a different layout) in the correct order. The children wore a space helmet with a translucent visor so that they could not see the pictures during the delay between presentation and recall. An adult trained at lip-reading meticulously observed any signs of overt or semicovert verbalisation by the children. Results supported the presumed increase in strategic behaviour with age, as kindergarteners were less likely than older children to rehearse picture names. Following the task, the children were also asked about what they had done to remember; there was a very strong correspondence between observed and reported behaviours, with reported and observed behaviour consistent for 45 of the 60 children. Of the 15 for whom the observed and reported behaviours were inconsistent, 8 were observed to rehearse rather infrequently, but did not unambiguously report they had. The other 7 children, who were not observed to rehearse yet reported having done so, provided retrospections that were very detailed and that the researchers found “very convincing” (Flavell et al., 1966, p. 293). The follow-up study by Keeney, Cannizzo, and Flavell (1967) used a very similar experimental design to look at the relationship between strategy use and how much children could remember by contrasting two groups of first graders who were either rehearsal producers or nonproducers. Results indicated that the children who spontaneously resorted to a rehearsal strategy initially outperformed age-mates who did not show evidence of using verbal rehearsal. The nonproducing children were nonetheless easily trained to rehearse, and this had a positive impact on their memory performance, which increased to levels comparable to that of the group of spontaneous rehearsers. Nevertheless, on subsequent free trials, most nonproducers stopped 15  rehearsing whilst none of the spontaneous rehearsers did so. These studies provided support for the presence of a production deficiency in kindergarteners and many firstgraders, as these children could improve their memory performance when shown a strategy although they failed to use one spontaneously (Flavell et al., 1970). Moreover, they suggested a quantitative increase in strategic behaviour with age. Ornstein, Naus and a generation of their students pursued the study of rehearsal. Their original contribution was to look at the content of rehearsal sets, or the difference in rehearsal styles. This work was to produce a shift in how the development of rehearsal was construed, with an increased emphasis on qualitative changes (Ornstein et al., 1988; Ornstein et al., 1975). To be fair, Flavell had envisioned that producing a strategy “is not an all or nothing affair” and “that there can be a whole gamut of intermediaries between a smooth and flawless execution of some mediational response pattern and no attempt to execute it at all” (1970, p. 199). However, it was this new line of work that would confirm Flavell’s intuition. In the first of a long series of experiments, Ornstein, Naus, and Liberty (1975) used free verbal recall (i.e., in any order) of unrelated words (Exp. 1) and instructed children from grades 3, 6, and 8 to rehearse aloud (overt rehearsal procedure). Printed words were presented one at a time for 5 seconds, which provided the children with time to rehearse between item presentations. As expected, recall increased over trials. Collapsed across trials, the primacy effect increased with age and proved significant only for the eldest groups. A significant age effect attributable to the primacy section of the serial position curve distinguished the three age groups. Most important, rehearsal frequency did not differ between grades. In addition, the number of times an item was rehearsed was apparently not the critical determinant of recall performance. Rather, the developmental differences were manifest in terms of rehearsal set contents. Whereas grade 8 children showed an active pattern of rehearsal, 16  the typical grade 3 participant adopted a more limited rehearsal strategy. Older children usually combined several items into each rehearsal set, whereas younger participants tended to rehearse the current word either alone or in combination with only one other item. As a result, the number of different items rehearsed together per rehearsal increased with age. These results suggested that developmental progress was not only about how much one rehearses, but also how one goes about doing so. Similar conclusions were drawn from a study by Cuvo (1975) who found that adults and eighthgraders differed from fifth-graders in terms of both the size of rehearsal sets and the number of words which appeared in different rehearsal sets. In addition, Kunzinger (1985) confirmed these cross-sectional patterns in a short-term longitudinal study. A more complex active rehearsal pattern seemed to contribute to increased recall with age. Consistent with this relationship, training studies demonstrated that whether or not children saw improvements in their memory performance depended on the strategy they were trained or instructed to use and their age, which fit with the results of earlier research that had trained children to use either labelling or rehearsal strategies (e.g., Hagen & Kingsley, 1968; Kingsley & Hagen, 1969). For instance, Ornstein, Naus, and Miller (1977) found that recall performance varied depending on how sixth-graders were instructed to rehearse aloud. Children who rehearsed each item in isolation (i.e., simulating the behaviour of younger children) recalled significantly less than those who were told to use active cumulative rehearsal. In parallel, Naus, Ornstein, and Aivano (1977) found that training 8-year-olds to use active rehearsal improved their recall, although subtle differences in rehearsal style and performance remained in comparison to spontaneously strategic 12-year-olds. These results were replicated using modified tasks parameters (e.g., oral vs. written presentation, articulatory suppression) and different age groups (Cox et al., 1989; Henry, 1991a; Naus et al., 1977; Ornstein, Naus, & Miller, 1977; Ornstein, Naus, & Stone, 1977). Finally, although it is possible that the 17  methods used underestimated the occurrence of rehearsal, in all likelihood exclusive reliance on overt behaviours disproportionately disadvantaged the older children, which would not compromise the observed age trends. There is in fact ample evidence that older children tend to decrease their overt use of verbal mediation, including rehearsal (Hulme et al., 1986; Justice, Baker-Ward, Gupta, & Jannings, 1997; Winsler & Naglieri, 2003) in favour of covert strategies, and this observed progression is substantiated by self-reports (McGilly & Siegler, 1989, 1990; Winsler & Naglieri, 2003). As a result of this accumulated research, it was generally accepted within the memory development research community that the use of verbal rehearsal developed both quantitatively and qualitatively from preschool to middle school. With development, rehearsal strategies evolved from simple labelling, to rehearsal of one or two items at a time, to multiple-item repetitions and, finally to true cumulative rehearsal. Other developmental changes expected to occur with age included the tendency to use rehearsal spontaneously and consistently as well as a progressive shift from overt to covert behaviours. In addition, rehearsal of a given type was deemed to become gradually more efficient as a result of practice in addition to related developments in lexical processing and production. Children younger than about 7 years of age were considered to be nonrehearsers, whereas sophisticated rehearsal strategies such as chunking or cumulative rehearsal were not expected to emerge before the age of 10 or so. These changes were presumed to be related to developments in basic capacities (e.g.,  processing  speed,  attention),  content  and  procedural  knowledge,  and  metamemory (i.e., knowledge about tasks, strategies, and oneself). This developmental progression documented for rehearsal also matched that of other memory strategies (e.g., semantic organisation).  18  Researchers coming from the adult memory models tradition then began to apply the working memory model to children of various ages. Their data led to a very different account regarding the development of rehearsal.  1.3.2 The application of the working memory model to children: New questions about rehearsal Throughout the 1980s a succession of research reports appeared that consisted of ‘developmental applications’ of the working memory model using the same methods that had proven informative with adults. Some results indicated that children as young as 3 or 4 years presented with either word-length or phonological-similarity effects at least in some experimental conditions. To briefly reiterate, for those operating according to the premises of the working memory model, the presence of a phonological-similarity effect was deemed to reflect the operation of the phonological loop and the process of phonological recoding of visually-presented stimuli (i.e., printed words, letters or nameable pictures). Also, the word-length effect was considered to be an indicator of rehearsal. Charles Hulme set things in motion with a series of studies that looked at the relationship between speech rate and recall, and that also explored verbal encoding of nameable picture stimuli (Hulme, 1984; Hulme et al., 1986; Hulme et al., 1984). For example, using an immediate serial recall task where participants had to remember lists of words in the order they had heard them, Hulme et al. (1984) found that speech rate and recall were related and that all age groups tested (4-, 7-, 10-year-olds, and adults) remembered fewer long words compared to short words. Increases in rehearsal were assumed to mediate the relationship between speech rate and recall. These results led the authors to favour a causal interpretation: “speech rate is seen as a measure of rehearsal rate, and it is increases in speech rate which led to the observed increase in  19  memory span during development” (Hulme et al., 1984, p. 251). Thus, participants in all age groups were assumed to be rehearsing covertly, with older groups doing so progressively more efficiently. However, rehearsal was never measured in any way. In addition, there was no mention that rehearsal in four-year-olds was a surprising finding, nor any reference to prior studies which had studied rehearsal via other means. In a following article, Hulme et al. (1986) addressed the rehearsal question head on. They specified that they did not claim that young children used active rehearsal, but rather that even 4- to 5-year-olds showed a tendency to convert nameable pictures to speech code and to name aloud, which they substantiated by observing their spoken and subvocal activity while they were performing a serial memory for pictures task (Experiment 2). Hence, this position did not truly conflict with the memory strategy literature, which was now acknowledged explicitly. At about the same time, another group of collaborators also tackled the issues of verbal encoding and rehearsal by young children in short-term memory tasks. To summarise succinctly, Hitch, Halliday, and their collaborators went from a stronger position stating that 4-year-olds could use subvocal rehearsal (Hitch, Halliday, Dodd et al., 1989) to a much more tempered opinion (e.g., Hitch et al., 1991). Again, conclusions were based on indirect evidence stemming from manipulating either the length or the similarity of the to-be-remembered words or by attempting to make rehearsal more difficult by having participants repeat an irrelevant word while they completed the task (i.e., perform articulatory suppression). To be fair, the authors did acknowledge from the outset that the younger children could have been resorting to a less complex form of rehearsal. In fact, in various writings around this period, Halliday and Hitch expressed a measure of doubt that their findings could be interpreted as reflecting sophisticated forms of subvocal rehearsal in very young children (Halliday & Hitch, 1988; Hitch & Halliday, 1983; Hitch, Halliday, Dodd et al., 1989). Finally, in a later article, Hitch and his 20  collaborators no longer used the term rehearsal, preferring ‘inner speech’ which they defined as “activation of phonological representations in an internal articulatory loop (Baddeley & Hitch, 1974; Baddeley, 1986)” (Hitch et al., 1991, p. 231). This was distinguished from repetitive strategic rehearsal, which was presumed to develop somewhat later. Much of this work focused on the presence of a word-length effect, which, at the time, was presumed to be the best indicator of subvocal rehearsal. The key findings used to support the presence of rehearsal in children of different ages were better memory performance for short words compared to long words and a reliable linear relationship between speech rate and recall using group averages for different ages and word-lengths (e.g., Hitch, Halliday, Dodd et al., 1989; Hulme et al., 1984). In addition, other studies focused on the presence of phonological-similarity effects. Sensitivity to phonological similarity was taken as an indicator that the storage component of the phonological loop was involved. In addition, developments in rehearsal were presumed to contribute to the emergence of the phonological-similarity effect. Many studies found that both the phonological-similarity and word-length effects emerged very early (around the age of 4 years) with the presentation of spoken words and verbal recall, but later on (somewhere between the ages of 5 and 8) for visual presentation of nameable pictures. These results were interpreted as evidence that younger children did not tend to use the rehearsal process to recode the pictures into phonological form (i.e., to label them), but rather relied on the visual stimuli to remember. In contrast, with auditory presentation, the spoken words gained direct access to the phonological loop and led to subvocal rehearsal in children as young as 4 years—the youngest ages tested (e.g., Hulme, 1987; Hulme et al., 1984; Hulme & Tordoff, 1989). Finally, manipulation of the visual similarity of stimuli was used as complementary evidence. Whereas the memory of young children was negatively impacted when 21  visually-presented stimuli looked alike, this was much less so for older children (e.g., R. M. Brown, 1977; Hayes & Schulze, 1977; Hitch, Halliday, Schaafstal, & Schraagen, 1988; Hitch, Woodin, & Baker, 1989). Taken together, these facts were used to support the view that young children were more tied to the perceptual characteristics of the tobe-remembered items and that they did not tend to recode nameable stimuli into words.  Writing at that time, Henry (1991a) summarised the state of the situation and the apparently conflicting data regarding the development of rehearsal in children: One major approach, the working memory model, assumes that even young children use rehearsal with auditory presentation, and the developmental increase in span results from faster rehearsal with age. However, this model had difficulty in accounting for the different developmental patterns in the use of rehearsal with auditory as opposed to visual presentation. Another, more traditional, approach views rehearsal as a strategy which develops later and which gradually improves memory span with age. However, most of this evidence is based on studies of visual and not auditory span, and it is auditory span around which the debate over the use of rehearsal centres. (p. 494) Hence, as Henry underscored, the inconsistencies in terms of the presumed developmental course of rehearsal centered around auditory-verbal tasks, whereas there was more harmony between the two research traditions when it came to memory for visually-presented nameable stimuli. Some researchers had apparently been initially unaware of the accumulated evidence and theory from the extensive body of research regarding memory strategy development that proposed a protracted developmental course for strategic rehearsal accompanied with a progressive internalisation. Consequently, they did not seem to find their results suggestive of subvocal rehearsal in preschoolers extremely surprising. On the other hand, others were clearly aware of the influential work of Flavell and of other researchers in the memory development tradition and, like Henry, acknowledged the conundrum. Although they were confident in the data emerging from their own research, they were not comfortable in claiming that such young children could be resorting to 22  adult-like subvocal verbal rehearsal. In addition, they wondered how there could be such an important developmental lag in terms of learning to use this strategy depending on whether the stimuli were spoken or presented in visual form. This tension between the data and their interpretation apparently bothered Hitch and Halliday as early as 1983. They elaborated on their concerns a few years later: Inevitably some doubts must remain... First, while our results... strongly suggest that, with spoken presentation, children as young as 4 or 5 are using the articulatory loop to rehearse, we have no direct evidence ... to show that they do. Secondly, and more seriously, it is important to note that rehearsal, of the sort used by adults, is a fairly complex procedure involving as it does serial processing and grouping items during their presentation... It is one thing to suggest that young children may verbally rehearse spoken words but not pictures, it is quite another to claim that the form of this rehearsal process is effectively mature at the age of 4... It passes belief that the dissociation between the systems involved in spoken and pictorial presentation is so complete that rehearsal strategies, which are available in something like the adult form when the input is spoken, need to be relearnt over a period of years when the input is in another modality" (Halliday & Hitch, 1988, p. 213). This led these authors to suggest that the rehearsal system associated with the phonological loop and conscious rehearsal could be distinct. The rehearsal loop was seen as being “a more primitive and essentially automatic system” whereas “the conscious strategic process of subvocal rehearsal would then be seen as being superimposed on this more primitive system and might be thought to use different resources" (Halliday & Hitch, 1988, p. 214). Hence, Halliday and Hitch were proposing that rehearsal as conceptualised in the working memory model perhaps corresponded to a different phenomena from the well-studied mnemonic strategy. This work came full circle in a 1994 article in which Gathercole, Adams, and Hitch concluded that 4-year-olds did not use strategic subvocal rehearsal in short-term memory tasks. Their study used an individual differences approach, looking at correlations between speed of articulation and memory span in 4-year-olds using the data from individual children. They reasoned that “if young children do rehearse in auditory short-term memory tasks, those with faster rates of explicit articulation should 23  have greater memory spans, because their rapid rehearsal will result in less memory representation loss from the phonological store due to decay” (Gathercole, Adams, & Hitch, 1994, p. 202). No relationship between these two variables was found for this age group regardless of whether digits or early-acquired concrete nouns were used as stimuli. When the same materials were used with college students the expected positive relationship between articulation rate and recall emerged. Similar results to the 4-yearolds were obtained in another study for slightly older children (aged 5 years) who also showed no relationships between any of the word spans and speech rate, although there was a significant word-length effect with poorer recall for lists of long (three-syllable) compared to short (one-syllable) words (Gathercole & Adams, 1994). These studies led to growing scepticism regarding whether correlations between speech rates and memory span could be used as reliable indicators of rehearsal, particularly in young children. Firstly, the relationship was not as obvious when the correlations were based on individual data rather than group means (e.g., Gathercole et al., 1994; Henry, 1994). Secondly, because in most cases the speech rate tasks actually involved a memory component, measures of speech rate using multiple syllables or words may have been “contaminated by memory load, especially in young children” (A. N. Ferguson, Bowey, & Tilley, 2002, p. 151), which could at least partly explain why memory span and speech rate were correlated. Thirdly, the attribution of the word-length effect exclusively to speed of articulation and hence rehearsal rate differences was soon questioned both in the adult and the developmental literatures (Cowan et al., 1992; Gathercole et al., 1994; Henry, 1991b; Jarrold, Baddeley, & Hewes, 2000; Lovatt & Avons, 2001; Lovatt, Avons, & Masterson, 2000; Yuzawa, 2001). Other candidates presumed to contribute to a word-length effect included verbal output effects, such as decay during output and output planning requirements. It is now generally accepted that output effects alone can produce a significant word-length effect in nonrehearsing 24  participants (for review of the evidence and other methodological considerations, see Jarrold, Cocksey, & Dockerill, 2008). There were in fact signs in these applications of the working memory model to young children that could lead one to question whether they were resorting to an active complex rehearsal strategy. Indeed, many studies have found that it was difficult to train young children to use overt forms of more advanced rehearsal (or something similar), particularly in immediate memory tasks with fast presentation rates. For example, Cowan and colleagues (Cowan, Saults, Winterowd, & Sherk, 1991, footnote 1) reported that they had been only ‘moderately successful’ in training 4 and 5 year-olds to cumulatively rehearse a spoken list, and that this seemed extremely demanding for these young children. As a result, they opted for single-item naming, multiple-item naming, or cumulative list repetitions in their study, thereby removing the memory component from the strategy implementation. Similarly, Johnston and Conning (1990) had difficulty training 5-year-olds to use overt or covert cumulative list repetitions, even with pictures visible throughout, and instead had to settle for only partial cumulative repetitions or multiple-item repetitions. Henry (1991a) also reported that 5-year-olds had difficulty learning to use a grouped rehearsal strategy, and that they were not able to learn do so covertly. Even silently mouthing proved difficult for many young children. For instance, Hitch et al. (1991) reported that many 6-year-olds had difficulty following instructions when they were asked to silently mouth item names, which leaves one doubting whether they were spontaneously able to do so covertly. Finally, some studies have reported that young school-aged children (kindergarteners in particular) had difficulty rapidly repeating word triples in tasks used to assess their speech rate (A. N. Ferguson et al., 2002), which raises doubts regarding their ability to resort to cumulative rehearsal.  25  These reports fit with much accumulated evidence from studies in both the memory strategies and the developmental applications of short-term memory models traditions that have attempted to train or support young children to use various forms of rehearsal or serial repetition during item presentation. Although these studies vary on many dimensions (including mode of presentation, response type, immediate vs. delayed recall, type of rehearsal), together they suggest that even with support and practice, it is challenging for young children (aged 4 to 7) to carry out even simpler forms of overt rehearsal (Cowan et al., 1991; R. P. Ferguson & Bray, 1976; Hayes & Rosner, 1975; Henry, 1991a; Johnston & Conning, 1990; Johnston, Johnson, & Gray, 1987; Kingsley & Hagen, 1969).  1.3.3 Stalemate or reconciliation? One could argue that, to some extent, it is possible to reconcile the data and the interpretations from both traditions—the developmental applications of short-term memory models and memory strategy development research. At the most basic level, apparently contradictory results were due to different theoretical issues and research questions. On the one hand, the short-term memory theorists were using the presumed adult end-point as an all-or-nothing basis of comparison to infer what groups of children were doing. On the other hand, the memory strategy researchers came from a clear developmental perspective, seeking to document both quantitative and qualitative changes. They were also relatively more focused on the individual than the group, although still interested in tracking broad developmental trends. The apparent contradictions were also in part attributable to methodological differences, with one approach relying on task manipulations to get at memory processes and the other putting more stock in observation and self-report to document task approaches and strategies. Finally, the term rehearsal has apparently been used to describe very  26  different phenomena, from automatic activation of phonological representations to strategic and deliberate behaviour ranging from simple forms of rehearsal such as single-item labelling, to complex rehearsal strategies such as chunking or cumulative rehearsal. The two traditions have essentially continued their research somewhat independently, with only a rare few attempting to bridge the gap between them. In particular, the work of Lucy Henry stands out, as she has acknowledged the accumulated evidence from both perspectives, and incorporated and combined methods from both areas to her work on memory development (e.g., Henry, 1991a, 1991b; Henry et al., 2000; see section 1.5 for additional details). However this is the exception to the rule. Given how much the traditions differ in terms of their premises and their theoretical commitments, this is hardly surprising. On the other hand, it has left the inconsistencies regarding the data and the conceptualisation of rehearsal largely unresolved. Generally, those working in the short-term memory models tradition mostly abandoned the use of the word rehearsal. Word-length and phonological-similarity effects in children have continued to be studied, but researchers have generally been more careful in their interpretations, often limiting them to the phenomenon of ‘phonological encoding’ of spoken words or ‘phonological recoding’ of nameable pictures. The main focus became documenting the impact of presentation modality, and in particular the developmental course of phonological recoding—the tendency to use labels to encode and recall visually presented nameable stimuli. Developmental studies of working memory using immediate serial memory tasks have documented an evolving propensity for children to give precedence to verbal encoding for nameable stimuli regardless of presentation mode (visual or verbal; e.g., Henry et al., 2000; Hitch et al., 1991; Palmer, 2000c). Also the importance of verbal rehearsal as a determining factor in the increase in memory capacity was progressively questioned, with greater emphasis 27  placed on changes in speed of memory search processes (Cowan & Kail, 1996; Cowan et al., 1998), in item identification speed (Henry & Millar, 1991) and in the knowledge base. In particular, developmental changes in long term memory were presumed to influence memory access and the process of reconstructing the memory traces at the time of response (i.e., redintegration; G. D. A. Brown & Hulme, 1995; Hasselhorn & Grube, 2003; Henry & Millar, 1991; Roodenrys, Hulme, & Brown, 1993). It is interesting to note that there has recently been greater emphasis on the importance of considering deliberate strategies and individual differences in the adult literature within the working memory model tradition (Campoy & Baddeley, 2008; Hanley & Bakopoulou, 2003; Logie et al., 1996; Logie, Venneri, Della Sala, Redpath, & Marshall, 2003), with Baddeley himself insisting that more attention needs to be focused on these issues (Baddeley, 2003; Baddeley & Larsen, 2007a, 2007b). In particular, Baddeley and Larsen (2007b) have recently insisted on the need “to develop adequate methods to determine strategy use that are independent of the effects being studied” (p. 514). In parallel, the memory development theorists continued to ignore the work of the other group and pursued their own evolution. In particular, they have progressively moved towards a complex model that can accommodate the interactions in development and in the moment between capacity, knowledge, strategies, and metacognition and their impact on memory and problem solving abilities more generally (Kuhn, 2000a; Pressley, 1995; Schneider & Bjorklund, 1998; Schneider & Pressley, 1997). They have also found evidence supporting a more complicated picture of strategy use and development than was previously hypothesised. This last point will be elaborated below (see section 1.5).  Despite this ongoing research in both areas, many fundamental questions remain unanswered. In fact, there is still no consensus regarding how one might determine 28  whether individual children are resorting to rehearsal in immediate memory tasks, or whether rehearsal is an important determinant of developmental change in short-term memory (see, e.g., Hasselhorn & Grube, 2003; Jarrold et al., 2008). In part, the current state of affairs stems from the fact that very little research has explored within a single study whether the evidence stemming from multiple measures of rehearsal converge or not. In addition, conclusions about what groups are doing may not hold at the level of individual participants (children or adults), although studies that have attempted to verify this are rare (some notable exceptions: Campoy & Baddeley, 2008; Conrad, 1972; Logie et al., 1996; Palmer, 2000a, 2000c; Yuzawa, 2001). Finally, in many cases, the concept of rehearsal is only loosely defined, if at all. The present study aims precisely to combine and contrast different measures of rehearsal in order to establish whether individual children are rehearsing. In the process, it also addresses the definitional or conceptual issue head on. A few recent publications illustrate the continued disjunction between the two research traditions. In a review chapter written from the perspective of the short-term memory models, Ben-Yehudah and Fiez (2007) presented on equal footing two alternative hypotheses: either i) only older children engage in active rehearsal, with this change perhaps occurring as early as 4 years of age; or ii) rehearsal is always present but becomes more effective with increasing age. Of course, these two hypotheses are not mutually exclusive, and many would object to the age of 4 as a lower limit for the emergence of rehearsal. Based on the same body of evidence, Hitch (2006) wrote: Children ages 4 and upward are sensitive to word length and phonemic similarity of the items in just the same way [italics added] as are adults, provided stimuli are presented orally (Hulme, Thomson, Muir, & Lawrence, 1984; Hitch, Halliday, & Littler, 1993)…. Spans for words of different lengths are a linear function of speech rate in children in just the same way [italics added] as in adults (Hulme et al., 1984)…. These quantitative observations suggest that development is associated with faster subvocal rehearsal [italics added]. (p. 116)  29  These two examples contrast with the claim made by Bjorkland, Dukes, and Brown (2009) that the accumulated research suggests that most children spontaneously display single-item rehearsal by the age of 6 to 7 years, and cumulative rehearsal by 11 to 14 years. Of course, here the authors are referring to the data emanating from the memory strategies research tradition. Where does one go from here? One possibility, of course, would be to accept the status quo by continuing to treat these two sets of facts independently and ignoring any potential conflicts. Another alternative, which has not been much explored, would be to attempt to reconcile the data and the interpretations from the memory models and the memory strategies literatures in the service of gaining further understanding of rehearsal and its development. This is precisely what this project set out to do by acknowledging the contributions of both of these traditions and building on them. Specifically, the present study combined methods from both traditions and relied on multiple indicators of rehearsal and of other strategies. The impacts of visual and phonological similarity in immediate serial memory were explored by manipulating the characteristics of the to-be-remembered items. This method used to tap visual encoding, verbal encoding, and rehearsal in the memory models tradition was augmented with proven methods from the memory strategies research, namely direct observation of strategic behaviours, and children’s self-reported strategies. The next section provides an interpretive framework for exploring indices of verbal rehearsal using such converging evidence.  30  1.4 A theoretical framework for interpreting indices of verbal mediation in immediate serial memory in children This section serves to highlight how phonological coding and rehearsal have been conceptualised in studies that have applied models of short-term memory to study these phenomena in a developmental perspective. In addition, theoretical accounts and experimental work from the adult literature will serve as a backdrop. Alan Baddeley’s model of working memory has been used as a framework in most research that has explored the issues of visual encoding, verbal encoding, and rehearsal by children in immediate memory tasks within a short-term memory models perspective. As such, this study is essentially contrasting rehearsal as defined in the Baddeley model with deliberate rehearsal as conceptualised in memory strategy development research. For this reason, more detail regarding the Baddeley model is presented next.  1.4.1 Visual encoding, verbal encoding, and rehearsal in Baddeley’s multiple component model of working memory The three–component model proposed by Baddeley and Hitch (1974) consisted of the central executive and two subsidiary storage systems, the phonological loop and the visuospatial sketchpad. This model has undergone many refinements and some changes over the years (e.g., Baddeley, 1986, 2003; Baddeley & Logie, 1999), most notably with the recent addition of a fourth component, the episodic buffer (Baddeley, 2000). As we have previously discussed, the phonological loop is based on sound and language. Specifically: the model of the phonological loop comprises a phonological store, which can hold memory traces for a few seconds before they fade, and an articulatory  31  rehearsal process that is analogous to subvocal speech. Memory traces can be refreshed by being retrieved and re-articulated (Baddeley, 2003, p. 830). Notice that Baddeley used the term articulatory rehearsal. Subvocal articulation is presumed to take place in real time, which imposes a limit on rehearsal, and hence on memory span. Items can only be recalled correctly if there is sufficient time for rehearsal before early items have begun to fade. The memory traces themselves provide a means to recall a minimum amount of information without invoking the articulatory rehearsal process. This has been demonstrated in adults by using techniques to interfere with the rehearsal process, most notably articulatory suppression (e.g., Cowan et al., 1987; Logie, Della Sala, Wynn, & Baddeley, 2000). Access to the phonological loop depends on the modality of presentation, with auditory stimuli gaining automatic and direct access whereas visual stimuli (i.e., nameable pictures, written letters, written words) must first undergo recoding via the process of (sub)vocalisation. Hence, the rehearsal mechanism is considered to play a role of both recoding and maintenance. Participants are assumed to resort to verbal recoding as this form is deemed best suited for the encoding and retrieval of serial order (Baddeley, 2000; Penney, 1989). Also, verbal recoding is a much-practiced activity among readers. In parallel, visual stimuli gain direct access to the visuospatial sketchpad, which is presumed to handle the retention and manipulation of visual patterns (as well as spatial movements). Logie (Baddeley & Logie, 1999; Logie, 1995) proposed that the sketchpad may be further fractionated into a storage component (the visual cache) and a retrieval and rehearsal process (the inner scribe). Very little is yet known about this nonverbal maintenance process (Baddeley, 1996; Pickering, 2001). Finally, via visual imagery, auditorily presented stimuli can gain access to the sketchpad. Visual memory is  32  limited in capacity and subject to the detrimental effects of visual similarity among items, particularly (but not exclusively) for stimuli or conditions that make it difficult to rely on a verbal code (Hitch et al., 1988; Hitch, Woodin et al., 1989; Logie et al., 2000; Poirier, Saint-Aubin, Musselwhite, Mohanadas, & Mahammed, 2007) The central executive is a control system of limited attentional capacity which serves “to focus, to divide, and to switch attention” (Baddeley, 2003, p. 835). It is also presumed to control encoding and retrieval strategies (Baddeley & Logie, 1999). The episodic buffer was added specifically to perform the function of linking with long-term memory as well as between the components of working memory (Baddeley, 2000). It is “assumed to be a limited capacity store that binds together information to form integrated episodes… to be attentionally controlled by the executive and to be accessible to conscious awareness” (Baddeley, 2003, p. 836). Baddeley (2003) adds that it can be conceived of either as a fourth component of the model, or as the storage component of the executive. Most work done within the framework of this multi-component working memory model has used some form of immediate serial recall. Verbal tasks aimed at exploring the functioning of the phonological loop have generally used digits, letters, or unrelated words as the to-be-remembered stimuli, presented either auditorily or visually (i.e., pictures of familiar nameable objects or written letters). Crucially, the nature of the code and the use of a verbal rehearsal mechanism is inferred based on task manipulations, with “the characteristics of the material remembered being used to give an indication of the nature of the code on which the recall is based” (Baddeley, 2003, p. 830). This model continues to be extremely influential, although not universally accepted (for other views see e.g., Jones, Hughes, & Macken, 2006; individual chapters in Miyake & Shah, 1999; Nairne, 2002). Nonetheless, most of the research exploring  33  phonological (re)coding and rehearsal in children has been done within this theoretical framework. Some aspects of the Baddeley model which are particularly important for my purposes remain unclear. First, it is uncertain whether the articulatory rehearsal process is assumed to be deliberate and accessible to consciousness, although Baddeley did recently write that “within the phonological loop, at least, rehearsal is a covert activity under strategic control [italics added]” (Baddeley, 2007, p. 36). In the latest instantiation of the model, the control of encoding and retrieval strategies is the purview of the central executive. Is the subvocal rehearsal process as conceived of within the phonological component of the model distinct from these strategies? Additionally, the content of the episodic buffer is purportedly accessible to conscious awareness. What does this mean with regards to the process of articulatory rehearsal? Baddeley has recently proposed that there are likely multiple forms of attentional rehearsal, including visual and semantic forms: “we suspect that subvocal articulation is a somewhat atypical form of rehearsal, possible only because the phonological material retained in most studies can be mapped directly onto a familiar spoken response such as a digit. It seems likely that rehearsal in visual, semantic, and other systems may reflect a more general process of attentional activation and reactivation. There is good evidence to suggest that such attentionally based rehearsal is available for verbal material as well as for visual and semantic information.” (Baddeley & Larsen, 2007b, p. 514) The existence of these additional forms of rehearsal could in part explain how participants are able to perform with some success on tasks that make articulatory rehearsal particularly challenging. On the other hand, it remains underspecified how articulatory rehearsal is distinct from these forms of attentional rehearsal. Logie et al. (1996) completed one of the few studies with cognitively intact adults which looked at word-length and phonological-similarity effects at the individual level. They also used self-reported strategies to interpret their results. As this study was clearly realised within the framework of the Baddeley model, this would suggest that the 34  researchers considered cumulative rehearsal and chunking as equivalent to (or indicative of) the process of articulatory rehearsal as conceived of within the phonological loop. Then again, Gupta and McWhinney (1997) proposed a model that builds on Baddeley’s and that integrates vocabulary acquisition and verbal short-term memory. In their writings, Gupta and McWhinney distinguished cumulative rehearsal from chunking, describing them as optional strategies, which were presumably available to consciousness. Whether or not this is equivalent to articulatory rehearsal remains unclear. Baddeley’s model, which refers explicitly to subvocal articulatory rehearsal was developed for adults and then applied later on to children. As a result, another area of uncertainty surrounds whether this process must necessarily be subvocal, or whether this characterisation is simply a function of the tendency for adults to internalise their strategies. Baddeley has, on occasion, suggested that the articulatory rehearsal process in this model could be overt: The rehearsal system is assumed to involve some form of subvocal articulation, which revives the memory trace…. The process of rehearsal does not need to be overt [italics added], since even patients who have lost the capacity to articulate as a result of a peripheral lesion may still show all the signs of subvocal rehearsal, including the word length effect. (Baddeley, 1996, p. 13469) In some very recent writings referring to his model he wrote: “Rehearsal was assumed to depend on either overt or covert vocalization” (Baddeley, 2007, p. 7). For all these reasons, it is unclear to what extent Baddeley’s conception of subvocal rehearsal corresponds to strategic and intentional verbal rehearsal (overt, semi-covert, or covert) as it has been defined in the memory development literature. There is also some debate as to whether the phonological-similarity effect can be used as an indicator of rehearsal.  35  1.4.2 The phonological-similarity effect as an indicator of rehearsal Despite over 40 years of intense study, there is apparently no consensus on how immediate serial memory works in adults, and there are numerous competing theoretical accounts that continue to be researched and discussed. The focus here will be on interpretations of the phonological-similarity effect and its link to rehearsal. To briefly reiterate, the phonological-similarity effect corresponds to the phenomenon that sequences of stimuli (i.e., words, picture labels, letters) that sound alike (e.g., cat, rat, mat) are generally more difficult to remember in order than sequences where the stimuli are more distinct-sounding (e.g., girl, fish, spoon). Conrad (1963, 1964; Conrad & Hull, 1964), who first documented this phenomenon, initially used the term acoustic similarity. When Baddeley and his colleagues went on to use the phonological-similarity effect paradigm to test their model, they opted for the term phonological similarity to reflect the language-based nature of the effect. As Baddeley recently recounted: We initially chose phonemic, but given that this had a rather precise linguistic meaning, opted instead for phonological, which we hoped would be relatively neutral with respect to the exact nature of the code, apart from indicating that it was capable of encoding the spoken features of language.” (Baddeley & Larsen, 2007a, p. 498) All three terms have been used in the literature over the years to refer to the same phenomenon. Some choose a term on theoretical grounds (see e.g., Jones et al., 2006), whereas many seem to opt for the most common term at the time, which presently is ‘phonological-similarity effect’. This current appellation was preferred in the present study because it fits with the general framework of exploring how language—not simply any sounds—may support thinking. Clearly, although participants are vocalising what to them are meaningful strings of sounds (at least one can presume that is what they are doing), this does not in itself tell us anything about the exact nature of the code, which  36  even Baddeley concedes “remains an important but open question” (Baddeley & Larsen, 2007a, p. 498).  1.4.2.1 How does the phonological-similarity effect emerge? Current models of short-term memory account for the phonological-similarity effect in terms of inter-item confusions. Where they differ, is whether they consider these confusions to take place at encoding or during maintenance (e.g., Baddeley, 1986; Gupta & MacWhinney, 1997; Lewandowsky & Farrell, 2008), or at the time of recall of the individual items (see e.g., Burgess & Hitch, 2006; Henson, 1998; Saint-Aubin & Poirier, 2000). Also, the extent to which rehearsal is presumed to play a role in the phonological-similarity effect is not always clear. Generally, the phonological-similarity effect in immediate serial recall has a particular impact on order memory rather than on item memory. This remains true even when large word pools are used (‘open sets’) which would presumably make item guessing more difficult as item repetitions across lists are rare (or absent) compared to when small ‘closed sets’ are used (Coltheart, 1993; Gupta, Lipinski, & Aktunc, 2005; Nimmo & Roodenrys, 2004). There are multiple interpretations for why order is particularly impacted (e.g., Burgess & Hitch, 2006; e.g., Saint-Aubin & Poirier, 2000). In fact, competing hypotheses have been proposed for understanding the phonologicalsimilarity effect going back to the 1960s, as this phenomenon has been considered key for developing a framework for understanding working memory (e.g., Baddeley, 1968; Baddeley & Larsen, 2007a; Fallon, Groves, & Tehan, 1999; Gupta et al., 2005; Jones et al., 2006; Lian & Karlsen, 2004; Nairne & Kelley, 1999). Despite this long-standing interest, no consensus has yet emerged regarding exactly what lies behind this muchstudied phenomenon.  37  Nonetheless, thanks to some valiant attempts at reconciling the existing data (see in particular Gupta et al., 2005) it is possible to draw a portrait of how a phonological-similarity effect is likely to emerge. Part of the effect may take place at encoding or storage. Words are presumed to be encoded phonologically (i.e., in terms of a speech-based code), although other representations (i.e., visual or semantic) are also activated. Phonological similarity impacts the extent to which these phonological traces can be easily discriminated. This corresponds to what Conrad (1964) referred to as additional noise in the system resulting from similarity. According to retrieval-based models, “manipulations that increase similarity among phonological traces make it more difficult to match them to long-term memory representations during redintegration” (Fallon, Mak, Tehan, & Daly, 2005, p. 350). Hence, at time of recall, poorer traces of phonologically-similar items make it more difficult to successfully reconstruct them compared to dissimilar items, as similar codes have less distinct features. This particularly impairs position accuracy or recall of order for the phonologically-similar lists (see Saint-Aubin & Poirier, 2000). Rehearsal can be conceptualised as a series of retrievals and reencodings (Jonides et al., 2008) or equated with partial or total preresponse cycles. In fact, in the latest instantiation of their network model of the phonological loop, Burgess and Hitch simulated rehearsal as “multiple output cycles” (2006, p. 631). If one rehearses, this can maintain the quality of the trace for dissimilar lists (i.e., fight decay) and also will translate visual stimuli into a phonological form. However, this may compound the confusion for phonologically-similar words because of “increased cross-talk in phonemic feedback” (Burgess & Hitch, 2006, p. 631) when retrieving the phonemic composition of words. This in turn increases substitution errors during recall. Because errors are more likely during rehearsal of similar lists, “when an older subject rehearses cumulatively, reactivation of the correct sequence in memory ends if and when the subject introduces 38  an error into the rehearsed sequence or cannot keep up with the list, and this should provide more repetitions for dissimilar lists” (Cowan et al., 1991, p. 41). Jones et al. (2006) suggest that rehearsal produces recall errors “akin to Spoonerisms in which elements of items in the to-be-recalled sequence are transposed” (p. 277). So these opposite effects—improved recall for similar lists, worse recall for dissimilar lists—should increase the size of the phonological-similarity effect. For non-rehearsing individuals who are relying on auditory perceptual traces, the initial difference in terms of the distinctiveness of the traces would still be present between phonologically-similar and dissimilar lists, and would impact redintegration. This could explain that a small phonological-similarity effect appears in young non-rehearsing children (Cowan et al., 1991; Hulme & Tordoff, 1989) or persists when adults are prevented from rehearsing (Baddeley & Larsen, 2007a; Jones et al., 2006). However, if participants are resorting to some other strategy (i.e., a semantic or a visual strategy), it is possible that both types of lists will be recalled to a similar extent. Multiple forms of encoding or activation allow for different strategies to be used, in particular when the phonological route is made more difficult or less efficient (see, e.g., Baddeley & Larsen, 2007b; Campoy & Baddeley, 2008; Hanley & Bakopoulou, 2003; Logie et al., 1996).  Hence, what is clear is that there are likely multiple compounding effects or additive sources that lead to a phonological-similarity effect. Ben-Yehudah and Fiez (2007) summarised three different possible sources: i) phonological confusions within the passive buffer; ii) confusions between articulatory plans during active rehearsal; and, iii) incorrect reconstruction of (confusable) decaying representations from long-term memory (i.e., the redintegration process). All three sources obviously have to do with how information is stored and represented. Nonetheless, given the focus of the current 39  project on using the phonological-similarity effect as an indicator of rehearsal, it is crucial to note that these three sources are not mutually exclusive, and that their relative importance likely depends on whether someone is resorting to verbal rehearsal. When rehearsal is made more difficult by having adults perform articulatory suppression or listen to irrelevant speech, the phonological-similarity effect is considerably reduced at the group level with auditory presentation (Baddeley & Larsen, 2007a; Cowan et al., 1987; Jones et al., 2006), and is no longer present with visual presentation (i.e., written words or letters; Baddeley & Larsen, 2007a; Coltheart, 1993; Hanley & Bakopoulou, 2003). In fact, some have argued on both empirical and theoretical grounds that rehearsal is the main contributor to the phonological-similarity effect in adults. For instance, Jones et al. (2006) claimed that: “…the PSE observed when participants are free to rehearse (with both visual and auditory input) is located in the articulatory rehearsal process itself, not within a separable, passive, store that is fed by that rehearsal process” (pp. 267-268). They went on to add that “when rehearsal is permitted, the chief way by which the PSE arises is through the very act of rehearsal itself” (p. 277). Cowan, Cartwright, Winterowd, and Sherk (1987) have proposed that there could be two additive sources for the phonological-similarity effect: "there may be two separate components of the similarity effect, only one of which depends upon rehearsal. The other component would be acoustic or phonemic confusion within a passive store that persists for a time even without rehearsal" (p. 512). They tested this hypothesis by combining the phonological-similarity paradigm and various forms of articulatory suppression in a study with college students. The authors hypothesised that, by blocking rehearsal, articulatory suppression would reduce but not necessarily eliminate the phonological-similarity effect. Under full suppression conditions (Experiments 1 and 3), both suppression and phonological similarity seemed to exert independent effects. 40  Suppression interfered (i.e., reduced recall) more for the dissimilar lists, and this reduced the size of the phonological-similarity effect. Cowan and his colleagues interpreted these finding as evidence that rehearsal is particularly helpful for the recall of dissimilar lists. In fact, under suppression, the performance of the college students resembled that of 5year-olds in an earlier study by Hulme (1984) which had used the exact same method. In another experiment (Experiment 2), the participants were instructed to use covert single-item repetition—which can be conceived of as a more 'primitive' form of rehearsal. This resulted in relatively less decrement in performance and more comparable effects for both similar and dissimilar lists. Hence, single-item labelling may be sufficient to cause a phonological-similarity effect, at least in adults. When the various experiments were taken together and based on prior research with children, Cowan et al. (1987) concluded that: the results suggest that articulatory suppression reduces the magnitude of the phonetic similarity effect by blocking a covert, speech-related process that requires at most a limited amount of effort in adults and can be successfully carried out in the intermittent intervals between list-item presentations. Covert, cumulative rehearsal may be only one of several candidate processes fitting this description; however, it is a candidate that is made more appealing by the successful induction, using rehearsal training, of the similarity effect in young children" (p. 516). The data also supported the hypothesis regarding two additive sources for the phonological-similarity effect. Cowan and his collaborators went on to propose that changes in how the rehearsal strategy is implemented, as well as enhanced representations which “could lead to an increase in confusions in memory stemming from phoneme identities among words" (Cowan et al., 1987, p. 516) could play a role in development. Hence, their study provides support for the hypothesis that there are two bases for the phonological-similarity effect—rehearsal and acoustic confusion when the representations are sufficiently strong. The second factor could explain the influence of phonological similarity in groups of young non-rehearsing children in experiments using  41  auditory presentation. Both of these factors could exert more influence with development as the rehearsal process becomes more stable, and as children's representations become ‘enhanced’.  Additional support for the contribution of rehearsal to the phonological-similarity effect comes from reciprocal evidence in terms of what transpires when adults either are instructed not to rehearse or choose not to do so. A few studies suggest that when adults are instructed to use a semantic strategy (e.g., make up a sentence with the list of words), the phonological-similarity effect is extremely reduced to the point of being nonsignificant at the level of aggregate group data (Campoy & Baddeley, 2008; Hanley & Bakopoulou, 2003). In addition, when self-report was combined with the phonological similarity paradigm and adults were left free to do what they wanted, the phonologicalsimilarity effect was reduced or absent for those who said they did not use a simple rehearsal strategy (Hanley & Bakopoulou, 2003; Logie et al., 1996). Logie et al. (1996) completed a large-scale study with adults that looked at both group-aggregate and subject-wise data for the presence of a phonological-similarity effect, and also collected self-report data. Their results indicated that despite a strong and significant phonological-similarity effect overall in an immediate serial verbal recall task, for a substantial proportion of participants (2% with presentation of spoken words and 10% with written words) recall was not negatively impacted by phonological similarity. The adults reported using many different strategies which included advanced forms of rehearsal (e.g., chunking), semantic elaboration, and visual imagery. Interestingly, 32% of those who were interviewed reported more than one strategy. Although less than half (41%) reported using a rehearsal strategy, if in all cases the responses of those who reported a mixed strategy are assumed to have included rehearsal (although this was unfortunately not reported in the study), the proportion 42  could have been as high as 73%. Interestingly, "several subjects reported switching from one strategy to another within trials and between trials as well as between types of list and presentation modalities” (p. 316). Hence, task characteristics apparently influenced strategy choices. Finally, as a group, those who reported a rehearsal strategy presented with a significantly larger mean phonological-similarity effect than those who did not do so, even when controlling for span. The authors concluded that the magnitude of the phonological-similarity effect was “heavily reliant on the consistent use of a verbal rehearsal strategy” (p. 315). In a more recent study, Hanley and Bakopoulou (2003) convincingly demonstrated that whether or not a phonological-similarity effect is present with adult participants depends on whether or not they are instructed to rehearse. Adults were presented fairly long lists (i.e., seven items) of uppercase letters and asked to recall them in order by writing their responses. Lists consisted of either phonologically-similar or dissimilar letters. For half the trials, the participants also had to ignore irrelevant speech (i.e., a passage read aloud) playing in the background. Three groups of participants received different instructions. One group was instructed to rehearse, as participants were told “to remember the letters by repeating them subvocally until they had to be recalled” (p. 440). Another group was told to use a semantic strategy; in this case, the participants were told to turn each letter into a word, then create a sentence from the words generated, and to use the sentence to recall the letters in the lists in the correct order. Finally, a control group was given no instructions. The results were extremely revealing, as the patterns of results were very different for the three groups. The ‘rehearsal’ group showed a strong and significant phonological-similarity effect both in quiet and in the irrelevant speech condition. In contrast, the ‘semantic’ group showed no effect of phonological-similarity in either condition, whereas the ‘control’ group did so only in quiet. At the end of the experiment, the participants were asked to describe the 43  strategies they had used. All of those in the rehearsal group reported using that strategy, whereas most (69%) of the participants in the semantic group reported consistently using the strategy they had been instructed to employ. Finally, there was much variability in terms of the strategies reported in the control group, including rehearsal of the letter names, elaborative rehearsal, rehearsal of words replaced for the letters, converting the letters into real or nonsense words, or a combination of rehearsal and semantic strategies. Remarkably, among the participants in the control condition, the subgroup who reported rehearsing the letter names presented with a significant phonologicalsimilarity effect, whereas those who reported using a semantic strategy did not. This study gives credence to a hypothesis put forward by Salamé and Baddeley (1986) according to which some adults tend to abandon simple rote rehearsal in favour of visual or semantic strategies when a task becomes too difficult (e.g., long lists, irrelevant speech to ignore, articulatory suppression) and rehearsal is no longer an effective choice. It also confirms the findings of Logie et al. (1996) indicating a high degree of variability in task approach among adults when the instructions are neutral. Finally, a study by Campoy and Baddeley (2008) replicated many of the findings from the two studies reported above. They contrasted different strategy instructions and their impact on the phonological-similarity effect in a group of Spanish-speaking adults. To summarise briefly, the authors found that the phonological-similarity effect was not significant at the group-aggregate level when participants were told to use a semantic strategy. In contrast, the groups who either were instructed to rehearse or received no specific instructions did show a significant effect of similarity. At the level of individual participants, more adults in the rehearsal condition than in the other two conditions exhibited a decrement in performance for lists of similar-sounding written words compared to dissimilar words. Interestingly, the beneficial effect of the semantic strategy was apparently restricted to the similar lists. Based on effect sizes, the phonological44  similarity effect was strongest when participants were told to rehearse combined with fast presentation rates (1s vs. 2s). Taken together, these results suggest that semantic strategies may require more time to execute and that verbal rehearsal may be best adapted to short-term memory tasks involving immediate serial recall of unrelated items with fast presentation rates. These three studies offer strong support that task parameters are likely to influence how individuals are going about a task. They also suggest that the phonological-similarity effect and the process of rehearsal are related in adults. Finally, they confirm that self-report can add precious information for interpreting the results of experimental manipulations at the level of individual participants. The next section will summarise the research that has used the phonological-similarity paradigm in children.  1.4.2.2 The phonological-similarity effect and rehearsal in children There exists a long tradition of using the phonological-similarity effect as a means to tap whether participants are resorting to verbal encoding of pictures, and also to support the presumed use of verbal rehearsal in immediate serial recall tasks in children. From the pioneering work of Conrad (1971, 1972) to current research, the focus has been on identifying the developmental progression from participants being tied to the presentation modality to one where verbal strategies predominate. As such, the visual-similarity effect has been used as complementary evidence that young children were more tied to the characteristics of the stimuli and less likely to recode to-beremembered items into verbal form. The vast majority of studies using the phonological similarity paradigm in English have followed in the footsteps of Conrad (1971), and used monosyllabic concrete nouns containing the vowel sound /æ/ (as in cat, bat, cap) as the similar stimuli (either spoken words or nameable pictures with these words as labels), with variations in terms of the  45  number of stimuli in the set, and the exact words used (Halliday, Hitch, Lennon, & Pettipher, 1990; Henry, 1991a, 1991b; Hitch et al., 1991; Hitch, Woodin et al., 1989; Hulme, 1984, 1987; Hulme & Tordoff, 1989; Jarrold et al., 2000; Jarrold et al., 2008; Johnston & Conning, 1990; Palmer, 2000a, 2000c). A few studies, have used words that rhymed (e.g., snail, pail, mail, etc.; Hayes & Rosner, 1975; Hayes & Schulze, 1977) or lists of words with the same initial consonant (e.g., car, cat, clown, etc.; Al-Namlah et al., 2006; Ford & Silber, 1994) and have found similar developmental patterns. The same trends have also emerged in studies that have been conducted with children in other languages (including Arabic, French, Italian, and German; Al-Namlah et al., 2006; Alegria & Pignot, 1979; Carlesimo, Galloni, Bonanni, & Sabbadini, 2006; Hasselhorn & Grube, 2003; Longoni & Scalisi, 1994; Steinbrink & Klatte, 2008). Together, these facts suggest that the phonological-similarity effect is a robust phenomenon, and not list- or language-specific. Unfortunately, not all studies have controlled for other characteristics of the words which could have impacted their ease of recall, such as word length, concreteness, familiarity, age of acquisition, or imageability; this is generally less of an issue, however, in more recent studies. In terms of presentation, most studies have used either spoken words or nameable pictures (e.g., Conrad, 1971; Halliday et al., 1990; Hayes & Rosner, 1975; Hayes & Schulze, 1977; Hitch et al., 1991; Hitch, Woodin et al., 1989; Hulme, 1984, 1987; Hulme & Tordoff, 1989; Longoni & Scalisi, 1994; Palmer, 2000c). In a few cases, visual presentation has involved letters (R. M. Brown, 1977; Johnston, Rugg, & Scott, 1987) or written words (McNeil & Johnston, 2004). The response has most often required immediate serial verbal recall (e.g., Hitch, Woodin et al., 1989; Hulme, 1984; Hulme & Tordoff, 1989; Johnston, Rugg et al., 1987; Longoni & Scalisi, 1994; Palmer, 2000c) or a form of serial picture recognition (e.g., picture matching; Conrad, 1971, 1972; Hayes & Rosner, 1975; Hayes & Schulze, 1977; Hulme, 1987). A few studies 46  have used same-different judgements that required only memory for order (i.e., two lists of words or of pictures, with two items transposed for some trials; Carlesimo et al., 2006; Gathercole, Pickering, Hall, & Peaker, 2001; Steinbrink & Klatte, 2008) or a form of probed recall of only one item from the sequence (i.e., the experimenter points to a position in the sequence and the child must name the corresponding item; e.g., Henry, 1991b; Jarrold et al., 2000; Jarrold et al., 2008). Clearly, studies that have used the phonological-similarity paradigm vary considerably in their design, in terms of presentation and response modes, the specific stimuli used, the number and the length of trials, whether a span procedure or lists of fixed length were used, etc. This makes comparisons across studies most difficult, as there are very rarely two studies that are truly identical in terms of the methods. Nonetheless, some consistent findings do emerge. First, both the mode of presentation (spoken words vs. nameable pictures) and the type of response (verbal or not) strongly impact at what age a significant phonological-similarity effect is found at the groupaggregate level. The bulk of the evidence suggests that the effect emerges earlier or is larger with spoken words than with nameable pictures (Conrad, 1971; Halliday et al., 1990; Hitch, Woodin et al., 1989; Hulme, 1984, 1987; Hulme & Tordoff, 1989; Longoni & Scalisi, 1994), and that it increases with age (Hitch et al., 1991; Hulme & Tordoff, 1989; Palmer, 2000c), although there is some controversy about this final point (e.g., Hasselhorn & Grube, 2003). The nature of the response also seems to exert a powerful impact, with the requirement of full verbal serial recall most likely to induce a significant phonological-similarity effect at the group level even in the youngest children (Cowan et al., 1991) . One major problem with the accumulated data is that very few studies have left the children to their own devices when it comes to remembering nameable pictures. In most cases, one of three things occurred: i) the experimenter named the pictures as they 47  were presented; ii) the participants were instructed to label the pictures either aloud or subvocally when they appeared; iii) the participants were told to remain silent during the presentation of the pictures. Hence, there exists little data indicating what children at different ages prefer to do. Also, very few studies were nonvocal even when nameable pictures and silent presentation were used, as in many cases the required response was verbal serial recall. For these reasons, it is very difficult to infer at what age the phonological-similarity effect emerges in a nonvocal task requiring the serial memory of nameable pictures, although it does seem to be later than when spoken words are presented—particularly if spoken words are combined with verbal recall. There are, however, three studies where pictures were presented in silence and children’s behaviours were unconstrained. McNeil and Johnston (2004) found that a group of 7- to 8-year-olds (M age 7;7) presented with significant aggregate phonologicalsimilarity effects with immediate verbal recall regardless of whether they had to remember sequences of nameable pictures, spoken words, or written words. Similarly, Steinbrink and Klatte (2008) found significant effects of phonological similarity with immediate verbal recall in a group of 7- to 9-year old German children (M age 8;4) who had to remember sequences of either nameable pictures or of spoken words. Interestingly, no phonological-similarity effect appeared for these same participants with picture presentation and a nonvocal response requiring recognition memory for order (i.e., same-different judgement).4 This suggests that a nonvocal task may leave more room for individual variability and for developmental trends in task approaches to appear. Unfortunately, neither of these two studies performed any subject-wise analyses  4  The study reported here included both a group of poor readers and spellers and a group of above-average readers and spellers. Unfortunately, analyses were not always performed for each group separately. Nonetheless, the graphic representation of the results provided in the article show almost identical memory performance for phonologically-similar and dissimilar sequences of pictures by the group of good readers and spellers when the required response was ordered recognition (see Steinbrink & Klatte, 2008, Figure 1).  48  or used any additional indicators of rehearsal. Perhaps the most reliable data come from a longitudinal study by Palmer (2000c; Experiment 2) which used silent presentation of pictures, immediate verbal recall, and also looked at subject-wise data. Her results suggest that by the age of 7 a majority of children were susceptible to phonological similarity. They also point to much individual variability in task approach based on the presence of either phonological-similarity or visual-similarity effects in children aged 5 through 8 (see below). Hence, most children may begin to show a decrement in their memory performance for sequences of pictures that have similar-sounding labels compared to sequences with dissimilar labels around the ages of 7 or 8 years. This effect may emerge even later, however, with a completely nonvocal task, although it is difficult to draw strong conclusions based on a single study. Additionally, the rare studies that have looked at subject-wise data in children or adults speak to high levels of variability which can be masked by group results when they are generalised to all individuals (Campoy & Baddeley, 2008; Conrad, 1972; Hanley & Bakopoulou, 2003; Logie et al., 1996; Palmer, 2000a, 2000c). This makes it very likely that "studies that fail to take account of individual differences in the strategies adopted may be presenting a misleading picture of the nature of the cognitive architecture responsible for task performance" (Logie et al., 1996, p. 319). When the results from the numerous studies are combined, on the one hand, it seems possible to find a significant phonological-similarity effect in groups of children who, given their ages, are likely not using an active rehearsal strategy such as multiple list repetitions or cumulative rehearsal, particularly if spoken words are presented and verbal serial recall is required. On the other hand, there may be better convergence between the presence of a phonological-similarity effect and other indicators of rehearsal  49  (such as observation and self-report) in a nonvocal task, although this possibility remains mostly speculative given the scarcity of the evidence. Nonetheless, complementary evidence supporting the hypothesis that the development of rehearsal contributes to the emergence of the phonological-similarity effect does exist. When young children are trained and instructed to use something akin to a rehearsal strategy (e.g., partial list repetitions, chunking, cumulative rehearsal) in immediate memory tasks, their memory performance generally improves, in particular for dissimilar lists, which is taken as indirect evidence that they were not resorting to such a strategy on their own. In parallel, the phonological-similarity effect generally emerges or becomes stronger as the result of training (Cowan et al., 1991; Hayes & Rosner, 1975; Johnston & Conning, 1990). For instance, Hayes and Rosner (1975) found that training 5-year-olds to use cumulative rehearsal in a serial matching task using nameable pictures improved item identification, but also provoked a phonological-similarity effect as a result of serial order errors which were more frequent for similar lists. The authors concluded that the phonological-similarity effect was attributable to "a differential loss of order information for the two list types" (p. 396). As another example, Cowan and et al. (1991) found that training 4-year-olds to produce cumulative list repetitions increased verbal recall particularly for dissimilar lists and produced a larger phonological-similarity effect than when the children were told to remain silent during presentation. Another set of interesting findings came from an early study by Conrad (1972), who found that if a group of 5-year-olds were told to label pictures at presentation, those children who were sensitive to phonological similarity showed, on the one hand better memory for dissimilar lists, and on the other hand worse memory for similar lists than those who did not show a sensitivity to phonological similarity. This fits with the proposal put forward by Cowan et al. (1987; see above) that the phonological-similarity effect emerges in children in large part as a result of developments in rehearsal which lead to 50  improvements in memory limited essentially to dissimilar lists. Additionally, this hypothesis is supported by the data from many other studies (e.g., Hulme, 1984; Hulme & Tordoff, 1989; Palmer, 2000c).  Hence, although the phonological-similarity effect may not be a perfect indicator of rehearsal particularly in memory tasks that use spoken words and verbal serial recall, most of the developmental work using this paradigm has assumed that rehearsal is at least partly responsible for the emergence of this effect. Even Hasselhorn and Grube (2003) who have insisted on the importance of redintegration processes in the phonological-similarity effect wrote at the end of a recent article that "the diminished PSE in 4-year-old children reported by Hulme (1984, 1987) might be accounted for by the fact that these children do not use rehearsal processes. This could explain the age-related increases in PSE that were found in these studies" (p. 151). The phonological-similarity effect was preferred over the word-length effect, which is now generally considered to be a poor marker of rehearsal, particularly as it is very susceptible to output effects (see above). The phonological-similarity effect also tends to be more stable both within and between participants (Logie et al., 1996). In addition, it presents the added advantage over other methods (e.g., articulatory suppression) of not imposing an additional load on the children which could potentially render the memory task very demanding. Finally, it offers the opportunity to let the children do what comes naturally to them, particularly in a nonvocal task which may provide a more sensitive context for the phonological-similarity effect to function as a good indicator of rehearsal. Before turning to the other indicators of rehearsal that were used in this project (namely observation and self-report), the next section will present a brief summary regarding the contribution of data regarding the effects of visual similarity  51  to the interpretation of what participants—particularly those who are not rehearsing— may be doing to remember sequences of visual stimuli.  1.4.3 Susceptibility to visual similarity in children and adults As previously mentioned, the impact of the visual similarity of to-be-remembered items has been used as a means to explore whether groups or individuals tend to rely on visual encoding. Within a specific study, the visual-similarity paradigm has been used either on its own or in combination with the manipulation of phonological-similarity. Most studies that have explored visual-similarity effects in children have used black and white line drawings of objects with elongated shapes and dissimilar labels (e.g., knife, pen, rake), and emphasised the visual similarity by presenting the objects drawn at the same angle (Hayes & Schulze, 1977; Hitch, Halliday, Dodd et al., 1989; Hitch et al., 1988; Hitch, Woodin et al., 1989; Longoni & Scalisi, 1994; Palmer, 2000a, 2000c). The visually-similar sets tend to contain many objects that could fall into the broad category of tools and, as such, are likely to be more semantically similar that the control sets. However, a few studies have used objects with round shapes (e.g., ball, face, plate, etc.) and found the same developmental trends (Longoni & Scalisi, 1994; Palmer, 2000b). In addition, it is difficult to predict what effect semantic similarity could have, either increasing confusion or providing participants with another strategy to enhance retrieval. A few studies with children have also used visually-presented letters to look at the visual-similarity effect (R. M. Brown, 1977; Walker, Hitch, Doyle, & Porter, 1994). In addition to the problems of controlling for letter-reading proficiency and letter knowledge for children of different ages, these stimuli are inherently verbal in nature for any literate participant as they are learned specifically within the context of a phonological recoding  52  task. In fact, naming letters is an overpracticed and highly automatic skill for children and adults who can read. Overall, the results were generally consistent. Young preschool children aged 3to 6-years tended to be sensitive to the manipulation of visual similarity (R. M. Brown, 1977; Hayes & Schulze, 1977; Hitch et al., 1988; Hitch, Woodin et al., 1989; Longoni & Scalisi, 1994; Palmer, 2000c; Walker et al., 1994) and a few studies have confirmed that these group-aggregate data held true for a majority of the individual children (R. M. Brown, 1977; Hayes & Schulze, 1977; Palmer, 2000c). In contrast, when younger and older children were included in the same study, the older groups (from about 10-years of age) did not generally show the impacts of visual similarity (Hitch et al., 1988; Hitch, Woodin et al., 1989; Longoni & Scalisi, 1994) unless they were required to perform articulatory suppression (Hitch, Woodin et al., 1989). Few studies have looked at the intermediate ages. However, the results from a recent project by Palmer (2000c) which combined a cross-sectional and a longitudinal study and also looked at subject-wise data suggest that about half of the 6- and 7-year-olds and a quarter of the 8-year-olds continued to show diminished recall for visually-similar sequences of nameable pictures compared to sequences of dissimilar pictures. Few studies have explored the impacts of visual similarity in adults. Baddeley (1966) tested how both visual similarity and phonological similarity influenced the recall of written words, and found only the latter to have a significant impact. However, this may have resulted at least in part from methodological issues, and Baddeley himself concluded that the effect of acoustic similarity was at minimum much stronger than that of formal similarity. More recently, Logie et al. (2000) set out to investigate whether there were any signs of visual encoding by adults in serial written recall tasks by manipulating both articulatory suppression and the visual similarity of visually-presented words and letters. 53  The adult participants were presented with written words that were all phonologically similar, but either visually similar or not (e.g., FLY, SHY vs. PI, RYE ) and with letters which were either similar or not in uppercase and lowercase (e.g., Cc, Ww vs. Dd, Qq). Altogether, the four experiments found independent effects of suppression and of visual similarity. The data hence converged to suggest that both visual and phonological encoding (or rehearsal?) were taking place and that it was possible to use visual encoding to recall serial information. The authors speculated that this storage system may have developed as part of the process of learning to read, and in fact may only apply to lexical material or letters. In addition, even when the participants were free to use a phonological approach to the task (i.e., in the ‘no suppression’ condition), some visual coding was apparently nonetheless occurring. The individual-level data patterns also largely followed the group trends. For older children and adults, such dual coding may occur only in situations where the task characteristics push participants to resort to visual encoding as well. For instance, in Experiments 1 and 2 of the Logie et al. (2000) study, the stimuli were printed words and the participants were required to produce written recall; this may have increased the reliance on visual traces, particularly given that the words were all phonologically similar. Also, in Experiments 3 and 4, the participants had to recall not only letter identity but also case; it could be that letter identity was stored phonologically whereas case was stored visually. Hence, these tasks likely involved multiple levels of coding: phonological, semantic, and visual. This may have been true within a participant across trials or truly in simultaneous fashion but, as the authors point out, it is hard to know which of these interpretations is correct. Nonetheless, both possibilities are supported by the self-report data obtained in an earlier study (Logie et al., 1996), where many participants claimed to have used more than one task approach, and many others reported changing strategies 54  between trials, depending on types of lists and presentation modalities. Finally, a recent study by Poirier, Saint-Aubin, Musselwhite, Monahadas, and Mahammed (2007) supports the results arising from Logie et al. (2000), as the adult participants clearly showed the impacts of visual similarity even when they were also obviously resorting to verbal encoding. Although younger children are most sensitive to visual similarity, at least under some circumstances, even older children and adults do show diminished memory for sequences of nameable items that look alike. In fact, evidence of multiple coding can be found in numerous studies with both children and adults (e.g., Hitch et al., 1988; Hitch, Woodin et al., 1989; Hulme, 1987; Palmer, 2000c; e.g., Schiano & Watkins, 1981). What remains unclear is whether this propensity to rely on multiple codes is always present but masked under some experimental conditions or by the predominant verbal strategy, or rather whether individuals tend to adapt their task approach dependent on task characteristics and demands. Of course, both these hypotheses could be correct. In development, there may indeed be a progressive change leading to the supremacy of verbal mediation. This issue is explored in the next section.  1.4.4 Likely developmental course of verbal mediation in shortterm memory Palmer (2000c) used both a cross-sectional and a longitudinal design to investigate phonological- and visual-similarity effects in immediate verbal recall of pictures. In the cross-sectional study (Experiment 1), the children were told to either remain silent or to name each picture as it was presented. The 3-year-olds showed neither a phonological-similarity effect nor a visual-similarity effect at the group level, and this was confirmed with subject-wise analyses. The group-aggregate data indicated that the 6- and 7-year-olds were sensitive to visual similarity. The 7-year-olds also showed a  55  phonological-similarity effect, whereas there was a trend for phonologically-similar lists to be recalled slightly less well than dissimilar lists for the 6-year-olds. When the two older groups were combined, the individual-level data showed considerable individual variability, with 16 of the 54 children showing only a visual-similarity effect, 12 only a phonological-similarity effect, 13 both effects, and 13 neither effect. There was also a surprisingly high level of correspondence between the encoding strategy inferred on the basis of recall differences between stimulus types (i.e., presence of visual-similarity and/or phonological-similarity effects), and what children reported they were doing when they were asked how they had tried to remember the pictures at the end of the final recall task. Hence, Palmer reported that of the 16 participants who presented with only a visual-similarity effect, 11 said they had tried to remember the pictures, and 4 said they had relied on both pictures and words. Of the 12 children who showed only a phonological-similarity effect, 11 said they had tried to remember the words and 1 reported remembering both pictures and words. Among the 13 participants who showed impacts of both types of similarity, 12 said they had tried to remember pictures and words, and one reported remembering the words. Finally, for the 13 children who showed neither effect, 9 said they didn't know what they had done to remember, and 4 reported remembering the pictures. Unfortunately no additional information was provided regarding the wording or the procedures used for this post-task questioning. This experiment is one of the few that looked at individual-level data for the phonological-similarity effect and the visual-similarity effect. These results indicate that group means may obscure much individual variability in terms of task approach, and that in this age range many children may still be relying on both visual and verbal strategies when asked to verbally recall a series of nameable pictures in order. Surprisingly, whether or not the children were told to label overtly or to remain silent did not impact the results. Additionally, this experiment also strongly suggests that self-report is a viable 56  approach with young school-aged children, and that it can provide informative complementary data.  Palmer (2000c, Experiment 2) also completed a cross-sequential study where she followed two overlapping cohorts (one year apart in age at the start) over a 3-year period. This provided data over four ages in a 3-year span. In this case the children were left to label or not at presentation based on their own preference. Unfortunately, no posttask questions were used. At the group-aggregate level, children at age 5 showed a visual-similarity effect only, compared to both a visual-similarity effect and a phonological-similarity effect at ages 6, 7, and 8. Encoding strategies were determined at the individual level based on the presence or absence of visual-similarity and phonological-similarity effects. These results indicated a developmental trend from no active strategy, to a visual strategy, followed by a mixed strategy, and finally preference for a verbal strategy (at age 5, 50% of children neither effect, 29% visual-similarity effect; at age 6, 33% neither, 22% visualsimilarity effect, 30% both effects; at age 7, 44% both effects, 34% phonologicalsimilarity effect; and at age 8, 28% both effects, 64% phonological-similarity effect). In fact, Palmer reported that 82% of the children in this study followed this developmental pattern. The percentages of children who showed neither effect decreased from 50% to 33%, 4% and 2% from ages 5 through 8 years, while the percentages of participants who presented with a phonological-similarity effect increased from 6% to 15%, 34%, and 64%. When Palmer’s results from both experiments are combined, they suggest that the developmental progression seems to be from relying on automatic activation, to more intentional reliance on visual, mixed, and then verbal strategies. Individual-level analyses and self-report data strongly suggest that children are not resorting to covert 57  rehearsal at the earliest ages, contrary to inferences based on results from studies based on group-level data, particularly with auditory presentation. However, given the way that the questions were asked or that the data were coded, it is not possible to differentiate between types of verbal strategies (i.e., labelling vs. rehearsal). The fact that there were no significant differences between the two labelling conditions (child told to remain silent vs. child told to label pictures) suggests that imposed labelling is not sufficient to cause a phonological-similarity effect, at least in this age range and with verbal recall. It may be, as Cowan et al. (1991) have proposed, that response mode is more powerful than presentation mode. Interestingly, labelling at presentation also did not seem to be related to the children's reported strategies, at least in a context where verbal recall was required.  Palmer (2000a, 2000c) proposed an interpretation for the developmental course of visual/verbal encoding of nameable picture stimuli in short-term memory tasks that is largely inspired by Baddeley’s working memory model. She argued that the key lies in the development of the central executive which allocates attention and controls inhibition, thus enabling children to move from no strategy (i.e., simply relying on automatic activation), to a visual strategy (i.e., allocating attention based on the mode of presentation), to a dual strategy with attention distributed between labels and pictures, to an adult-like verbal strategy where the label is the focus of attention and the visual representation is successfully inhibited. This developmental course is more gradual and comprises more stages than what had been previously suggested in the literature, with a hypothesised period of dual coding as a likely intermediate developmental stage for many children. Unfortunately, Palmer did not observe or report how many children in each agegroup labelled aloud during presentation for the cross-sequential study where the 58  children were left to do what they preferred. This is particularly unfortunate because she argued that the phonological-similarity effect depends on the child paying attention to the label rather than the picture, which makes the auditory representation the one that is more strongly activated. Having data on children's tendency to label could give support to this interpretation. The idea that the child must devote sufficient attention to the verbal representation or label for a phonological-similarity effect to emerge suggests that it depends on strategic processes. The question remains whether self-initiated labelling is sufficient or whether some form of rehearsal is necessary.  To summarise, there exists much evidence that the phonological-similarity effect is linked to verbal rehearsal in adults and children. In particular, the bulk of the data suggests that when pictures are used as stimuli in memory experiments, contrary to older children and adults, younger children are less affected by task manipulations which are purported to impact memory when the participant is encoding verbally. The absence of a phonological-similarity effect in young children is suggestive of a developmental change in the representation and maintenance of verbal information. Additionally, the presence of a strong visual-similarity effect only in younger children lends further support to the idea that they are less likely than older children or adults to verbally recode nameable pictures and to use rehearsal. Young children are assumed to be more ‘modality dependent’ and to rely on representations that are automatically activated. In contrast, older children and adults are assumed to “remember things in terms of verbal descriptions when they are able to” (Longoni & Scalisi, 1994, p. 69). These conclusions result from many different task manipulations meant to either favour or prevent verbal strategies, and from studies that have trained children to use more or less complex forms of rehearsal. One other possibility would be to observe directly what participants are doing. In addition, it would 59  be most informative to look not only at the group level, but also to include subject-wise analyses.  1.5 Contributions perspective  from  the  memory  development  As mentioned at the outset, observation and self-report have a long history coupled with an excellent track record in the memory development tradition. These methods have provided much insight into the development of rehearsal as a mnemonic strategy. As the work of Palmer reported above has shown, these methods have also started to make their way into studies that clearly fit within the memory models tradition. This section will focus first on a few additional developmental studies that have used immediate memory tasks combined with either observation or self-report. Then, more recent trends in the memory development literature will be highlighted. Very few studies that have applied short-term memory models to children have relied on direct observation. As briefly mentioned above, Hulme et al. (1986) observed whether children aged 4, 7, and 10 years named the pictures or showed any signs of subvocal activity while completing a memory for pictures task that required no talking on the part of the child (i.e., serial picture matching). They found that all three age groups showed a significant word-length effect. In addition, there was a strong trend for observed task-related speech (overt and subvocal) to decrease with age, from 5.25, 4.75, and 2.70 times over eight trials for 4-, 7-, and 10-year olds respectively. This decrease encompassed two developmental trends, with children progressively verbalising less, but also increasing their subvocal activity (i.e., mouthing). The authors did not provide data regarding how many of the children in each grade were verbalising, although they wrote that “many [italics added] of the 4-year-olds were observed to name the pictures overtly and none of them showed signs of subvocal activity” (p. 68). Finally,  60  they did not distinguish between types of speech activity, and in particular between labelling of single pictures and rehearsal. A few years later, Henry (1991a) observed how frequently children aged 5, 7 and 9 years (n = 30 in each group) resorted to rehearsal in an auditory verbal immediate serial memory task where the experimenter named the objects at presentation. None of the 5-year olds, only three of the 7-year-olds, and only ten of the 9-year-olds showed signs of using grouped rehearsal when they were provided no instructions on how to proceed. The criteria used for rehearsal was "spontaneously rehearsing either out loud, using fingers or whispering" (p. 501). On the one hand, this definition is imprecise and may actually include behaviours that were not rehearsal (e.g., ‘using fingers’ could simply reflect counting or place marking). On the other hand, some of the children, particularly the 9-year-olds, could have been rehearsing covertly. Nonetheless, this data does suggest that, at least under some conditions, many children in this age range will apparently not resort to cumulative rehearsal on their own, and that individual variability should be expected. Taken together, these two studies attest to the usefulness of observational data, and also reveal that it may be most informative to distinguish between labelling and rehearsal. A few studies have successfully used self-report. Allik and Siegel (1976) may have been the first to use the word-length paradigm with children. Five groups from prekindergarten to grade 5 had to remember long sequences of pictures with either one or two-syllable labels. The presentation rate was slow, with each picture shown for 2 seconds followed by a 4-second delay before the next picture appeared. Once the task was complete, the children were asked how they had remembered the pictures. They were classified as having used cumulative rehearsal if they "convincingly described or demonstrated aloud a cumulative rehearsal strategy" (p. 324).  61  The word-length effect was significant only for the two eldest age groups. Correspondingly, the percentage of children who reported or demonstrated cumulative rehearsal increased progressively with grade, reaching 50% and 82% for grades 3 and 5 respectively. Self-reports also indicated much variability in terms of strategy use, even for the older children, and that for many of them, cumulative rehearsal was difficult to implement. Of course, among the younger children, many could have been resorting to simpler forms of verbal strategies such as single-item labelling or multiple repetitions of item labels, but this cannot be confirmed. Finally, we have no information regarding how consistently cumulative rehearsal was used. More recently, Turner and et al. (2000) completed a study looking at the effects of lexicality and familiarity among 7- and 10-year-olds. In response to an open-ended question regarding what they had done inside their head to help themselves remember the words, the children reported using more cumulative rehearsal (vs. naming of single items once repeatedly, or any other strategy) with words than with nonwords. For real words, 69% of 7-year-olds reported having used cumulative rehearsal compared to only 22% of those presented with nonwords. The trend was in the same direction for 10-yearolds, but the levels of cumulative rehearsal were higher (84% for words, 53% for nonwords), and the age difference was statistically significant. This suggests that children tend to change their approach depending on task demands. Henry et al. (2000) also used self-report as converging evidence in a study using the word-length paradigm. They distinguished cumulative rehearsal (i.e., recitation of the memory list up to and including the currently presented item as each new item was presented) from naming (i.e., labelling each item as it was presented either once or multiple times, generally silently), and also documented other (visual or semantic) strategies. Among the 7-years-olds, 14% reported using cumulative rehearsal compared to 45% who reported naming, whereas the percentages for reported rehearsal and 62  naming among the 10-year-olds were 68% and 25%, respectively. The authors also stated that the older children were more likely to report cumulative rehearsal with auditory rather than visual presentation, and that no such difference was found for the 7year-olds. On many levels (i.e., convergence with evidence from the word-length effect, serial position curves, and response time data), this study strongly suggests that few 7year-olds but most 10-year-olds were using rehearsal in this task. It also demonstrates that labelling can be distinguished from rehearsal in children in this age range with careful questioning and coding of responses. Observable behaviours (mouth movements and audible rehearsal) were only considered to verify whether they contradicted the selfreports, and "on no occasion was the video information inconsistent with a child's reported memory strategy" (p. 7). This clearly lends credibility to self-report in children of this age range; yet, it does not allow for a more fine-grained analysis of strategy use taking into account consistency or variability across trials. Finally, when the studies of Turner et al. (2000) and Henry et al. (2000) are considered in tandem, the levels of reported cumulative rehearsal for children of the same ages are very different. This was the case despite many parallels between the studies in terms of the methodology (i.e., tasks, post-task questions, coding of self-reports), which were, after all, done by the same group of researchers. Hence, subtle differences between tasks (e.g., lists of words and nonwords vs. lists of short and long words) could result in children of the same ages resorting to complex rehearsal strategies more or less so. Hence, there is reason to believe that observational and self-report data could prove informative and reliable in an immediate memory task. These methods were preferred to think-aloud protocols or overt rehearsal in the present study. Asking participants to use overt rehearsal presents the potential disadvantages of having older participants do something unnatural or of rendering relatively automatic behaviours conscious. This is supported by data from at least one study, as a group of participants 63  in grades 3 to 8 who were simply told to remember as many words as possible did better than an overt rehearsal group of the same age (Ornstein et al., 1975). Combining observation and self-report should provide complementary data given, on the one hand, the likely tendency of some children to internalise their strategies and, on the other hand, the possibility that some children will provide incomplete or uninformative responses. In addition, detailed observational data on a trial-by-trial basis would make it possible to document consistency in strategy use or any changes depending on task or stimulus characteristics. In particular, it remains an open question if children may be more prone to rely on rehearsal in a verbal or a nonvocal task. Also, we do not know if they tend to switch their task approaches depending on whether they are attempting to remember dissimilar, phonologically-similar, or visuallysimilar sequences. That being said, there is ample evidence that children and adults use multiple approaches in reasoning and problem solving both between and within individuals and tasks (Coyle, 2001; Kuhn, 1995; Siegler, 1994, 1995). Robert Siegler and Deanna Kuhn have been at the forefront in rejecting the overly simplistic view that children switch from ineffective to more effective strategies (Kuhn, 1995; Siegler, 1994, 1995). Instead, they propose that developmental changes in strategy use involve continued coexistence of more primitive and more sophisticated strategies “with the frequency of use of each strategy ebbing and flowing with increasing age and expertise” (Siegler, 1995, p. 410). To some degree, this within-subject variability for a given task can be quite easily explained. Some trials within a task may be easier than others. Moreover, strategy variability may correspond to fluctuations in attentional control, changes in motivation, fatigue effects, or perceived difficulty, all of which could result in inconsistent deployment of a more effortful strategy over time. On the other hand, practice effects could allow a child to more successfully resort to more complex strategies over trials. 64  However, as Kuhn (1995) has argued, variability may be more intrinsically related to cognitive development, residing within the participant rather than being simply attributable to task or content effects. Similarly, according to Siegler (1994, 1995), variability in children’s thinking is much more pervasive than past depictions have suggested and corresponds to more than improvements attributable to practice effects or learning. Variability has been observed within an individual child solving related problems in many areas including arithmetics, conceptual development, number conservation, and memory. Cognitive development is hypothesised to involve changing distributions of approaches or strategies, with more advanced approaches progressively becoming more frequent, although less sophisticated tactics continue to predominate for a substantial period. In many cases, resorting to different strategies over trials appears to be adaptive, with task or trial characteristics making some strategies more likely than others to produce success, or at least requiring less effort for equal degrees of success. Strategy choice is apparently more often than not adaptive if not always optimal. Research in various cognitive domains has found that new strategies are rarely flawed in the sense of violating the basic goal of the task. In the field of memory development, a few studies have explored the use of multiple strategies within a single task and experiment. For example, McGilly and Siegler (1989) investigated the use of rehearsal in a serial recall task by 5- to 8-year-olds by combining data on recall, observable behaviours, and self-reports. As had been the case in other domains, they found that multiple strategies coexisted, with change occurring in terms of how frequently each strategy was used. The trend was toward increasing use of more complex strategies, for instance from no rehearsal to single rehearsal of lists items, or from single to repeated rehearsal of multiple items. These findings fit with Siegler’s hypothesis and suggest that earlier-developing memory strategies coexist and compete for use with more mature strategies over a prolonged period of time. 65  Taken together these studies offer strong support for the feasibility and advantage of using both observational and self-report data to document the use of rehearsal in individual participants with a specific focus on both within-child variability and possible changes with development.  1.6 Bridging the two perspectives to document and define rehearsal This project acknowledges the contributions of two rich and complementary research traditions and builds on them. I entered this problem space via the strategy lens, with a specific goal of documenting the use of rehearsal in individual children. It soon became clear that I was facing two hurdles. The first was conceptual or theoretical. What exactly was rehearsal? Was this term referring to the same phenomena when used within the working memory model and the memory strategy development traditions? The second was methodological. How best could I determine whether an individual child was rehearsing? How could I get the most reliable and valid data? Finally, did the measurement and conceptual issues interact in that indices of rehearsal were tied to a definition? I decided to face this challenge by exploiting the power of converging evidence. Specifically, this study combined multiple indicators of verbal rehearsal: the presence of a phonological-similarity effect, observations of behaviours indicative of rehearsal, and children’s reports of having used rehearsal. By doing so, it offers the possibility of addressing the theoretical implications of the convergence or the divergence of results, as well as considering the methodological issues that arise. First, these multiple indicators should provide data that can be juxtaposed with the various ways that rehearsal has been defined, from activation of phonological representations to cumulative rehearsal. In addition, this should make it possible to study more directly how  66  rehearsal and the phonological-similarity effect are related in developing children. No prior study has systematically verified whether the results obtained using the phonological-similarity paradigm correspond to conclusions based on observation of strategy use by children while the task was on-going. It is not clear that the three indicators of rehearsal are measuring the same thing, or correspond to the same construct. Under what task conditions, if any, is the presence of a phonological-similarity effect in a child an index of rehearsal? One possibility is that children who are apparently rehearsing present with a phonological-similarity effect, and those who are not rehearsing do not show this effect. Such convergence in the evidence would strongly suggest that in both cases the same phenomenon is indexed, and that at least in this developmental window the phonological-similarity effect is a good indicator of verbal rehearsal. Another prospect is that it could be possible for children to be sensitive to phonological similarity although there is no indication that they are rehearsing. Here, the phonological-similarity effect could reflect other mechanisms such as the level of automatic activation, which could partly be task dependent (i.e., influenced by presentation or response modes). Alternatively, the effect could be linked to other strategies such as labelling, which could be sufficient to produce an effect of similarity. Under these circumstances, whether or not the data would be seen as converging would depend on how broadly one would define rehearsal. A final but unlikely possibility is that children who are clearly rehearsing are not sensitive to phonological similarity. If this pattern of data were to emerge, it would indicate that, at least in some circumstances, the phonological-similarity effect and strategic rehearsal are unrelated, or at best not strongly so. Relying solely on task manipulations such as item similarity does not provide a means to validate inferences regarding task approach, leaving much room for 67  speculation about what an individual is actually doing, consciously or not. That being said, there has been considerable scepticism, and even occasional derision, regarding the usefulness or validity of observational data (e.g., Hulme et al., 1986). There is no denying that observation can provide at best an incomplete picture of strategy use, and that observable behaviours do not necessarily reflect mediation. Part of this criticism can be addressed by using conservative and systematic coding. But more important, greater confidence can emerge by combining observational and self-report data. Self-report has proven informative and veridical in numerous studies (Allik & Siegel, 1976; Bray, Huffman, & Fletcher, 1999; Campoy & Baddeley, 2008; Hanley & Bakopoulou, 2003; Henry et al., 2000; Logie et al., 1996; McGilly & Siegler, 1989; Palmer, 2000c; Turner et al., 2000; Winsler & Naglieri, 2003), but one can only draw conclusions from what participants say, not what they omit or cannot verbalise. Also, one must be careful not to lead children to provide the responses we hope for or expect. Nonetheless, as Logie and his collaborators concluded, "a record of reported strategies might add greatly to the confidence placed in the interpretation of the data obtained" (Logie et al., 1996, p. 320). The various measures obviously present with strengths and weaknesses, but there is every reason to think that the contribution of the sum will be greater than that of the parts. The present study aims to tackle these methodological and theoretical questions head on. Much remains to be learned regarding how children actually go about many memory and problem solving tasks. There is nonetheless considerable evidence supporting that both task-related and within-child variables will influence how each child approaches a task. The child’s decisions or default solutions are likely based on a range of possible reasons, including: how the task may bias children towards a particular approach and the opportunity it provides for them to adjust over trials; what a given child habitually does to solve similar problems, what he thinks works best, what is least 68  effortful; what a child’s particular aptitudes are and what options are available to him, etc. Also, the stability or consistency within and across participants has been much exaggerated which points to the importance of looking at subject-wise data and relating it to characteristics of individual participants as possible explanations. The next section addresses the possible link between the developments of language and rehearsal.  1.7 Individual differences in language abilities and rehearsal Many within-child variables have been considered in prior research looking at verbal mediation in memory tasks. These include age, global processes (i.e., processing speed, working memory, or executive functioning; Kail, 2004), task-related knowledge (both in terms of content and procedure), and general intelligence, which have all been found to be positively related to strategy development and use (for reviews, see Bjorklund & Douglas, 1997; Cowan, 1997). Interestingly, language ability has been afforded very little attention. Given that verbal encoding and verbal rehearsal are undoubtedly language-based processes or strategies, one might expect language ability to be an important predictor variable in terms of an individual’s proclivity to rely on verbal rehearsal both in a developmental perspective and in the moment. Language knowledge and accessibility could play a role at encoding and output (i.e., redintegration), as well as in terms of the efficiency of rehearsal, as all of these steps require language knowledge or production (Tehan & Lalor, 2000). There is only limited evidence that children with better language abilities may be more likely to use verbal strategies either at earlier ages, or more advanced strategies than peers. In particular, very few studies have considered whether language ability  69  could influence the timing of the development of rehearsal, although there are intimations in the literature that this may be the case. For example, Henry et al. (2000) found that the 7- and 10-year-olds who reported having relied on cumulative rehearsal in an immediate recall task achieved significantly higher vocabulary scores than those who reported having used naming. Another study which hinted at such a relationship is that of Palmer (2000a), who found that performance on a standardised test requiring children to produce oral definitions (the Mill Hill Vocabulary Scale) was positively and significantly correlated to the size of the phonological-similarity effect in a picture memory task with serial verbal recall for children aged 5 to 8 years. This suggests that children with stronger language skills were also more likely to resort to verbal labelling of the pictures, and perhaps to use rehearsal. Interestingly, the nonverbal IQ measure (Raven’s Coloured Progressive Matrices) also correlated significantly with the size of the phonological-similarity effect. In fact, performance on both the nonverbal IQ and the language measure were themselves highly correlated, which leaves open the question of whether these correlations reflect a more general relationship with intelligence or if there is something specific about language ability. Also, a recent study with college students found tentative evidence (given the small sample) that more sophisticated strategy use may be related to language proficiency in this population (verbal SAT scores; McNamara & Scott, 2001). Finally, in a study by Al-Namlah et al. (2006) with children aged 4 to 7, the frequency of use of self-regulating private speech during a problem solving task was positively correlated, on the one hand, with receptive vocabulary levels and, on the other hand, with serial verbal recall of pictures with dissimilar-sounding labels. The details regarding the possible influence of language ability on the development and use of verbal rehearsal have not been mapped out. However, there are a few possible explanations. For instance, Brooks and MacWhinney (2000) argued 70  that phonological representations underlying word generation change qualitatively with development from holistic retrieval to increased emphasis on retrieval from the word's onset. The combination of this change in structure and a developmental increase in processing speed (i.e., a change in process) is presumed to result in speeded lexical access, allowing children to speak more rapidly. This would seem to have direct implications for the efficiency of rehearsal. Alternatively, Palmer (2000c) proposes that individual differences in young children’s propensity to focus on either visual or verbal means for maintenance of nameable pictures could depend on the relative strength of visual or verbal activation in long-term memory, and the capacity to attend to one representation while inhibiting another. Both of these factors could be related to a child’s language ability. They could also influence whether a child would be more or less likely to rely on labelling or rehearsal as active strategies. Finally, lexical knowledge is assumed to impact the memory process in two ways: either directly by activating specific items and associations among items in semantic memory and thereby facilitating retrieval; or indirectly by providing a means to impose structure on the materials via strategies (Bjorklund, 1987; Bjorklund, MuirBroaddus, & Schneider, 1990; Bjorklund & Schneider, 1996; Schneider, 1993). These impacts could vary between participants depending on the relative strength of their language abilities, and consequently impact the degree to which they resort to verbal mediation. To some degree this has been explored in studies that manipulated the familiarity of the to-be-remembered stimuli (i. e., word frequency and lexicality effects; e.g., Majerus & Van der Linden, 2003; Turner et al., 2000) and found significant impacts on recall. However, some theorists downplay the relationship with language, rather emphasising the articulatory or motor speech aspect of rehearsal: 71  It just so happens that when a short-term ‘memory’ task involves verbal materials, there exists a particularly rich and multifarious set of skills and habits involved in language-use that lends itself very well to the job of reproducing the order of the stimuli (Macken & Jones, 2003). Thus, the use of language skills (e.g., speech) merely constitutes a restricted example of a general strategy of coopting motor skills to meet the demands of a short-term ‘memory’ task. (Jones et al., 2006, p. 279) Both views would, in fact, be compatible with the accumulated evidence suggesting that speech rate may play in a role in the efficiency of rehearsal. In fact, as the work of Cowan et al. (1998) highlights, measures of speech rate may encompass age-related changes in speed not only of word production but also of lexical access. Similarly, Tehan and Lalor (2000) suggest that lexical memory access may play a major role in memory span. It could also play a role in whether rehearsal is easy or effective. This hypothesis, however, needs to be tested more directly. As things stand, most of the existing evidence corresponds to a positive linear relationship between speech rate and the number of words recalled, which has been demonstrated in numerous studies involving both children and adults particularly when correlations were based on group means (e.g., Gathercole & Adams, 1994; Gathercole et al., 1994; Henry, 1994; Hitch, Halliday, Dodd et al., 1989; Hitch, Halliday, & Littler, 1989; Hitch, Halliday, & Littler, 1993; Hulme et al., 1984; Hulme & Tordoff, 1989; Johnston & Anderson, 1998). From this data, it has been presumed that rates of subvocal rehearsal mediated the relationship between speech rate and span. However, there is most often no independent evidence that the participants (particularly in the case of young children) were actually resorting to a rehearsal strategy. Finally, if a relationship is demonstrated between speech rate and rehearsal, it likely reflects contributions from word production and lexical access—which brings us back into the realm of language. Hence, despite limited empirical data as of yet, there is nonetheless reason to speculate that either general language abilities or more specific abilities related to lexical retrieval and production could play a role in individual differences in the development and use of rehearsal. 72  The current project fits within a complex model that acknowledges that multiple interacting factors influence performance on any cognitive task. These include, at minimum, processing resources, knowledge, prior experience with the task, task approach or strategies, and metacognitive knowledge (Colozzo, 2005). It also represents a shift in primary focus from the performance to the process at the level of individual participants, by documenting in detail how children go about remembering sequences of words and nameable pictures. This will be accomplished by collecting a combination of performance, observational, and interview data, by using a modified microgenetic approach that looks at strategic behaviour over multiple trials, and by manipulating task and stimulus characteristics in order to obtain both direct and indirect evidence of verbal mediation. The overarching goals are twofold: first, to establish how best one can determine whether a given child is using verbal rehearsal; second, to contrast the multiple levels of evidence and revisit what is meant by the term rehearsal. To serve these larger goals, this study attempts to answer the following specific questions.  73  1.8 Research questions Question 1: To what extent do children in grades I to IV use strategic rehearsal in two immediate serial memory tasks, memory for spoken words and memory for nameable pictures? •  According to what measure? o  Does the answer depend on the indicator: presence of a phonologicalsimilarity effect (and/or a visual-similarity effect), observation of labelling and rehearsal, reported use of labelling and rehearsal  •  Under what conditions (i.e., influence of task and conditions)? o  Presentation and response modes: Auditory-Verbal Recall vs. Visual Recognition  o  Characteristics  of  the  stimuli:  word-picture  pairs  with  either  phonologically-similar labels and dissimilar pictures, visually-similar pictures and dissimilar labels, or dissimilar labels and pictures. •  With what degree of consistency within child from task to task?  Question 2: What within-child characteristics are associated with the use of strategic rehearsal in the immediate serial memory tasks? Variables considered: o  Age/grade  o  Nonverbal intelligence  o  General language ability  o  Language production/fluency  74  2. METHODS 2.1 Participants Third parties who had an existing relationship with the families contacted parents of children in grades I through IV to invite them to participate in the study. This developmental window was expected to allow for both individual variability and changes with increasing grade. Recruitment covered a broad geographic area around Vancouver (including the South Delta and Fraser Valley regions). The targeted children were those deemed by their teachers and their parents to have no developmental, behavioural, or learning difficulties. In addition, every attempt was made to find children who could be characterised as ‘mostly monolingual’. This was operationalised as children whose first language was English and who primarily spoke English both at home and at school. This restriction to mostly monolingual children was motivated on two levels. First, children who are growing up bilingually may present with a strategic or a metacognitive advantage that could manifest itself in their performance on memory tasks (Colozzo, Marcoux, & Johnston, 2005; Marcoux, Colozzo, & Johnston, 2005). On the other hand, assessment of language skills in English may not reflect the true language abilities of such children, particularly if they have been learning English as a second language for a relatively short time. Hence, the relationship between language abilities and strategy development and use could be different between groups of monolingual and bilingual children. In total, the parents of 46 children consented to have them participate in the project. One boy in grade II declined to continue after the first session, and a boy in grade I was withdrawn from the project by his parent. Another two girls completed all the tasks but were not included in the final analyses because of documented or suspected learning difficulties. The other 42 children participated willingly and had no history of  75  developmental or learning difficulties. All offered good collaboration and happily returned for the succeeding sessions. Most of the participants expressed that they had enjoyed taking part in the study. Although 5 children were exposed to an additional language either at home or at school, all spoke English as a first and dominant language. The children were assigned to the highest grade level where they had completed at least half a year of schooling. The vast majority of them were tested in the second half of the school year. The four who participated in the first six weeks of class were placed in the group for the previous grade. Altogether this study includes data for 42 children from grades I to IV, with 10 children in each of grades I, III, and IV, and 12 children in grade II. Approximately equal numbers of girls and boys participated in each grade, 19 boys and 23 girls in total. The children came from varied socio-economic backgrounds, with 23 mothers (55%) reporting high school education, 6 (14%) having received some additional schooling, and 13 (31%) being university-educated.  2.2 Experimental protocol Testing took place in a quiet room either at the child’s home or at the school. A single examiner (the author) saw each participant individually for three sessions of approximately 45 minutes each, generally spread over a 2- to 3-week period. The children completed a variety of memory and problem-solving activities, as well as standardised tests of nonverbal cognition and language. This report focuses on a subset of these tasks, specifically on two immediate serial memory tasks, the standardised language and cognitive testing, and a rapid automatic naming task. The immediate serial memory tasks required the children to remember lists of words or pictures in the same order as they had been presented. In order to tap into the strategies that the children were using to remember words and pictures, the  76  experimenter videorecorded them while they were completing the memory activities and also asked them to explain what they had done to remember. In addition, the data regarding children’s cognitive and language abilities was meant to document individual differences which could be related to the development and use of rehearsal. The study included two measures of language ability, a measure of general language ability as well as a more specific measure of language production or fluency.  2.3 Immediate serial memory for words and pictures 2.3.1 Summary ƒ  Two serial memory tasks made up the core of the experiment. The to-beremembered items consisted of either words or pictures. In Auditory-Verbal Recall, the children heard lists of familiar one-syllable words which they had to verbally recall in order. In Visual Recognition, they saw a series of familiar pictures presented silently, one at a time, and then had to select the correct pictures from a larger set and mark them in the order that they had been presented. Hence both tasks required memory for order.  ƒ  Although they were never presented together, the words and the pictures correspond to pairs of stimuli sharing the same referent (i.e., the word cat, and a picture of a cat).  ƒ  Each memory task included three conditions based on the characteristics of the word-picture pairs. The phonologically-similar (PS) condition consisted of spoken words that sounded alike or of dissimilar pictures with labels that sounded alike (e.g., can, cat, hat). The visually-similar (VS) condition consisted of dissimilar-sounding words or of similar-looking pictures with dissimilar-sounding labels (e.g., match, nail, pen). The dissimilar condition consisted of dissimilar-sounding words or of dissimilarlooking pictures with dissimilar-sounding labels (e.g., cup, door, drum).  77  ƒ  Each condition consisted of trials that increased in length from three to five words or pictures. Those children who were near ceiling (see below for details) completed additional trials of six words or pictures in length.  ƒ  The children completed Auditory-Verbal Recall and Visual Recognition in counterbalanced order. Within each task, the dissimilar condition was always presented first, followed by either the PS or the VS condition, in counterbalanced order.  ƒ  The participants were left to do what came naturally to them. In particular, precautions were taken both in the design and the wording of instructions in order to avoid biasing them towards using verbal strategies.  ƒ  A receptive pretest preceded the immediate memory tasks. Its purpose was to familiarise the children with the spoken words as well as with the target labels that went with the pictures they would see. It did not require the children to label the pictures.  ƒ  A brief post-task questionnaire followed each serial memory task. The children reported what they had done to remember the words and the pictures.  2.3.2 Task design The design of the memory for spoken words and the memory for pictures tasks was based on prior research, the specific goals of the study, as well as results from piloting. The major issues involved choosing the specific stimuli (i.e., word-picture pairs), determining the number, length, and composition of the trials, choosing between a span or a fixed-list-length procedure, deciding on presentation and response modes, designing a pretest, and insuring that neither presentation order nor instructions biased children towards verbal mediation. As will soon become evident, although there are no  78  standard procedures for looking at either phonological-similarity or visual-similarity effects in children, there are ‘modal’ ways of doing so.  2.3.2.1 Stimuli Four sets of word-picture pairs were developed, the first for practice trials, and three experimental sets, one each for the dissimilar, PS, and VS conditions. The wordpicture pairs all consisted of one-syllable words that were likely to be familiar to all children and easily identifiable black and white line drawings. Prior studies exploring the effects of phonological similarity and visual similarity have used many of the same words. Nonetheless, a few modifications were made for this project. In particular the three experimental sets of word-picture pairs were matched on age of acquisition (AoA), frequency, and mean length of words (in ms). There is a large body of evidence suggesting that AoA and frequency exert important effects on naming latencies in both children and adults (e.g., Barry, Morrison, & Ellis, 1997; Bates et al., 2003; D'Amico, Devescovi, & Bates, 2001; Ellis & Morrison, 1998; Meschyan & Hernandez, 2002; C. M. Morrison & Ellis, 2000; e.g., C. M. Morrison, Ellis, & Quinlan, 1992; Snodgrass & Yuditsky, 1996; Székely et al., 2003). In addition, as was reviewed previously, word length (e.g., Baddeley et al., 1975; e.g., Henry et al., 2000; Hulme et al., 1986) and word knowledge can both have a major impact on how much one remembers and on strategy use (e.g., Bjorklund, 1987; Bjorklund et al., 1990; Bjorklund & Schneider, 1996; Majerus & Van der Linden, 2003; Turner et al., 2000). Although all forms of similarity impact memory, the most detrimental effect occurs when the vowel is shared between words (Nimmo & Roodenrys, 2004, 2005). This is predicted by psycholinguistic models of short-term memory (e.g., Gupta & MacWhinney, 1997) that integrate concepts such as sonority (i.e., the energy of a speech trace) or that take into account not only effects of serial order within a list but also within a word (i.e.,  79  constituents within a word; onset, the initial consonant or consonant cluster, vs. rime, the vowel and any following consonants) to predict errors. The sonority principle is “the idea that the energy of a speech trace increases to a peak at the vowel and then decreases” (Nimmo & Roodenrys, 2005, p. 774). Hence the shared vowel has the strongest impact on memory for order “as this is the most strongly represented phoneme in the speech trace" (p. 774). On the other hand, words that rhyme tend to show a relative benefit for item recall but not for order recall, as this specific form of phonological similarity seems to help retrieval by providing a category cue (i.e., commonalities  between  the  items  that can be extracted and used as a recall cue; Gupta et al., 2005; Nimmo & Roodenrys, 2004). This category cue can boost item recall to a degree, and is most beneficial when the lists are drawn from an open set as, in such cases, different cues characterise different lists (Fallon et al., 1999; Fallon et al., 2005; Gupta et al., 2005; Nimmo & Roodenrys, 2004). Many of the studies which have looked at phonological-similarity effects both in children (e.g., Conrad, 1971; Halliday et al., 1990; Henry, 1991b; Hitch et al., 1991; Hulme, 1984; Jarrold et al., 2000; Jarrold et al., 2008; Johnston & Conning, 1990; Palmer, 2000c) and in adults (e.g., Baddeley, 1966; Cowan et al., 1987) have used word lists that share the same vowel but are otherwise less consistent in their phonological overlap, with some word pairs being alliterative and others rhyming (e.g., cat, map, can, man). Gupta et al. (2005) used the term canonically similar to describe such lists “as an acknowledgment that this is the type of similarity utilized in some of the seminal studies that originally demonstrated the classic PSE” (p. 1003). The current project followed the dominant trend, as it proved easiest to build a fairly large set of phonologically-similar word-picture pairs that were imageable, concrete, likely to be familiar to young children, and had one-syllable labels by choosing words with the vowel /æ/ in the middle. Also, although these canonically-similar lists are 80  less consistent in terms of their phonological overlap (vs. CV_, _VC, C_C), they do present with the important feature of sharing the vowel, which seems to have the most detrimental effect on memory (Nimmo & Roodenrys, 2004, 2005). Regarding visual similarity, most studies with children have used black and white line drawings of objects with elongated shapes and dissimilar labels (e.g., knife, pen, rake), and emphasised the visual similarity by presenting the objects drawn at the same angle (Hayes & Schulze, 1977; Hitch, Halliday, Dodd et al., 1989; Hitch et al., 1988; Hitch, Woodin et al., 1989; Longoni & Scalisi, 1994; Palmer, 2000a, 2000c). Visual similarity was manipulated in the same way for the present study. The visually-similar sets tend to contain many objects that could fall into the broad category of tools and, as such, are likely to be more semantically similar than the control sets. However, a few studies have used sets of round shapes (e.g., ball, face, plate, etc.) and found the same developmental trends (Longoni & Scalisi, 1994; Palmer, 2000b). In addition, it is difficult to predict what effect semantic similarity could have in a serial memory task—either increasing confusion or providing participants with another strategy to enhance retrieval. Given that Visual Recognition involved the silent presentation of pictures, it was most important to choose pictures for which children were likely to already have a label and to strive for a high level of picture-name agreement both within and across participants. Hence, pictures that were likely to evoke more than one label were excluded (e.g., rat, because of mouse; mat because of rug and carpet). Piloting suggested that some of our initial choices were not meeting these two criteria. For instance, many children were saying nail when shown the picture of a screw. Hence, this item was replaced by nail although this word has a familiar homophone. Additionally, groups of words with high semantic similarity (e.g., spoon, fork, knife; comb and brush) or pairs with high degrees of association (cat and dog) were avoided, as these characteristics could enhance the memory of such items (e.g., Frankel & Rollins, 1985; 81  Schneider, 1986). In the end, given the primary concerns of familiarity and picture-name agreement, it proved impossible to exclude words with homophones, particularly considering that some of the words can be verbs as well as nouns (e.g., can, match, or tap). All but one of the word-picture pairs (i.e., arm) correspond to concrete whole objects or animals. Every effort was made to choose words from the picture-naming literature, which provides data on a large number of variables that have been found to affect naming latency (i.e., age of acquisition, concreteness, familiarity, imageability, name-agreement, etc.). Whenever possible, the picture stimuli were taken from the International Picture Naming Project database (IPNP; Székely et al., 2004) available as freeware at http://crl.ucsd.edu/~aszekely/ipnp/. The IPNP database consists of 520 black-and-white drawings of common objects, including 174 pictures from the original Snodgrass and Vanderwart (1980) set, and additional items from various sources. It also provides data on all these words, including the dominant response and the number of alternate names obtained in timed picture-naming by adults (Székely et al., 2005; Székely et al., 2003). Black and white line drawings (rather than coloured pictures) were chosen, as a much larger selection of pictures was available. Also, most of the existing research involving children of similar ages has used black and white line drawings (some notable exceptions, Conrad, 1971, 1972; Ford & Silber, 1994). The pool for each condition was increased up to 16 items from the common size of 6 to 9. Given the high number of trials and the fact that all children completed two memory tasks with the same word-picture pairs as stimuli, larger sets were preferred in order to minimise the repetition of items, and thus make the task less tedious for the children. This decision also made it possible to construct different nine-picture response cards for each trial in the Visual Recognition task (see section 2.3.2.4 below). For each task, 36 stimuli were required to build the nine core trials of 3 to 5 words/pictures for a 82  given condition (three trials of 3 words/pictures, three of 4 words/pictures, three of 5 words/pictures), and another 18 were needed to build three additional trials of 6 words/pictures). By increasing the word/picture pool for each condition to 16, each word or picture appeared as targets two or three times in total over the core 27 trials of three to five items presented in either the Auditory-Verbal or the Visual task. Research with both adults and children suggests that increasing the number of items per condition should nonetheless result in a phonological-similarity effect (i.e., a decrease in words or pictures remembered for phonologically-similar compared to dissimilar conditions; Coltheart, 1993; Gathercole et al., 2001; Gupta et al., 2005; Nimmo & Roodenrys, 2004), particularly in individuals who are resorting to verbal rehearsal (Campoy & Baddeley, 2008; Hanley & Bakopoulou, 2003; Logie et al., 1996). These decisions resulted in the following word-picture sets: ƒ  Practice set: 10 early-acquired5 word-picture pairs with dissimilar-sounding words and dissimilar pictures: ball, bed, car, cheese, dog, eye, moon, shoe, tree, and watch.  ƒ  Dissimilar set: 16 pairs of dissimilar-sounding words and dissimilar pictures: ant, bench, bridge, cup, door, drum, heart, house, kite, leaf, rope, scarf, shell, tent, train, and wheel.  ƒ  Phonologically-similar set: 16 pairs of similar-sounding words containing the /æ/ vowel and dissimilar pictures: bag, bat (animal), can, cat, fan, hand, hat, lamb, lamp, man, map, mask, pan, tag, tap, and van.  ƒ  Visually-similar set: 16 pairs of dissimilar-sounding words and similar-looking pictures that correspond to objects with approximate elongated rectangular shapes. Visual similarity was emphasised by illustrating all objects on the diagonal: arm,  5  According to the Bates-MacArthur CDI American English norms (Dale & Fenson, 1996), 75% of children spoke the word prior to the age of 30 months. Norms available online at http://www.sci.sdsu.edu/cdi/lexical.  83  bone, brush, cane, corn, flag, flute, key, match, nail, pen, rake, saw, spoon, sword, and worm. In practice, in Visual Recognition, the pictures from the phonologically-similar set would ‘sound alike’ only if a child labelled them. On the other hand, in Auditory-Verbal Recall, the words from the visually-similar condition should be equivalent to those in the dissimilar condition given that only spoken words (i.e., no pictures) were presented. This prediction was based on the fact that the visual similarity of the items had been accentuated in the pictures, and was unlikely to be as strong in the Auditory-Verbal task even if a participant were using a memory strategy based on visual imagery.6 The three experimental sets were matched in terms of age of acquisition (AoA), frequency, and mean word length (in ms). For AoA, production data from the MacArthurBates Communicative Development Inventories (CDI) lexical development norms in American English (Dale & Fenson, 1996) were converted to a four-point scale based on the youngest age at which 75% of the children were reported to produce the word: before 24 months; between 24 and 30 months; after 30 months; not in the database (i.e., presumably spoken beyond infancy). Objective AoA data from a picture-naming study by Morrison, Chappell, and Ellis (1997) were used as complementary evidence to support the ratings from the CDI, in particular when the words did not appear in the inventories. Estimated frequency was obtained from The Educator’s Word Frequency Guide (WFG; Zeno, Ivens, Millard, & Duvvuri, 1995). The WFG is based on a large sample and a broad spectrum of reading materials used in schools and colleges throughout the USA, 6  The VS trials are in fact neutral with auditory presentation, as there are no pictures and the labels do not sound alike. The only way that a visual-similarity effect could emerge would be if a participant systematically and predominantly relied upon a visual imagery strategy. This seems particularly unlikely given that the task requires verbal serial recall, and that the auditory presentation provides at minimum an acoustic trace of the to-be-remembered list. In addition, the visual similarity of the items is debatable when it comes to imagery produced by the participants. On the other hand, in Visual Recognition, the VS condition cannot be considered equivalent to a control condition as visual similarity (which has been exaggerated by the manipulation of the pictures) is present regardless of strategy at least at the recognition stage.  84  and provides frequency estimates per grade. For this study, data for grades I to IV were combined to obtain an estimate (U) of the type per million tokens weighted by an index of dispersion (D). Unfortunately, as this is an untagged frequency count, it is not possible to distinguish homographs (e.g., can, the object vs. the modal auxiliary). Despite this limitation, the WFG was the most recent and comprehensive source available for frequency estimates of materials in English written for young children. Statistical testing using one-way ANOVAs and planned contrasts confirmed that the three word-picture sets did not differ in terms of AoA, word length, or frequency (all omnibus ps > .29). In addition, piloting confirmed that all the words were likely to be part of the vocabularies of children in grades I to IV, and that participants should have no difficulty recognising the pictures. See Appendix A for detailed descriptive statistics about the word-picture pairs.  2.3.2.2 Number and length of trials completed by each child Prior research has found that groups of participants exhibit weaker phonologicalsimilarity effects in span compared to fixed-list-length procedures (Hitch, Halliday, Dodd et al., 1989). In a traditional span procedure, list length increases until the participant makes a predetermined number of errors at a given length (e.g., two out of three trials not remembered correctly). On the other hand, most studies using fixed-list lengths have used supra-span lengths resulting in approximately 50% of the items or the trials remembered successfully (e.g., Hayes & Rosner, 1975; Hitch et al., 1991; Hitch, Woodin et al., 1989; Hulme, 1987; Johnston, Rugg et al., 1987), and with all trials completed by all participants. The objective has been to achieve “intermediate levels of performance capable of showing sensitivity to experimental manipulations” (Walker et al., 1994, p. 76). Other research has found that, at an individual level, participants with lower spans are less likely to exhibit a phonological-similarity effect (Logie et al., 1996). This  85  could be partly related to strategy use (i.e., rehearsing or not), but also to the fact that in a traditional span procedure, individuals with lower spans complete fewer trials overall. Given that there is evidence that inter-item interference can take place over multiple trials (Cowan et al., 1987; Hayes & Rosner, 1975), this could reduce the interference effect (but see also Coltheart, 1993) or simply result in too little data for any differences between conditions to emerge. Hence, it seemed preferable for all children to complete equal numbers of trials. Given that children of different ages participated in this study, and also allowing for individual differences in memory ability within an age range, a modified fixed-listlength procedure was used. This procedure has occasionally been used in prior research (e.g., Logie et al., 2000). Each condition consisted of nine core trials that increased in length from three to five words or pictures, three trials at each length. For each task, the children completed 27 trials in total, divided into three 9-trial blocks, one block for each condition (dissimilar, PS, and VS). This procedure resulted in equal exposure to the words and pictures between participants, and to the word-picture pairs between Auditory-Verbal Recall and Visual Recognition. It was also meant to allow for variability in performance between children and between conditions, as well as to insure that all children could complete some of the trials at a moderate level of difficulty. This second condition was deemed important, as individuals tend to rely on a strategy most when a task is sufficiently challenging to warrant its use, yet not difficult to a point that could make using a strategy potentially futile (e.g., Guttentag, 1995; e.g., Schneider, 1999; Winsler & Naglieri, 2003). It may also be necessary for children to experience sufficiently high levels of success for a phonological-similarity effect to emerge (Johnston, Rugg et al., 1987). Hence it is important to come in at an optimal level of difficulty, but this level will depend not only on the demands of the task, but also on an individual’s level of ability on aspects relevant to the specific task (e.g., Newton & 86  Roberts, 2004). Piloting with children as young as 5 and 6 years of age did not indicate that participants would get discouraged because of the increase in difficulty; on the contrary, many seemed stimulated by the challenge. In addition, floor effects were not expected. Although ceiling effects were deemed unlikely particularly for the phonologically-similar condition, piloting did reveal that trials of six items in length could be necessary for some children. Consequently, provision was made for children who may have been at ceiling by having them complete additional trials of six words or pictures in length at the end of the experimental protocol.  2.3.2.3 Creating trials of three to six words and pictures Each condition (dissimilar, PS, and VS) required 12 trials of varying length: 9 core trials of three to five words or pictures, and an additional 3 trials of six words or pictures for children who may have been at ceiling. These 12 trials were created by randomly sampling without replacement until all the word-picture pairs had been used once. The procedure was then repeated until all trials for one condition had been constructed. Two lists of 12 trials were created for each condition, one list for each task, counterbalanced across participants (see Appendix B). Two lists of four practice trials (two each of two and three words/pictures) were constructed in the same way using word-picture pairs from the practice set. In order to create the trials for Auditory-Verbal Recall, three exemplars of each word were recorded by an adult female native speaker of Canadian English who spoke using list intonation. Cool Edit software (Syntrillim Software Corporation, 2000) was used to record the words in 16-bit format at a sampling rate of 44100 Hz. These recordings were then edited using a low-pass filter set at 10000 Hz. The best exemplar for each word was chosen in terms of intelligibility and prosody (i.e., naturalness), and also to insure that the words were of similar loudness overall. Loudness was evaluated using  87  both an objective criterion (RMS, the root mean square in dB, a measure of the mean energy or loudness over time) and perceptual judgements. The individually-spliced words were then strung together to produce trials of 3, 4, 5, or 6 words. Altogether, six lists of 12 trials (3 each at each length from three to six words) were constructed, one list per task (Auditory-Verbal Recall and Visual Recognition) for each condition (dissimilar, PS, and VS). Each trial began with an alerting signal (i.e., a referee’s whistle), followed 2 seconds later by the first word. Individual words in a trial were interspersed by periods of quiet in order to create a constant 2-second onset-to-onset interval. This presentation rate was chosen as it has often been used in studies using similar paradigms with children (e.g., Al-Namlah et al., 2006; Johnston, Johnson et al., 1987; McNeil & Johnston, 2004; Palmer, 2000c; Steinbrink & Klatte, 2008). This also insured consistency between tasks, as a shorter interval did not seem feasible for the visual presentation. Finally, this presentation rate allowed children some time to rehearse if they were inclined to do so. Each trial ended with a recall cue {ding} placed 2 seconds following the onset of the last word. Again, this was done to insure a parallel time course with the visual presentation as well as to indicate to the children that the trial was finished. For Visual Recognition, digital black and white line drawings were resized to 5 cm by 4 cm, and delimited by a box demarking the borders of the picture. Using the same procedure as described above, the individual pictures were then combined to make up the 12 trials (3 at each length of three to six pictures) for each of the six lists (i.e., two lists each for dissimilar, PS, and VS conditions). The task was designed as a slideshow using Microsoft PowerPoint 2003 (Microsoft Corporation, 1987-2003) presented on a tablet computer with a 14.1-inch (36 cm) widescreen. The line drawings appeared on a white background with the timing of the presentation within a trial 88  controlled automatically. Each trial began with an alerting signal (+) followed 1 second later by the first picture. Each successive picture was displayed for 2 seconds before disappearing and being replaced by the next or, in the case of the final picture, by a response screen. The pictures making up a trial were presented in a spatial left to right sequence. This presentation has been used most often in the literature. It also has the advantage of offering a spatial cue which parallels the temporal sequence of auditory presentation and serves as an important cue for order (Cowan et al., 1991). Spacing between the pictures varied depending on the trial length in order to occupy most of the computer screen width. The first picture was superimposed upon the alerting signal, but none of the pictures themselves spatially overlapped (see Appendix C for an example).  2.3.2.4 Presentations and responses The memory for words and the memory for pictures tasks were designed based on a few critical considerations. First, both tasks had to require memory of all items within a trial (words or pictures) and of order, as these requirements were most likely to induce verbal rehearsal. There is in fact some support that the necessity to remember all items in the correct serial order is most likely to push participants to use a verbal rehearsal strategy (Henry, 1991b; Henson, Hartley, Burgess, Hitch, & Flude, 2003). Second, given the particular focus of this project on verbal rehearsal and on possible individual differences, there had to be some opportunity for subject-wise variability in terms of task approach. As such, one of the memory tasks had to be potentially nonvocal (i.e., not require any talking). In addition, the overall design and the instructions had to avoid pushing the participants towards using verbal mediation if this was not their natural inclination; this was particularly important for the nonvocal task. These considerations were based on the fact that a greater reliance on verbal strategies is assumed to develop with age and experience. As such, any overt talking or emphasis on  89  verbal coding could impact children’s strategy choices. When considering how to meet these criteria, it became clear that developing a perfectly parallel situation between the verbal and visual tasks was not optimal. In terms of presentation, based on prior research, it was quickly decided that one task would use spoken words whereas the other would use nameable pictures. Letters, digits, and written words were deemed non-optimal given the children’s ages and because they are inherently verbal. For the responses, verbal recall was chosen for the auditory task, as this has been most widely used in the literature. Requiring verbal recall with auditory presentation also avoided crossing over to another modality. Determining a response for the visual task was more difficult, and in the end recognition was judged to be preferable to recall. Although nonvocal recall has been used in past studies with adults by having them write out their responses (e.g., Logie et al., 2000), this method is not appropriate with young children who may have very different written language abilities given their emerging literacy skills. Also, it would have increased the verbal nature of the task. Resorting to drawing also seemed unrealistic, as these skills may also vary widely among children. In addition, young children are not likely to understand that the quality of the production is irrelevant, and could get caught up in trying to draw well instead of being focused on recalling the items presented. In the end, serial recognition (i.e., pointing to the correct pictures in the order of presentation) was chosen as the response mode for the Visual task. This procedure actually involves both recognition and recall, as the correct pictures are selected among a larger set, but their order must be reconstituted. Quite a few studies have successfully used some version of serial picture recognition with both children and adults (e.g., Conrad, 1971; Hitch, Halliday, Dodd et al., 1989; Hulme, 1987; Schiano & Watkins, 1981). Requiring visual recognition with visual presentation insured that the participants could remain in the visual modality if they were predisposed to do so. If some children 90  continued to prefer nonverbal task approaches (i.e., relying essentially on the pictures), this was more likely to be observed with a nonverbal response. On the other hand, children primarily relying on verbal encoding would be required to either resort to dual encoding or to reconvert their responses visually in order to provide correct responses. Although recall is generally more difficult than recognition, the response screen in Visual Recognition could result in interference at the time of response, thereby increasing the level of difficulty somewhat. In addition, participants who spontaneously resort to labelling will have to convert their response back to its visual form. Although it is not possible to claim that the two tasks are equally difficult in terms of their processing demands, there is a good likelihood that they are similarly challenging, although for different reasons. This claim can to some extent be tested empirically by looking at the total numbers of words or pictures remembered correctly in each task. However, it is not of serious concern given that the aim of this project is not to contrast recall and recognition per se, but rather to examine how strategy use may be affected by task characteristics. For Visual Recognition, the response screen consisted of nine pictures separated by a black border and arranged in three rows of three pictures. The pictures were identical to those used in the presentation, except that they were resized to 4.4 cm x 3.5 cm in order to fit within the grid given the dimension of the computer screen (see Appendix C for examples or response screens). Participants marked their responses on the screen of a tablet computer using a stylus pen. In accordance with prior research (Henry et al., 2000), piloting indicated that even the youngest children in the study would be able to do the recognition task using the 3 x 3 layout, at least when the trial length was within their span range. As piloting suggested that some of the older children would successfully remember trials of five and six pictures longs, this justified using a response card of nine pictures. Finally, the 3 x 3 layout seemed to favour picture scanning 91  compared to a 2 x 4 configuration. A constant number of response pictures to choose from maintained consistency across different trial lengths, which seemed beneficial both for the children and for the analyses. Admittedly, this results in a different probability of responding correctly by chance depending on trial length. However, given that correct serial recognition is required, longer lists result in increased probability of order errors, hence likely counterbalancing the possibility of increased correct item selection by chance. A different response card of nine pictures was created for each trial. Each card contained the targets as well as the number of foils needed to get to 9 pictures (e.g., for a 4-picture trial, 4 targets and 5 foils). The foils were randomly selected from the nontarget pictures in the 16-picture set for the relevant condition (e.g., within the PS set for PS trials). The targets and foils were placed randomly on the card. Subsequently, for each of the two lists for each condition (dissimilar, PS, and VS), all response cards were inspected to insure that the targets were approximately evenly distributed across all locations over trials, and that the pictures were used a similar number of times (as targets and foils combined). Changes were made when needed. See Appendix B for all trials for the six lists, and the corresponding response cards for Visual Recognition.  2.3.2.5 Pretest A receptive pretest preceded the memory tasks. The decision to use a receptive task (rather than a naming task) evolved following pilot testing. This change stemmed from a concern that an expressive task could lead children to think that they should name the pictures during Visual Recognition, and hence push them towards using verbal mediation. During piloting, the pretest had consisted of a timed picture-naming task. Using a timed procedure seemed to go against the main objectives of the pretest. As one would expect, operating in a timed mode pushed children to say the first response  92  that came to mind, without necessarily taking the time to process the picture in detail. This led to many responses which did not quite correspond to the picture or to the expected label (e.g., sink vs. tap when there is no sink illustrated; car vs. van). Numerous changes and modifications of the picture stimuli were done during the piloting process in order to increase correct labelling (i.e., production of the target) and to insure a high level of name agreement. In spite of these efforts, a few labels continued to be more difficult to illustrate in a way that insured very high levels of name agreement (e.g., tag, van); nonetheless, most of the children did arrive at the right label with additional prompting or after being provided with semantic clues. These observations also supported the decision to resort to recognition rather than recall, as in this format the children heard only the target labels. In its final version, the pretest consisted of a picture recognition activity presented on a tablet computer. Its aims were fourfold: first, to verify that the children possessed the relevant receptive vocabulary; second, to familiarise the children with the spoken words and the pictures in the same format as they would encounter them later in each of the memory tasks; third, as a consequence of this familiarisation, to increase the likelihood of obtaining high levels of picture-name agreement, which is an essential condition for the phonological-similarity effect to occur in Visual Recognition (e.g., money, car, and sink, would not induce a phonological similarity effect, whereas tag, van, tap may); and fourth, to diminish the novelty effect of the tablet computer and give the children opportunity to practice using the stylus in a response format very similar to that used in Visual Recognition. The words and the pictures were identical to those used to construct the experimental trials for Auditory-Verbal Recall and Visual Recognition. The response screen had the same general layout as that used for Visual Recognition, except that only five of the nine cells contained pictures (the two cells on each end of the first and last 93  rows, and the middle cell in the array). Twelve response cards were created by randomly selecting among the 58 word-picture pairs from the four sets combined (16 word-picture pairs each from the dissimilar, PS, and VS sets, and 10 from the practice set). In order to round out to a multiple of 5, two pictures from the practice set each appeared on two response cards. The 12 response cards were presented in order from 1 through 12 (with a few exceptions for the first trials which included the demonstration and practice trials), and this sequence was repeated five times until all 58 word-picture pairs had been probed. The target word-picture pair for each successive card was determined randomly. See Appendix D for details regarding the composition and picture layout for the 12 response cards, as well the presentation order for the word probes.  2.4 Procedure The following section provides detailed descriptions of the procedures for the serial memory tasks. Throughout the tasks, strategy choice was left up to each child. Although some studies have attempted to control this variable (e.g., by telling children either to remain silent or to label the pictures during presentation), this project aimed at tapping into what individual children were most likely to do given task constraints and individual differences. Hence, whether or not a child chose to repeat words or to label the pictures at presentation was crucial data corresponding to the research questions. The script of the specific instructions provided to participants appears in Appendix E.  2.4.1 Pretest: picture recognition Pictures and words were presented using a tablet computer and headphones once again in the form of a PowerPoint (Microsoft Corporation, 1987-2003) slideshow. The children heard one word at a time, inserted in the carrier phrase “point to X”. Although they actually had to do more than point to the correct picture (i.e., make a mark  94  on it with the stylus pen), this carrier phrase seemed most natural, and the children had no problem understanding what to do. The participants were instructed to select the “best picture” to go with the word from a choice of five pictures. They did so by marking their answer on the computer screen using a stylus pen set to make broad strokes that were clearly visible (i.e., like a highlighter pen). The task began with a demonstration trial followed by two practice trials, which the examiner could repeat until it was clear that the participant understood what was expected. For the experimental trials (i.e., Trial 4 on), if the child was unsure or not responding, the examiner offered to replay the word. If he continued to hesitate, the examiner encouraged him to guess. In the case of an incorrect response, the examiner told the child to try again. If he again chose an incorrect picture, the examiner showed him the correct picture. To summarise, the examiner replayed the auditory recording of a trial (i.e., “point to x”) either if the child requested it, if he was hesitant, or if he made an incorrect choice. However, no trial was replayed more than once. In practice, the participants had no difficulty understanding the task, and they very rarely hesitated or requested to have a word repeated. The examiner scored the task online. In addition, each child’s responses were saved. As expected, the pretest proved to be very easy, with all the participants able to correctly match all of the words they heard to the corresponding pictures. The very rare errors usually involved children choosing car instead of van, and in all cases they were able to self-correct.  2.4.2 Auditory-Verbal Recall In Auditory-Verbal Recall, the children listened to word lists presented over headphones via a laptop computer using Windows Media Player (Microsoft Corporation, 2006) at a loud yet comfortable listening level and a 2-second rate (onset to onset). The  95  examiner controlled the presentation of successive trials, insuring that the participant was ready and attentive before continuing. Trials were presented blocked by condition (dissimilar, PS and VS), with trial length increasing from three to five words. All children first completed the dissimilar condition, followed either by the PS or the VS condition in counterbalanced order across participants. The examiner interleaved the instructions within the four demonstration and practice trials of two and three words in length. She was careful not to provide any clues to strategy use that could be detected by the children (e.g., mouthing words), but did complete the task at a slow, but not unnatural, pace. The instructions also avoided any wording suggestive of a verbal strategy. The children were instructed to recall as many words as possible in the order they had heard them. For the second demonstration, the examiner illustrated that they should respond by marking the place of forgotten words by saying “something”. This instruction was intended to help the participants to correctly mark the order of as many words as possible within a trial given that strict scoring requiring that words be recalled in the correct serial position would be used. For the practice trials, the examiner provided feedback regarding the accuracy of responses. She also repeated the practice trials as needed until she judged that the participant understood the task and, in particular, the requirement to recall the words in the correct order. Before beginning the experimental trials, the examiner notified the children that the memory for words task would become progressively more difficult. She instructed them to simply do their best, and that it was alright to guess if they were unsure. She also reminded the children that they had heard all the words before during the pretest. This was done to reassure them and hopefully to help them accurately recognise the words that were now presented in an unnatural list format. The examiner no longer provided any corrective feedback (only general encouragement), and presented each experimental trial only once, although self-corrections were allowed. She also informed 96  the participants at the outset that there would be three parts to the task. In addition, she indicated when the number of words in a trial was about to increase (“Now there will be more”), but did not specifically refer to the trial length. Finally, she marked the transitions to the next condition, indicating that the next part would be just like the preceding one, except with a new set of words. The examiner scored the task online, and also videorecorded it for offline verification of scoring accuracy as well as scoring of observed strategies. Based on prior research and piloting, a range of trial lengths varying from three to five words was expected to fall within the maximum span level of most children. Nonetheless, nine additional trials of six words (three per condition) were completed by those children who successfully recalled all the words (regardless of order) for at least two of the three trials of five words for at least one condition. Hence, the decision to go on to trials of six words was based on lenient scoring (i.e., words correct in any order). Given that the memory tasks were specifically designed to foster strategy ‘discovery’ by using multiple trials of increasing difficulty, it was expected that some children would change their strategies with experience, and possibly improve their memory performance. However, the fact that some participants would complete earlier on what would be for them the more difficult condition (e.g., the PS condition before the VS condition for a child relying on verbal rehearsal), it seemed best to give more rather than fewer children the opportunity to complete the additional longer trials. Hence, all children completed a core set of 27 trials of three to five words for Auditory-Verbal Recall, and those who may have been near ceiling were given an additional 9 trials of six words in length at the very end of the experimental protocol (actually on a different day). This insured an identical time-course for all children for the entire protocol. It was also meant to avoid children becoming overly fatigued or bored with the task. 97  2.4.3 Visual Recognition In all major respects, the procedure for Visual Recognition paralleled that used for Auditory-Verbal Recall. As such, this description focuses on areas of difference. When giving instructions, the examiner was careful to never refer to words, only to pictures. In the hopes of increasing the likelihood that children who labelled the pictures would use the target words, the examiner underscored that the pictures were identical to those seen in the pretest. This decision to make an explicit link to the pretest emerged following piloting, as very few children mentioned that the pictures or the words were the same as those they had seen and heard in the earlier task. The children watched as pictures appeared silently on the screen of a tablet computer. The examiner instructed them to remember the pictures in the same order as they had been presented, and to mark their responses on the screen using a stylus pen. The stylus was set to function as a highlighter writing in the same colour as the picture background (i.e., white). This left no visible mark on the pictures while the children completed the task, and hence avoided providing them with clues regarding prior responses within a trial. This design choice was based on the desire to make the two tasks as similar as possible, given that in Auditory-Verbal Recall the children were not provided any external cues to check what words they had already said in their response for a given trial. The experimenter explained to the participants that they would be writing in ‘invisible ink’, but that she would make their answers reappear later by changing the colour of their responses. When the response card of nine pictures appeared on the screen, the children marked in sequence each picture they remembered. Selfcorrections for either picture choice or for order were permitted. The examiner again scored the task online and videorecorded the children while they were completing the activity. In addition, each child’s individual file provided a  98  record of his responses. The video later served for the offline verification of scoring accuracy of serial recognition and to score the observed strategies. Just as for Auditory-Verbal Recall, all children completed a core set of 27 trials of three to five pictures for Visual Recognition. Those who were near ceiling (i.e., all pictures correct regardless of order on two or more trials of five words for at least one condition) were given an additional nine trials of six pictures in length at the end of the entire experimental protocol. In practice, a child could go on to complete lists of six items for neither memory task, for only one, or for both tasks.  2.4.4 Post-task questions At the end of each serial memory task, the children were presented with a few post-task questions about their strategy use. The first question simply asked: “What did you do to remember? Did you have a special trick?” A four-item trial from the dissimilar condition was presented again, as this gave the children the opportunity to demonstrate and describe what they had been doing. In accordance with prior research (e.g., Bray et al., 1999; Winsler & Naglieri, 2003), piloting had indicated that providing an example to accompany the questions increased the likelihood that the participants, particularly the younger ones, would either provide any response or be more precise in their answers. Additional questions focused on awareness of task characteristics and their interaction with task difficulty (e.g., trial length, characteristics of the words or the pictures). Also, following the second memory task, the examiner asked the child whether he had used the same approach for both tasks. The inclusion of these follow-up questions also provided the children with additional time and the opportunity to complete their responses regarding their strategy choice if they wanted to. See Appendix E for the complete post-task questionnaire.  99  2.4.5 Presentation order Participants completed either Auditory-Verbal Recall or Visual Recognition first, and each task on a different day. In addition, the children first received the dissimilar block, followed by either the phonologically-similar (PS) or the visually-similar (VS) block, counterbalanced across participants (i.e., presented either second or third). In order to avoid fatigue effects, an intervening task (either part 1 or 2 of the Test of Narrative Language, see below) separated the PS and the VS conditions. All participants received the tasks in one of four possible presentation orders: Auditory-Verbal Recall (AV) then Visual Recognition (Vis), phonologically-similar (PS) condition then visually-similar (VS) condition (AV/PS); Auditory-Verbal then Visual, VS then PS (AV/VS), Visual then Auditory-Verbal, PS then VS (Vis/PS), Visual then Auditory-Verbal, VS then PS (Vis/VS).  Table 1. Distribution of Children by Presentation Order for Auditory-Verbal Recall and Visual Recognition Auditory-Verbal Recall  Visual Recognition  Frequency  Percent  Frequency  Percent  AV/PS AV/VS Vis/PS Vis/VS  11 10 10 11  26.2 23.8 23.8 26.2  9 12 10 11  21.4 28.6 23.8 26.2  Total  42  100.0  42  100.0  Note: AV/PS, Auditory-Verbal task first, PS block prior to VS block; AV/VS, Auditory-Verbal task first, VS block prior to PS block; Vis/PS, Visual task first, PS block prior to VS block; Vis/VS, Visual task first, VS block prior to PS block.  Half the participants (n = 21) received Auditory-Verbal Recall first, whereas the other half (n = 21) first completed Visual Recognition. Although according to the experimental design each child should have received the PS and the VS conditions in the same order for both tasks, in two cases this did not happen due to experimenter  100  error. As a result, the totals are not exactly the same for the four presentation orders across the two tasks (see Table 1), although in both cases the children are fairly evenly distributed across them.  2.5 Data reduction and task scoring Carefully-trained research assistants helped to complete the data reduction under the supervision of the author. Each child had been videorecorded while completing the experimental protocol. The recordings for each task were spliced, identified, and reorganised (e.g., subparts from the same task were merged) for later viewing using Ulead VideoStudio 11 (Intervideo Digital Technology Corporation, 2007). Research assistants orthographically transcribed the following tasks using the Systematic Analysis of Language Transcripts program (SALT; J. F. Miller & Iglesias, 2006): responses to the post-task questions for Auditory-Verbal Recall and Visual Recognition, and the three story productions from the Test of Narrative Language (see below). The author reviewed all transcripts for accuracy.  2.5.1 Scoring of the word and picture memory tasks ƒ  Auditory-Verbal Recall. The task was scored online on a word-by-word basis. A second coder reviewed the scoring from the video for all trials and all participants. A third judge resolved any discrepancies.  ƒ  Visual Recognition. First, each child’s saved file was modified by changing the stylus marks to a visible colour. A coder then compared the online scoring to the stylus marks and noted any discrepancies. The stylus marks did not always match online scoring, either because the child self-corrected (i.e., without being able to ‘erase’ prior responses), or for some reason the responses did not register well on the tablet computer (i.e., child holding stylus at an angle or not putting enough pressure on  101  stylus). The longer trials were also more challenging to score online, particularly for order. The video recordings were used to verify any trials where: i) the marks on the pictures did not match online scoring; ii) the examiner had indicated being unsure at the time of initial scoring; iii) the examiner had indicated that the child self-corrected (changing either the pictures selected or their sequence). As an additional precaution, the second coder verified all trials for a given child if 3 or more of the 27 trials of three to five pictures in length were flagged as needing to be checked, as well as all trials of six pictures in length (completed by only 23 participants). Because the child marked his responses on the screen of the tablet computer which lay flat in front of him, it was not possible to actually see the pictures on the video. However, the coder was able to watch the child and use a hardcopy of each trial’s response card to follow along and judge which pictures were being marked and in what order based on their relative positions on the screen. In addition, many children talked while they were responding, either labelling the pictures or explicitly telling the examiner how they wished to change their responses. A third judge decided how to score any discrepancies. These verifications actually supported the online scoring in the vast majority of cases. The item-by-item accuracy of online scoring based on strict scoring for order was estimated at 99% for trials of three to five pictures, and 97% for the longer sixpicture trials. ƒ  Items correct, trials correct, and span. For both Auditory-Verbal Recall and Visual Recognition, the numbers of items (words or pictures) correct and the numbers of trials correct were calculated by condition (dissimilar, PS, and VS) and overall. For each task, each child’s highest span was then determined by condition (dissimilar, PS, and VS lists) and overall based on a criterion of at least two of three trials correct (i.e., all words or pictures correct) at the longest trial length. For all measures, both 102  lenient scoring (based on items only) and strict scoring (based on items in the correct serial position) were used. The author later verified the accuracy of all scoring. Additional details regarding specific scoring decisions are presented in Appendix F.  2.5.1.1 Variables for the memory tasks Items or trials correct were based on a strict scoring criterion that also required correct order. Lenient scoring was used only to establish the children’s highest lenient span level for trials of three to five words or pictures in length in order to determine whether they would go on to complete trials of six words or pictures. Each variable was compiled separately for Auditory-Verbal Recall and Visual Recognition: ƒ  Items correct by condition (trials of three to five items): Number of words or pictures remembered in the correct serial position, by condition (dissimilar, PS, and, VS). For each condition, the maximum number of words or pictures correct is 36 (three trials at each length of 3, 4, and 5 items).  ƒ  Total items correct (trials of three to five items): Total number of words or pictures remembered in the correct serial position for all three conditions combined (max. 108).  ƒ  Items correct by condition (trials of three to six items): Number of words or pictures remembered in the correct serial position, by condition (dissimilar, PS, and VS). For each condition, the maximum number of words or pictures correct is 54 (three trials at each length of 3, 4, 5, and 6 items). This score was available only for children who completed lists of six items in length.  ƒ  Size of the phonological-similarity effect: Score corresponding to the difference in total words or pictures remembered between the dissimilar condition and the phonologically-similar (PS) condition. A positive difference score indicates that there  103  is an advantage for dissimilar trials, which is the expected direction if a phonologicalsimilarity effect is present. ƒ  Size of the visual-similarity effect: Score corresponding to the difference in total words or pictures remembered between the dissimilar condition and the visuallysimilar (VS) condition. A positive difference score indicates that there is an advantage for dissimilar trials, which is the expected direction if a visual-similarity effect is present.  ƒ  Highest lenient span level (trials of three to five items): Highest trial length where the child remembered all the words or pictures regardless of order for at least two of the three trials across all three conditions. Hence, it was sufficient for the highest span to have been achieved for only one condition.  ƒ  Highest strict span (trials of three to five items): Highest trial length where the child remembered all the words or pictures in the correct order for at least two of the three trials across all three conditions. Hence, it was sufficient for the highest span to have been achieved for only one condition.  2.5.2 Observational data In order to achieve high levels of scoring reliability, the author and a second coder viewed the videos of each child and together judged trial-by-trial whether there was sufficient evidence to conclude that one or more strategies were used during either the presentation or the response phase. The general rule was to err on the side of caution. Hence, a behaviour was attributed only if both coders were convinced there was sufficient evidence to do so based on what they observed. Overt behaviours included talking aloud, whispering, and mouthing. Coding distinguished between single-item labelling (henceforth labelling) and rehearsal. The coders attributed single-item labelling if they agreed that they observed a  104  child overtly naming one or more items. On the other hand, they attributed rehearsal if they observed either item repetition (e.g., cup, cup), naming of items in a block (e.g., cup, shell, bench, after the third item was presented), repetition of multiple items (e.g., cup, cup, cup; shell, shell, shell; bench, bench, bench…), or cumulative rehearsal (e.g., cup; cup, shell; cup, shell, bench…). As it did not seem realistic to reliably distinguish between these different types of rehearsal, they were coded into a single category. Generally, both coders had to be convinced that at least one of the words produced was part of the stimulus pool. This decision avoided crediting children with labelling or rehearsal if they were simply place-marking (e.g., da, da, da, da) or using another strategy such as counting. One coder then calculated the total number of trials where labelling and rehearsal were observed for each condition (dissimilar, PS, and VS), and the other coder verified all totals. Behaviours during the presentation or the response phases were considered. During presentation of the words or the pictures, the children could opt to say the words or name the pictures one at a time, or to use a rehearsal strategy. In addition, children occasionally rehearsed while they responded, usually to find additional items in a series or to self-correct for order. This manifested itself as stopping and restarting or repeating a section of the response (e.g., cup, shell {PAUSE} cup, shell, bench). For Visual Recognition only, it was possible to observe a labelling strategy in the response as well, as children often named the pictures as they marked their answers on the screen. On the other hand, given that Auditory-Verbal Recall required a full verbal response, it was not possible for a labelling strategy to operate in the response phase for this task. The coders judged each portion of the trial (i.e., presentation and response) separately for evidence of either labelling or rehearsal. A behaviour was credited only once per trial. The coders credited a strategy observed during the response only if they had not already done so for the encoding portion of the trial, but both labelling and 105  rehearsal could be credited for the same trial. Altogether, for each task, a child could be credited once for rehearsal and once for labelling for each of the 27 core trials. In practice, although Visual Recognition was by design a task that could be done in silence, it actually offered the opportunity to use either rehearsal or labelling strategies both at the time of presentation and of response. As a result, the children were much more likely to be credited with both strategies in Visual Recognition than in Auditory-Verbal Recall. Any child who was observed to rehearse three or more times within a condition (dissimilar, PS, or VS) was placed into the rehearsal category. The specific criteria of three out of nine trials per condition was based on the logic that, for each child, at least one trial length was assumed to fall within a moderate level of difficulty, and hence to be conducive to strategy use. Some children may not have begun using a strategy until the trial lengths increased, whereas others may have stopped doing so as the trials became longer. Hence, it did not seem wise to use a stricter criterion. Then again, it seemed imprudent to accept only one or two instances as providing a reliable indicator that rehearsal was taking place. The coding scheme required both consistency and density, as it was not sufficient for a child to have been observed to rehearse three times over all 27 trials (all conditions combined). The decision to include density in the criteria was consistent with all scoring decisions for observed and reported strategies which erred on the side of conservatism, acknowledging both measurement error and practical significance. In the end, either coding scheme would have led to very similar results, with only two children having been observed to rehearse three times or more in total (over 27 trials) for a given task but not meeting the stricter criteria, and neither of these children being observed to rehearse more than four times in total. To reiterate, any child who was observed to rehearse for at least three trials (of nine) within a condition was deemed to have been rehearsing. Meeting this criterion in one condition (dissimilar, PS, or VS) was sufficient for a child to meet the minimum requirement to be considered to have 106  rehearsed during the memory task. For those children who completed trials of six words or pictures, if they were observed to rehearse three or more times over the nine longer trials (dissimilar, PS, and VS combined) they were placed into the rehearsal category for this trial length. Hence, these children could have been observed to rehearse consistently either only for trials of three to five items in length, only for trials of six items in length, for both cases, or for neither. The same procedure was used to determine whether children were overtly labelling.  2.5.2.1 Variables for observed strategies Using the observational data, each of the following variables was compiled separately for Auditory-Verbal Recall and Visual Recognition: ƒ  Observed occurrences of overt labelling by condition (trials of three to five items): Number of trials (max. 9) where coders observed a child to use single-item labelling, for each condition (dissimilar, PS, VS).  ƒ  Observed occurrences of overt rehearsal, by condition (trials of three to five items): Number of trials (max. 9) where coders observed a child to use a form of rehearsal for each condition (dissimilar, PS, VS).  ƒ  Observed occurrences of overt labelling (trials of six items only): Number of trials of six items in length (max. 9) where coders observed a child to use single-item labelling for all conditions combined (dissimilar, PS, and VS). This data was available only for children who completed trials of six words or pictures in length.  ƒ  Observed occurrences of overt rehearsal (trials of six items only): Number of trials of six items in length (max. 9) where coders observed a child to use a form of rehearsal for all conditions combined (dissimilar, PS, and VS). This data was available only for children who completed trials of six words or pictures in length.  107  ƒ  Classification of participants as overtly labelling or not (trials of three to five items): Any child who was observed to label three or more times within one condition (dissimilar, PS, or VS) for trials of three to five items in length was placed into the overtly labelling category for trials of this length. One condition was sufficient to meet the criterion for the task in question.  ƒ  Classification of participants as overtly rehearsing or not (trials of three to five items: Any child who was observed to rehearse three or more times within one condition (dissimilar, PS, or VS) for trials of three to five items in length was placed into the overtly rehearsing category for trials of this length. One condition was sufficient to meet the criterion for the task in question.  ƒ  Classification of participants as overtly labelling or not (trials of six items only): Any child who was observed to label three or more times across all conditions (dissimilar, PS, and VS) for the longer trials of six items in length was placed into the overtly labelling category for this trial length. This data was available only for some children.  ƒ  Classification of participants as overtly rehearsing or not (trials of six items only): Any child who was observed to rehearse three or more times across all conditions (dissimilar, PS, and VS) for the longer trials of six items in length was placed into the overtly rehearsing category for this trial length. This data was available only for some children.  2.5.3 Self-report data A coder used the transcripts of responses to post-task questions to code for mentions of single-item labelling, rehearsal (i.e., item repetition, chunking, cumulative rehearsal), and other strategies (e.g., counting, visual imagery). The author later verified the scoring. Discrepancies were very rare, with only 2 disagreements out of 50 times a  108  strategy was coded as being reported for Auditory-Verbal Recall, and only 1 disagreement out of 51 times a strategy was coded as being reported for Visual Recognition, which corresponds to 96% and 98% agreement respectively. The children’s responses to the post-task questions were coded in order to reflect all the strategies they reported using. Precautions were taken not to overinterpret, and only explicit references to strategies or unambiguous demonstrations were counted. A given child could receive credit for more than one strategy. For example, a child who responded after Visual Recognition “I used my fingers to go like ‘one, two, three, four’. Then I also said words out loud in a little whisper” was credited for both counting and single-item labelling. Initial coding distinguished between the following forms of rehearsal: repetition of items (e.g., “cup, cup” or “cup, cup, cup; shell, shell,…”), chunking (e.g., “cup, shell, bench; cup, shell, bench; house, house”), or cumulative rehearsal (e.g., cup; cup, shell; cup, shell, bench…). In practice, very few children reported using either item repetition or chunking, and all these responses were recoded into a single category. Also, the few responses (three it total) that were suggestive of semantic elaboration (e.g., “I made a funny sentence… ‘a bench by a train with an ant by a house’”) were added to labelling as these strategies were judged to be more similar than different. The children’s responses were then recoded in order to place each child into a single reported strategy category. In order to be able to compare across the various indicators of rehearsal, precedence was granted to verbal strategies over others, and to rehearsal over labelling. Hence, all the children who reported having used a rehearsal strategy were place in the rehearsal category. Among the remaining children, those who reported having used single-item labelling were placed in the labelling category. Finally, any of the remaining children who reporter a strategy other than rehearsal or labelling were placed in the other category. 109  2.5.3.1 Variables for reported strategies Using the self-report data, each of the following variables was compiled separately for Auditory-Verbal Recall and Visual Recognition: ƒ  Reporting rehearsal: Number of children who reported having used a rehearsal strategy either alone or in combination with labelling or other strategies.  ƒ  Reporting labelling: Number of children who did not report rehearsal, but did report having used single-item labelling either alone or in combination with other strategies.  ƒ  Reporting other strategies: Number of children who did not report either rehearsal or labelling, but did report having used another strategy (e.g., counting, imagery).  2.5.4 Test of Narrative Language The Test of Narrative Language (TNL) is a standardised test that assesses both narrative comprehension and narrative production. It was chosen as an ecologicallyvalid measure of language for children in this age range and because of its strong psychometric properties (Spaulding, Plante, & Farinella, 2006). The three receptive subtests combined produce the Narrative Comprehension score whereas the three expressive subtests combined produce the Oral Narration score; both have a mean of 10, and a standard deviation of 3. The composite Narrative Language Ability Index (NLAI) has a mean of 100, and a standard deviation of 15. The children were videorecorded while completing the TNL in order to allow for verification of online scoring of the comprehension subtests as well as scoring of the production subtests. The examiner followed standardised procedures for administering and scoring. A second coder independently scored the receptive subtests from the videos according to the standardised procedure. Any disagreements in scoring were resolved by discussion. For the retell, a SALT (J. F. Miller & Iglesias, 2006) word list program was built and used to  110  automatically identify all the target words. The other expressive subtests were scored and verified by the author from the transcripts.  2.5.5 Test of Nonverbal Intelligence The Test of Nonverbal Intelligence (TONI-3) is designed to provide a measure of abstract reasoning and problem solving in a format that requires no language in the administration or the responses. The examiner followed standardised procedures for administering and scoring. Raw scores were converted to standardised quotients (M 100, SD 15).  2.5.6 Rapid Automatic Naming of Colours and Animals The Rapid Automatic Naming of Colours and Animals (RAN) activity, adapted from Catts (1993), was used to provide a measure of language production and fluency. It was built using the names of familiar animals and basic colour terms. Although on their own, these targets should not involve complex planning and execution processes, the fact that the children tried to name them as fast as they could increased the planning and production demands. Like verbal rehearsal, rapid automatic naming requires word retrieval and production. It does not, however, involve a memory load, as the stimuli are accessible at all times. The children completed three rapid naming tasks, the first (RAN 1) with four colours (green, blue, red, yellow), the second (RAN 2) with four animals (cow, dog, horse, pig), and the last (RAN 3) with both colours and animals randomly changing. For each task, following a six-item demonstration, the participants completed a six-item untimed practice trial (Demo/practice 1 to 3). They then completed all three timed 24item experimental tasks (RAN 1, 2, and 3) in sequence. The picture stimuli were presented on standard letter-size pages (21.6 cm x 27.9 cm), one for each  111  demonstration/practice and experimental trial. The experimenter measured the total time taken to name all 24 items (either colours, animals, or colours and animals) using a stopwatch and noted any uncorrected errors. In addition, the videorecording served for verification of the online scoring. A second coder used the video to independently time all three RAN tasks and to count uncorrected errors. Any discrepancies in timing were checked a third time. The variable used was time (in seconds) to complete RAN 3, the subtest where both colours and animals randomly change. The RAN plates are presented in Appendix G.  2.6 Reliability of scoring Additional measures of scoring reliability were completed for the following tasks: ƒ  TNL: Point-by-point intercoder agreement based on a randomly selected sample of 8 of the 42 children reached 99% for the receptive subtests and 96% for the expressive subtests.  ƒ  Serial memory for words: A second coder independently re-scored Auditory-Verbal Recall from the video for 8 of the 42 children. Mean point-by-point intercoder agreement was high at 98%.  ƒ  Observational data: One of the two coders independently re-scored the observed strategies for 8 of the 42 children. Based on a trial-by-trial comparison, agreement with the two-coder scoring reached 86% for labelling and 89% for rehearsal for Auditory-Verbal Recall, and 93% for labelling and 95% for rehearsal for Visual Recognition. Given that these comparisons were between single-coder and twocoder scoring, they are most likely an underestimation of the reliability of the twojudge scoring.  112  3. RESULTS This study includes data for 42 children from grades I to IV. Table 2 presents demographic data as well as the children’s scores on the language and cognitive tests. The participants obtained scores in the normal to high range on the Test of Narrative Language (TNL). As expected, given the broad age range of participants, times to complete the Rapid Automatic Naming of Colours and Animals (RAN) varied considerably. However, all children completed the RAN without too much difficulty: uncorrected errors were very rare (12 in total), and no child made more than two errors without self-correcting. Scores also varied considerably on the Test of Nonverbal Intelligence (TONI-3), with some children (n = 7) obtaining standard scores of at least one standard deviation below the mean (lowest score of 81, −1.27 SD), and others achieving scores of at least one standard deviation above the mean (n = 12; highest score of 138, +2.53 SD). However, none of these children had any history of developmental or learning difficulties according to both teacher and parent reports. The low scores on the measure of nonverbal cognition may in part be attributable to the fact that some of the children came from families with lower socio-economic status; in particular, some of the younger children appeared to be unfamiliar with this type of task or testing situation. Hence, although their levels of ability may have varied, there is every reason to think that this sample consists of typical children.  113  Table 2. Demographic Data and Scores on Tests of Language and Cognition  Age (mos) Maternal education (yrs) TONI-3 quotient TNL, Narrative Comprehension TNL, Oral Narrative TNL, NLAI RAN (s)  M  SD  Min  Max  102.0 14.3 102.3 12.8 12.1 114.6 39.9  14.4 2.2 14.8 2.2 2.3 11.1 11.3  80 12 81 9 7 91 25  128 20 138 18 16 136 75  Note. Maternal education corresponds to number of years of schooling. TONI-3 = Test of Nonverbal Intelligence 3 (L. Brown, Sherbenou, & Johnsen, 1997); mean quotient 100, standard deviation 15. TNL = Test of Narrative Language (Gillam & Pearson, 2004); Oral Narration and Narrative Comprehension, mean standard score 10, standard deviation 3; NLAI = Narrative Language Ability Index, mean standard score 100, standard deviation 15. RAN, Rapid Automatic Naming of Colours and Animals; time to complete in seconds. N = 42.  This project aims to answer two questions, and consequently this chapter consists of two parts. The first question asks under what conditions or task constraints a given child is using verbal rehearsal to remember words and pictures. The answer may depend on the way rehearsal is measured, therefore multiple indicators were used. As the focus here is on task-related variables, analyses included the entire sample of children. The second question asks whether individual differences predict whether a child will be more or less likely to use verbal rehearsal while completing these memory tasks. Here, the focus is on child-related variables, more specifically grade (or age), and measures of language and cognitive abilities. On occasion, in the interest of efficiency and clarity, some analyses exploring possible age effects were included in the first part of the chapter. Detailed raw data can be found in Appendix H whereas subject-wise data for the variables used as indicators of labelling and rehearsal appear in Appendix I.  114  3.1 Indicators of rehearsal in immediate serial memory for words and pictures This section describes the results for the various indicators of rehearsal, namely the presence of phonological-similarity and visual-similarity effects at the group and individual levels, observational data regarding labelling and rehearsal, and self-reported strategies. Each of the memory tasks is considered separately.  3.1.1 Phonological-similarity and visual-similarity effects 3.1.1.1 Auditory-Verbal Recall In Auditory-Verbal Recall, children heard lists of words which they had to repeat in the same order. All children completed 27 trials varying in length between three and five words, three blocks of 9 trials each in the dissimilar, phonologically-similar (PS), and visually-similar (VS) conditions. For each condition, the maximum number of words correct was 36 (three trials at each length of 3, 4, and 5 words), and the highest possible span was 5 words. As a group, using strict scoring (i.e., words recalled in the correct serial position), the children recalled on average between 19.4 and 26.7 words depending on the condition, and 71.2 of 108 words (66%) in total (see Table 3). This translated into a mean highest strict span of 4.14 words: 1 child with a highest span of 2 words7, 8 children with spans of 3 words, 17 children with spans of 4 words, and 16 children with spans of 5 words.  7  One child (Case #15) did not meet the two out of three trials correct criterion for any condition, and was thus assigned a highest span of 2 words.  115  Table 3. Mean Words Correct, Trials Correct, and Highest Span, by Condition and by Trial Length, Auditory-Verbal Recall Min  Max  Mean  SD  Trials of three to five words Words correct, by condition Dissimilar PS VS  9 4 7  36 32 36  25.02 19.45 26.74  7.03 6.38 7.76  Highest span  2  5  4.14  0.81  Trials of three to six words Trials correct, by condition Dissimilar PS VS  1 0 1  11 7 11  5.57 3.26 5.79  2.45 1.67 2.53  Highest span  2  6  4.21  0.92  Note: Words, trials, and span are all based on strict scoring which credited words only when they were remembered in the correct serial position. The highest span was obtained in at least one condition.  Sixteen participants who performed at or near ceiling (i.e., those who obtained highest lenient spans of 5 words based on words recalled correctly in any order) completed nine additional trials of six words in length, three each for the dissimilar, PS, and VS conditions. Only 3 of these 16 children improved their highest strict spans to 6 words. When trials of three to six words in length were considered, children succeeded in correctly recalling on average between 3.3 and 5.8 trials depending on the condition (max. 12),8 and mean span increased to 4.21 words.  8  Comparisons which take into account trials of three to six words in length are based on trials correct rather than words correct. Only some children went on to complete the longer trials of six words; this decision was based on whether or not they were able to recall all words correctly (irrespective of order) for two of three trials of five words in at least one condition. Hence, if they did not go on to trials of six words, it is a safe assumption that they would not have succeeded on any trials of the longer length based on strict scoring, where they would have had to recall not only all the words but also their correct order. The same assumption cannot be made for words correct.  116  This preliminary examination of the data indicated variability in terms of performance in Auditory-Verbal Recall, and most importantly that the task was at a reasonable level of difficulty to allow both a degree of success and some errors in all children. Unless otherwise specified, the analyses included results for trials of three to five words only, as this afforded maximal comparability across children. In all cases, strict scoring was applied, with words scored as correct only if they were recalled in the correct serial position within the trial. Finally, an alpha-level of .05 was used for all statistical tests, with adjustments made for multiple comparisons where appropriate.  Phonological-similarity effect The primary analysis focuses on whether children showed a significant phonological-similarity effect when they were asked to recall lists of words in the order they had heard them. This would be reflected by better recall in the dissimilar compared to the PS conditions. Group-level analyses In Auditory-Verbal Recall, as a group, the children recalled slightly more words in the correct serial position in the VS condition (M = 26.7) compared to the dissimilar (M = 25.0) condition, but fewer words in the PS condition (M = 19.5) compared to either of the other conditions (see Table 3 and Figure 1).  117  32  Words corre ct  28 24 20 16 12 8 4  Dissimilar  PS Condition  VS  Figure 1. Mean number of words remembered correctly (95% CI) in Auditory-Verbal Recall, by condition  118  Group differences in the number of words recalled between conditions were explored with a repeated-measures ANOVA using two planned contrasts (dissimilar vs. PS, and dissimilar vs. VS).9 Taken as a group, the participants showed a large and significant phonological-similarity effect, with a mean of 5.6 (SD = 4.4) additional words recalled in the correct position in the dissimilar condition compared to the PS condition, F(1, 41) = 68.9, p < .001, d = 1.28. Examination of individual patterns revealed that most children had superior recall (ranging from 1 to 14 words) for the dissimilar compared to the PS condition (i.e., positive difference scores; see Figure 2). Six children did not follow this pattern and instead recalled more words for the PS block as is indicated by the negative difference scores (between −5 and −1; Cases #4, 17, 18, 26, 29, and 42).  9  The focus of this analysis is not the omnibus ANOVA but rather on the results of the planned contrasts. Some statisticians (e.g., Karpinski, 2006) argue that it is not necessary to adjust the pvalue when the focus is on a few (a − 1, where a is the number of groups) planned contrasts based on hypotheses. Here, two contrasts were planned, one for the phonological similarity effect (i.e., comparing the dissimilar and PS conditions), and one for the visual-similarity effect (i.e., comparing the dissimilar and VS conditions). This meets the a − 1 criteria, as there are 3 groups; therefore the critical p-values were set at .05 for each contrast. For repeated measures contrasts, the crucial assumption is symmetry of the difference scores. This assumption was met for both contrasts in preliminary analyses. For all statistical analyses, the data were examined for symmetry, outliers, and homogeneity of variances in order to verify that assumptions were met. Any data point with a standardised residual value greater than |2.58| was considered an outlier, which corresponds to a 99% confidence interval and is a generally accepted cut-off point for small samples (Field, 2005; Tabachnick & Fidell, 2001). When necessary, follow-up sensitivity analyses were performed.  119  7 6 5  Number of children  4 3 2 1 0 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14  Difference in words (Dissimilar - PS) Figure 2. Difference score in number of correct words, dissimilar vs. PS conditions, Auditory-Verbal Recall Note: Max. 36 for each condition.  120  Similar word recall for the dissimilar and VS conditions was predicted for this task, which involves auditory presentation and requires verbal recall. Under these circumstances, one would anticipate dissimilar and VS conditions to be equivalent as there were no pictures involved. Unexpectedly, recall was significantly lower in the dissimilar condition than the VS condition (see Table 3 and Figure 1), with a mean difference of −1.7 words (SD = 4.5), F(1, 41) = 6.0, p = .019, d = 0.38.10 This suggests that a small to moderate practice effect was occurring, as the dissimilar block was always presented first, followed by the PS and the VS conditions (the latter two presented in counterbalanced order across participants). Most of the children showed this improvement for the VS condition, with 29 of 42 participants obtaining negative difference scores (i.e., recalling between 1 and 11 additional words in the VS block compared to the dissimilar block; see Figure 3).11 There were nonetheless 9 children who remembered fewer correct pictures (between 1 and 11) in the VS compared to the dissimilar condition. These children (Cases #9, 11, 14, 15, 22, 25, 28, 32, and 40) were not the same ones who stood out in the prior analysis.  10  The difference was calculated this way (dissimilar – VS) to parallel the analyses in the Visual Recognition task, where visual similarity may have impaired performance. Here, this resulted in a negative mean difference score because the effect was in the opposite direction. 11 There was 1 outlier in the distribution of difference scores, with that child showing a large drop in performance for the VS condition compared to the dissimilar condition (difference score of +11), which may simply correspond to a fatigue effect. This child (Case #25) also completed the additional trials of six words in length, and the difference between the dissimilar and VS conditions was largely attenuated. A sensitivity analysis resulted in the same conclusion that a moderate practice effect was advantaging word recall in the VS compared to the dissimilar condition, F(1,40) = 9.8, p = .003, d = 0.49.  121  9 8 7 6  Number of children  5 4 3 2 1 0 -11-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11  Difference in words (Dissimilar - VS) Figure 3. Difference score in number of correct words, dissimilar vs. VS conditions, Auditory-Verbal Recall Note: Max. 36 for each condition.  122  Given this apparent order effect resulting from the dissimilar condition always coming first and leading to better recall for the VS condition, it seemed prudent to verify whether task or condition order may have otherwise impacted children’s memory for words. Each participant completed the memory tasks according to one of four presentation orders, resulting from the crossing of task order (Auditory-Verbal or Visual first) and condition order (dissimilar followed by either PS then VS, or by VS then PS conditions). The same trend in terms of words recalled in the three conditions (dissimilar, PS, and VS) appeared for each presentation order, with slightly higher recall for the VS compared to the dissimilar condition, and many fewer words recalled in the PS condition compared to the two other conditions (see Figure 14, Appendix J). This was confirmed by statistical testing using a one-way ANOVA and planned contrasts, as the size of the phonological-similarity effect (i.e., the difference score in words recalled between the dissimilar and PS conditions) did not vary by presentation order (see Appendix J for detailed analyses).These results indicate that presentation order did not impact the likelihood of finding a phonological-similarity effect in Auditory-Verbal Recall at the group level. Subject-wise analyses The main focus of this study is to determine the strategies of individual children. As mentioned above, most of the children recalled more words in the dissimilar compared to the PS condition. However, it did not seem reasonable to consider that any difference between conditions was meaningful at the individual level without some further justification based on the distribution of the data. In order to make a decision at the level of each child regarding the presence of a significant phonological-similarity effect, expected variability was estimated based on the standard error of the mean difference (SEMdiff) in words recalled between dissimilar and PS conditions at the group  123  level.12 This value was used to construct a confidence interval around the mean difference in words recalled between conditions for individual children.13 Any child for whom the confidence interval around this difference score had a lower boundary greater than 0 was judged to present with a significant phonological-similarity effect. Proceeding in this manner, the error estimate for the phonological-similarity effect for Auditory-Verbal Recall is 1.36 words. Hence, any child who recalled at least 2 additional words for the dissimilar block compared to the PS block met the standard for exhibiting a significant phonological-similarity effect. According to this criterion, 35 children presented with a phonological-similarity effect in Auditory-Verbal Recall, whereas 7 did not. Ceiling effects were a potential concern, however, as 4 children recalled 35 or 36 words in the correct order (max. 36) for the dissimilar condition. Bearing in mind that those children who were considered to be at or near ceiling completed additional trials of six words in length, it was possible to use this supplementary data to test whether ceiling effects may have impacted the likelihood of finding a significant phonological-similarity effect in some cases.14 Consequently, the results of children who did not show a phonological-similarity effect at the individual level were examined more closely. Only one of these children was near ceiling, and this participant continued to show no difference between the dissimilar and PS conditions for words recalled correctly on trials  12  This is in fact the standard error of the phonological-similarity effect (dissimilar-PS) for the entire group. 13 Using the SEMdiff and the critical t-value (tcrit) corresponding to the degrees of freedom (here df = N −1 = 41), it is possible to construct a 95% confidence interval around the mean phonologicalsimilarity effect. With SEMDiss-PS = 0.671, and tcrit(.05, 41) = 2.021, this results in an error estimate of 0.671 x 2.021) = 1.36 words. 14 The reader is reminded that during the data collection phase, being at or near ceiling for trials of three to five words was operationalised as having a lenient span of 5 words in at least one condition (dissimilar, PS, or VS). Children who recalled all the words correctly regardless of order in two out of three trials of five words in at least one condition went on to complete trials of six words in length for all conditions.  124  of three to six words in length.15 Hence, ceiling effects were apparently not a cause of concern in this analysis. One could argue that given the possible benefit of practice for PS and VS conditions, performance in the VS condition makes for a better baseline here. When the same approach as described above was applied to the difference between the VS and PS conditions,16 4 of the children (Cases #17, 18, 29, and 38) who did not initially show a phonological-similarity effect did present with a significant advantage for recall of words from the VS compared to the PS conditions. None of the other 3 children (Cases #4, 26, and 42) were at or near ceiling, which again indicates that ceiling effects were likely not an issue here. Overall, for Auditory-Verbal Recall, the group- and individual-level analyses concord very well, with 39 of the 42 participants showing a significant drop in their performance in words recalled in the correct order for the PS condition compared to either of the control conditions. Given the large effect size at the group level and the fact that almost all the children presented with a phonological-similarity effect for AuditoryVerbal Recall, the task clearly exerted a sufficient pull on participants in this age range to produce a phonological-similarity effect. We turn now to the Visual Recognition task.  3.1.1.2 Visual Recognition In Visual Recognition, the children saw a series of pictures presented one at a time in a spatial sequence, and then had to select from a response set of nine pictures those that matched and mark them in the same order as they had been presented. All children again completed 9 trials (3 trials at each length of three, four, and five pictures) for each condition (dissimilar, PS, and VS), for a total of 27 trials. Memory performance 15  This child (Case #38) recalled 26, 25, and 35 words in the correct order for dissimilar, PS, and VS conditions respectively when only trials of three to five words were considered, and 36, 37, and 44 words correct for trials of three to six words. 16 SEMVS-PS = 0.585; confidence interval of 1.18 words.  125  was based on a strict scoring scheme which required both picture recognition and memory for order. Hence, a picture was credited as correct only if the child recalled it in the exact serial position within a trial. For each condition, the maximum number of pictures correct was 36, and the maximum possible span was 5 pictures. As a group, the children correctly remembered on average between 23.7 and 26.3 pictures depending on the condition, and 76.1 of 108 pictures (70.5%) in total (see Table 4). This translated into a mean highest strict span of 4.26 pictures: 7 children with highest spans of 3 pictures, 17 with spans of 4 pictures, and 18 with spans of 5 pictures. Those children who had performed near ceiling (i.e., obtained highest lenient spans of 5 pictures) completed nine additional trials of six pictures in length (three each in the dissimilar, PS, and VS conditions). Of these 23 children, 10 improved their highest strict spans to 6 pictures. When trials of three to six pictures in length were considered for the entire sample, the children succeeded in remembering on average between 4.8 and 5.9 trials depending on the condition (max. 12),17 and mean span increased to 4.52 pictures. Hence, most children were in a comfortable range of task difficulty for the Visual Recognition task. As indicated by their highest strict spans, many children actually did very well, but very few of them were at or near ceiling for all three conditions, which is crucial for our purposes given that many analyses contrast performance in different conditions. Unless otherwise indicated, analyses included only results for trial lengths of three to five pictures in order to maximise comparability across children. In all cases strict scoring based on picture recognition and order recall was used.  17  For trials of three to six pictures in length, the number of trials (rather than pictures) correct was used as a memory performance measure when making comparisons using the entire sample.  126  Table 4. Mean Pictures Correct, Trials Correct, and Highest Span, by Condition and by Trial Length, Visual Recognition Min  Max  Mean  SD  10 10 8  35 36 36  26.19 23.67 26.26  6.92 6.87 7.78  Highest span  3  5  4.26  0.73  Trials of three to six pictures Trials correct, by condition Dissimilar PS VS  1 1 1  11 10 11  5.88 4.81 5.79  2.48 2.31 2.81  Highest span  3  6  4.52  1.04  Trials of three to five pictures Pictures correct, by condition Dissimilar PS VS  Note: Pictures, trials, and span are all based on strict scoring which credited pictures only when they were remembered in the correct serial position. The highest span was obtained in at least one condition.  Phonological-similarity effect Group-level analyses Once again, the main analysis involved determining whether the children showed a significant phonological-similarity effect when asked to remember lists of pictures in the order they saw them. This would be reflected by better picture recognition in the dissimilar compared to the PS conditions. As a group, the children remembered similar numbers of pictures for the dissimilar (M = 26.2) and the VS conditions (26.3), but slightly fewer pictures for the PS condition (M = 23.7) compared to the other two conditions (see Table 4 and Figure 4).  127  32  Pictures correct  28 24 20 16 12 8 4  Dissimilar  PS Condition  VS  Figure 4. Mean number of pictures remembered correctly (95% CI) in Visual Recognition, by condition  128  Group differences in the numbers of pictures correctly recognised were explored using a repeated measures ANOVA and planned contrasts between conditions (dissimilar vs. PS, and dissimilar vs. VS). Taken as a group, the participants showed a significant phonological-similarity effect with a mean of 2.5 (SD = 4.6) additional pictures recognised correctly for the dissimilar compared to the PS condition, which corresponds to a small effect size F(1, 41) = 12.8, p = .001, d = 0.37.18 Subject-wise analyses Examination of individual patterns indicated superior picture recognition for the dissimilar compared to the PS condition for most children (see Figure 5). Once again, the minimum difference to judge whether an effect was present at the individual level was based on the distribution of scores in the sample as a whole. Based on the standard error of the mean difference in pictures correct between conditions,19 any child who remembered at least 2 additional pictures for the dissimilar condition compared to the PS condition was judged to be exhibiting a phonological-similarity effect. According to this criterion, 25 children presented with a significant phonological-similarity effect, and 17 did not. However, lack of sensitivity was apparently not an issue, as the size of the difference in pictures correct for the dissimilar and PS blocks varied greatly (from −10 to +12), although 11 children did show small differences clustering around 0 (i.e., difference scores of -1, 0, or 1; see Figure 5). Practice effects may have reduced the size of the difference between conditions somewhat, as once again the dissimilar block of trials was always completed first. However, only 2 (Cases #2 and 7) of the 6 children (Cases #2, 7, 18  There were 2 outliers in the distribution of difference scores, with those children showing an important increase in performance of 8 and 10 pictures for the PS compared to the dissimilar blocks. A sensitivity analysis performed with these children excluded confirmed that these outliers did not change the overall results, although the effect size became large, F(1, 39) = 26.0, p < .001, d = 0.81. 19 Based on the SEMdiff of 0.705 between dissimilar and PS conditions, the value to build a 95% confidence interval is 1.43 pictures. Hence, any phonological-similarity effect greater than 2 was considered to be a significant difference, as the lower bound of the confidence interval would be above 0.  129  13, 15, 27, and 39) who had negative difference scores (i.e., superior memory of at least two pictures in the PS condition compared to the dissimilar condition) presented with a pattern suggestive of a practice effect benefitting both the PS and the VS conditions. Additionally, this would entail that these participants were either not particularly sensitive to phonological similarity, or that they were somehow overcoming this difficulty. Some children were at or near ceiling for each condition, with between 2 and 6 children recognising 35 or 36 pictures correctly (max. 36). As an additional precaution, the scores of children who did not show a phonological-similarity effect at the individual level for trials of three to five pictures were examined more closely. Among those children, 11 were near ceiling, and thus completed trials of six pictures. When the numbers of pictures recognised in the correct order for trials of three to six pictures in length were examined, 4 of these children (Cases #31, 32, 34, and 36) now presented with a significant phonological-similarity effect (i.e., a difference of between +4 and +7 pictures) based on a criterion of an advantage of 3 or more pictures20 for the dissimilar compared to the PS condition. Among the 7 participants who did not show a phonological-similarity effect even when the longer trials of six pictures were considered, only 2 continued to be close to ceiling, as they remembered 50 or more pictures in the correct order (max. 54) for either the PS or the VS condition (dissimilar, PS, and VS blocks respectively, Case #11: 40, 45, 50; Case #39: 44, 51, 52), but not for the dissimilar condition, which again indicates that ceiling effects were not a cause for concern. In addition, all 7 of the children who continued not to show a significant decrement in performance in the PS condition were actually exhibiting a reverse effect (with differences varying between −1 and −7 pictures). Altogether, 29 (69%) of the 42 children exhibited a significant phonological-similarity effect in Visual Recognition. The 20  For trials of three to six pictures in length, SEMdiff = 1.142, and tcrit(.05, 22) = 2.074 (i.e., only 23 children went on to complete these longer trials). This results in a value of 2.37 pictures to build the 95% confidence interval.  130  13 who did not were quite evenly distributed in terms of span, as 5 had highest spans of 3 pictures, 3 had spans of 4, 3 had spans of 5, and 2 had spans of 6. At first glance, based on the group means for the various conditions in the four presentation orders, it appeared that order could have impacted the likelihood of individual participants presenting with a phonological-similarity effect in Visual Recognition (see Figure 15, Appendix J). However, further analyses revealed that there was no obvious effect of order of presentation of tasks (i.e., Auditory-Verbal and Visual) or of conditions (i.e., PS and VS) on the distribution of children in terms of whether or not they showed a significant phonological-similarity effect in the memory for pictures task (see Appendix J for detailed analyses). Hence, despite initial concerns, individual children were just as likely to show a significant phonological-similarity effect for serial picture recognition regardless of the order in which they completed the tasks or the conditions.  131  8 7 6 5  Number of children  4 3 2 1 0 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12  Difference in pictures (Dissimilar - PS) Figure 5. Difference score in number of correct pictures, dissimilar vs. PS conditions, Visual Recognition Note: Max. 36 for each condition.  132  Visual-similarity effect Group-level analyses Particularly given their ages, some of the participants were expected to be negatively affected by the visual similarity of the pictures in the VS condition. However, comparison of dissimilar and VS conditions revealed almost identical mean numbers of correct pictures (see Table 4 and Figure 4), which indicates that there was no visualsimilarity effect at the group level, Mdiff = −0.07, SD = 4.11, F(1, 41) = −0.013, p = .91, d = 0.02. Subject-wise analyses As will become clear, the apparent lack of difference at the group level in terms of pictures remembered in the correct order between the dissimilar and VS conditions in fact masks two patterns at the level of individual children (see Figure 6). Based on the standard error of the mean difference between the dissimilar and VS conditions21, only differences of 2 or more pictures were considered to be significant. According to this criterion, 12 children presented with a significant visual-similarity effect, as they obtained positive difference scores of 2 or more pictures (between 2 and 7), indicating better performance for the dissimilar compared to the VS condition.  21  For trials of three to five pictures in length, SEMDiss-VS = 0.634 and 95% CI = 1.28 pictures.  133  8 7 6 5  Number of children  4 3 2 1 0 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0  1  2  3  4  5  6  7  Difference in pictures (Dissimilar - Vs) Figure 6. Difference score in number of correct pictures, dissimilar vs. VS conditions, Visual Recognition Note: Max. 36 for each condition.  134  On the other hand, as was observed with Auditory-Verbal Recall, some children apparently experienced the benefits of practice and were able to improve their performance between the dissimilar and the VS conditions, as once again the dissimilar block always came first. Here 16 participants actually recognised significantly fewer dissimilar compared to VS pictures, as indicated by their negative difference scores ranging from −10 to −2. Given that a substantial portion of children actually improved their performance between the dissimilar and VS blocks of trials, this benefit of practice may have obscured a small visual-similarity effect in some children. On the other hand, for practice to have been beneficial, these children could not have been particularly sensitive to visual similarity to begin with or they must have been doing something to overcome this difficulty. Also, it does not seem to be the case that the manipulation was insensitive overall, although 14 children did show absolute differences around 0 (between −1 and 1). Once again, potential ceiling effects were explored. Altogether, 19 of the 30 children who did not show a visual-similarity effect were near ceiling for trials of three to five pictures and thus completed trials of six pictures in length. However, only 3 of these participants (Cases #26, 36, and 37) recognised significantly more pictures (between 4 and 8) in the dissimilar condition compared to the VS condition when trials of three to six pictures were considered—the expected pattern if visual similarity were influencing serial recognition accuracy.22 Consequently, with these 3 children added to the total, 15 (36%) of the 42 participants presented with a significant visual-similarity effect. Regarding possible order effects, there was little difference between the numbers of correct pictures in the dissimilar and VS conditions for any presentation order (see Figure 15, Appendix J). Statistical testing confirmed that presentation order did not 22  For trials of three to six pictures in length completed by only 23 children: SEMDiss-VS = 1.065, tcrit = 2.074; 95% CI = = 2.21 pictures. Therefore, any positive difference of 3 pictures or more was considered to be indicative of a significant visual-similarity effect.  135  influence the likelihood of individual participants being influenced by visual similarity (see Appendix J for detailed analyses). Using the combined information for both phonological-similarity and visualsimilarity effects, the following distribution arose: 10 children showed neither effect, 3 only a visual-similarity effect, 12 both effects, and 17 only a phonological-similarity effect.  3.1.1.3 Comparing tasks Overall, the children performed similarly for the two memory tasks, remembering 66% of the words and 70% of the pictures in the correct order. In addition, there was a strong significant positive association between word recall and picture recognition, r(42) = .80, p < .001, adjusted R2 = .62 (see Figure 7).23 Only 4 children (1 child in grade I, and 3 in grade II) remembered fewer than 50% of the items for both tasks; only 1 of these children failed to obtain a highest strict span of at least 3 items, but only for Auditory-Verbal Recall. Hence, most children were quite successful at remembering both words and pictures in the correct order. Also, those who were at or near ceiling completed additional trials of six words or pictures in length, which is not illustrated in Figure 7.24  23  All correlational analyses were preceded by visual inspection of the data. In addition, precautions were taken to identify bivariate outliers (using standardised residuals) and high leverage values. As for all analyses, any data point with a standardised residual value greater than |2.58| was considered an outlier. Any points with leverage values above .20 were considered potentially problematic (Zumbo, 2002). Here, this method led to the identification of 1 bivariate outlier. A sensitivity analysis revealed that the results were essentially unchanged, as performance on the two memory tasks continued to be strongly related r(41) = .85, p < .001, adjusted R2 = .72. 24 The result was identical when the correlational analysis was performed using total trials correct for trials of three to six words and pictures, r(42) = .80, p = <.001, adjusted R2 = .62. . A sensitivity analysis after removal of 1 bivariate outlier once again confirmed that performance on both tasks was very highly related, r(41) = .84, p < .001, adjusted R2 = .71.  136  C  90%  Pictures corre ct  80% 70%  G G C AG  60% 50% 40%  C C CG  A S  S SG G C A G CSCC S S S S S A G A A CA S G A A A  S  G A  30%  S  20%  G C  10%  Grade I Grade II Grade III Grade IV  10% 20% 30% 40% 50% 60% 70% 80% 90%  Words correct Figure 7. Percentages of correct words in Auditory-Verbal Recall and of correct pictures in Visual Recognition, all conditions combined, trials of three to five items  137  Most children fell into one of two groups: phonological-similarity effects in both tasks (n = 28), or a phonological-similarity effect only with Auditory-Verbal Recall (n = 11). This left 3 children (see Table 5), 1 who showed a phonological-similarity effect only with Visual Recognition (Case #42), and 2 who did not present with a phonologicalsimilarity effect in either task (Cases #4 and 26).  Table 5. Distribution of Participants Based on Whether or Not They Presented With Significant Phonological-Similarity Effects in Auditory-Verbal Recall and Visual Recognition. PSE in Visual Recognition PSE in Auditory-Verbal Recall  Yes No  Total  Yes 28 1  No 11 2  Total  29  13  42  39 3  Note: PSE, phonological-similarity effect.  3.1.1.4 Summary The influence of phonological similarity was very strong in Auditory-Verbal Recall. The overall effect was large for the group of children as a whole, and this was confirmed at the level of individual participants, with 39 of the 42 children remembering fewer words in the PS condition compared to the control conditions. The effect was also significant for Visual Recognition for the group as a whole, although it was not as large. Correspondingly, only 29 of the 42 children showed an impact of phonological similarity in the memory for pictures task. Finally, despite no apparent effect of visual similarity at the group level, 15 children were negatively impacted by the visual similarity of pictures, as they remembered significantly fewer pictures for the VS compared to the dissimilar conditions; 12 of these participants were also sensitive to phonological similarity.  138  In terms of potential order effects, the overall conclusion was that presentation order was not a cause for concern in terms of the appearance of phonological-similarity effects in either the Auditory-Verbal or the Visual Recognition memory tasks, or of visualsimilarity effects in the Visual Recognition task. Most important, there were no systematic influences of order impacting the likelihood of finding these effects to be significant at the level of individual participants. The only order effect that did seem to have been ‘real’ is that the dissimilar condition was relatively disadvantaged as it always came first. This led to an apparent practice effect in some children for the VS condition. Also, when phonological-similarity or visual-similarity effects were not found at the level of individual participants, the data from the longer trials made it possible to rule out ceiling effects as a likely explanation.  The next section reports on the findings for another index of rehearsal, specifically children’s overt behaviours observed while they were completing the serial memory for words and pictures tasks.  3.1.2 Observational data The children were videorecorded while completing both memory tasks. This made it possible to later code for observable behaviours reflecting verbal mediation on a trial-by-trial basis. To briefly reiterate, for purposes of coding, single-item labelling corresponded to the child saying a word or naming a picture a single time; although more than one item may have been labelled this way, the child did not repeat any items. Rehearsal included the following behaviours: naming items in a block (or chunking), repetition of multiple items, and cumulative rehearsal. Overt behaviours included talking aloud, whispering, and mouthing. Individual children were deemed to be either labelling or rehearsing based on a criterion of a minimum of three observed instances of the  139  target behaviour within a condition (dissimilar, PS, or VS). Results are once again presented for each memory task separately, followed by comparisons between tasks. Unless otherwise indicated, analyses were restricted to trials of three to five words or pictures in length in order to maximise comparability across children.  3.1.2.1 Auditory-Verbal Recall Because I hypothesised that condition could influence strategic behaviour, instances of observed verbal strategy use were tabulated separately for each condition. Data for both overt labelling and overt rehearsal were considered as indicators of verbal mediation. Most children either labelled or rehearsed aloud during Auditory-Verbal Recall. Given that a verbal response was required, strategic labelling could only occur at the time of presentation. On the other hand, it was possible for children to rehearse either while they were hearing the words, or when they were responding. Rehearsal at time of response was rarer, and usually took place when children were searching for additional words, self-monitoring, or self-correcting. The coders observed overt single-item labelling 245 times in total, or a mean of 5.8 times per child for the 27 trials of three to five words in length (see Table 6). Only 3 children were never observed to label overtly.25 Overall, the participants labelled quite consistently across conditions (between 1.8 and 2.1 times); given the small differences, no statistical tests were performed. Based on this data, 23 children were classified as overtly labelling using the criterion of three or more observed occurrences for at least one condition.  25  None of these 3 children went on to complete trials of six words.  140  Table 6. Observed Occurrences of Overt Labelling in Auditory-Verbal Recall, by Condition Condition  Minimum  Maximum  Sum  Mean  SD  Dissimilar PS VS  0 0 0  9 8 8  76 82 87  1.81 1.95 2.07  2.25 2.20 2.27  Total  0  25  245  5.83  5.74  Turning now to rehearsal, the children were observed to rehearse 276 times, or a mean of 6.6 times per child over 27 trials.26 In total, 33 of the 42 children (79%) were observed to rehearse overtly at least once for all three conditions (dissimilar, PS, and VS) combined. Only 2 children were never observed to either label or rehearse overtly (Cases #24 and 40). Among those who did rehearse overtly, 42% (14 of 33) did so quite consistently (i.e., between 9 and 26 times for the 27 trials). The mean numbers of occurrences of observed overt rehearsal were almost identical for the three conditions, with rehearsal observed twice on average over the nine trials in each condition (see Table 7); once again, the small differences did not warrant any statistical testing. Using the criterion of three or more observed occurrences per condition, 14, 13, and 16 children were observed to rehearse consistently in the dissimilar, PS, and VS conditions respectively.  26  Most of these occurrences were observed at time of presentation (i.e., while the children were listening to the words), with only four additional non-redundant observations coming from overt rehearsal during recall. Instances of rehearsal at time of response were included only if the child had not already been credited with rehearsal on that specific trial.  141  Table 7. Observed Occurrences of Overt Rehearsal in Auditory-Verbal Recall, by Condition Condition  Minimum  Maximum  Sum  Mean  SD  Dissimilar PS VS  0 0 0  9 9 9  93 92 91  2.21 2.19 2.17  2.84 2.72 2.65  Total  0  27  276  6.57  7.70  When information from the three conditions was combined, 19 children were classified as overtly rehearsing, as they were observed to rehearse three or more times in at least one condition (12 of whom were also overtly labelling). Among these 19 participants, 9 of them met the criteria for all three conditions, 6 for two, and 4 only for one. Hence, based on trials of three to five words in length, 19 children were overtly rehearsing (with or without also overtly labelling), another 11 were overtly labelling, and 12 were not observed to use either strategy with any regularity. The data for children who were near ceiling for trials of three to five words and who were not observed to label or were not observed to rehearse were scrutinised further. As the tendency to label or to rehearse (overtly or at all) may be linked to the level of participant-specific task difficulty, it is important to observe all children at a point where at least some of the trials were challenging. An additional 4 children who were at or near ceiling were observed to label consistently for the longer trials (i.e., for at least three of nine trials of six words). In total, 27 children were judged to have been overtly labelling while completing Auditory-Verbal Recall, and many of these children were also observed to rehearse overtly (see below). Furthermore, 6 children who were not classified as overtly rehearsing based on observations for trials of three to five words were near ceiling and thus completed trials of six words. When this additional data was  142  considered, 5 of the 6 children (Cases #21, 23, 25, 33, and 41) were observed to rehearse at least 3 times (between 3 and 6 times) for these nine additional longer trials.27 When these children were added, altogether 24 children were classified as overtly rehearsing for Auditory-Verbal Recall (and 17 of these children also qualified as overtly labelling). Another 10 were classified as overtly labelling, which left 8 children who were not observed to use either labelling or rehearsal with any consistency while they completed Auditory-Verbal Recall. Based on prior research, one would expect children to be more likely to rehearse in the higher grades; however, older children could also be internalising their behaviours to a greater extent which would make it more difficult to observe any strategies. In fact, more children were observed to rehearse consistently in Auditory-Verbal Recall with increasing grade level (see Table 8). This was particularly striking when the children in the two lower grades (9 of 22 of children, 41%) were compared to those in the two higher grades (15 of 20 children, 75%). Correspondingly, those who rehearsed overtly were, as a group, older by 9.1 months (M 105.9, SD = 13.4) than those who were not observed to do so (M 96.8, SD = 14.4). The results of a one-way ANOVA indicated that this was a significant difference, F(1, 40) = 4.4, p = .04, d = 0.67). Although the age ranges for the groups of participants who were and were not observed to rehearse were very similar (80 to 126 months, and 80 to 128 months respectively), among those who rehearsed overtly, 12 of 24 were aged 9;0 or older, compared to only 3 of 18 among those not observed to rehearse. What this may mean in terms of covert rehearsal will be addressed below, particularly in combination with the data from the children’s selfreported strategies.  27  For the longer trials of six words in length, children qualified as likely rehearsing if the coders observed them to use rehearsal for at least three of the nine trials (all conditions combined).  143  Table 8. Classification of Participants as Overtly Rehearsing or Not for Auditory-Verbal Recall Based on Observations, by Grade Grade Overtly rehearsing  Total  Total  Yes  Count %  I 3 30%  II 6 50%  III 8 80%  IV 7 70%  24 57%  No  Count %  7 70%  6 50%  2 20%  3 30%  18 42%  Count %  10 100%  12 100%  10 100%  10 100%  42 100%  The possible influence of presentation order of both tasks and conditions on the likelihood of observing a child rehearsing was investigated next. Once again, presentation order was not a cause for concern (see Appendix J for detailed analyses). Overall, 24 children were deemed to have been overtly rehearsing during one or more conditions while completing Auditory-Verbal Recall. These children tended to be older than those who were not observed to rehearse. The next section reports on observed strategies during Visual Recognition.  3.1.2.2 Visual Recognition The children did quite a bit of talking while they were completing the potentially nonvocal memory for pictures task. Visual Recognition requires an extra step for verbal mediation to take place, as the pictures must first be recoded into labels. Most children verbalised aloud during Visual Recognition, either while the pictures were being presented or when they were responding by marking pictures in the correct order on the screen of the tablet computer. The coders observed overt single-item labelling of  144  pictures 680 times in total (430 times during presentations and 250 during responses28), or a mean of 16.2 times per child over 27 trials (see Table 9). Only 2 children were never observed to label overtly.29 The participants labelled quite consistently across conditions (between 5.1 and 5.8 times); there was nonetheless a trend for more overt labelling during the PS trials, especially when compared to the dissimilar trials. However, neither the repeated measures ANOVA nor any of the pairwise planned contrasts were significant using adjusted p-values for three comparisons based on the Holm method, F(2, 82) = 2.24, p = .113, dissimilar vs. PS, F(1, 41) = 3.72, p = .061, d = 0.297; dissimilar vs. VS, F(1, 41) = 0.946, p = .336, d = 0.152; PS vs. VS, F(1, 41) = 1.52, p = .225, d = 0.188.30 In the end, 35 children were classified as overtly labelling based on the criterion of three or more observed occurrences for at least one condition for trials of three to five pictures. Another 2 children (Cases #5 and 27) who were near ceiling were observed to label at least three times over the nine longer trials of six pictures, and hence were also deemed to have been labelling consistently. Altogether, 37 of 42 children (88%) qualified as having been overtly labelling, and many of these also rehearsed overtly (see below).  28  Instances of labelling at time of response were included only if the child had not already been credited with labelling for that specific trial. 29 One of these children completed trials of six pictures, but was observed to label overtly only once for these longer trials. 30 Throughout the study, the Holm correction was chosen to control for Type I error when appropriate. This method is more powerful yet never rejects fewer comparisons than the Bonferonni procedure (Aickin & Gensler, 1996). As such, it proved to be a better choice to also guard against Type II error given the relatively small sample size. First, the p-values for all the comparisons are placed in increasing order. Then, each p-value is compared with α/(Nc − i + 1) for rejection of the null hypothesis, where Nc corresponds to the number of comparisons, and i to the rank of each specific comparison. Hence, α1 = .05/(Nc − 1 + 1), α2 = .05/(Nc − 2 +1), α3 = .05/(Nc − 3 + 1) and so on, until αNc = .05 / (Nc − Nc + 1) = .05. No further tests are done beyond the first non-rejection.  145  Table 9. Observed Occurrences of Overt Labelling in Visual Recognition, by Condition Condition  Minimum  Maximum  Sum  Mean  SD  Dissimilar PS VS  0 0 0  9 9 9  213 242 225  5.07 5.76 5.36  3.07 3.12 2.93  Total  0  27  680  16.19  8.35  The coders also observed the children to be rehearsing, although less frequently than was the case for labelling. As a group, the children rehearsed overtly 223 times over the 27 core trials, or a mean of 5.31 trials per child (see Table 10).31 In total, 32 of the 42 children (76%) rehearsed overtly for at least one trial over the three conditions combined. Only 2 children were never observed to either label or rehearse overtly (Cases # 34 and 40). Among those who rehearsed overtly, 31% (10 of 32) did so quite consistently, as they were observed to rehearse for 9 or more of the 27 trials. Instances of overt rehearsal were considered separately for each condition (dissimilar, PS, and VS). Once again, the mean numbers of occurrences of observed overt rehearsal were very consistent across conditions, averaging between 1.67 and 1.83 times over the nine trials within a condition (see Table 10), and neither the repeated measures ANOVA nor any of the pairwise planned contrasts were significant using adjusted p-values for three comparisons (all ps > .53). Using the criterion of three or more observed occurrences per condition, the coders observed 12, 13, and 14 children to be rehearsing during dissimilar, PS, and VS trials respectively.  31  Most of these occurrences were observed while the pictures were presented, with only 11 additional observations coming from overt rehearsal during the response stage. Once again, instances of rehearsal at time of response were included only if the child had not already been credited with rehearsal on that specific trial.  146  Table 10. Observed Occurrences of Overt Rehearsal in Visual Recognition, by Condition Condition Dissimilar PS VS Total  Minimum 0 0 0  Maximum 9 8 9  Sum 70 76 77  Mean 1.67 1.81 1.83  SD 2.39 2.14 2.43  0  26  223  5.31  6.35  When information from the three conditions was combined, 19 children were observed to rehearse three or more times for at least one condition and, as such, were classified as overtly rehearsing (and all of them also met the criteria for overtly labelling).32 Among these 19 participants, 7 of them met the criteria for all three conditions, 6 for two, and 6 for only one. Hence, based on data for trials of three to five pictures in length only, 19 participants were overtly rehearsing, 16 were overtly labelling, and 7 were not observed to do either with any consistency. As usual, possible ceiling effects were considered. Eleven children who were not classified as overtly rehearsing based on observations for trials of three to five pictures were near ceiling and thus completed trials of six pictures. Five of these 11 (Cases # 5, 19, 23, 30, and 33) were observed to rehearse three times or more (between three and eight times) over these nine additional longer trials and also qualified as consistently rehearsing. Hence, all told, 24 children were classified as overtly rehearsing for Visual Recognition (all of whom also qualified as overtly labelling). Another 13 were deemed to have been overtly labelling, which left 5 children who were not observed to use either labelling or rehearsal while they completed Visual Recognition.  32  The requirement of density or consistency (i.e., having been observed to rehearse at least three times within a single condition) resulted in two children who were observed to rehearse more than three times across the three conditions to nonetheless be excluded from the overtly rehearsing group.  147  Possible differences in terms of grade or age were investigated. In contrast to Auditory-Verbal Recall, there was no obvious influence of grade level on the likelihood of a child being observed to overtly rehearse in Visual Recognition, with similar proportions of children (50% to 60%) observed to rehearse consistently in each grade (see Table 11). Correspondingly, a one-way ANOVA revealed that, as a group, those children who rehearsed overtly were of similar mean age in months (M 102.9, SD 14.1) as those who did not (M 100.8, SD 15.1), F(1, 40) = 0.212, p = .65, d = 0.15. Once again the ranges were almost identical (80 to 125 or 128 months), but here the children were quite spread out for both subgroups. Whether this result may reflect in part the influence of children having rehearsed covertly will be addressed below, in particular once self-reported strategies have been considered.  Table 11. Classification of Participants as Overtly Rehearsing or Not for Visual Recognition Based on Observations, by Grade Grade Overtly rehearsing  Total  Total  Yes  Count %  I 5 50%  II 7 58%  III 6 60%  IV 6 60%  24 57%  No  Count %  5 50%  5 42%  4 40%  4 40%  18 43%  Count %  10 100%  12 100%  10 100%  10 100%  42 100%  Finally, the possible influence of presentation order of both tasks and conditions was explored. Presentation order did not have a significant impact on the likelihood of a child being classified as overtly rehearsing in Visual Recognition (see Appendix J for detailed analyses). 148  Altogether, 24 children fell into the overtly rehearsing category for Visual Recognition. As a group, these children were of similar ages compared to those who were not observed to rehearse, which differs from the pattern that emerged for AuditoryVerbal Recall. The next section will compare results for the observed strategies between the two tasks in more detail.  3.1.2.3 Comparing tasks Visual Recognition produced more overt talking than did Auditory-Verbal Recall, particularly because many children tended to label the pictures either as they appeared on the computer screen or while marking their responses. As a result, more children fell into the overtly labelling category for Visual Recognition (n = 37) than for Auditory-Verbal Recall (n = 27). When results for the two tasks were combined, 25 participants were observed to label while they completed both tasks, 12 only for Visual Recognition, and 2 only for Auditory-Verbal Recall. Only 3 (Cases #13, 34, and 40) children were not observed to do so during either task. As for rehearsal, over the 27 core trials for each task, as a group the children were observed to rehearse approximately at the same frequency during Auditory-Verbal Recall (24% of trials; M 6.57, SD 7.70) and Visual Recognition (20% of trials; M 5.31, SD 6.35). This difference proved not to be statistically significant, F(1, 41) = 1.52, p = .22, d = 0.19 when tested using a repeated measures ANOVA33. This result is not surprising when one looks at the distribution of difference scores, as 14 children rehearsed overtly during more trials for Auditory-Verbal Recall, 16 did so more frequently for Visual Recognition, and 12 rehearsed overtly on the same number of trials for both tasks.  33  This is based only on the trials of three to five pictures and words, i.e., those completed by all children and on which classification decisions were based for most children except those suspected to be near ceiling. There were 3 outliers in the distribution of difference scores. The conclusion was identical when the ANOVA was repeated without these participants, F(1, 38) = 1.31, p = .26, d = 0.18.  149  Identical numbers of children were classified as overtly rehearsing in each task, with 24 children meeting the criterion in each case. Although there was much overlap, these were not exactly the same children. The coders observed 20 participants to be rehearsing during both tasks, and 14 during neither. In addition, 4 children rehearsed overtly only during Auditory-Verbal Recall, and 4 did so only during Visual Recognition. Among the 4 participants who were classified as overtly rehearsing for Auditory-Verbal Recall only, for 2 of them (Cases #25 and 41) this was based on observations during trials of six words. Not only did these children not meet the criterion of three observations of rehearsal within a condition for Visual Recognition, but in all cases there were very few observations of rehearsal (Case #27, 1 in 36 trials in total; Case #10, 2 in 27; Case #41, 4 in 36; Case #25, and 0 in 27). Anecdotally, 3 of 4 the children who were observed to rehearse only during Auditory-Verbal Recall completed this task second (i.e., after Visual Recognition). Another subgroup of 4 children was observed to rehearse only in the memory for pictures task. One of these participants (Case #5) completed trials of six pictures in length and was classified as overtly rehearsing based on observations for these longer trials. All of these children were very rarely observed to rehearse during the memory for words task (between 0 and 3 of 27 trials in total; Cases #4, 5, 9, and 22). Anecdotally, 3 of the 4 participants who were observed to rehearse only during Visual Recognition completed this task second (i.e., after Auditory-Verbal Recall).  3.1.2.4 Summary On the whole, approximately half the children (20 of 42) were classified as overtly rehearsing based on observations for both the memory for words and the memory for pictures tasks. Another 8 were considered to be rehearsing overtly during only one of the tasks, either Auditory-Verbal Recall or Visual Recognition; 6 of these 8 participants did  150  so for the second task. Age seems to have been a factor only for Auditory-Verbal Recall, where those who rehearsed overtly tended to be older than those who did not. The following section reports findings for the final index of rehearsal, specifically children’s self-reported strategies based on their responses to post-task questions for each of the two serial memory tasks.  3.1.3 Self-report data Once the children had completed all trials of three to five items in length for a given memory task, the examiner asked them what they had done to remember.34 Results are once again presented for each memory task separately, followed by comparisons between tasks.  3.1.3.1 Auditory-Verbal Recall The majority of children reported using at least one strategy during AuditoryVerbal Recall, with only 10 of the 42 children (23%) classified as not reporting any strategy. These participants either denied having done anything to remember or offered vague responses that referred essentially to having made an effort to memorise the words or pictures (e.g., Case #1: “I was just trying to (um) remember the words on my brain”). They were more likely to be in the lower grades, as five were in grade I, three were in grade II, and one was in grade III. Reported strategies were classified as either rehearsal, labelling, or other, and scores were initially tabulated for each strategy type, allowing for more than one strategy per child. Just as for the scoring of observed strategies, rehearsal included responses referring to item repetitions, chunking, and cumulative rehearsal, whereas labelling was reserved for instances where the child mentioned saying words or naming pictures one at a time. Altogether, the responses of 34  The children reported on their strategies only once per task following all trials of three to five words or pictures in length regardless of whether or not they went on to complete trials of six items.  151  11 children included strategies other than rehearsal or single-item labelling. These were counting (n = 7), visual imagery (n = 2), eye closing, (n = 1) and simplification (i.e., deciding to focus only on some of the items; n = 1). In total, 7 of the 42 children (17%) stated having used another strategy in addition to rehearsal or labelling, and this included one child whose response was coded as labelling, rehearsal, and other (imagery). No other child reported having used both labelling and rehearsal to remember the lists of words. Responses were recoded in order to place each child into a single category. Precedence was given to verbal strategies and to rehearsal in particular. Given that the main objective of this study is to compare various indicators of rehearsal, this coding scheme made it possible to distinguish those children who reported rehearsal (as a single strategy or in combination with other strategies) from those who did not. Hence, any child who stated having used some form of rehearsal was placed in that category. Among those who did not report having used rehearsal, those who mentioned having named individual items were placed in the labelling category, and finally those who reported having used only another strategy were placed in the other group (see Table 12). According to this distribution, 13 children (31%) fell into the rehearsal group, and 15 (36%) into the labelling group. Only 4 of the 42 (10%) children stated having used another strategy in isolation, and were thus placed in the other category. When participants who fell into the rehearsal and the labelling categories were combined, 28 of 42 children (67%) declared having actively used a verbal strategy to recall the lists of words in serial order.  152  Table 12. Distribution of Children by Reported Strategy, Auditory-Verbal Recall Frequency  %  Cumulative %  Rehearsal Labelling Other None  13 15 4 10  31.0 35.7 9.5 23.8  31.0 66.7 76.2 100.0  Total  42  100.0  Note: This classification places each child into a single category and gives precedence to verbal strategies, and to rehearsal in particular.  The 13 children who reported having used rehearsal for Auditory-Verbal Recall were more likely to be in the higher grades, with no child from grade I and only 3 children from grade II stating that they had used the strategy (see Table 13). Correspondingly, a one-way ANOVA revealed that those who reported having used a rehearsal strategy were significantly older by almost 12 months on average (M 110.2, SD 12.1) than those who did not (M 98.4, SD 14.0), F(1, 40) = 6.9, p =.01, d = 0.90.  Table 13 Classification of Participants as Reporting Rehearsing or Not for AuditoryVerbal Recall, by Grade Grade Reporting rehearsal  Yes  I 0  II 3  III 4  IV 6  13  0%  25%  40%  60%  31%  Count %  10 100%  9 75%  6 60%  4 40%  29 69%  Count %  10 100%  12 100%  10 100%  10 100%  42 100%  Count %  No  Total  Total  153  Finally, the proportions of children (varying between 27% and 40%) who declared rehearsal as a strategy were quite evenly distributed across the four presentation orders, which suggests that presentation order was not exerting an important influence on the likelihood of participants reporting verbal rehearsal as a strategy (see Appendix J for details). Hence, in total, only 13 children stated having used verbal rehearsal while completing Auditory-Verbal Recall and these children were more likely to be in the later grades. The next section will consider these same analyses for Visual Recognition.  3.1.3.2 Visual Recognition Most children also declared having used at least one strategy while completing Visual Recognition, with only 6 of 42 (14%) classified as not reporting any strategy. Once again, the few who did not were more likely to come from the lower grades, with two in grade I, three in grade II, and one in grade III. The responses of 8 children corresponded to strategies other than rehearsal or labelling (counting, n = 5; visual imagery, n = 1; gestures, n = 1; simplification, n = 1), whereas, the responses of 7 others (17%) included another strategy in addition to rehearsal or labelling. Finally, 2 participants claimed to have used both labelling and rehearsal, including one child who also mentioned having occasionally relied on gestures corresponding to object functions (e.g., pretending to pound a nail or to use a saw).  154  Table 14. Distribution of Children by Reported Strategy for Visual Recognition Frequency  %  Rehearsal Labelling Other None  17 18 1 6  40.5 42.9 2.4 14.3  Total  42  100.0  Cumulative % 40.5 83.3 85.7 100.0  Note: This classification places each child into a single category and gives precedence to verbal strategies, and to rehearsal in particular.  As shown in Table 14, the distribution of children into grouped categories giving precedence to verbal strategies resulted in 17 children (40%) placed into the rehearsal category, and 18 children (43%) into the labelling category. Only one child (Case #28) stated having used another strategy on its own. Altogether, 35 of the 42 participants (83%) declared that they had actively resorted to a verbal strategy to recognise the pictures in the correct serial order. Possible effects of presentation order on the likelihood of children reporting rehearsal as a strategy were explored and found not to be an issue for concern (see Appendix J for detailed analyses). The next analysis looked at the effect of grade. Children in grade IV (7 of 10) were more likely than those in the three younger grades (10 of 32) to have declared using rehearsal while completing Visual Recognition (see Table 15). Although participants who reported having rehearsed in the Visual task once again tended to be older (M age 105.3 months, SD 15.9) than those who did not (M age 99.8, SD 13.1) by about 5.5 months, this difference was not significant, F(1, 40) = 1.5, p = .23, d = 0.40. The participants were very likely to report having used a verbal strategy to complete Visual Recognition, as 35 of 42 did so. However, only 17 of these children explicitly referred to rehearsal as their strategy, and as a group they were not  155  significantly older than those who did not, which differs from the pattern that emerged for Auditory-Verbal Recall. The next section will compare results for the self-reported strategies between the two tasks in more detail.  Table 15 Classification of Participants as Reporting Rehearsing or Not for Visual Recognition, by Grade Grade Reporting rehearsal  Yes  No  Total  Total  I 3  II 5  III 2  IV 7  17  %  30%  42%  20%  70%  40%  Count %  7 70%  7 58%  8 80%  3 30%  25 60%  Count %  10 100%  12 100%  10 100%  10 100%  42 100%  Count  3.1.3.3 Comparing tasks The participants were somewhat more likely to report having used a verbal strategy (labelling or rehearsal) for Visual Recognition (n = 35, or 83%) compared to Auditory-Verbal Recall (n = 28, or 67%). This same pattern emerged when rehearsal was considered on its own, with respectively 17 and 13 children reporting a rehearsal strategy for the memory for pictures and words tasks. When strategy reports for the two tasks were combined, only 10 children declared rehearsal as their strategy for both tasks, whereas 22 children did not report having used rehearsal for either task (see Table 16). Three children declared having used rehearsal to complete only AuditoryVerbal Recall; anecdotally, they all completed this task second (i.e., after Visual Recognition). Seven children reported rehearsal as their strategy only for Visual  156  Recognition; in this case, task order was inconsistent, as four of the seven completed this task first.  3.1.3.4 Summary Most children reported using at least one strategy for both the memory for words and the memory for pictures task, and in most of these cases these strategies included either labelling or rehearsal. However, only 13 children included rehearsal in their response when asked how they had remembered the words, and 17 did so when they explained how they had remembered the pictures. Age may have played a role, but was a significant factor only for Auditory-Verbal recall, where those who reported having used rehearsal tended to be older than those who did not. Finally, task order did not exert any obvious influence on self-report. The next section will integrate the findings regarding the various indicators of rehearsal.  Table 16. Number of Children Reporting Rehearsal as a Strategy, by Task Visual Recognition Reporting rehearsal Auditory-Verbal Recall Reporting rehearsal Total  Yes No  Total  Yes 10 7  No 3 22  13 29  17  25  42  3.1.4 Combining indices of rehearsal Comparisons across the three indicators of rehearsal (i.e., phonological-similarity effect, overtly rehearsing, reporting rehearsal) are based only on trials of three to five words or pictures, as children responded to post-task questions after having completed  157  these trial lengths. This presents the additional advantage of placing all children on an equal footing, as they all completed the same number of trials and did so at the same point in the experiment. Serial memory scores (for words or pictures) or observed strategies for longer trial lengths were considered only if they may have helped in the interpretation.  3.1.4.1 Auditory-Verbal Recall The first comparison looked at the classifications of children as rehearsing or not based on: i) observations and ii) self-report data (see Table 17). For 30 of the 42 children (71%), the two classifications matched, as 10 children both rehearsed overtly and declared having used a rehearsal strategy, and another 20 children were not observed to rehearse consistently and also did not report using this strategy. The distributions of children into reporting rehearsal or not categories were significantly different depending on whether the children were classified as overtly rehearsing, χ2(1, N = 42) = 7.63, p = .006.  Table 17. Distribution of Children as Rehearsing or Not, Based on Observations and on Self-Report, Auditory-Verbal Recall Reporting rehearsal Overtly rehearsing Total  Yes No  Total  Yes 10 3  No 9 20  19 23  13  29  42  Note: This classification is based only on data from trials of three to five words in length.  An additional 9 children who consistently rehearsed overtly did not report having done so. It is interesting to note that most of them were observed to rehearse quite  158  frequently, between 4 and 22 times over 27 trials (M 11.2, SD 6.5). Three of these children reported having used single-item labelling, and were observed to use both rehearsal and labelling consistently (Cases #2, 16, and 37). The coders conservatively judged one child’s (Case #6, grade I, 7 years 1 month) response as imagery: “I just remembered them. Like I was like looking at something and then I said ‘that was my hand or that was my palm tree or that was something’”. It is worth noting that the coders observed this child to rehearse for 4 trials, and to label for 17 trials. Another child (Case #28, grade III, 9 years 0 months) who was observed to rehearse 22 times reported using simplification: “If there was too many then I would just go to the three or four first ones… I just remembered them.” The other 4 participants (Cases #10, 17, 19, and 31) did not state having resorted to any strategy, although the coders observed them to rehearse while completing between 6 and 24 trials, and to label single items between 1 and 13 times across trials. Hence, the responses of these 9 children are clearly not incompatible with rehearsal—they appear to be incomplete rather than incorrect. An additional 3 children reported using a rehearsal strategy but did not meet the criterion of three observations within a condition (dissimilar, PS, or VS), although each was observed to rehearse once or twice over the 27 trials. One child (Case #12, grade II, 7 years 7 months), illustrated that she had used a chunking strategy following presentation of the second item: “I say it over and over in my head. Brush, flag. Brush, flag, ten, arm. Flag, ten, arm. {beep plays} Brush, flag, ten, arm {as a response, after the beep}…But I wouldn't do it like the first letter35 because I only do it in doubles”. Another participant (Case #33, grade IV, 9 years 8 months) clearly described using cumulative rehearsal: “Every time it said a word, I'd say it over and over. And then, when it came to the next word, I'd say them both over and over until all the words were there.” This child  35  It is perhaps surprising, but many children had trouble referring to words, and often called them letters.  159  also qualified as overtly rehearsing based on trials of six words, as the coders observed rehearsal on three of the nine trials at the longer length. Another child (Case #34, grade IV, 9 years 8 months) also expressed using cumulative rehearsal: “Like as soon as I heard them…I would go tent. And then I would go tent, kite. And then I would go tent, kite, heart”. The latter 2 children also responded positively when the examiner asked whether they had done this silently. All told, it is highly plausible that these 3 children were using covert rehearsal, which would account for the mismatch between observational and self-report data. This was, however, a fairly rare phenomenon, as only 3 children can be confidently assumed to have been rehearsing covertly. When observations and self-report were combined, 22 children fell into the likely rehearsing category. Returning to the other 20 children (i.e., those who were likely not rehearsing), 11 (Cases #1, 7, 8, 9, 14, 15, 20, 22, 23, 26, and 42) were observed to label at encoding on at least three trials within a condition; 7 of these 11 participants also reported labelling as their strategy, 1 child indicated counting (Case #22), and 3 did not declare any strategy (Cases #1, 7, and 8). Among the other 9 children who were not observed to label consistently, 5 nonetheless reported this as their strategy, and the coders observed all but one of them to label at least once. One child (Case #21, grade II, 8 years 2 months) who was observed to label three times and to also rehearse three times over the 27 trials, reported using a simple word chaining strategy which was coded as labelling but actually could be considered a simple form of semantic elaboration36: “I made up a sentence or tried as hard as I could to remember. ‘Bench in a train with an ant in a house’”. The coders observed 2 other children to each label only twice, but their responses clearly corresponded to labelling: Case #25, grade III, 8 years 10 months, “I said it over once in my head. I would say it at the same time. {Child repeats words with  36  This is one of the three responses that were suggestive of simple semantic elaboration strategies. They were added to labelling as they were judged to be more similar than different.  160  the recording, one at a time} Kite. House. Wheel.”; Case #41, grade IV, 10 years 6 months, “Well, I found it effective if you close your eyes to keep your attention away from anything but the noise… Sometimes if you mouth the words it helps a bit. Heart. Shell. Bridge. Drum.” {said while example plays, one word at a time}. The only participant who was never observed to either label or rehearse (Case #40, grade IV, 10 years 6 months) unequivocally indicated labelling as his strategy: “Just saying them in my head. Those words… Just one at a time.” Only 1 child (Case #3, grade I, 6 years 10 months), who was very young, provided a rather vague response “I just put it in my brain like while I was saying all the other ones in my brain and then saying them out I remembers the other ones”; although both the coders judged this response as labelling, some doubt could remain about how to code it.  Table 18. Distribution of Participants by Likely Strategy, Based on Combined Data From Observation and Self-Report, Auditory-Verbal Recall Strategy  Frequency  Percent  Likely rehearsing Likely labelling Likely doing neither  22 15 5  52.4 35.7 11.9  Total  42  100.0  Note: This classification is based only on data from trials of three to five words in length.  Altogether, children’s self-reports proved to be highly accurate if incomplete, with significant doubt remaining about the report of 1 one child (Case #3). Combining selfreports and observational data resulted in 22 children classified as likely rehearsing and 15 as likely labelling (i.e., using single-item labelling; see Table 18).37 The other 5 were  37  If a given child met either the observation or the self-report criteria for more than one strategy, the final classification gave precedence to verbal strategies over others, and to rehearsal over labelling.  161  judged to likely not be using active verbal mediation (either labelling or rehearsal) in any consistent way. This included the child (Case #3) who reported something vague that could only tentatively be classified as one strategy or another (see above). He was very young, observed to label only once, and never observed to rehearse. Another 3 did not report any strategy: one was very young (Case #4, grade I, 6 years, 10 months), was never observed to label and observed to rehearse only once; another (Case #13, grade II, 7 years 8 months) was observed to label 4 times over all 27 trials, never observed to rehearse, and obtained a highest span of only 3 words; the other (Case #24, grade III, 8 years 6 months) was older, yet never observed to label or rehearse and had a highest span of only 3 words. Finally, 1 child (Case #5, grade I, 7 years 1 month) reported using another strategy (i.e., counting), and was observed to label twice and to rehearse three times; given her age, it is plausible that she was using verbal mediation only sporadically.  Table 19. Distribution of Children With or Without a Phonological-Similarity Effect (PSE) and Likely or Not to Have Been Rehearsing, Auditory-Verbal Recall Likely rehearsing PSE Total  Yes No  Yes 22 0  No 17 3  Total  22  20  42  39 3  Note: This classification is based only on data from trials of three to five words in length.  The next analysis contrasted the distributions of children in terms of whether or not they exhibited a significant phonological-similarity effect in Auditory-Verbal Recall, and whether or not they were likely rehearsing (based on observation and self-report combined) while completing this task (see Table 19). The 3 children who did not present  162  with a phonological-similarity effect (Cases #4, 26, and 42) also did not fall into the likely rehearsing category. Among the children who presented with a phonological-similarity effect, only 22 of the 39 (56%) were judged to be likely rehearsing (19 observed, 3 reported), whereas 17 (44%) were not. There were too few cases in the nonsignificant phonological–similarity group to reliably test whether these distributions were significantly different. When both likely rehearsing and likely labelling were combined (see Table 20), among the children who presented with a phonological-similarity effect another 13 (33%) fell in the likely labelling category (9 observed, 4 reported), which brought the total of children who were likely rehearsing or labelling to 35 of 39 (90%). This left only 4 children (10%; Cases #3, 5, 13, and 24) who were most likely not using either labelling or rehearsal (either overtly or covertly) with any consistency, yet did nonetheless show a phonological-similarity effect. It is interesting to note that 3 of these children were observed to label or to rehearse sporadically, including for PS trials. However, they did not do so sufficiently to meet the minimum criteria of three of nine trials for any condition. Finally, of the 3 children who were not sensitive to phonological similarity, two fell in the likely labelling category (Cases #26 and 42), and the third was likely not labelling or rehearsing (Case #4). One of these children (Case #42) had a very low span for his age; he was also observed to label only three times and to rehearse once, all four times in the VS condition.  163  Table 20. Distribution of Children With or Without a Phonological-Similarity Effect (PSE) and Likely or Not to Have Been Labelling or Rehearsing, Auditory-Verbal Recall Likely labelling or rehearsing PSE  Yes No  Total  Yes 35 2 37  No 4 1 5  Total 39 3 42  Note: This classification is based only on data from trials of three to five words in length.  Overall, there was very good convergence (i.e., for 35 of the 42 participants) between the presence of a phonological-similarity effect and evidence of active verbal mediation in the form of labelling or rehearsal in Auditory-Verbal Recall, a task that required the immediate serial recall of words.  3.1.4.2 Visual Recognition The first analysis compared the coders’ classifications of children as rehearsing or not based on observational and self-report data (see Table 21). Here, the classification matched for 26 of the 42 children (62%), as 10 participants rehearsed overtly and reported having used a rehearsal strategy, and another 16 were not observed to rehearse consistently and did not report this strategy. However, the distributions of children into reporting rehearsal or not categories were not significantly different depending on whether the children were classified as overtly rehearsing, χ2(1, N = 42) = 2.1, p = .14.  164  Table 21. Distribution of Children as Rehearsing or Not, Based on Observations and on Self-Report, Visual Recognition Reporting rehearsal Overtly rehearsing  Total  Yes No  Yes 10 7  No 9 16  19 23  Total  17  25  42  Note: This classification is based only on data from trials of three to five pictures in length.  An additional 9 participants (Cases #4, 6, 9, 17, 21, 28, 29, 31, and 38) were observed to rehearse quite consistently (between 4 and 16 times over the 27 trials, M 7.8, SD 3.8) but did not report using this strategy. However, all of these children were also frequently observed to label (between 15 and 27 times, M 21.8, SD 4.5), and 7 of them reported having used a labelling strategy. One child (Case #28, grade III, 9 years 0 months) reported using simplification, yet was observed to label on 18 trials and to rehearse on 4 trials: “If it was too high, then I would just go to the three or four other ones…I was concentrating”. The only participant (Case #17, grade II, 8 year 1 month) who was not attributed any strategy based on self-report actually labelled overtly for 18 trials and rehearsed aloud for 8 trials. Her response to the post-task questions was, however, very vague: “Try to think. Try to remember what they are… Try to remember in my head”. Hence, it is highly plausible that all these children were in fact rehearsing at least on some trials. A final group of 7 children (Cases #3, 10, 12, 27, 33, 34, and 42) claimed to have used a rehearsal strategy but were not observed to do so consistently (i.e., did not meet the criterion of three observations within a condition). Four of these children were nonetheless observed to rehearse at least once across all trials of three to five pictures in length. The coders observed one participant (Case #12, grade II, 7 years 7 months) to  165  rehearse four times over 27 trials, and this child’s response clearly indicated using covert grouped rehearsal: “I said it over in my head like cat, bat, map. Cat, bat, map. Cat, bat, map”. Another child (Case #27, grade III, 8 years 11 months) was observed to rehearse overtly only once, yet clearly demonstrated cumulative rehearsal when asked what she had done to remember: “I was repeating the things over and over. Rope. Rope, bridge. Rope, bridge, ant. Rope, bridge, ant, shell. Rope, bridge, ant, shell”. A similar situation occurred with another child (Case #33, grade IV, 9 years 8 months) who rehearsed overtly only once: “I said them over and over until the next picture appeared and then I added that to ‘em”. This child was also classified as overtly rehearsing based on trials of six pictures. For the last child (Case #10, grade I, 7 years 6 months), the situation was not so clear cut. The coders did observe her to rehearse on two occasions and judged her response as chunking, although it was somewhat vague: “I just said the words twice”. However, when shown an example, she did illustrate grouped rehearsal at encoding (“Bench, train, bug. Bench, train, bug. House, house”) followed by single-item naming while marking responses on the computer screen (“Bench. Train. Bug. House”). On the other hand, the other 3 children in this subgroup were never judged to have used overt rehearsal based on observation although they declared having resorted to this strategy. Two of these children were older, and gave explicit responses illustrating cumulative rehearsal and stated that they were silently verbalising. One of the two (Case #42, grade IV, 10 years 8 months) also reported using single-item labelling on some trials and was actually observed to do so four times: "I would say it in my head or said it a little bit out loud. Rope. Bridge. Ant. Or I'd go rope, bridge. Rope, bridge, ant. Rope, bridge, ant, shell." The other child (Case #34, grade IV, 9 years 8 months) was never observed to verbalise at all during Visual Recognition, but did claim to be using the same strategy as for Verbal Recall where the coders observed her to use rehearsal on one trial: “I kind of did the same thing as yesterday. Like say it was cat, pan, mask, can. So 166  as soon as they showed up, I would say in my head cat. And then as soon as pan showed up, I would say cat, pan. And then I would say cat, pan, mask. And then cat, pan, mask, can." The last child who was never observed to rehearse was quite young (Case #3, grade I, 6 years 9 months). His response was not as clear: "I kinda put ‘em in my brain... I kinda like said ‘em twice or five times or something". Although both coders judged this response as multiple-item repetition, the child could have actually been referring to saying two words or five words (i.e., single-item labelling). The coders did observe him to frequently use single-item labelling of the pictures (for 19 of 27 trials). He was never observed to rehearse for Auditory-Verbal Recall either, and observed to label only once. Hence, excluding this last child (Case #3), 6 children were apparently using covert rehearsal while completing Visual Recognition. Although this is twice as many as for Auditory-Verbal Recall, it remains an exceptional situation (14% of the participants). These covert rehearsers had a mean age of 108 months (SD 15.1, Range 91 to 128), thus were more likely to be in the older range of the sample. Children’s self-reports again proved to be reliable, with significant doubt remaining about the report of only one child (Case #3) who may have been labelling rather than rehearsing. Combining those children who were observed to rehearse and/or who convincingly reported using a rehearsal strategy resulted in 25 children being classified as likely rehearsing. Among the other 17 who were not likely rehearsing during Visual Recognition, 13 were observed to label quite consistently (between 4 and 26 times over all condition, M 20.2, SD 6.5; Cases #1, 3, 7, 8, 15, 19, 20, 23, 24, 25, 26, 30, and 41) and thus met the criterion for overtly labelling. Nine of these participants also reported using a single-item labelling strategy, either in isolation or in combination with another strategy.38 Finally, 3 of the 4 who were not observed to label consistently and  38  This included the child (Case #3) who gave a vague response which, to be prudent, was reclassified as labelling.  167  were also never observed to rehearse, nonetheless reported using a labelling strategy. One child (Case #5, grade I, 7 years 1 month) labelled overtly only once, yet her response clearly corresponded to labelling: “As they were on the screen I would say them in my head… Rope. Bridge. Ant, Shell.” Another participant (Case #14 grade II, 7 years 10 months) who the coders observed to label overtly only once, nonetheless provided a response suggesting that he was using both labelling and imagery: “Well I was remembering them in my mind so I would find them out on to the computer. And I tried pretty much my best but some of them actually escaped out, like went out of my mind and I couldn't remember it…And some of them come back… I was saying the words in my mind then remembering the pictures and what they looked like”. The third child (Case #40, grade IV, 10 years 6 months) responded only: “I was saying them in my head”, but never gave any further indication to suggest that this was more than singleitem labelling. Only one child (Case #13, grade II, 7 years 8 months) did not provide enough evidence to be credited with labelling. The coders observed him to name the pictures only twice and his response was imprecise: “I was looking at the pictures”. In total, this resulted in 16 children classified as likely labelling, in addition to the 25 who were likely rehearsing, and 1 child who was probably not actively resorting to either labelling or rehearsal as a strategy (see Table 22).  168  Table 22. Distribution of Participants by Likely Strategy, Based on Combined Data from Observation and Self-Report, Visual Recognition Strategy  Frequency  Percent  Likely rehearsing Likely labelling Likely doing neither  25 16 1  59.5 38.1 2.4  Total  42  100.0  Note: This classification is based only on data from trials of three to five pictures in length.  The next analysis contrasted classifications of children as being likely rehearsing or not in Visual Recognition (based on observation and self-report combined) to data regarding the phonological-similarity effect in this task (see Table 23). There was good correspondence between these two measures for only 20 participants, 14 of whom showed a significant phonological-similarity effect and were also likely rehearsing, and another 6 who did not present with a phonological-similarity effect and also fell in the non-rehearsing category. On the other hand, 11 of the 17 (65%) children who were likely not rehearsing consistently during this task nonetheless presented with a phonologicalsimilarity effect. Not surprisingly, the distributions into the likely rehearsing or not categories were not significantly different depending on whether the children presented with a phonological-similarity effect, χ2(1, N = 42) = 0.32, p = .57.  169  Table 23. Distribution of Children With or Without a Phonological-Similarity Effect (PSE) and Likely or Not to Have Been Rehearsing, Visual Recognition Likely rehearsing PSE  Yes No  Total  Yes 14 11  No 11 6  Total  25  17  42  25 17  Note: This classification is based only on data from trials of three to five pictures in length.  All 25 children who presented with a significant phonological-similarity effect in Visual Recognition were, however, either likely rehearsing (n = 14) or likely labelling (n = 11; see Table 24). Nonetheless, among the 17 children who did not show a phonological-similarity effect for this task, 5 (29%) were judged to have been likely labelling. Perhaps even more unexpectedly, another 11 (65%) were deemed to have been likely rehearsing. The distributions depending on whether or not children were likely using labelling and/or rehearsal did not appear to differ depending on whether they presented with a phonological-similarity effect; however, there were too few cases of children who neither labelled nor rehearsed to reliably test for a statistical difference.  Table 24. Distribution of Children With or Without a Phonological-Similarity Effect (PSE) and Likely or Not to Have Been Labelling or Rehearsing, Visual Recognition Likely labelling or rehearsing PSE Total  Yes No  Total  Yes  No  25 16  0 1  25 17  41  1  42  Note: This classification is based only on data from trials of three to five pictures in length.  170  Among those who were likely labelling and did not show the effects of phonological similarity, 2 probably relied on visual strategies as well: one presented with a visual-similarity effect (Case #15) and another reported having relied on the pictures to remember (Case #14). It is interesting to note that it was not the case that the children who rehearsed yet were not affected by phonological similarity did not rehearse for the PS condition, as this pattern was observed for only 1 child (Case #31). In fact, 8 of the 11 who did not present with a significant phonological-similarity effect (Cases #2, 4, 11, 21, 29, 32, 34, and 39) were observed to rehearse between two and five times for PS trials, whereas the other 2 (Cases # 27 and 36) were not ever observed to rehearse consistently but rather were judged to have rehearsed covertly based on self-report. Among the 11 children who did not show an effect of phonological similarity yet fell in the likely rehearsing category, 1 participant (Case #29, grade III, 9 years 0 months) may have been using multiple strategies: he was observed to rehearse 10 times in total, presented with a visual-similarity effect, and reported using a strategy that may have been semantic elaboration: “My trick was like making up a story and then I copy it like it”. He also had a highest span of only 3 pictures, which is low for his age. Another child (Case #4, grade I, 6 years, 10 months) who reported using item labelling was observed to rehearse a total of seven times, and actually did well for his age, obtaining a highest strict span of 4 pictures. He also exhibited a pattern suggestive of a practice effect, recognising correctly 29 pictures in the VS condition compared to 21 and 20 for dissimilar and PS conditions respectively. All 9 of the remaining children were near ceiling and completed additional trials of six pictures in length for Visual Recognition. On the basis of trials of three to six pictures, 4 of them (Cases #31, 32, 34, and 36) did exhibit a decrement in performance for the PS conditions resulting in a significant phonological-similarity effect; this effect may have been obscured in the shorter trials 171  because these children were near ceiling. One child (case #21) did equally well across all conditions (pictures correct, dissimilar 44, PS 45, VS 44) and provided a response that may have reflected a simple form of semantic elaboration (“I made a funny sentence to remember…. A bench by a train with an ant by a house”). The other 4 (Cases #2, 11, 27, and 39) actually had less accurate serial recognition for the dissimilar compared to the PS condition, ranging between 5 and 7 pictures. Three of these 4 participants (Cases #2, 11, and 39) also showed improvement between dissimilar and VS conditions. Taken together, these facts are suggestive of practice effects benefitting both the PS and the VS conditions for 4 of the children, another 4 participants for whom the phonological-similarity effect appeared only for the longer trials (including one who selectively rehearsed overtly only for the VS and the dissimilar trials), and 2 children who likely relied on multiple strategies. This left 1 child who was not hindered by phonological similarity regardless of trial length, but rather performed best with the PS trials (Case #27).  3.1.4.3 Comparing tasks Overall, there was correspondence across tasks for 32 (76%) of the 42 participants, with 20 who were likely rehearsing in Auditory-Verbal Recall and Visual Recognition, another 11 who were likely labelling in both tasks, and 1 child (Case #13) who probably did not ever actively resort to either strategy (see Table 25). Among the other 10 children, 8 showed a more active or complex strategy with Visual Recognition: 4 children labelled in the Verbal task and rehearsed in the Visual task (Cases #9, 21, 22, and 42); 4 children did not seem to actively use any verbal mediation in Auditory-Verbal Recall yet in Visual Recognition they either labelled (Cases #3, 5, and 24) or rehearsed (Case #4). The other 2 children (Cases #19 and 30) rehearsed in Auditory-Verbal Recall and labelled in Visual Recognition. In addition, for 7  172  of the 10 children, the more elaborate strategy was used for the memory task that they completed second, including the 2 children who likely rehearsed for Auditory-Verbal Recall and likely labelled for Visual Recognition.  Table 25. Distribution of Participants by Likely Strategy in Auditory-Verbal Recall and Visual Recognition Based on Observation and Self-Report  Auditory-Verbal Recall  Rehearsing Labelling Neither Total  Visual Recognition Rehearsing Labelling Neither 20 2 0 4 11 0 1 3 1 25  16  1  Total 22 15 5 42  3.1.4.4 Summary On the whole, there is much evidence that many children used verbal mediation while they were completing both the memory for words and the memory for pictures tasks. The data are also indicative of variability between tasks and among the participants. The combination of observational and self-report data overlapped to a considerable degree with the information provided from the presence of a phonologicalsimilarity effect, if and only if both single-item labelling and rehearsal were used as indices of active verbal mediation. The second part of this chapter will focus more closely on some individual differences between children that may have related to the likelihood of them resorting or not to rehearsal in the memory for words and pictures tasks.  173  3.2 Interactions between child-related variables and the presence of rehearsal The second purpose of this study is to explore whether individual differences predicted whether a child would be more or less likely to use verbal rehearsal in the Auditory-Verbal and Visual memory tasks. These individual factors include grade (or age), as well as measures of language and cognitive abilities. The objectives are to determine: i) whether the overall results regarding phonological similarity and visual similarity were consistent across grades; and ii) whether rehearsal behaviour (based on observational data and self-reported strategies) varied systematically depending on age, nonverbal cognitive skills, and language abilities.  3.2.1 Demographic data, cognitive and language scores, by grade Grade is a variable of major interest, as there is some evidence that the changing instructional styles of teachers with advancing grades may support the development of memory strategies (F. J. Morrison, Smith, & Dow-Ehrensberger, 1995). Consequently, as a first step, it is important to verify whether the children were otherwise similar across the grades for variables that may have impacted the likelihood of participants having rehearsed.39 Additionally, age will sometimes be used as a proxy, as it has the advantage of being a continuous variable. Table 26 presents demographic data, as well as children’s scores on the language and cognitive tests and the two memory tasks, by grade. Age ranges for successive grades overlapped somewhat but, predictably, age increased steadily with grade. On the other hand, in terms of socio-economic status, a one-way ANOVA using the Brown-Forsythe correction for unequal variances and Tukey HSD posthoc testing 39  In these analyses, the primary concern is to guard against Type II error. As such, although four correlational analyses were performed, the critical p-value was maintained at .05.  174  confirmed that the participants were well matched across grades regarding the level of maternal education, F*(3, 29.6) = 1.0, p = .40, and all ps > .36 for pairwise differences between grades. Similarly, children appeared to be well matched across the grades in terms of their general expressive language and comprehension abilities based on standard scores (i.e., when controlling for age differences). In fact, the mean Narrative Language Ability Index (NLAI) from the Test of Narrative Language (TNL) did not differ between the grades, F(3, 38) = 0.86, p = .47; all pairwise comparisons ps > .54. Regarding language production or fluency, as expected, children showed faster completion times with increasing grade for the Rapid Automatic Naming of Colours and Animals (RAN); correspondingly, times decreased significantly with increasing age, r(42) = −.427, p = .002 one-tailed, Adjusted R2 = .162.40 It was also anticipated that the total number of items remembered correctly would increase with grade. In Auditory-Verbal Recall, the total number of words recalled correctly for trials of three to five words in length did increase somewhat with grade, but there was considerable variability within each grade, and much overlap between them (see Table 26). A one-way ANOVA indicated that the total number of words correct did not differ between the grades, F(3, 38) = 1.2, p = .33. Nonetheless, there was a positive small yet significant correlation between age and total words recalled correctly, r(42) = .29, p = .032 one-tailed, Adjusted R2 = .06. On the other hand, the increase in pictures correct with grade was more consistent for Visual Recognition, as revealed by the one-way ANOVA F(3, 38) = 3.0, p = .04. In this case, the relationship between the number of pictures correct and age was  40  The results were essentially identical when the correlation was performed with 2 bivariate outliers removed, r(39) = − .437, p = .002 one-tailed, Adjusted R2 = .169, suggesting that RAN times did in fact tend to decrease with age.  175  larger and again positively and significantly correlated, r(42) = .407, p = .004 one-tailed, Adjusted R2 = .145.41  Table 26. Demographic Data and Scores on Tests of Language, Cognition, and Memory, by Grade Grade  Age (mos)  I  M SD Min Max  84.8 3.9 80 90  II  M SD Min Max  95.3 3.8 87 101  III  M SD Min Max  IV  M SD Min Max  Maternal education (yrs) 14.6 2.6 12 20  TONI-3  TNL NLAI  RAN (s)  AV, Vis, words pictures correct correct 65.8 65.4 18.7 15.4 23 47 92 92  107.8 16.2 83 129  112.90 12.09 91 133  48.5 14.5 25 75  14.2 2.0 12 18  96.7 12.0 81 118  117.25 11.16 103 136  39.4 7.8 30 54  66.7 24.4 25 93  74.5 22.5 37 98  107.30 4.6 99 115  15.0 2.5 12 18  96.2 12.4 81 115  110.80 9.92 94 121  38.7 12.1 27 64  73.0 18.3 51 104  74.8 20.2 42 103  122.10 4.4 116 128  13.4 1.1 12 16  109.8 15.2 84 138  117.10 11.23 94 133  33.0 4.4 25 41  80.3 15.5 53 103  90.1 14.7 61 106  Note. Maternal education corresponds to number of years of schooling. TONI-3 = Test of Nonverbal Intelligence 3 (L. Brown et al., 1997); mean quotient 100, standard deviation 15. TNL = Test of Narrative Language (Gillam & Pearson, 2004), NLAI = Narrative Language Ability Index, mean standard score 100, standard deviation 15. RAN, Rapid Automatic Naming of Colours and Animals. Total words or pictures correct based on trials of three to five items in length, all conditions combined, strict scoring for both items and serial position, Maximum 108. AV, Auditory-Verbal Recall. Vis, Visual Recognition. N = 42: grade I, n = 10; grade II, n = 12; grade III, n = 10; grade IV, n = 10.  41  Words or pictures correct for trials of three to five items in length was used as the memory performance measure here in order to maximise comparability across children. The results were however essentially equivalent when the analyses were based on total trials correct for trials of three to six items, which took into account the performance of children who went on to complete the longer trials: Auditory-Verbal Recall, r(42) = .268, p = .043 one-tailed, Adjusted R2 = .048; Visual Recognition, r(42) = .442, p = .002 one-tailed, Adjusted R2 = .175.  176  140 130 120  TONI-3 quotient  110 Grade IV  100  Grade III 90  Grade II  80 70  Grade I 80  90  100  110  120  130  Age in months Figure 8. Scores on the Test of Nonverbal Intelligence (TONI-3), by age and by grade  177  In terms of nonverbal cognitive abilities, scores varied considerably for all grades on the Test of Nonverbal Intelligence (TONI-3). Unexpectedly, the children in grades I and IV obtained somewhat higher mean standard scores than those in the middle grades. A possible difference in nonverbal intelligence for children of different grades could have been problematic. However, the scatterplot of ages by TONI-3 scores did not reveal any systematic relationship (linear or otherwise) between these two variables (see Figure 8). To be prudent, possible differences between grades in TONI-3 quotient scores were further investigated. Although the one-way ANOVA omnibus test just missed the significance criterion, F(3, 38), = 2.77, p = .055, Tukey’s HSD posthoc testing found no pairwise significant differences between any of the grades, with all ps > .14; nonetheless, four of the six pairwise comparisons produced large effect sizes (grade I vs. II, d = 0.83; grade I vs. III, d = 0.84; grade II vs. IV, d = 1.02, grade III vs. IV, d = 1.03). This would be a concern when making any comparisons across grades if our major variables of interest were related to performance on the TONI-3. Follow-up oneway ANOVAs indicated that children did not differ significantly in terms of TONI-3 quotient depending on whether or not: i) they presented with a phonological-similarity effect in either Auditory-Verbal Recall or Visual Recognition; ii) they presented with a visual-similarity effect in Visual Recognition; or, iii) they were likely rehearsing for either Auditory-Verbal Recall or Visual Recognition, all ps > .25. These preliminary analyses have identified that the children were reasonably well-matched across the grades. In addition, some expected trends appeared, as age was significantly related to language production/fluency (as revealed by a negative correlation with RAN completion times) as well as to total items correct in both AuditoryVerbal Recall and Visual Recognition. These results will be taken into consideration in later analyses. 178  3.2.2 Phonological-similarity effects and visual-similarity effects 3.2.2.1 Auditory-Verbal Recall The first analysis explored whether the general conclusions regarding the phonological-similarity effect were consistent across the grades. To reiterate the results for the entire sample, the group showed a large and significant phonological-similarity effect (d = 1.3), with a mean of 5.6 additional words recalled in the correct order in the dissimilar condition compared to the PS condition, and 39 of the 42 participants presented with a significant effect. Some children also apparently benefitted from practice, which resulted in significantly fewer words recalled in the correct order in the dissimilar compared to the VS condition (Mdiff = −1.7, d = 0.38). Given the possibility that the size of the phonological-similarity effect could vary with development, additional comparisons were done at each grade level. As is evident from Figure 9 and Table 27, the pattern of results for the three conditions (dissimilar, PS, and VS) was very similar across the grades. A repeated measures ANOVA and planned contrasts with critical p-values adjusted for multiple comparisons42 revealed a significant phonological-similarity effect for each grade level, with more words recalled for the dissimilar compared to phonologically-similar blocks, and large effect sizes in all cases: grade I, Mdiff = 6.70, F(1, 9) = 26.9, p = .001, d = 1.64; grade II, Mdiff = 5.92, F(1, 11) = 17.2, p = .002, d = 1.20; grade III, Mdiff = 5.90, F(1, 9) = 13.1, p = .006, d = 1.15; grade IV, Mdiff = 3.70, F(1, 9) = 18.1, p = .002, d = 1.35.  42  For this analysis, the Holm method resulted in the following critical p-value for the four contrasts: Grade I, pcrit = .0125, Grade IV, pcrit =.0167, Grade II, pcrit = .025, Grade III, pcrit =.05.  179  32  Words correct  28 24 20 16 12  Condition  8  Dissimilar PS  4  VS  Grade I  Grade II  Grade III Grade IV  Figure 9. Mean number of words remembered correctly, by condition and by grade, Auditory-Verbal Recall  180  Table 27. Number of Words Correct by Condition and by Grade, Auditory-Verbal Recall Grade  Condition  I  II  III  IV  Min  Max  Mean  SD  Dissimilar PS VS Dissimilar-PS  9 4 10  35 22 36 14  23.8 17.1 24.9 6.7  7.0 5.5 7.0 4.1  Dissimilar PS VS Dissimilar-PS  9 4 7  33 25 35 12  23.8 17.8 25.1 5.9  7.9 7.5 10.2 4.9  36 32 36 14  26.1 20.2 26.7 5.9  7.9 6.0 5.9 5.2  35 32 36 8  26.7 23.0 30.6 3.7  5.5 5.1 6.2 2.8  −1  −5  Dissimilar PS VS Dissimilar-PS  14 13 19  Dissimilar PS VS Dissimilar-PS  17 18 18  −2  −1  Note. Grade I, n = 10; grade II, n = 12; grade III, n = 10; grade IV, n = 10.  Another issue of interest was whether the factors leading to a phonologicalsimilarity effect may have changed as children got older. With increasing grade, patterns of improvement in recall ma