UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Validity of the Miller assessment for preschoolers in predicting later cognitive performance in children… Fulks, Mary-Ann Lesley 1996

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_1996-0147.pdf [ 5.21MB ]
Metadata
JSON: 831-1.0087070.json
JSON-LD: 831-1.0087070-ld.json
RDF/XML (Pretty): 831-1.0087070-rdf.xml
RDF/JSON: 831-1.0087070-rdf.json
Turtle: 831-1.0087070-turtle.txt
N-Triples: 831-1.0087070-rdf-ntriples.txt
Original Record: 831-1.0087070-source.json
Full Text
831-1.0087070-fulltext.txt
Citation
831-1.0087070.ris

Full Text

VALIDITY OF THE MILLER ASSESSMENT FOR PRESCHOOLERS IN PREDICTING LATER COGNITIVE PERFORMANCE IN CHILDREN PRENATALLY EXPOSED TO DRUGS by MARY-ANN LESLEY FULKS B.P.E., University of Alberta, 1984 B.Sc.O.T., University of Alberta, 1987 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES (School of Rehabilitation Sciences) We accept this thesis as conforming to the required standard THE UNTVERSITY OF BRITISH COLUMBIA January 1996 © Mary-Ann Lesley Fulks, 1996 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department The University of British Columbia Vancouver, Canada DE-6 (2788) 11 Abstract This study investigated the validity of the Miller Assessment for Preschoolers (MAP) in predicting later cognitive outcome in children with prenatal drug exposure, using the Wechsler Preschool and Primary Scale of Intelligence-Revised (WPPSI-R), the Test of Early Reading Ability (TERA-2), the Peabody Picture Vocabulary Test-Revised (PPVT-R), and the Developmental Test of Visual-Motor Integration (VMI) as the outcome measures. Criticisms of previous studies of the predictive validity of the MAP included the predominant use of correlational and t-test analysis rather than clinical epidemiological techniques, and the questionability of the predictive accuracy of the MAP's recommended 5th and 25th percentile cutpoints without investigation of the full possible range of cutpoints. The primary purpose of this study was, therefore, to investigate the predictive validity of the MAP over a range of cutpoints, using clinical epidemiological techniques. The secondary purpose of the study was to compare the subjects' test performance to each test's norms to investigate how this sample of children with prenatal drug exposure performed in comparison to each test's normative sample. The subjects were 37 children with prenatal drug exposure who were 35 - 49 months old at the time of administration of the predictor measure, and 47 - 68 months old at the time of administration of the outcome measures. The MAP demonstrated the highest level of predictive accuracy when the WPPSI-R was used as the outcome measure, suggesting it was the most successful in identifying later intelligence in this sample of children with prenatal drug exposure. It was less successful at predicting later outcomes for more specific cognitive areas such as early reading behavior, receptive vocabulary and visual-motor integration, as assessed by the other three outcome measures. The 14th percentile MAP cutpoint was found to demonstrate higher overall levels of predictive accuracy than the test's recommended 5th and 25th percentile cutpoints. Compared to each test's norms, this sample of children approximated the normal distribution on the WPPSI-R and PPVT-R, but performed closer to the low average range on the MAP, TERA-2 and VMI. iii TABLE OF CONTENTS Abstract ii Table of Contents iii List of Tables v List of Figures vi Acknowledgment vii Dedication viii Chapter 1 Introduction 1 Statement of the Problem 2 Purposes of the Study 4 Definition of Terms 4 Chapter 2 Literature Review 6 Definition of Learning Disabilities 6 Theories/Concepts of Learning Disabilities 10 Issues in Early Detection of Learning Disabilities 12 Learning Difficulties of Children with Prenatal Drug Exposure 15 Previous Reviews/Studies of the MAP 23 Underlying Theory in the Development of the MAP 26 Proposed Relationship Between the MAP and the Outcome Measures in this Study 28 Research Questions 35 Chapter 3 Methods 38 Participants 38 Predictor Measure 40 Criterion Measures 42 Procedure 48 Data Analysis 49 Chapter 4 Results Summary of Participant Test Performance Research Question # 1 Research Question #2 Research Question #3 Research Question #4 Research Question #5 Research Question #6 Research Question #7 Research Question #8 Summary Chapter 5 Discussion WPPSI -R T E R A - 2 P P V T - R V M I Clinical Profiles of Tests Scores Limitations Summary and Conclusions References Appendix LIST OF TABLES Table 1 Sensitivity of the MAP Total Score for each of the Four Outcome Measures 53 Table 2 Specificity of the MAP Total Score for each of the Four Outcome Measures 55 Table 3 Positive Predictive Value of the MAP Total Score for each of the Four Outcome Measures 57 Table 4 Negative Predictive Value of the MAP Total Score for each of the Four Outcome Measures 59 Table 5 Overreferral Rate of the MAP Total Score for each of the Four Outcome Measures 60 Table 6 Underreferral Rate of the MAP Total Score for each of the Four Outcome Measures 62 Table 7 Means and Standard Deviations for the Four Outcome Measures 67 Table 8 Predictive Accuracy Values for Reading Outcomes in Miller's (1986) Study and the Present Study (using the 5th Percentile Cutpoint) 74 Table 9 Predictive Accuracy Values for Reading Outcomes in Miller's (1986) Study (using the 20th Percentile Cutpoint) and the Present Study (using the 26th Percentile Cutpoint) 74 Table 10 Predictive Accuracy Values for Language Outcomes in Miller's (1986) Study and the Present Study (using the 5th Percentile Cutpoint) 76 Table 11 Predictive Accuracy Values for Language Outcomes in Miller's (1986) Study (using the 20th Percentile Cutpoint) and the Present Study (using the 26th Percentile Cutpoint) 77 LIST OF FIGURES Figure 1. ROC curve using the WPPSI-R for each of the MAP's four cutpoints. 63 Figure 2. ROC curve using the TERA-2 for each of the MAP's four cutpoints. 64 Figure 3. ROC curve using the PPVT-R for each of the MAP's four cutpoints. 65 Figure 4. ROC curve using the VMI for each of the MAP's four cutpoints. 66 vu Acknowledgment I would like to express my appreciation to the members of my committee for sharing their time and expertise in the completion of this thesis. To Dr. Susan Harris, I express my admiration for her great research ideas and skills, and her dedication to teaching. My thanks to her for chairing the committee, her countless hours of editing and mentoring, for molding my research skills and improving the quality of my writing. To Dr. Lyn Jongbloed, I express my esteem for her insightful comments and kind approach to teaching. Her encouraging remarks, thoughtful questions and "speedy turnarounds" were most appreciated. To Ms. Linda Daniels, I express my respect for her accomplishments and knowledge in the field of pediatric occupational therapy and for her ability to see the "gestalt" of this study. Her willingness to share her knowledge, ideas and resources, and her dedication as a committee member on her own time, is deeply appreciated. I would like to acknowledge and thank the Kiwanis Clubs of Kelowna for funding the original research project from which the data were obtained for this study. Appreciation is expressed to Sunny Hill Health Centre for Children and to Lori Roxborough for their support. I would like to express my gratitude to the children and their families for their time and effort in participating in the original research project A special thanks is extended to Ms. Kathy Hurley and Ms. Carmen Hurley for their countless hours of child minding. Chapters 2,3 and 5 were made possible with their help. To my parents I extend my gratitude for teaching me to value education and for their assistance in my undergraduate years. Finally, deep appreciation and gratitude is extended to my husband Jay, for doing full-time at home as well as at work so that I could complete my thesis; to Alexis for her patience and for all those hours that she played on her own while Mommy worked on her "papers"; and to Jori for sharing her room with the computer and for sleeping through all those nights while I "clicked away". Dedication I dedicate this thesis to Jay, Alexis and Jori, for accompanying me on this journey ... Validity of 1 CHAPTER 1 Introduction The early detection of children with preacademic problems is important in order to identify and treat young children who are at risk for later learning difficulties. One test that purports to detect children with preacademic problems is the Miller Assessment for Preschoolers (MAP) (Miller, 1982). The MAP is a broad domained test for which the stated purpose is to " ... identify children who exhibit moderate 'preacademic problems' which may affect one or more areas of development, but do not have obvious or severe problems" (Miller, 1982, p. 1). More specifically, the goals of the test are: (1) to identify preschool-aged children with developmental delays who need further evaluation; and (2) to provide a clinical framework for identifying a child's strengths and weaknesses, and thereby indicate possible avenues of remediation (Miller, 1982,1988c). In developing the MAP, specific focus was given to factors believed to be associated with learning disorders and adjustment difficulties (Miller, 1982) The MAP is a norm-referenced test, standardized for preschool-aged children (2 years 9 months to 5 years 8 months) (Miller, 1982, 1988c). It is comprised of 27 items which are scored on a three level nominal scale (score bands); percentile scores are then obtained for the Total score and each of the five performance indices. The Foundations Index measures neurological and neuromotor aspects of development. The Coordination Index includes gross, fine and oral motor tasks. The Verbal Index evaluates cognitive language abilities, while the Non-Verbal Index examines cognitive abilities that do not require spoken language. The Complex Tasks Index measures the interaction of sensory, motor and cognitive abilities, and the interpretation of visual-spatial information. The definite problem range ("At-Risk") is defined as scores between the 1st and 5th percentiles (Miller, 1988c, p. 3.2). The possible problem range ("Needs Observation") is defined as scores between the 6th and 25th percentiles (Miller, 1988c, p. 3.2). The normal range is defined as scores between the 26th and the 99th percentile. The MAP is not designed to measure above average abilities so the highest scores represent only average performance. The cutpoints of the MAP, the critical score levels below which children are classified as at-risk or Vahdity of 2 possibly at-risk, are therefore the 5th and 25th percentiles, respectively. During test development, these cutpoints were set based on arbitrary decision-making (Miller, 1982). Statement of the Problem The purpose of the study was to evaluate the predictive validity of the MAP for children with prenatal drug exposure. Since publication of the MAP in 1982, three major predictive validity studies have been conducted (Conn, 1986; Lemerand, 1985; Miller, 1986) which have been published in several articles or reviews (e.g., Miller, 1987a, 1987b, 1988a, 1988b, 1990; Miller, Lemerand & Cohn, 1987). The results of these studies have generally been positive, with prediction rates reported to be as good or better than any other standardized instrument evaluated in similar predictive studies (Miller, 1987b). However, the MAP's predictive studies have come under recent criticism, primarily for their predominant use of correlational and t-test analyses rather than analyzing the predictive accuracy using clinical epidemiological techniques (Schouten & Kirkpatrick, 1993). The former types of analyses provide information that relies solely on group data, which can obscure high proportions of misidentified individual children (Keogh & Sears, 1991; Meisels, 1988). Analysis using clinical epidemiological techniques is necessary to obtain specific information about the level of predictive accuracy for individual children. Secondly, results reported from the clinical epidemiological analyses that have been completed on the MAP, have not been fully inclusive of all three predictive studies (Schouten & Kirkpatrick, 1993). For example, a range of overreferral rates (children with normal development who are classified as at-risk) was reported based on two of the three studies, while underreferral rates (children with developmental problems who are classified as not at-risk) were omitted. Thirdly, the predictive accuracy of the MAP's recommended 5th and 25th percentile cutpoints has been questioned; a full range of possible cutpoints was not investigated as part of the development of the MAP's scoring system (Schouten & Kirkpatrick, 1993). With regard specifically to children with prenatal drug exposure, there has been an increase in drug usage in recent years by women of child-bearing age (Hans, 1989; Kronstadt, 1989; McCance-Katz, 1991; Schneider, Griffith & Chasnoff, 1989; Young, Vosper & Phillips, Validity of 3 1992) . Day and Richardson (1994) report that the highest rates of use of alcohol, marijuana and cocaine are among women of childbearing age. In 1989, it was estimated that approximately 11% of newborns in the United States had been prenatally-exposed to drugs (Chasnoff as cited by Schneider et al., 1989). In an inner city study in Vancouver, BC, 12.5% of infants born in 1989 and 1990 were determined to be drug-exposed, while another 10.7% were placed in a risk factor group (e.g., signs and symptoms of withdrawal, dysmorphic facial features, etc.) (Loock et al., 1993) . Consequently, the outcomes for children who have been exposed to drugs in-utero is of increasing concern (Bauer, 1991; Greer, 1990; Lewis, Schmeder & Bennett, 1992; McCance-Katz, 1991; Schneider et al., 1989; Schutter & Blinker, 1992; van Baar, 1990; Van Dyke & Fox, 1990). With the surge of crack-cocaine use in 1985, the educational system has particularly been affected as the first large group of children who were prenatally exposed to this drug began school in 1990 (Rist as cited by Bauer, 1991). The impact of this has led some writers in the educational field to call for increased support services and a new educational approach to deal with the behavioral and learning needs of this population (Bauer, 1991; Van Dyke & Fox, 1990). Children with prenatal drug exposure are often difficult to follow longitudinally because parents who abuse drugs tend to have unstable lifestyles characterized by frequent moves, lack of telephones and failure to keep appointments (Howard, Beckwith, Rodning, & Kropenske, 1989). Therefore, there are few studies beyond the infant/toddler stage (Kaltenbach & Finnegan, 1989; Williams & Howard, 1993). Research to date has indicated that these children typically perform in the low normal range on cognitive or developmental milestone testing (Hans, 1989; Howard et al., 1989; Wilson, McCreary, Kean, & Baxter, 1979), and exhibit learning and behavioral difficulties (Davis & Templer, 1988; Greer, 1990; Householder, Hatcher, Burns, & Chasnoff, 1982; Howard et al., 1989; Schneider et al., 1989; Van Dyke & Fox; 1990; Wilson, 1989; Wilson, Desmond, & Wait, 1981). There is little research on the outcomes of preschool and school-aged children in this population (Davis & Templer, 1988; Kronstadt, 1989; McCance-Katz, 1991; Rodning, Beckwith & Howard, 1989). Validity of 4 Purposes of the Study The primary purpose of the study was to evaluate the predictive validity of the MAP Total score over a range of cutpoints using clinical epidemiological techniques, for children who were prenatally exposed to drugs. The Wechsler Preschool and Primary Scale of Intelligence-Revised (WPPSI-R) (Wechsler, 1989), the Test of Early Reading Ability-2 (TERA-2) (Reid, Hresko & Harnmill, 1989), the Peabody Picture Vocabulary Test-Revised (PPVT-R) (Dunn & Dunn, 1981), and the Developmental Test of Visual-Motor Integration (VMI) (Beery, 1989), were the outcome measures. The secondary purpose was to contribute to the body of knowledge on outcomes of preschool-aged children prenatally exposed to drugs, by comparing the test performance of the children in this study to each test's normative sample. Definition of Terms The following section defines cognition, as well as terms regarding clinical epidemiology. The clinical epidemiological definitions are relevant to understanding many of the predictive validity studies which were discussed, as well as understanding the type of data analysis used for this study. Cognition is defined as "... that operation of the mind process by which we become aware of objects of thought and perception, including all aspects of perceiving, thinking, and remembering" CDorland's Pocket Medical Dictionary. 1982). In this study, the term cognition was used inclusively to refer to all four outcome measures which assessed intelligence, early reading behavior, receptive vocabulary and visual-motor integration. Clinical epidemiology refers to the study of groups of people to obtain the necessary background information for making clinical decisions about the direct care of patients (Feinstein, 1985). Clinical epidemiological statistical techniques are based on a 2 x 2 table (predictor criterion table) with four separate cells: (1) true positives (failed screening test and poor outcome); (2) false positives (failed screening test but good outcome); (3) false negatives (passed Validity of 5 screening test but poor outcome); and (4) true negatives (passed screening test and good outcome) (Carran & Scott, 1992). The following are definitions of specific clinical epidemiological statistical techniques: Sensitivity refers to the proportion of participants correctly identified as having a poor outcome (Carran & Scott, 1992; Domholdt, 1993; Miller, Lemerand & Schouten, 1990). Specificity refers to the proportion of participants correctly identified as having a good outcome (Carran & Scott, 1992; Domholdt, 1993; Miller et al., 1990). Positive predictive value is the proportion of participants correctly identified as positive by the test (identified as at-risk), compared to all the participants who tested positive (Carran & Scott, 1992; Domholdt, 1993). Negative predictive value is the proportion of participants correctly identified as negative by the test (identified as not at-risk), compared to all participants who tested negative (Domholdt, 1993). Overreferral rate refers to the number of false positives (Miller et al., 1990). Underreferral rate refers to the number of false negatives (Miller et al., 1990). Receiver operator characteristic (ROC) curve expresses the relationship between the sensitivity and specificity by a line graph (curve). It is used to illustrate a test's predictive accuracy over a range of cutpoints. Tests or cutpoints with high levels of predictive accuracy crowd the upper left corner of the curve, while less discriminative tests or cutpoints have curves that fall closer to a diagonal line. The larger the area under the curve, the more accurate the test (Fletcher, Fletcher & Wagner, 1988). Validity of 6 CHAPTER 2 Literature Review The primary purpose of this study was to evaluate the predictive validity of the MAP with regard to later cognitive function for children with prenatal drug exposure. The MAP's purpose is to identify young children who show evidence of preacademic problems, and thereby identify those who may be at risk for later learning difficulties or learning disabilities (LD). This chapter presents a comprehensive definition of LD, including some of the controversy regarding defining and diagnosing LD. It is followed by theories of LD and issues surrounding the early identification of LD. Some background information is subsequently provided regarding what is currently known about the learning difficulties of children with prenatal drug exposure. More detailed information is then provided on previous predictive validity studies of the MAP, followed by the theories which were relevant to the development of the MAP. The proposed relationships between the MAP and the four outcome measures selected for this study are then presented. The chapter concludes with the research questions of the study. Definition of Learning Disabilities The term LD was adopted in 1963 (Haring, Lovett, Haney, Algozzine, Smith & Clarke, 1992; Lerner, 1993). However, given the diverse nature of LD, developing a uniform definition which is acceptable to all has been a process plagued with difficulties since the inception of the term (Haring et al., 1992; Lerner, 1993; Reynolds, 1990a). There are several definitions of LD but the most widely used one in the U.S. is that incorporated in Public Law (P.L.) 101-476, the Individuals with Disabilities Act (IDEA) (1990), which is a revision of the earlier version of this legislation, P.L. 94-142, Education for All Handicapped Children Act (1975) (Lerner, 1993). The first part of the definition stated in P.L. 101-476 is: The term "children with specific learning disabilities" means those children who have a disorder in one or more of the basic psychological processes involved in understanding or in using language, spoken or written, which disorder may manifest itself in imperfect Validity of 7 ability to listen, think, speak, read, write, spell, or to do mathematical calculations. Such disorders include such conditions as perceptual handicaps, brain injury, minimal brain dysfunction, dyslexia, and developmental aphasia. Such term does not include children who have learning problems which are primarily the result of visual, hearing, or motor handicaps, of mental retardation, of emotional disturbance, or of environmental, cultural, or economic disadvantage, (cited in Lerner, 1993, p.9) Part two of the definition is operational and states that: ... a student has a specific learning disability if (1) the student does not achieve at the proper age and ability levels in one or more of several specific areas when provided with appropriate learning experiences, and (2) the student has a severe discrepancy between achievement and intellectual ability in one or more of these seven areas: (a) oral expression, (b) listening comprehension, (c) written expression, (d) basic reading skill, (e) reading comprehension, (f) mathematics calculation, and (g) mathematics reasoning, (cited in Lerner, 1993, p.9) Other organizations and other countries have developed their own definition of LD (Lerner, 1993). In Canada, most provinces have developed programs for individuals with LD. This plethora of definitions, both within and between countries, indicates that the diagnosis of LD has become widely recognized and accepted. However, it also reflects the diverse nature of LD and the lack of feasibility in developing a single definition. Keogh (1987) supports a multidefinitional approach to LD with the development of a taxonomy of LD. There is a need for several definitions in order to encompass the different types of LD, as well as to satisfy the various professionals involved in the field. Several definitions are also needed to address the various populations, age levels and degrees of severity of LD (Lerner, 1993). Despite the existence of various definitions of LD, there are some common elements (Lerner, 1993). These include the view (1) that LD is caused by dysfunction in the central nervous system; (2) that there is an uneven growth pattern in mental abilities in that while some mental components are developing at age level, other components are lagging behind; (3) that there are various ways that difficulties in academic and learning tasks may manifest themselves Validity of 8 (e.g., one individual with LD may have difficulty with oral language, while another individual with LD may have a problem with reading or written expression); (4) that the individual is experiencing a severe discrepancy between achievement and intellectual ability; and (5) that the LD is not primarily caused by other disabilities, such as mental retardation, emotional disturbance, visual impairment, etc. Kirk and Chalfant (1984) have proposed classifying LD into two groups. One group would be developmental LD and would include prerequisite skills such as attention, memory, perception, thinking and oral language. The other group would be academic LD and would refer to the learning of school subjects such as reading, arithmetic, handwriting, spelling and written expression. Although the federal definition in P.L. 101-476 provides some guidelines, it has been viewed as "... vague, subjective, and resulting in diagnosis by exclusion in many cases (Reynolds, 1990a, p. 572). The lack of consensus regarding a readily operationalized definition leads to difficulties in identifying who is eligible for service (Reynolds, 1990a). Operationalization of the definition has varied significantly which has resulted in great variation as to who is identified as having LD. One of the most difficult elements of the definition to operationalize is that of the severe discrepancy between achievement and potential (Reynolds, 1990a). This element is particularly important because at the hearings held during the development of P.L. 94-142, it was the only one that enjoyed full consensus regarding inclusion in the definition of LD. However, attempts to operationalize this element have resulted in the development of several different formulae. Some formulae provide an expected grade equivalent while others provide cutoffs for a severe discrepancy. Mathematical inadequacy is one of the problems with the formulae (e.g., ordinal data such as grade equivalents were treated as interval or ratio data). If cutoff scores are used, a mathematical limit is set as to the number of children who can be identified as having LD. For example, a cutoff of one standard deviation below the mean presupposes that 16% of the population has LD. VaUdity of 9 Therefore, it is not surprising that the prevalence rates for LD vary from 1% to 30% of the school population (Lerner, 1993). Agencies or organizations with stringent criteria report lower prevalence rates while the opposite is true for those organizations with lenient criteria. However, there has been a steady increase in the number of children identified with LD between the inception of P.L. 94-142 in 1977 and the enactment of P.L. 101-476 in 1990. This may be due to: (1) an increased awareness of LD; (2) improved procedures for identifying and assessing individuals with LD; (3) higher levels of social acceptance and preference for a diagnosis of LD compared to other diagnoses (e.g., mental retardation); (4) cutbacks in other programs or a lack of other educational alternatives for children who have difficulties learning in the regular classroom setting; and (5) court orders to reevaluate some minority children who were initially classified as mentally retarded (Lerner, 1993). Individuals with LD now represent close to half of all children receiving special services. One of the difficulties in diagnosing LD is in distinguishing it from low achievement (Haring et al., 1992). In one study, students who were identified as having LD and those who were identified as low achieving, were evaluated on various psychoeducational measures, including cognition, academic achievement, perceptual-motor abilities, behavior and self-concept. A median of 96% of the scores were within a common score range between these two groups (Ysseldyke, Algozzine, Shinn & McGue, 1982). In another study, psychologists and special education teachers were required to review the profiles of psychometric scores between students who were low achieving and those with LD, and then use their clinical judgment to identify which students were labeled as having LD, and which were low achieving (Epps, Ysseldyke & McGue cited in Ysseldyke & Algozzine, 1983). They were correct only half of the time, which demonstrates no better prediction than chance alone. Both these studies indicate that it is difficult for school personnel to differentiate, either psychometrically or clinically, between students who are low achieving and those identified as having LD (Haring et al., 1992). Validity of 10 Theories/Concepts of Learning Disabilities Developmental psychology and cognitive psychology both have theories or concepts regarding the underlying causes of LD (Lerner, 1993). Maturation theory is grounded in developmental psychology where the ability to learn is based on a child's level of maturation. Maturation evolves in a sequential manner, cannot be accelerated, and stages cannot be bypassed. Maturational lag is the term used to describe a slowness in the development of certain neurological functions. Because abilities may mature at different rates, children may be delayed in only some aspects of their development. These delays are viewed as temporary, a matter of timing or maturation, and are not viewed as real differences between children with maturational lags or LD and those without them. A major cause of school failure is therefore viewed as immaturity. A few studies which support this theory are summarized below. In a longitudinal study over several years, many of the students with LD who were described as immature and poorly integrated, were able to complete their schooling when provided with some help and extra time " ... to mature and to compensate for neurological malfunction" (Koppitz, 1973, p. 136). In evaluating which of 37 tests administered in kindergarten were most predictive of reading and spelling achievement in grade two, tests that reflected differences in maturation were found to be the most successful predictors (de Hirsch, Jansky and Langford, 1966). In the early grades (kindergarten to grade 3), younger children within each grade level were more likely to be referred for psychoeducational testing for academic problems than older children (Di Pasquale, Moule & Flewelling, 1980). This so-called birthdate effect was found to be specific to boys. Cognitive psychology deals broadly with the mental processes of learning, thinking and knowing (Lerner, 1993). It includes abilities such as awareness, conceptualization, abstract reasoning, and critical and creative thought. There are three main concepts regarding cognitive psychology and learning (Lerner, 1993). The oldest concept is that of a disorder in psychological processing, which is one of the elements in the federal definition of LD. The term psychological processing includes functions such as perception, motor skills, linguistics and memory. It refers to the intrinsic function necessary for development or preacademic learning. It goes beyond the Validity of 11 theory of a maturational lag by proposing an underlying deficit which precludes certain skill development. The concept of disorders in psychological processing laid the foundation for the emergence of the definition of LD. It continues to prevail in the assessment and teaching of students with LD (Kavale, 1990). Using this approach, teaching students with LD is primarily focused on auditory and visual perception (Lerner, 1993). More recently, the information processing model of learning has been developed (Lerner, 1993). This model proposes a three-part flow of information in the learning process. The first is the input phase, which involves the initial reception of information such as auditory stimuli. This is followed by the processing phase which includes cognitive functions such as associations, thinking, memory and decision making. The last part is the output phase which refers to actions and behaviors. Disorders in psychological processing are directly relevant to the processing phase of this model. More contemporary theories of cognitive learning have encompassed and built upon many of the earlier concepts of learning and thinking (Lerner, 1993). There are key assumptions of these theories which guide instruction: (1) learning is a goal-oriented process (students strive to construct meaning and to learn independently); (2) learning links new information to prior knowledge; (3) learning requires the organization of knowledge; (4) learning is strategic (specific skills such as predicting or summarizing); (5) learning occurs in phases (preparation/anticipation, processing and consolidation), but is also recursive (requires verification with prior knowledge); and (6) learning has developmental influences (acquisition of prior knowledge and strategies, and the automatization of basic skills delineates between proficient and less proficient learners) (Jones, Palincsar, Ogle, & Carr, 1989). With regard to young children, specifically those with developmental LD, underlying theoretical concepts link learning with motor development and with perception (Lerner, 1993). Developmental LD manifests itself in many ways that can negatively influence later academic development (Lerner, 1993). Difficulty in learning to print or write may have underlying causes such as deficient motor skills, eye-hand coordination, or memory. Difficulty in reading ability may reflect underlying causes like deficiencies in language, auditory perception or visual perception. Validity of 12 Issues in Early Detection of Learning Disabilities Although it has now been accepted that LD is not just a unique phenomenon of the school age child and that LD exists in early childhood (and persists into adulthood) (Lerner, 1993), there are a number of issues which lead to difficulty in accurately identifying the preschooler with LD. One issue is the ongoing controversy over the definition of LD (Haring et al., 1992; Humphry & King-Thomas, 1993; Keogh & Sears, 1991). Following many years of conflict and confusion, this issue is still not resolved in the school-age group, for which LD was initially identified (Haring et al., 1992). Therefore, one cannot expect that transferring these constructs for use in the preschool-age group will improve the ability to identify the young child with LD. In fact, it will likely be more difficult in that some of the constructs applicable to the school-age group (e.g., failure to achieve in school) may not be applicable to the preschool age group. Specific definitional criteria for the identification of LD in the preschooler have not yet been developed (Haring et al., 1992). In a survey of 49 states and the District of Columbia of ekgibility determination systems for preschoolers under IDEA, tremendous variability was found across categorical, noncategorical and combination approaches (Snyder, Bailey & Auer, 1994). Of particular variability and controversy was the use of the category of LD. A second issue is the instability of early human development. Children do not develop or mature at the same rate (Lerner, 1993; Paget, 1990). Although children with more severe developmental delays tend to show relatively stable development over time (Bernheimer & Keogh, 1988; Humphry & King-Thomas, 1993), this is less true of typically developing children or those with milder forms of developmental delays (Bernheimer & Keogh, 1988; Lerner, 1993; Paget, 1990). Furthermore, specific learning problems may not be evident in the early years but may become apparent as tasks and demands change and become more complex (Humphry & King-Thomas, 1993; Keogh & Sears, 1991). Therefore, the developmental course between prediction and outcome for children with LD may be variable (Keogh & Sears, 1991). In some children, this leads to their risk status changing over time, making accurate early identification difficult (Keogh & Sears, 1991). Validity of 13 Most instruments used to detect LD in young children do not take into direct account family/social issues that impact on the child. And yet, a child's home environment has been shown to be one of the most powerful predictors of future outcome in children who were identified as being at risk (Keogh & Sears, 1991). While medical or biological factors are significant predictors of outcome in very young children, environmental factors become more significant as children grow older, particularly in the area of cognition (Aylward, Gustafson, Verhulst & Colliver, 1987). In a longitudinal study of almost 2000 children conducted over an 18 year period, social and environmental factors were shown to become increasingly more influential as the children grew older (Werner, 1986). At 20 months of age, pre- and perinatal factors were found to be related to risk status but much more so if social factors such as poverty or family disorganization were also taken into account The influence of social and environmental factors increased with the children's age. The three key factors that were found to be strongly associated with negative outcomes included: significant biological factors such as severe perinatal stress or congenital defects; environmental factors such as family instability or low levels of maternal education; and specific child characteristics such as high activity levels in infancy or developmental delays. Children whose learning and/or behavioral difficulties persisted throughout the study tended to have histories of significant biological factors such as moderate to severe perinatal stress, low birth weight and central nervous system dysfunction in infancy, in combination with environmental factors such as poverty or parental psychopathology. There is little or no support for mild neurological signs being the cause of LD, however (Keogh & Sears, 1991). In another longitudinal study, 114 children with LD were followed over a 10-year period (Hartzell & Compton, 1984). Effective family functioning and high full scale IQ were found to be the most significant predictors of academic success, while effective psychosocial functioning in childhood and high full scale IQ were predictive of later social outcome. Therefore, while family/social issues are not the only factors that affect a child's future outcome, they are significant issues which are often not considered when screening children to determine risk status (Keogh & Sears, 1991). Validity of 14 Another factor in the issue of early identification is that of specific age-related behavioral characteristics. It is imperative to consider the effect on test performance of a preschool-aged child's ability to cooperate, attend, sit still and respond to the assessment setting (Scarr, 1981). "Young children may 'test' behavioral limits, refuse items, or be easily distracted by items of more intrinsic interest" (Paget, 1990, p. 753). Furthermore, preschoolers particularly, display not only a wide variation in behaviors but also in learning experiences (e.g., preschool attendance, exposure to testing situations) (Paget, 1990). Differences in learning opportunities and experiences may be considerable influences on a child's test scores. Furthermore, involvement in an intervention program may conceal underlying learning difficulties (Humphry & King-Thomas, 1993). These behavioral and experiential factors are generally not formally factored into a child's test scores when determining potential eligibility for early intervention. Finally, there have been major methodological difficulties with studies investigating the early detection of learning disabilities. Developmental screening instruments have been criticized for their low levels of accuracy and validity (Haring et al., 1992; Meisels & Wasik, 1990; Satz & Fletcher, 1988). According to Satz and Fletcher (1988), the temporal interval between the initial assessment (predictor variables) and the follow-up status (outcome variables) is often too short in that it should be at least three years. A longitudinal prospective design should be used. Many assessments of validity have been based on simple correlations. These do not address the issue of predictive accuracy, which requires the use of clinical epidemiological statistical analysis where the participants' placement into predicted and outcome groups is analyzed (Satz & Fletcher, 1988). Research findings that report on the relationship between two variables should be replicated on an independent sample (cross-validation) (Satz & Fletcher, 1988). Other problems reported with longitudinal research of this type include difficulty in controlling the confounding effects of history, maturation, the unpredictability of early childhood development, and the attrition of participants (Miller 1987a). Validity of 15 Learning Difficulties of Children with Prenatal Drug Exposure The incidence and extent of learning difficulties in children prenatally exposed to drugs are not well documented in the literature. There are few studies following these children up to school age, when learning difficulties may emerge (De Cubas and Field, 1993). Most published reports involve children exposed primarily to alcohol, a group for whom higher rates of learning and behavioral difficulties are clearly documented (Osborn, Harris, & Weinberg, 1993; Shriver & Piersel, 1994). In a recent literature review and analysis of studies of children with prenatal drug exposure, only 46 out of more than 1200 articles that were identified were accepted (Carta et al., 1994). The acceptance criteria included: (1) publication in a refereed journal between 1972 and 1992; (2) prenatal drug exposure to at least one illegal substance; (3) original research (i.e., not a review article) with an experimental or quasi-experimental design; (4) participants who were between 0 and 60 months of age; (5) English or Spanish documentation; (6) information about the methods for determining drug exposure; and (7) study outcome(s) that described either behavioral characteristics or developmental status. Most of the 1200 articles were either nonempirical or nonexperimental in design. The largest proportion of the 46 accepted studies investigated the effects of prenatal drug exposure involving cocaine, heroin/methadone or marijuana. There were a total of 460 behavioral or developmental outcomes identified from these 46 articles. Most of the outcomes were neurodevelopmental in nature (60.2%), followed by cognitive (21.3%), motor (13.3%), social (3.5%) and language (1.7%). The largest proportion of studies investigated outcomes in infants from birth to 7 days (41.7%). The proportion of studies of outcomes in other age groups were: 8 to 30 days (17.8%), 1 to 6 months (8.9%), 7 to 24 months (16.7%) and 25 to 72 months (14.8%). Cognitive outcomes represented 50% of the studies in the oldest age group. The youngest and the oldest age groups demonstrated the highest number of statistically significant adverse outcomes at 58.3% and 52.9% respectively, as opposed to nonsignificant outcomes or those that favored no effects. One of the few studies on LD in children prenatally exposed to heroin, was a follow-up study on school performance of children from birth to age 5 who were part of a previous Validity of 16 longitudinal study (Wilson, 1989). The original study was comprised of 29 women with untreated heroin dependency and their infants; 39 women enrolled in methadone maintenance programs and their infants; and a drug-free comparison group of 57 women and their infants. In the follow-up study on school performance, data were obtained via contact by mail or telephone. While 68% (n = 20) of the group with heroin exposure responded, only 30% of the group with methadone exposure and 36% of the drug-free group responded. Therefore, the author reported that the data were interpreted cautiously with only the school performance of the children with heroin exposure described. Twenty more participants with prenatal heroin exposure from earlier studies were combined with the 20 participants from this study for a total of 40 participants aged 6-11 years, for whom reports of school performance were obtained. Study results indicated a 65% incidence of grade repetition or special education services; a 40% incidence of participants who performed greater than 1 standard deviation (SD) below the mean on intelligence testing; and a 13% incidence of a language learning disability (Wilson, 1989). Sixty-five percent of the sample were thought to have behavioral difficulties (e.g., inattention, low self-confidence, poor peer relations). Difficulties were also reported in the areas of motor coordination and "visual-motor-perceptual function" (Wilson, 1989, p. 192). However, it was concluded that "... the overwhelming number of confounding prenatal and environmental influences and the small number of drug-exposed subjects who have been carefully evaluated in a prospective fashion preclude the statistically meaningful analysis of long-term effects of intrauterine narcotic exposure" (p. 193). A more recent longitudinal study of cognitive development in the Netherlands investigated 35 infants of mothers who were drug-dependent, and compared them to 35 infants whose mothers were not dependent on "hard" drugs (van Baar & de Graaf, 1994, p. 1064). By the end of the study when the children were 5 1/2 years old, there were 25 children in the drug-exposed group and 32 children in the non-exposed comparison group. Variable amounts of data were available at each evaluation phase because of participant attrition, erratic attendance and uncooperative behavior on the part of some of the participants. Cognitive development was assessed using standard intelligence and language tests at 3 1/2 years (IQ testing), 4 years (language testing), Validity of 17 4 1/2 years (IQ testing) and 5 1/2 years (IQ testing) (van Baar & de Graaf, 1994). Study findings indicated that the two groups differed significantly on the IQ tests at all three age levels, with the group with prenatal drug exposure performing more poorly overall. At 3 1/2 years of age, only 1 child with prenatal drug exposure performed more than 1 SD below the mean, while none of the children in the non-exposed group performed below average range. However, at 4 1/2 years of age, 14 out of 23 children with prenatal drug exposure for whom IQ testing was completed, performed more than 1 SD below the mean, compared to 5 out of 31 children in the non-exposed group. At 5 1/2 years of age, 9 out of 22 children in the group with prenatal drug exposure performed more than 1 SD below the mean, compared to 4 out of 30 children in the non-exposed group. Language testing at 4 years of age revealed significant group differences in comprehension and expression, with the group with prenatal drug exposure performing lower. When children with prenatal drug exposure who were born preterm were excluded from the analysis, the study findings remained the same as for the total group (van Baar & de Graaf, 1994). When fostered children with prenatal drug exposure were excluded from the analysis, study results were again essentially the same. However, when comparing only the fostered children with prenatal drug exposure to the non-exposed group, significant differences were found only at the 4 year language assessment and the 4 1/2 year IQ assessment. On the other hand, there was an 8-point spread in the IQ scores at the 5 1/2 year assessment (i.e., 102 vs. 93), even though the difference was not significant Behaviorally, the children with prenatal drug exposure were noted to differ significantly from the non-exposed group (van Baar & de Graaf, 1994). Generally, at 3 1/2 years they were described as more active and perhaps more immature, and at 4 1/2 and 5 1/2 years as having more difficulty adjusting to the demands of the testing situation in terms of endurance and cooperation. When the data were reanalyzed with a correction for behavior, the two groups differed significantly on IQ testing at 3 1/2 and 4 1/2 years, but not at 5 1/2 years. The authors concluded that overall their sample of preschool-aged children with prenatal drug exposure functioned at a lower cognitive level than the non-exposed group, and that most of Vahdity of 18 the children with prenatal drug exposure began school with considerable cognitive delay (van Baar & de Graaf, 1994). Following a review of the literature showing that children with prenatal drug exposure appeared to have difficulties similar to attention deficit disorder (e.g., difficulty with attention, impulsivity, self-control, motor coordination, and cognitive tests requiring focused attention), Davis and Templer (1988) reported results of a study of children aged 6-15 years with prenatal drug exposure. The children were born to parents enrolled in a methadone maintenance program. A drug environment control group of children was selected from parents also enrolled in this program, whose mothers did not use narcotics during pregnancy but who lived with partners who were narcotic-addicted. Each group was comprised of 14 boys and 14 girls, primarily of Caucasian and Mexican-American descent The children were assessed using the Wechsler Intelligence Scale for Children (WISC-R) (Wechsler, 1974), the Quick Neurological Screening Test (QNST) (Mutti, Sterling & Spalding, 1978), the Bender Visual Motor Gestalt Test for Children (Bender, 1938) and the Burks Behavior Rating Scales (Burks, 1975) (Davis & Templer, 1988). The children with prenatal drug exposure demonstrated significantly lower Full Scale and Performance IQ scores on the WISC-R, with the Digit Span, Picture Completion, Object Assembly and Coding subtests also significantly lower than the control group. The neurological indicators of the Bender-Gestalt and scores on two subtests of the QNST were also significantly lower. The study findings seemed to reflect possible impairment in the perceptual, motor and attention realms. Because most of the subtest scores of the QNST were not significantly different between the two groups, the authors suggested that it is unlikely that a child with prenatal drug exposure would typically have neurological deficits that are readily recognizable. On all but one of the scales on the Burks Behavior Rating Scales, the children with prenatal drug exposure performed significantly lower than the control group, reflecting difficulties with impulsivity, socialization, attention and interpersonal relationships (Davis & Templer, 1988). In comparing the children reported to be exposed only to heroin to the children reported to be Vatidityof 19 exposed only to methadone, the children with methadone exposure performed more poorly, especially in the behavioral domain. It was concluded that the children with prenatal drug exposure demonstrated various deficits in the areas of cognition, perceptual-motor skills and behavior, particularly children exposed only to methadone. The particular behavioral difficulty demonstrated"... is consistent with the attention deficit disorder type syndrome suggested by the clinical literature" (Davis & Templer, 1988, p. 281). In another study of prenatal exposure to heroin, a sample of children aged 3 years 1 month to 6 years 4 months was studied (Wilson et al., 1979). There were approximately the same number of boys and girls with 30 children of Latin American descent, 30 of African American descent, and 17 of Anglo-American descent. These 77 participants were comprised of a group of children with heroin exposure and three non-exposed comparison groups - a drug environment comparison group, a high-risk comparison group (based on medical factors such as dysmaturity, intrauterine growth retardation, fetal distress, etc.), and a comparison group of similar socioeconomic status. Various measures of growth and development were taken, including neurological, psychometric, perceptual, speech and behavioral assessments. There were no major neurological differences among the four groups, with the exception of children with heroin exposure performing significantly poorer on rapid alternating hand movements (Wilson et al., 1979). Psychometrically, children with heroin exposure performed within normal limits. However, their performance was consistently lower than that of the comparison groups, with the group with heroin exposure often obtaining significantly lower scores on various measures. More specifically, particular areas of weakness were noted to include measures of memory and perception, and one measure of general cognitive ability. Subtests measuring organizational processes, which require attention, concentration, short-term memory and symbolic manipulation were particularly problematic. On other perceptual tests, children with heroin exposure performed significantly more poorly than the comparison groups on measures of visual, tactile and auditory perception. There were no significant differences among the groups on measures of speech and language function. Children with heroin exposure differed VaHdity of 20 significantly on behavioral measures, reflecting difficulties with uncontrollable temper, impulsivity, self-confidence, aggression and social relationships. Physically, the group of children with heroin exposure were significantly smaller than the other groups on measures of height, weight and head ckciirnference. Overall, the performance of children with heroin exposure generally fell within normal kmits; however, this group consistently performed lower than the other groups on physical, intellectual, perceptual and behavioral measures (Wilson et al., 1979). The authors concluded that the functional deficits that differentiated the group with heroin exposure from the other groups did not impair cognitive development; however,"... because of differences in behavior, perceptual and organizational abilities, these children must be considered more vulnerable to suboptimal social and environmental conditions" (Wilson et al., 1979, p. 141). In evaluating the development of young children with prenatal drug exposure in the first 2 1/2 years of life, the Bayley Scales of Infant Development were used to study a group of infants born to women who were predominantly multiple drug users (van Baar, 1990). The initial study group was comprised of 35 children with prenatal drug exposure and a non-exposed comparison group of 37 children. By the time the children were 2 1/2 years old, 6 children from the study group and 2 children in the comparison group were no longer participating in the study. Furthermore, due to erratic attendance of many of the children with prenatal drug exposure, variable amounts of data were collected (n=22-27). The Dutch version of the Bayley Scales of Infant Development (van der Meulen & Smrkovsky, 1983) was administered at 6,12,18,24 and 30 months of age (van Baar, 1990). On the Mental Scale of the Bayley, there were no significant differences between the two groups at 6, 12, or 18 months of age. However, the two groups differed significantly at 24 and 30 months of age with the children with prenatal drug exposure performing more poorly. Using a non-verbal scale of the Bayley (van der Meulen & Smrkovsky, 1987), no significant differences were found at any age group, suggesting that the children with prenatal drug exposure had specific difficulties with the language items of the Bayley. Separate analyses were conducted using both the Bayley Mental Scale and the non-verbal scale where results from prematurely born infants, infants of Validity of 21 foreign parents and infants in foster care were excluded from the analysis in turn, to investigate whether these subgroups specifically contributed to the results. The findings were the same as for the total group of children with prenatal drug exposure, indicating that these subgroups did not specifically affect the overall findings of the study. There were no significant differences between the children with prenatal drug exposure and the comparison group on the Bayley Motor Scale or on any of the behavior measures. The author concluded that her sample of children with prenatal drug exposure showed language difficulties in their second year of life. Particular attention to the early language development of children with prenatal drug exposure thus appears warranted. In studying the intellectual function and quality of play of children with prenatal drug exposure, eighteen 18-month old children with prenatal drug exposure were compared to a group of high risk non-exposed children who were born prematurely (Howard et al., 1989). The group of children with prenatal drug exposure performed in the low average range on standardized developmental testing, and had significantly lower developmental scores than the preterm comparison group. When placed in an unstructured free play situation that required self-organization, self-initiation and follow-through without adult assistance, the children with prenatal drug exposure demonstrated significantly less representational play than the comparison group. The group with prenatal drug exposure, instead, demonstrated a pattern of scattering, batting, and picking up and putting down of the toys. One conclusion was that because of the link between representational play and language acquisition, difficulties in language development may be anticipated for children with prenatal drug exposure. Hans (1989) investigated the effects of prenatal methadone exposure on the neurobehavioral development of 2 year old children. Most of the mothers of the study children were reported to occasionally use other drugs in addition to methadone (i.e., alcohol, marijuana, heroin, cocaine, Valium or Talwin). The original study group consisted of 42 infants, and it was compared to a non-exposed group of 47 infants. By the 2 year old assessment, 30 children remained in the study group while 44 remained in the comparison group. Both groups continued to be well matched in sociodemographic characteristics and maternal intelligence. However, the Validity of 22 women who used methadone rated lower in psychiatric functioning, particularly in regards to psychosocial stressors and adaptive functioning. The children were evaluated using the Bayley Scales of Infant Development (Bayley, 1969), including the Mental Scale, the Motor Scale and the Infant Behavior Record (IBR) (attention span, activity level, tension, gross and fine motor coordination) (Hans, 1989). In addition, measures of height, weight and head circumference were taken. Although both groups' means on all the outcome measures were within the average range, the study group performed more poorly on all of the measures. The two groups differed significantly on measures of height, head circumference, the Motor Scale, and tension and both fine and gross motor coordination from the IBR. The total sample was then analyzed further in three different ways by dichotomizing it first by higher and lower socioeconomic status (SES), then by maternal IQ, and lasdy by pregnancy/birth complications. The children with methadone exposure consistently performed lower on some or all of the outcome measures. Generally, the children with methadone exposure lagged behind in physical growth and motoric functioning. In the lower SES and IQ dichotomies, children with methadone exposure also lagged behind in cognitive development. It was concluded that prenatal methadone exposure appears to have a small, direct effect on physical growth and motoric function of 2 year old children. When prenatal exposure to methadone is combined with very low SES, the impact on cognitive development is particularly devastating. Research involving the MAP with a group of children with prenatal drug exposure, who were followed at Sunny Hill Health Centre for Children (SHHCC) in Vancouver, B.C., was conducted by Fulks and Harris (1995). Using a retrospective design, the profiles of MAP scores of 54 children were investigated. There were 23 boys and 31 girls aged 35-68 months (mean = 48 months). The reported drugs used were varied and most of the children were prenatally exposed to at least two drugs. There was no comparison group; scores were evaluated against the test norms. The children in this study received the lowest overall median score on the Verbal Index (measuring language-based cognitive items) compared to the other four performance indices Validity of 23 (Fulks & Harris, 1995). However, the largest number of children (15%) performed in the lowest category (1st - 5th percentile) on the Foundations Index, which measures sensory and motor abilities. The median Total Score, at the 23th percentile, was in the possibly at-risk range. Measures of non-verbal cognition were within average range. It was concluded that although a distinctive profile of scores did not emerge, there was a general trend of test performance in the lower end of the average range, with particular difficulties presented on sensory processing and language items. In summary, the findings across these studies are not consistent enough to depict a typical presentation of a child with prenatal drug exposure. However, performance in the low average range on cognitive testing appeared to be a consistent finding. Tendencies toward language impairments and behavioral difficulties were commonly reported. Some studies also reported poor perceptual and motor function in children with prenatal drug exposure. Even when the data were reanalyzed to control for potential confounding factors such as low SES or pregnancy/birth complications, children with prenatal drug exposure still tended to lag behind their non-exposed peers. Previous Reviews/Studies of the MAP The MAP is widely used in preschool screening and assessment of children at risk for learning problems (Miller, 1993). The test has received some positive reviews, particularly regarding its rigorous test development (Banus, 1983; Slaton, 1985).and its usefulness as a clinical tool (Slaton, 1985). Humphry and King-Thomas (1993) complimented the test on its depth of item development. The MAP's ability to produce differential score patterns for children with different diagnoses, and for differentiating between children with specific versus more generalized delays has been documented (Daniels, 1990; Daniels & Bressler, 1990). Slaton (1985) concluded that the MAP "provides a wealth of opportunities to observe clinically relevant information about the child" (p. 69), and that "many children who perform poorly on the MAP do well on more traditional developmental assessments which, in the past, have failed to identify many of the mild or moderately disabled children at an early age" (p. 70). Vakdity of 24 Three doctoral dissertations which reported on the ability of the MAP to predict later school-related problems have been completed (Cohn, 1986; Lemerand, 1985; Miller, 1986). Using t-test analysis, Cohn (1986) reported significantly different mean MAP scores between children who later showed school problems versus children without school problems 1.25 to 2.5 years after the MAP evaluation. The sample was comprised of 134 children, the majority of whom were at least 4 years 3 months of age at the time of the MAP assessment. Using the MAP's 25th percentile cutpoint, Lemerand (1985) reported sensitivity and specificity rates of at least 70% in predicting which children showed difficulties 1 year later in kindergarten. The sample was comprised of 273 children and the MAP was administered when they were 4 or 5 years old. Miller (1986) used a subsample of 338 children from the MAP standardization sample of over 1200 children. The children were followed 4 years after the initial MAP screening. Correlations of .47 to .50 between the MAP total score and the Wechsler Intelligence Scale for Children-Revised (Wechsler, 1974), and correlations of .35 to .38 between the MAP total score and the Woodcock-Johnson Psychoeducational Battery Part II (Woodcock & Johnson, 1977) were reported. These relationships were described as strong. The main findings from these three studies were summarized in an article by the three investigators (Miller et al., 1987). The over- and underreferral rates at both the 5th and 25th percentile cutpoints were also summarized from Lemerand's (1985) and Miller's (1986) studies (Miller et al., 1987). The rates of overreferral ranged from "moderately high (22%) to extremely low (less than 1%)" (p. 380). No specific information was provided in this summary article specifying the cutpoints at which these rates were obtained. The authors further reported that the 5th and 25th percentile cutpoints were supported in all three studies, depending on the purpose of the screening. In other words, the 5th percentile cutpoint was effective in identifying potential special education students, while the 25th percentile cutpoint effectively identified the full range of severe to mild problems. As part of her dissertation (Miller, 1986), Miller also examined school outcome criteria (teacher observations, report card grades, grade retention, receipt of special services, and placement in special classrooms) (Miller 1988a, 1990). Using t-test analyses, the conclusion was that the MAP Total Score effectively differentiated children with and without school-related Validity of 25 problems in all school categories four years after the administration of the MAP in the preschool years (Miller, 1988a). All of these studies have recently received some critical reviews (Kirkpatrick & Schouten, 1993; Schouten & Kirkpatrick, 1991,1993). One criticism has been the predominant use of correlational and t-test analyses, rather than using clinical epidemiological techniques (Schouten & Kirkpatrick, 1993) which are recommended in order to classify children into appropriate membership groups for comparison of predicted risk status with ultimate outcome status, thereby determining the amount of agreement between screening decisions and future outcomes (Keogh & Sears, 1991; Meisels, 1988; Schouten & Kirkpatrick, 1993). Another criticism is that the correlations achieved have reflected weak to moderate relationships rather than strong ones (Schouten & Kirlcpatrick, 1993), which were reported by Miller et al. (1987). Any reported results of predictive accuracy have seemed "unbalanced" in that only some types of screening errors have been reported (Schouten & Kirkpatrick, 1993, p. 12). Specifically, in the summary by Miller et al. (1987), reports of screening errors were limited to overreferral rates. The high rates of underreferral were omitted, which Schouten and Kirkpatrick (1993) contend is a more serious error than high overreferral rates in that many children who need early intervention are missed, only to be identified later when early intervention is no longer on option. The predictive accuracy of the test's 5th and 25th percentile cutpoints was also questioned. In all three of the predictive validity studies, the MAP's sensitivity at the 5th percentile cutpoint was very low, with the test missing 69% to 90% of children who went on to have school problems (Schouten & Kirkpatrick, 1993). The MAP's sensitivity improves when the 25th percentile cutpoint is used, particularly if the outcome criterion is the teacher's general rating of the child's ability (Humphry & King-Thomas, 1993). However, even when the 25th percentile cutpoint is used, the MAP still fails to detect more than half of the children who later develop problems in school (Kirkpatrick & Schouten, 1993). In Lemerand's (1985) study, 35% of the children were referred when the 25th percentile cutpoint was used. This suggests this is a very inefficient cutpoint for identifying children with potential school problems because such a large proportion of the children would be referred for further service (Kirlcpatrick & Schouten, 1993). Validity of 26 Further studies of predictive vahdity of the MAP are needed to evaluate a full range of cutpoints in order to determine the test's maximal predictive accuracy relative to school-related outcomes (Miller, 1988b; Schouten & Kirkpatrick, 1993). One of the difficulties in evaluating the predictive accuracy of a screening tool is that there are few guidelines to determine what are acceptable results (Humphry & King-Thomas, 1993). When considering an ideal underreferral rate, consideration must be made regarding the incidence of this problem in the general population. If the incidence of learning problems is considered to be 15% of the population, and if the MAP cutpoint used is the 5th percentile, then 10% of children with learning difficulties may potentially be missed (Humphry & King-Thomas, 1993). Another area of concern is the vahdity of the MAP across the different age groups. In terms of IQ and achievement testing, the MAP does not discriminate well before age three because of unstable and unpredictable early childhood development (Miller & Schouten, 1988). The test was reported to demonstrate the strongest predictive accuracy between the ages of 3 and 4 years. However, with the oldest age group (4 to 5 years), it fared no better than the younger age group (Miller & Schouten, 1988). Based on personal communication with Lucy Miller, Daniels (1990) reported that all ages had been averaged for the final scoring tables; therefore, younger children tend to receive higher scores than older children. Schouten and Kirkpatrick (1991) reported that the test performs most poorly at the lowest age levels and inconsistently across the other age groups. In Miller's predictive vahdity study the MAP Total Score, compared to each of the five performance indices, showed the strongest correlation overall with later measures of intelligence (Miller, 1987a). Further, in Daniels and Bressler's (1990) study, the MAP Total Score outperformed any of the five performance indices in demonstrating the greatest degree of discrimination among six different diagnostic groups. Underlying Theory in the Development of the MAP It is critical in early screening to have a theory or set of hypotheses about reading or learning disabilities and their developmental antecedents, because without theory there would not Validity of 27 be guidelines for selecting a test battery that will identify the at risk child (Satz & Fletcher, 1988). The theoretical considerations that suggest precursors of learning difficulties can be evaluated by investigating whether these risk factors predict later learning status. In development of the MAP, emphasis was placed on obtaining a broad domain of items that were sensitive to subtle, moderate developmental delays; many of the variables that were suggested to be predictive of future school performance were included in the final version (Miller, 1982). According to the MAP manual (Miller, 1982), normal development in the preschool years and factors associated with difficulties with school performance were the two main areas that had been explored in the literature. The MAP is comprised of 27 items which cluster into five performance indices. The Foundations Index includes items reflecting neurological and neuromotor aspects of development such as sense of position and movement (vestibular, proprioceptive, and cerebellar integrity), sense of touch (tactile intactness), and components of normal movement patterns (flexion, extension, weight shifting, and rotation) (Miller, 1982, 1988c). These items are suggested as being indicative of higher-level cognitive abilities, nervous system maturity, and general maturational status (Miller, 1988c). Research by Provost, Harris, Ross and Michnal (1988) provided support for the notion that intact tactile processing and an appropriate balance between trunk flexors and extensors may be required for development of normal fine and gross motor skills, respectively. The Coordination Index includes gross, fine and oral motor items (Miller, 1982,1988c). According to Miller (1988c), there is extensive evidence to support a relationship between motor abilities and academic achievement. It was not specified whether this relationship is concurrent or predictive. The primary difference between the Foundations and Coordination Indices is that motor milestones are the focus of the Coordination Index, while the Foundations Index reflects underlying neurological integrity needed to perform these tasks (Miller, 1988c). The Verbal Index measures language abilities associated with memory, sequencing, comprehension, association, following directions and expression (Miller, 1982,1988c). There is evidence for an association between poor academic functioning and speech or language problems (Miller, 1988c). Again, it was not specified whether the association pertains to a concurrent or Validity of 28 predictive relationship. Difficulty with language may indicate underlying developmental disabilities which may present as poor reading skills or difficulty with verbal language comprehension. The Non-Verbal Index measures cognitive abilities such as memory, sequencing, visualization, and mental manipulations not requiring verbal language abilities (Miller, 1982, 1988c). Difficulty with any of these processes may interfere with the acquisition of reading, spelling, and writing skills (Miller, 1988c). The Complex Tasks Index requires an integration of sensory, motor and cognitive abilities, as well as the interpretation of visual-spatial information (Miller, 1982, 1988c). There is research and clinical support that difficulty with integrative tasks may underlie academic disability (Miller, 1988c). Evaluation of visual motor abilities examines the processes through which the central nervous system organizes visual information and produces a motor response. Motor planning is evaluated because the child is dependent on this ability to organize motor activities. Proposed Relationship Between the MAP and the Outcome Measures in this Study The four outcome measures used in this study were chosen because they represent a broad spectrum of cognitive function in children aged 47-68 months. The WPPSI-R provides a measure of general intelligence, while the other three tests measure more specific areas of cognitive function. The TERA-2 examines early reading behavior; the PPVT-R examines receptive vocabulary; and the VMI examines visual-motor integration. The VMI is considered to be a cognitive (perceptual) as well as a motor measure, in that it was reported to measure a unique integration factor of higher levels of thinking and behavior that goes beyond basic visual-motor skill (Beery, 1989). Wechsler Preschool and Primary Scale of Intelligence - Revised (WPPSI - R). The WPPSI-R is an intelligence test for young children (Wechsler, 1989). Given that intelligence is related to an individual's ability to learn and that the original WPPSI (Wechsler, 1967) was shown to correlate with both later IQ scores and achievement measures, and given that the stated purpose of the MAP is to identify children at risk for later learning problems, one would expect to Validity of 29 find a positive relationship between these two measures. The original WPPSI was also one of the tests that influenced the development of the format and content of the MAP (Miller, 1982). As part of establishing concurrent validity during the MAP's standardization, the MAP was compared to the original WPPSI. Only the WPPSI Full Scale Intelligence Quotient (FSIQ) and the Complex Tasks Index of the MAP were significantly correlated (r = .367). Later, Miller's (1986) predictive validity study of the MAP revealed significant correlations between the three IQ scales of the Wechsler Intelligence Scale for Children - Revised (WISC-R) (Wechsler, 1974) and the MAP Total score and five performance indices (r = .19 to .50). The MAP was administered during the participants' preschool years and the WISC-R was administered four years later. The MAP's Total score demonstrated the highest degree of correlation with the three WISC-R IQ scales (r = .45 to .50). Miller (1986) subsequently stated "... it can be concluded that the administration of the MAP at a preschool level can predict intelligence and achievement four years later at a level typically observed in similar research" (p. 240). Using clinical epidemiology techniques in Miller's (1986) study, the following predictive values were obtained between the MAP's Total Score and the WISC-R: (1) using the 5th percentile as a cutpoint, the sensitivity was 21%, the specificity was 97%, the overreferral rate was 2.4%, and the underreferral rate was 7.9%; and (2) using the 25th percentile as a cutpoint, the sensitivity was 59%, the specificity was 80%, the overreferral rate was 18%, and the underreferral rate was 4.1%. Miller (1986) concluded that"... the WISC-R appears to have the highest levels of sensitivity and specificity of all the dependent measures that were examined in this study" (p. 339). Therefore, in regards to the present study, is seems reasonable to expect that there would be a relationship between the WPPSI-R FSIQ and the MAP Total score. Test of Early Reading Ability (TERA-2). The TERA-2 is a standardized measure of early reading behaviors such as book handling/processing (e.g., orienting a book right-side up, beginning at the front of a book, etc.), letter/number recognition, and familiarity with the structure and formality of written language. It is asserted in the manual that these early reading behaviors are more directly related to later reading ability that the more traditional risk indicators, such as motor development, communication or perception (Reid, Hresko & Hamill, 1989). Vahdity of 30 In Miller's (1986) predictive validity study of the MAP, the closest association to early reading ability was the Woodcock-Johnson Psychoeducational battery (W-J) - Reading subtest (Woodcock & Johnson, 1977). The MAP Total Score and the five performance indices significantly correlated with this subtest (r = . 18 to .36, with .36 representing the relationship with the Total Score), which was administered four years after the MAP. Using the 5th percentile cutpoint, the MAP demonstrated a sensitivity of 20%, specificity of 97%, overreferral rate of 2.4%, and underreferral rate of 8.3% with the W-J - Reading subtest. At the 25th percentile cutpoint, the sensitivity was 54%, the specificity was 80%, the overreferral rate was 18% and the underreferral rate was 4.7%. Miller's (1986) predictive validity study also examined report card grades in reading and compared them to MAP scores administered four years earlier. In comparing passed versus failed report card groups in reading, the group means were significantly different on the MAP Total Score and the five performance indices. Using the 5th percentile cutpoint, the MAP Total score demonstrated a sensitivity of 25%, specificity of 97%, overreferral rate of 2.9% and underreferral rate of 4.4% in predicting report card grades in reading. At the 25th percentile cutpoint, the sensitivity was 60%, the specificity was 78%, the overreferral rate was 20.4% and the underreferral rate was 2.37%. The MAP items which were most discriminative of later reading outcome covered a broad range of domains including language (specifically, the ability to repeat verbal information), neurological measures, and visual perceptual functions (Miller, 1986). Specifically, the most ckscriminative MAP items were Digit Repetition, Figure Ground, Stamp, Sentence Repetition, Cage, Kneel-stand, Tongue Movements and Block Tap. While the MAP demonstrated the fewest number of false negatives in predicting later report card grades in Physical Education, Miller (1986) noted that only 6 children failed Physical Education. Therefore, the problem category for Physical Education was small and the data were considered to be suspect. The MAP's ability to predict later report card grades in reading demonstrated the second fewest number of false negatives. Given that more children failed reading (n = 20) than failed Physical Education, which Vahdityof 31 resulted in a larger problem category for reading, Miller (1986) concluded that"... reading appears to be the criterion on which the MAP is least likely to 'miss' children" (p. 335). Other studies, however, have demonstrated support for different discriminators of later reading ability than were found in Miller's (1986) study. Felton (1992) investigated the value of a variety of different scores/tasks in kindergarten in predicting reading outcome in grade 3, particularly measures of phonological processing. Significant predictors were general ability (IQ), rapid naming of letters, and beginning sound chscrirnination of words. Overall, children with higher IQs in kindergarten had better reading outcomes than children with lower IQs. When IQ was removed from the analysis, rapid naming of letters, beginning sound discrimination of words, and a measure of auditory conceptualization were found to be the best predictors. However, using these significant predictors, the study found a sensitivity of 31%, a specificity of 97%, and a false positive rate of 69% in predicting reading outcome. The author therefore recommended that to identify children who are at risk for reading difficulty, standardized measures of language processing and reading readiness should be included in addition to measures of phonological processing. In contrast, Satz, Taylor, Friel and Fletcher (1978) found that sensorimotor-perceptual deficits measured in kindergarten best predicted reading problems in grade 2. Fletcher and Satz (1980) also found that sensorimotor-perceptual skills measured in kindergarten were most predictive of reading performance in grades 2 and 5. However, Badian (1988) found no support for the value of visual-motor tasks measured at age 4, in predicting later reading achievement in grade 3 or 8. Conversely, at both grade levels, early language abilities were among the best predictors. These findings indicate that there is a great deal of controversy over what factors contribute to later reading difficulties. A great deal of the confusion appears to lie in the definition of a reading problem. One view of reading emphasizes decoding while another view emphasizes comprehension (Keogh & Sears, 1991). While reading might be more precisely understood as a combination of the two, decoding appears to be more relevant to early reading and comprehension to later reading. Validity of 32 Given the MAP's broad range of items and its moderate accuracy in Miller's (1986) study in predicting reading outcomes in the early elementary school years, it seems reasonable that there may be a relationship between the MAP Total Score and a later measure of early reading ability. Peabody Picture Vocabulary Test - Revised (PPVT-R). The PPVT-R provides a standardized assessment of receptive vocabulary (Dunn & Dunn, 1981). Vocabulary was reported in 1957 to be a very good indicator of school success (Dale & Reichert as cited in Dunn & Dunn, 1981), and historically has been highly associated with measures of IQ used early in this century. However, a more recent study by Altepeter and Handal (1985) which compared the PPVT-R to the WISC-R (Wechsler, 1974) and Wide Range Achievement subtests (Jastak & Jastak, 1976), indicated that the PPVT-R loaded significantly on a verbal comprehension factor and did not load on perceptual organization or achievement factors. According to the PPVT-R manual (Dunn & Dunn, 1981), most of the validity studies conducted to date have been on the earlier version of the PPVT (Dunn, 1959). However, the tests are similar in structure and a statistical link between the two tests has been demonstrated. The validity studies of the PPVT found that it correlates moderately well with measures of IQ (median correlations in the .60's), and to a reasonable degree with other measures of school achievement administered concurrently (i.e., reading comprehension; r = .29 to .68). However, it does less well in predicting later school outcome (median correlations in the .40's). A predictive and concurrent study of the PPVT-R conducted since the publication of the PPVT-R manual, using a sample of 29 children with mental handicaps, found a correlation of .43 between the PPVT-R and achievement tests which were administered seven months later (Naglieri & Pfeiffer, 1983). The concurrent correlation was higher at .71. In Miller's (1986) predictive validity study, the MAP Total Score and the five performance indices correlated significantly with the W-J - Language (Woodcock & Johnson, 1977), although the correlation values were low (r = .21 to .35, with .35 being the correlation with the Total score). Using the 5th percentile cutpoint, the sensitivity of the MAP was 23%, the specificity was 97%, the overreferral rate was 2.4%, and the underreferral rate was 6.8%, with the W-J -Validity of 33 Language subtest At the 25th percentile cutpoint, the sensitivity was 53%, the specificity was 79%, the overreferral rate was 19.2%, and the underreferral rate was 4.1 %. In comparing passed vs. failed language report card groups, t-tests demonstrated significant differences between these report card group means compared to the MAP Total Score and all performance indices except for the Complex Tasks Index (Miller, 1986). The MAP language items did not cuscrirninate as well as its non-verbal cognitive items and fine motor items. Using the 5th percentile cutpoint, the sensitivity of the MAP was 20%, the specificity was 97%, the overreferral rate was 2.9%, and the underreferral rate was 5.9% in predicting language report card grades. At the 25th percentile cutpoint, the sensitivity was 52%, the specificity was 78%, the overreferral rate was 20% and the underreferral rate was 3.6%. There is some evidence to suggest that language development is a valid predictor of developmental outcome (Capute & Accardo, 1978). Felton (1992) suggested the assessment of phonological processing and language (as well as measures of reading readiness) to identify children at risk for reading difficulties. However, the findings of Satz et al. (1978) are more consistent with Miller's (1986) findings that young children with reading problems are more likely to have deficits in sensory-motor ability and perception than in language. Meisels and Wasik (1990) contend that there is no evidence to support a singular relationship between an early language delay and subsequent developmental delay in childhood. There is no hard evidence to support the existence of a relationship between the assessment of a single aspect of language (receptive vocabulary) and a child's performance on a broad domained developmental test like the MAP. However, given Miller's (1986) significant (albeit low) correlations between language and the MAP Total score, and the inconsistencies in the literature (some of which support a relationship between language acquisition and later developmental outcome and some of which do not), it seems reasonable to explore the possibility of an association between the MAP Total Score and the PPVT-R. Developmental Test of Visual-Motor Integration (VMD. The VMI is a standardized paper and pencil copying test of geometric forms (Beery, 1989). The MAP and the VMI have similar stated purposes in that both propose to screen for learning problems. The VMI was also Validity of 34 reported to be one of the tests that influenced the format and the content of the MAP (Miller, 1982); however, no further information was provided as to how it specifically influenced the development of the test. The VMI was one of the outcome measures used in Miller's (1986) predictive validity study. Correlations between the VMI and the MAP Total Score and the five performance indices were significant, except for the MAP Verbal Index. However, the coefficients were quite low (r = . 10 to .21). The VMI was not subjected to further predictive accuracy evaluation using clinical epidemiology techniques in Miller's (1986) study. Some investigators have argued that there is a strong relationship between deficits in visual function and school performance (Frostig, Maslow, Lefever & Whittlesey, 1964), and that LD is based in perceptual dysfunction (Frostig & Maslow cited in Reynolds, 1990b). Satz et al. (1978) provide more recent support for this theory with their findings that younger children with reading difficulties were more likely to present with difficulties in sensory-motor and perceptual function. Reynolds' (1990b) summarization of numerous studies showed that children who had difficulty reproducing geometric forms on the Bender-Gestalt (another standardized measure of form copying) (Bender, 1938), also had difficulty with basic academic skills. However, with the possible exception of children in early elementary school, the predictive power of the Bender-Gestalt is limited if the effect of intelligence is controlled (Tolor & Brannigan, 1980). Lesiak (1984) found that the Bender-Gestalt added little or nothing to the predictive accuracy of standardized reading readiness tests in predicting reading ability. In the VMI manual (Beery, 1989), support was provided for the VMTs predictive accuracy in earlier rather than later grade levels when used in combination with other measures. Two of the studies reported in the manual investigated the value of the VMI and other measures in predicting school achievement from kindergarten to grade 1, while another study investigated reading achievement seven years after kindergarten testing which included the VMI. Correlations between the VMI and readiness tests were moderate (r = .50). In terms of achievement testing, the VMI correlated more strongly with arithmetic than with reading (no coefficients were provided). Handwriting correlated at .42. However, Reynolds (1990b) pointed VaHdityof 35 out that in recent years, the relationship between visual-motor function and LD has been questioned and is uncertain. There is little evidence to support a predictive relationship between the MAP Total Score and the VMI. Given the young age group of the children in this current study, however, there is more likely to be a relationship than if the children were older. Research Questions Because the MAP Total Score consistently demonstrated the highest level of predictive vahdity in Miller's (1986) study, the predictive accuracy of the MAP's Total Score, relative to the four outcome measures (WPPSI-R, TERA-2, PPVT-R and VMI), was investigated based on the following research questions: Research Question #1. At which cutpoint does the MAP's Total Score demonstrate the highest level of sensitivity using the following outcome measures? (a) WPPSI-R (b) TERA-2 (c) PPVT-R (d) VMI Research Question #2. At which cutpoint does the MAP's Total Score demonstrate the highest level of specificity using the following outcome measures? (a) WPPSI-R (b) TERA-2 (c) PPVT-R (d) VMI Validity of 36 Research Question #3. At which cutpoint does the MAP's Total Score demonstrate the highest positive predictive value using the following outcome measures? (a) WPPSI-R (b) TERA-2 (c) PPVT-R (d) VMI Research Question #4. At which cutpoint does the MAP's Total Score demonstrate the highest negative predictive value using the following outcome measures? (a) WPPSI-R (b) TERA-2 (c) PPVT-R (d) VMI Research Question #5. At which cutpoint does the MAP's Total Score demonstrate the lowest overreferral rate using the following outcome measures? (a) WPPSI-R (b) TERA-2 (c) PPVT-R (d) VMI Research Question #6. At which cutpoint does the MAP's Total Score demonstrate the lowest underreferral rate using the following outcome measures? (a) WPPSI-R (b) TERA-2 (c) PPVT-R (d) VMI Validity of 37 Research Question #7. A t which cutpoint does the M A P ' s Total Score demonstrate the highest level of predictive accuracy when using a R O C curve, relative to the four outcome measures? (a) W P P S I - R (b) TERA-2 (c) P P V T - R (d) V M I The final research question is relevant to the secondary purpose of this study, namely to contribute to the growing body of knowledge regarding developmental outcomes of preschool-aged children prenatally exposed to drugs. Research Question # 8. How did this particular sample of children perform on each of the tests, compared to the sample on which each test was normed? (a) M A P (b) W P P S I - R (c) T E R A - 2 (d) P P V T - R (e) V M I Validity of 38 CHAPTER 3 Methods This chapter initially presents a detailed description of the participants, including how they were selected; their birth mothers' drug(s) of choice during pregnancy; some details with regard to their neonatal history; and their types of home and school environments. The predictor measure, the MAP, is then described, followed by descriptions of the four outcome measures: the WPPSI-R, TERA-2, PPVT-R and VMI. The procedure of the study is then presented which includes statements with regard to the study design and ethical approval; how informed consent was obtained; a description of the backgrounds of the psychologist and occupational therapist evaluators; a report of the time interval between the predictor and outcome measures; and a report on the interrater reliability with regard to the WPPSI-R, MAP and VMI. Finally, the data analysis is discussed which includes the clinical epidemiological techniques of sensitivity, specificity, positive predictive value, negative predictive value, overreferral rate, underreferral rate, and ROC curves for the MAP with regard to each outcome measure, and the use of descriptive statistics to compare the study's sample to each test's normative sample. A rationale is presented for the four chosen MAP cutpoints: the 5th, 14th, 17th and 26th percentile scores. Participants Participants for this study were children who were followed longitudinally from 1988 to 1994 in the Infants and Children At-Risk (ICAR) Program at Sunny Hill Health Centre for Children (SHHCC) because of prenatal drug exposure. These children were initially referred to SHHCC for management of drug withdrawal as neonates or were referred somewhat later by their social workers or physicians because of prenatal drug exposure. The purposes of the ICAR Program are to provide developmental follow-up at regular intervals up to age 6 years, and to make recommendations for referrals to early intervention programs or remedial activities, as appropriate. Validity of 39 The initial list of 154 potential participants was the total number of children seen by the ICAR Program during 1988 to 1994, who were also in the appropriate age range for the study. Of this potential sample, only 37 children (24%) participated in the study. Of the children who did not participate in the study, 72 (47%) did not participate because of attrition from the program; 15 (10%) did not have a MAP evaluation in the required age range for the study; 14 (9%) were unable to complete a MAP evaluation in the required age range; 8 (5%) did not wish to participate; 5 (3%) did not fit into the time frame of the study (e.g., unable to schedule early and later assessments far enough apart and still meet the study's deadline for completion); and 3 (2%) were unable to participate for various reasons (e.g., difficulty with travel arrangements). The birth mother's reported drug(s) of choice was obtained from each child's medical file for the 37 children in the final sample. As is typical in this population, the majority of mothers (81%) were recorded as being polydrug users. Alcohol was the most common drug recorded at 49%, with others being cocaine (38%), Ritalin (38%), Talwin (32%), and Ritalin and Talwin combined (27%). Less commonly recorded were methadone (19%) and heroin (16%). Over half of recorded drugs (57%) were prescription, over-the-counter or other street drugs (e.g., niarijuana). Although alcohol was the most commonly reported drug, given the prevalence of polydrug abuse, it is thought to be underrecorded at only 49%, as is nicotine at only 8%. Information on the neonatal history was also obtained from the medical file. Gestational age was available for 68% of the sample and ranged from 33 to 40 weeks (mean = 37.9 weeks). Birth weight was available for 62% of the sample and ranged from 1680 to 3700 grams (mean = 2775 grams). Withdrawal symptoms were recorded for 70% of the sample (n = 26), with two infants not demonstrating any signs of withdrawal; this information was unavailable for the remaining nine participants. The majority of children (n=22) were living with adoptive parents, while 8 were in foster care, 2 were living with other relatives, and 5 were living with their biological parents. Most of the children (70%) were also in school or daycare. Special-needs preschool accounted for 22%, regular preschool 19%, kindergarten 19% and daycare 5%. Two children (5%) were provided with home schooling. Validity of 40 All participants received a MAP assessment between the ages of 35 to 49 months (mean = 40.3 months). They all subsequently received a psychological assessment between the ages of 47 to 68 months (mean = 56.9 months). The participants represent a sample of convenience because the number of eligible participants was not large enough to allow for random sampling in this pool of children. The participants were previously involved in a prospective study conducted at SHHCC and funded by the Kiwanis Clubs of Kelowna, to examine the concurrent and predictive validity of the MAP in comparison to IQ and other tests administered by a psychologist (Harris, Eaves & Fulks, 1994). They represent the total number of children who were eligible for this study and for whom consent by the legal guardian was obtained. Predictor Measure Miller Assessment for Preschoolers As stated earlier, the MAP is a standardized, preschool-aged test (2 years 9 months to 5 years 8 months) which was designed to evaluate children who are exhibiting moderate "preacademic problems" (Miller, 1982, p. 1). The test was developed over a 10-year period and involved over 4000 children with an initial pool of over 800 items. Following multiple field-tests, the final pool of 27 items was standardized on a randomly selected, stratified sample of 1200 preschoolers in the nine Census Bureau Regions of the U.S. The sample was a close approximation of the 1970-77 U.S. census reports (Miller, 1982), although it was skewed toward groups with higher income and education (Miller, 1986). The sample was stratified on the basis of age, sex, race, community size and socioeconomic variables. A sample of 90 children with predetermined preacademic problems was also tested. As part of the initial standardization of the MAP, interrater reliability was assessed. It ranged from .84 to .99 across the Total Score and the five performance indices with all but the Coordination Index ranging from .97 to .99 (Miller, 1982; Miller, 1988c). In regards to the test-retest reliability of the MAP, 81% of the MAP Total Scores remained stable over two testings using intervals of 1 to 4 weeks (Miller, 1982; Miller, 1988c). The internal consistency of the Validity of 41 MAP Total score based on the participants' responses to all test items was .79 using a Spearman-Brown correlation coefficient. To establish content validity, four procedures were completed (Miller, 1982; Miller, 1988c): (1) a MAP specification table was developed which illustrated the representativeness of each MAP item to the behavioral domain assessed; (2) it was determined that each MAP item represented a developmental trend with older children performing at a higher level on each item than younger children; (3) the computation of a varimax rotated factor matrix indicated that the Verbal, Coordination and Non-Verbal indices formed distinctive clusters, while the items from the Complex Tasks Index grouped with verbal, non-verbal and coordination factors (as might be expected given the complex nature of the activities), and the items from the Foundations Index did not cluster at all (reflecting measurement of a broad domain of behaviors), and (4) each MAP item correlated significantly with the Total Score, as did all five performance indices. To assess concurrent validity, the MAP was compared to four other standardized tests with which the MAP was purported to have some commonality. The MAP demonstrated the strongest relationship with the Illinois Test of Psycholinguistic Ability (Kirk, McCarthy & Kirk, 1968). The MAP did not correlate highly with the WPPSI (Wechsler, 1967), achieving a statistically significant correlation only between the WPPSI's FSIQ and the Complex Tasks Index (r =.367); nor did the MAP correlate well with the Southern California Sensory Integration Test (SCSIT) (Ayres, 1972), demonstrating significance only between the MAP's Foundation Index and the SCSIT grouping titled Other (r =.357). In comparing the MAP to the Denver Developmental Screening Test (DDST) (Frankenburg & Dodds, 1973), the MAP's Red category (0 - 5th percentile), Yellow category (6th - 25th percentile), and Green category (26th - 99th percentile) were compared respectively to the DDST's Abnormal, Questionable and Normal categories. Seventy-two percent of the sample fell into the same groupings. However, the MAP identified 24% more children as at risk, than did the DDST. To measure construct validity, a sample of 90 children with identified preacademic problems were evaluated on the MAP. A total of 75% of the sample obtained scores that placed them in the Red or Yellow category (46% in the Red category and 29% in the Yellow category). Validity of 42 Criterion Measures Wechsler Preschool and Primary Scale of Intelligence-Revised. The WPPSI-R is a standardized assessment of intelligence for children from 3 years to 7 years 3 months (Wechsler, 1989). It measures abilities considered to reflect different aspects of intelligence. Development of the test was guided by the belief that intelligence is a global, multi-dimensional entity rather than a single-faceted, uniquely defined trait The purpose of the WPPSI-R is described as follows: "The WPPSI-R is intended for use as a measure of intellectual ability in a wide range of educational, clinical, and research settings. The primary use of this scale is in the diagnosis of exceptionality in schools and private practice settings." (Wechsler, 1989, p. 8). The test is divided into two sections - one section is comprised primarily of perceptual-motor subtests and its scores yield the Performance IQ (PIQ). The other section is comprised of verbal subtests and its scores yield the Verbal IQ (VIQ). Both scores are combined to obtain the Full Scale IQ (FSIQ). Motor responses such as pointing, placing or drawing are required by the Performance subtests. Spoken responses are required by the Verbal Subtests. Intercorrelational studies and factor analyses consistently supported the clustering of two distinct subscales. Each subtest has a mean of 10 and a standard deviation of 3. Each IQ scale has a mean of 100 and a standard deviation of 15. The split-half reliability coefficients ranged from .54 to .93 for the subtests, and .85 to .97 for the IQ scales. Interscorer agreement on subjectively scored tests, such as Geometric Design ranged from .88 to .96. Objectively scored tests were not evaluated using interscorer agreement. The test-retest stability was also better for the IQ scales than the subtests, at .88 to .91 versus .52 to .82. Therefore, it was concluded that the WPPSI-R is highly reliable, especially for the IQ scales. The construct validity of the WPPSI-R was supported in comparison studies with the WPPSI (coefficients for the IQ scales ranged from .82 to .87) (Wechsler, 1967), the WISC-R (.75 to .85 for the IQ scales) (Wechsler, 1974), the Stanford-Binet Intelligence Scale (.54 to .74 between each IQ scale, each area score and the composite score) (Thorndike, Hagen & Sattler, Validity of 43 1986), and the McCarthy Scales of Children's Abilities (.71 to .81 between the IQ scales and the McCarthy's indices) (McCarthy, 1972), indicating that the WPPSI-R is a valid measure of intelligence. Test of Early Reading Ability-2. The TERA-2 is a standardized assessment of early reading for children from 3 to 9 years of age (Reid, Hresko & Harnmill, 1989). However, its uniqueness is in its evaluation of spontaneously emerging reading behaviors during the preschool years. It assesses three different areas: (1) construction of meaning from print, which refers to the knowledge of word meanings, syntax, and the child's knowledge of general background information (e.g., making sense of print via cues from logos, pictures or shapes of boxes); (2) knowledge of the alphabet and its functions, which refers to letter and word recognition, and how letters are related to words, syllables and individual sounds; and (3) discovery of print conventions, which refers to understanding how written language works, such as how a book is oriented, and that print is read from top to bottom and left to right. The purposes of the TERA-2 are: (1) to identify children who are performing significantly differently from their peers in early reading behaviors; (2) to monitor the progress of children who are learning to read; (3) to act as a measure for research studies; and (4) to indicate appropriate areas for further evaluation or remediation. The TERA-2 yields a raw score, percentile rank, and two types of standard scores (i.e., a quotient and a normal curve equivalent). For the purposes of this project, only the quotient, which has a mean of 100 and a standard deviation of 15, was used. This permitted consistency of measurement across the four outcome measures. The test was standardized using a sample of 1454 children across 15 states. The characteristics of the participants (e.g., sex, race, ethnicity) were representative of the population in the United States (U.S.), according to the 1985 Statistical Abstract of the U.S. Two types of reliability were established - content sampling and time sampling. Content sampling refers to the level of homogeneity among the test as a whole and its component items. Coefficients ranged from .78 to .98, with 86% of the items exceeding .80 and 64% exceeding .90. Time sampling refers to the degree of temporal stability of a test Using the two equivalent forms Validity of 44 of the TERA-2 (Form A and Form B), 49 students were tested using each form within a 2-week period. The resulting coefficient was .79. This method measured error not only in regards to time sampling but also form equivalency (a type of content sampling). To establish content validity, the test authors reported that particular care was taken to... "select items that were representative of the subject matter being assessed and that required a variety of responses." (Reid et al., 1989, p. 27). Criterion-related validity was evaluated via correlation with the Basic School Skills Inventory - Diagnostic (BSSI-D), Reading Subtest (Form A, r = .61; Form B, r = .52; p < .01) (Harnmill & Leigh, 1983), and the Test of Reading Comprehension (Paragraph Reading subtests) (Form A, r = .36; Form B, r = .34; p < .05) (Brown, Harnmill & Wiederholt, 1986). In regards to construct validity, the following evidence was provided: (1) that test scores increased appropriately with increasing age and school experience; (2) the test differentiated between children with learning disabilities and those considered to be normal; (3) both forms of the test correlated with the BSSI-D Writing subtest and total score at the p < .05 (r values ranged from .37 to .59), providing evidence of correlation with other academic measures; and (4) each test item related to the total score (item validity). Peabody Picture Vocabulary Test - Revised. The PPVT-R is a standardized evaluation of hearing vocabulary where for each stimulus word, the individual selects one picture out of four that best represents the stimulus word (Dunn & Dunn, 1981). It therefore does not require an oral or written response. The test was designed for use with individuals from 2 1/2 to 40 years of age. The test has two equivalent test forms - Form L and Form M. The purpose of the PPVT-R is to assess an individual's receptive vocabulary. It provides a quick estimate of only one major aspect of language ability - receptive vocabulary. However, vocabulary has historically been reported to be a very good indicator of school success (Dale & Reichert as cited in Dunn & Dunn, 1981), although this is not supported in a more recent study (Altepeter & Handal, 1985). The test is useful in several settings. For example, in school it may be used to help teachers know what vocabulary level to use when teaching (particularly with bilingual students), to assess preschoolers' vocabulary as a measure of child development, and to screen for children Validity of 45 who may need remedial help. Because there is no need for reading, writing or oral responses, the test may be used clinically with individuals with a variety of special needs (e.g., cerebral palsy, autism, severe language impairment). The test may also be useful for research purposes. For example, it has two alternate forms which may be useful for pre- and posttesting, and it covers a wide age range which makes it useful in longitudinal studies. The PPVT-R yields a raw score, standard score, percentile rank, stanine and age equivalent. For the purposes of this study, only the standard score was used (mean of 100 and standard deviation of 15). The test was standardized using a sample of 4200 children and adolescents. The sample was representative of the population based on the 1970 U.S. Census. The test was also standardized on 828 adults. Two types of reliability were established - internal consistency using the split-half procedure, and alternate forms where a sample is tested on two different test forms. The split-half reliabilities for children and youth for Form L ranged from .67 to .88 (median .80), and from .61 to .86 on Form M (median .81). The immediate retest alternate-forms reliability coefficients (both tests administered in < 1 week) for standard scores for children and youth ranged from .71 to .89 (median .79). The delayed retest alternate-forms reliability coefficients (both tests administered between 9 and 31 days apart) for children and youth for standard scores ranged from .54 to .90 (median .77). To establish content vahdity, the authors reported undertaking a thorough search of the dictionary to identify all words whose meaning could be illustrated. Nineteen different content categories were employed (e.g., actions, animals, clothing). Evidence of construct validity was provided via various quotes and studies from developers of intelhgence tests, which espoused the value of vocabulary testing and its high correlation with the overall IQ score. However, this was not supported in a more recent study (Altepeter & Handal, 1985). Although significant correlations were achieved between the three WISC-R IQ scales (r = .68 to .79) and the WRAT subtests (r = .50 to .59), on factor analysis the PPVT-R loaded significantly only on a verbal comprehension factor. The PPVT-R authors also Validity of 46 reported that because the internal consistency of each test item was established as the stimulus words were selected, this also offers evidence of construct validity. At the time of manual publication, no predictive validity studies of the PPVT-R were completed. The only study of concurrent validity reported in the PPVT-R manual was the comparison between the original and the revised editions of the PPVT, where the correlations for standard scores ranged from .50 to .85 (median .68). A concurrent and predictive vahdity study of the PPVT-R completed since the manual's publication revealed a predictive correlation between the PPVT-R and an achievement test of .43 (Naglieri & Pfeiffer, 1983). The tests were administered an average of seven months apart. The concurrent validity correlation between these two measures was .71. Developmental Test of Visual Motor Integration. The VMI is a standardized paper and pencil copying test, comprised of a developmental sequence of 24 geometric forms (Beery, 1989). It was designed for use with children aged 3 to 18 years. The primary purpose of the VMI is ... "to help prevent learning and behavioral problems through early screening identification" (Beery, 1989, p. 8). The VMI yields a raw score, standard score, percentile rank, age equivalent, normal curve equivalent, t-score and scaled score. As with the other outcome measures, only the standard score (mean of 100 and standard deviation of 15) was used in this study. The test was standardized initially in 1964 using a sample of 1030 children from Illinois. The initial test norms were cross-validated in 1981 using 2060 children from California. These norms were found to be virtually identical to the initial norms. The two sets of norms were subsequently combined and published in the 1982 revision. Further cross-validation studies were undertaken in 1988 using 2734 children from various states from the East, North and South regions. Despite incorporating an expanded scoring system (increased from 24 to 50 points to allow for finer discrimination in the older age levels), there were no significant differences between these norms and the earlier ones. Therefore, the norms from all three samples were combined and published in the 1989 revision. This combined norming sample is similar to the U.S. population as described in the 1980 census. Validity of 47 Beery (1989) reported that... "As the 1989 expanded scoring results correlate almost perfectly with the original scoring results, earlier reliabiUty, validity, and other studies of the VMI are relevant to the current edition" (p. 12). Therefore, the following reported studies of reliability and validity are primarily based on earlier editions. Interrater reliability coefficients ranged from .58 to .99 (median .93). However, if 3 to 5 hours of training were provided, interrater reliability coefficients consistently exceeded .90. Test-retest reliability coefficients for normal samples ranged from .63 (over a 7-month period) to .92 (over a 2-week period) (median .81). With institutionalized, disturbed children, a coefficient of .59 was obtained over a 2-week period. Using the 1988 norrning studies, split-half reliability coefficients ranged from .76 to .91 (median .85). Several measures of concurrent validity were undertaken. Changes in eye-hand coordination with increasing chronological age correlated at .89. Correlations between the VMI and various IQ tests ranged from .38 to .59. Significant test performance differences between males and females or among rural, urban and suburban children were generally not found. Although statistically significant differences have been found between some ethnic groups (e.g., black vs. Caucasian children in Headstart programs), there was little practical significance because only approximately 1% of the variance could be attributed to ethnicity. Some studies have supported findings of differences in test performance based on socioeconomic differences, with lower socioeconomic groups performing more poorly. In comparing the VMI and readiness tests, correlation coefficients have averaged about .50. In regards to achievement tests, the VMI correlated more strongly with measures of arithmetic than of reading. However, not all studies supported strong relationships. Correlations between the VMI and various age groups on handwriting averaged .42, stronger than other correlation measures including general IQ, finger dexterity and visual perception. With the exception of children with language delays, children with various other disabilities (e.g., brain injury, educable mental handicaps) have performed less well on the VMI than children without disabilities. The VMI correlated highly with measures of non-motor visual perception (r = .72 and .80), although it was reported that because the children could visually discriminate more forms Validity of 48 than they could copy, the VMI additionally measured a unique integration factor. Likewise, despite a correlation of .76 between test performance on the VMI and tracing the VMI forms, the children were able to trace approximately five more forms than they could copy. This again points to a unique integration factor on the VMI. It was suggested that the VMI measures an integration component that goes beyond visual-motor behavior because... "The VMI correlated highly with automatic-sequential subtests of the Illinois Test of Psycholinguistic Abilities and only moderately with the subtest designed to measure less integrative skills" (Beery, 1989, p. 16). Correlations between the VMI and the Bender-Gestalt (Bender, 1938) have ranged from .29 to .93 (median .56). In general, studies of predictive validity have found the VMI to be a useful predictor when used in combination with other measures (Beery, 1989). For example, in comparing a battery of pre-kindergarten tests, the VMI in combination with a test of auditory-vocal association, best predicted achievement between the end of kindergarten and the end of grade one. Predictive correlations decrease in higher grade levels. The reason suggested was that children learn to compensate for poor visual-motor skills as they grow older. Procedure The study design was a nonexperimental, prospective analysis of the predictive accuracy of the MAP Total Score compared to four cognitive outcome measures, using clinical epidemiological techniques. Approval from the University of British Columbia (UBC) Ethical Review, Behavioural Sciences Screening Committee was received for the original prospective study by Harris, Eaves and Fulks (1994). Subsequent approval was also received from the committee for the present study to reanalyze the existing data set and to extend the term deadline. The participants were initially identified for eligibility in the study via medical file review regarding completion of MAP evaluations in the stated age range. The legal guardians of eligible participants were then contacted by letter which explained the purpose of the study. They were then called regarding whether they wished to have their child participate. Children for whom written informed consent was provided by their legal guardians were then booked for a Validity of 49 psychology evaluation, which included administration of the WPPSI-R, TERA-2 and PPVT-R. The occupational therapist administered the VMI. The psychologist for the study received her Master of Arts degree in Psychology in 1964. She has 30 years of experience in working with children, including 10 years of experience in administering standardized psychological assessments to young children both at SHHCC and at British Columbia Children's Hospital. The occupational therapist had worked in the ICAR Program at SHHCC for five years and had been trained in administration and interpretation of the MAP. She also had eight years of experience in administering a variety of standardized motor, sensory and perceptual assessments, including the VMI. The time interval between the administration of the MAP (predictor measure) and the administration of the four outcome measures was 6 to 32 months (mean = 16.6 months). Only two participants had a time interval of less than 11 months. This time interval is reflective of the availability of participants, and results in an average interval of almost 18 months. As part of the original Kiwanis MAP Project, the interrater reliability on the MAP between this graduate student and another occupational therapist with 20 years of pediatric experience was assessed for 10 of the participants. The intraclass correlation coefficient (ICC) for the MAP was .985, and for the VMI was .962. The interrater reliability coefficients (ICC) for the WPPSI-R, between the Kiwanis Project psychologist and two other psychologists employed at SHHCC, was conducted on 5 of the participants. The ICC was .998 for the FSIQ, .964 for the VIQ, and .987 for the PIQ. Both of the psychologists who participated in the interrater reliability were registered psychologists with Ph.Ds. One was a clinical child psychologist, while the other was a clinical developmental psychologist. Data Analysis To examine the predictive accuracy of the MAP, data were analyzed using clinical epidemiological techniques, including the sensitivity, specificity, positive predictive values, negative predictive values, overreferral rates and underreferral rates. The sensitivity and specificity rates were used to develop the ROC curves for each outcome measure, to help Validity of 50 determine the best cutpoint for the correct classification of children into poor or good outcome groups. Given that the MAP Total Score consistently demonstrated the highest level of predictive accuracy in Miller's (1986) predictive validity research, only the Total Score was investigated in this study. A range of MAP Total Score cutpoints from the 1st to the 25th percentiles was investigated. Because of the structure of the MAP's scoring system, not every percentile point between the 1st and the 25th percentile is available; only the following percentiles are attainable: 1-9,11,14,17 and 20. Therefore, it did not seem reasonable to examine every attainable cutpoint within this range because, logically, there would not be as much of a difference between the 1st and 2nd percentiles, as there might be between the 14th and 17th percentiles. Therefore, the following cutpoints were included for investigation: 5th, 14th and 17th percentiles. The 5th percentile cutpoint was chosen because it is one of Miller's (1982,1988c) recommended cutpoints. The 14th and 17th cutpoints were chosen because they were the percentile scores which were closest to the 16th percentile, which would be 1 SD below the mean. Although the 25th percentile cutpoint is one of the recommended cutpoints of the MAP and is reported as a cutpoint in all of the previous validity research on the MAP, it actually is not an attainable percentile point. Therefore, the 26th percentile was also investigated as a cutpoint. With regard to the four outcome measures, the cutpoint for all four tests was 1 SD below the mean (< 85). Only the FSIQ score was investigated for the WPPSI-R. One SD below the mean was selected as the cutpoint because of its statistical relevancy in that scores below this cutpoint are below the average range, and its clinical relevancy in that children performing 1 SD below the mean are more likely to be at-risk for learning problems than those above this cutpoint. Furthermore, it allowed for a consistent cutpoint across the four outcome measures. To investigate how the sample of children in this study performed on the MAP and on each of the four outcome measures in comparison to each test's normative sample, descriptive statistics (e.g., means, SDs) for the present sample were calculated and compared to each test's norms. Vahdity of 51 CHAPTER 4 Results This chapter presents the results of the data analysis. Each research question will be presented and answered in turn. The study design was a nonexperimental, prospective analysis of the predictive accuracy of the MAP Total Score relative to four cognitive outcome measures. The data were analyzed using clinical epidemiological techniques. The first seven of eight research questions pertained to the primary purpose of the study, which was to evaluate the predictive validity of the MAP Total Score. The first four questions pertained specifically to the sensitivity, specificity, positive predictive value and negative predictive value of the MAP Total Score. An acceptable level of predictive accuracy is considered to be a value of > 80% (Carran & Scott, 1992). Research questions #5 and #6 examined the overreferral and underreferral rates. Given that an acceptable predictive accuracy value was reported as being > 80% (Carran & Scott, 1992), it may be assumed that an acceptable overreferral or underreferral rate is < 20%. Research question #7 examined the trade-off between the sensitivity and the specificity using a ROC curve. Research question #8 investigated how this sample of children performed on the MAP and on each of the four outcome measures, compared to each test's normative sample. This question pertained to the secondary purpose of the study, which was to contribute to the body of knowledge on outcomes of preschool-aged children with prenatal drug exposure. Four cutpoints of the MAP Total Score were evaluated: the 5th, 14th, 17th and 26th percentiles. The 5th percentile was chosen because it is one of the MAP's recommended cutpoints. The 14th and 17th percentiles were chosen because they were the percentile scores which were closest to the 16th percentile, which would be 1 SD below the mean. Although the 25th percentile cutpoint is one of the recommended cutpoints of the MAP and is reported as a cutpoint in all of the previous vahdity research on the MAP, it actually is not an attainable percentile point Therefore, the 26th percentile was also investigated as a cutpoint. With regard to the four outcome measures, the cutpoint for each of the four tests was set at 1 SD below the mean. Vahdity of 52 Summary of Participant Test Performance On the MAP, none of the 37 children performed at or below the 5th percentile cutpoint, while 6 of the children performed at or below the 14th percentile cutpoint. Eight of the 37 children received a MAP score at the 17th percentile cutpoint or below, while a total of 12 children received a MAP percentile score of 26 or lower. On the outcome measures, scores of less than 85 were received by two children on the WPPSI-R, four children on the TERA-2, three children on the PPVT-R, and six children on the VMI. A complete set of test scores for each participant is presented in the appendix. Research Question #1 The sensitivity of a test refers to the proportion of participants correctly identified as having a poor outcome (Carran & Scott, 1992; Domholdt, 1993; Miller et al., 1990). Knowing the sensitivity of a test allows an examiner to assess how predictive a poor test result is of a poor outcome. A test will yield different levels of sensitivity depending on the cutpoint used. At which cutpoint does the MAP's Total Score demonstrate the highest level of sensitivity using the following outcome measures? (a) WPPSI-R. At the 5th percentile cutpoint, the sensitivity of the MAP to later IQ as assessed by the WPPSI-R FSIQ, was 0%. Specifically, none of the children performed at or below the 5th percentile on the MAP, but two of them scored greater than 1 SD below the mean on the WPPSI-R FSIQ. In contrast, the sensitivity of the MAP at the 14th, 17th and 26th percentile cutpoints was 100%, indicating that the test was sensitive enough as low as the 14th percentile cutpoint to detect the two children who performed poorly on the WPPSI-R FSIQ. (b) TERA-2. At the 5th percentile cutpoint, the sensitivity of the MAP to early reading ability, as assessed by the TERA-2, was 0%. None of the children performed at or below the 5th percentile on the MAP, but four of them scored greater than 1 SD below the mean on the TERA-2. At the 14th and the 17th percentile cutpoints, the MAP attained a sensitivity of 50%. At the 26th percentile cutpoint, the MAP's sensitivity was 75%, i.e. three out of the four children Validity of 53 identified as poor performers on the MAP also performed poorly on the TERA-2, i.e. < 1 SD below the mean. (c) PPVT-R. At both the 5th and 14th percentile cutpoints, the sensitivity of the MAP to later receptive vocabulary, as assessed by the PPVT-R, was 0%. None of the children performed at or below the 14th percentile on the MAP, but three of them scored greater than 1 SD below the mean on the PPVT-R. The MAP attained a sensitivity of 33% at both the 17th and the 26th percentile cutpoints, correctly identifying only one out of the three children who performed poorly on the PPVT-R. (d) VMI. At the 5th percentile cutpoint, the sensitivity of the MAP to later visual motor integration, as assessed by the VMI, was 0%. None of the children performed at or below the 5th percentile on the MAP, yet six of them scored greater than 1 SD below the mean on the VMI. The MAP achieved a sensitivity of 50% at both the 14th and the 17th percentile cutpoints. However, at the 26th percentile cutpoint the MAP's sensitivity was 83%, with five of the six children who performed poorly on the VMI correctly identified by the MAP. The sensitivity of the MAP Total Score for each of the four outcome measures is shown in Table 1. Table 1 Sensitivity of the MAP Total Score for each of the Four Outcome Measures MAP's cutpoints Outcome Measures WPPSI-R8 TERA-2b PPVT-RC VMId 5th 0% 0% 0% 0% 14th 100% 50% 0% 50% 17th 100% 50% 33% 50% 26th 100% 75% 33% 83% an = 37. bn = 33. cn = 36. dn = 37. Validity of 54 Research Question #2 The specificity of a test refers to the proportion of participants correctly identified as having a good outcome (Carran & Scott, 1992; Domholdt, 1993; Miller et al., 1990). Knowing the specificity of a test allows an examiner to assess how predictive a good test result is of a good outcome. A test will yield different levels of specificity depending on the cutpoint used. At which cutpoint does the MAP's Total Score demonstrate the highest level of specificity using the following outcome measures? (a) WPPSI-R. At the 5th percentile cutpoint, the specificity of the MAP to later IQ, as assessed by the WPPSI-R FSIQ, was 100%. At this cutpoint, the test correctly identified all 35 of the children with scores of > 85 on the WPPSI-R FSIQ. Acceptable specificity levels were attained using either the 14th or the 17th percentile cutpoints, at 89% and 83%, respectively. The MAP incorrectly classified 4 children as at-risk at the 14th percentile cutpoint, and 6 children as at-risk at the 17th percentile cutpoint. Using the 26th percentile cutpoint, the MAP's specificity was 71%. (b) TERA-2. At the 5th percentile cutpoint, the specificity of the MAP to early reading behavior as assessed by the TERA-2, was 100%. At this cutpoint, the test correctly identified all 29 children with scores of > 85 on the TERA-2. At the 14th and the 17th percentile cutpoints, the MAP's specificity was 86% and 79%, respectively. Four children were incorrectly classified as at-risk at the 14th percentile MAP cutpoint, and 6 children at the 17th percentile MAP cutpoint Using the 26th percentile cutpoint, the MAP's specificity was 69%. (c) PPVT-R. At the 5th percentile cutpoint, the specificity of the MAP to receptive vocabulary, as assessed using the PPVT-R, was 100%. At this cutpoint, the test correctly identified all 33 children with scores of > 85 on the PPVT-R. At the 14th and the 17th percentile cutpoints, the MAP's specificity was 82% and 76%, respectively. Six children were incorrectly classified as at-risk at the 14th percentile MAP cutpoint, and 8 children at the 17th percentile MAP cutpoint. Using the 26th percentile cutpoint. the MAP's specificity was 64%. (d) VMI. At the 5th percentile cutpoint, the specificity of the MAP to visual motor integration as assessed by the VMI, was 100%. At this cutpoint, the test correctly identified all Validity of 55 31 children with scores of > 85 on the VMI. At the 14th and 17th percentile cutpoints, the MAP's specificity was 90% and 84%, respectively. Three children were incorrectly classified as at-risk at the 14th percentile MAP cutpoint, and 5 children at the 17th percentile MAP cutpoint. Using the 26th percentile cutpoint, the specificity of the MAP was 77%. The specificity of the MAP Total Score for each of the four outcome measures is shown in Table 2. Table 2 Specificity of the MAP Total Score for each of the Four Outcome Measures MAP'S Outcome Measures WPPSI-Ra TERA-2b PPVT-RC VMId 5th 100% 100% 100% 100% 14th 89% 86% 82% 90% 17th 83% 79% 76% 84% 26th 71% 69% 64% 77% an = 37. bn = 33. cn = 36. dn = 37. Research Question #3 The positive predictive value of a test refers to the proportion of participants correctly identified as positive by the test (identified as at-risk), compared to all of the participants who received a positive test score (Carran & Scott, 1992; Domholdt, 1993). Knowing the positive predictive value of a test allows an examiner to assess the likelihood that a participant with a poor test result is a true positive. A test will yield different positive predictive values depending on the cutpoint used. At which cutpoint does the MAP's Total Score demonstrate the highest positive predictive value using the following outcome measures? Validity of 56 (a) WPPSI-R. Using the WPPSI-R FSIQ as the outcome measure, the MAP demonstrated very low positive predictive values. This indicates that the proportion of children with poor outcomes (scores of < 85) on the WPPSI-R FSIQ who were correctly identified as such by the MAP, compared to all the children who scored poorly on the MAP, was small. Using the MAP's 5th percentile cutpoint, the positive predictive value for later IQ was 0%. The highest positive predictive value was attained at 33% when using the MAP's 14th percentile cutpoint In other words, of the six children who did poorly on the MAP when using the 14th percentile as a cutpoint, only two had poor outcomes on the WPPSI-R FSIQ. The 17th and 26th percentile MAP cutpoints attained positive predictive values of 25% and 17%, respectively. (b) TERA-2. Using the TERA-2 as the outcome measure, the MAP demonstrated very low positive predictive values. At the 5th percentile cutpoint, the positive predictive value of the MAP was 0%. The highest positive predictive value was again at the 14th percentile cutpoint at 33%, with correct identification of two out of six children who scored poorly on the MAP. The 17th and 26th percentile MAP cutpoints each demonstrated positive predictive values of 25%. fc) PPVT-R. The MAP demonstrated the lowest positive predictive values when using the PPVT-R as the outcome measure. Positive predictive values of 0% were demonstrated at both the 5th and 14th percentile MAP cutpoints. The 17th percentile MAP cutpoint attained the highest positive predictive value at only 11%, with correct identification of one out of nine children who scored poorly on the MAP. The 26th percentile cutpoint demonstrated a positive predictive value of only 8%. (d) VMI. The MAP attained the highest positive predictive values when using the VMI as the outcome measure. The positive predictive value at the 5th percentile of the MAP was 0%. It was highest at the 14th percentile at 50%, with correct identification of three out of six children who performed poorly on the MAP. The 17th and 26th percentile MAP cutpoints demonstrated positive predictive values of 38% and 42%, respectively. The positive predictive value of the MAP Total Score for each of the four outcome measures is shown in Table 3. Validity of Table 3 Positive Predictive Value of the MAP Total Score for each of the Four Outcome Measures 57 MAP's cutpoints Outcome Measures WPPSI-Ra TERA-2b PPVT-RC VMId 5th 0% 0% 0% 0% 14th 33% 33% 0% 50% 17th 25% 25% 11% 38% 26th 17% 25% 8% 42% an = 37. bn = 33. cn = 36. dn = 37. Research Question # 4 The negative predictive value of a test refers to the proportion of participants correctly identified as negative by the test (identified as not at-risk), compared to all the participants who received a negative test score (Domholdt, 1993). Knowing the negative predictive value of a test allows an examiner to assess the likelihood that a participant with a good test result is a true negative. A test will yield different negative predictive values depending on the cutpoint used. At which cutpoint does the MAP's Total Score demonstrate the highest negative predictive value using the following outcome measures? (a) WPPSI-R. When using the WPPSI-R FSIQ as the outcome measure, the MAP demonstrated excellent negative predictive values. In other words, the proportion of children with good outcomes on the WPPSI-R FSIQ (scores of > 85) who were correctly identified as such by the MAP, compared to all the children who scored well on the MAP, was large. A negative predictive value of 95% was demonstrated at the 5th percentile cutpoint of the MAP, indicating that out of the 37 children who scored well on the MAP using this cutpoint, 35 also had good outcomes on the WPPSI-R FSIQ. A negative predictive value of 100% was attained at the 14th, 17th and 26th percentile MAP cutpoints. Vahdity of 58 (b) TERA-2. When using the TERA-2 as the outcome measure, the MAP attained acceptable negative predictive values. At the 5th percentile cutpoint of the MAP, the negative predictive value was the lowest at 87% (correct identification of 29 out of 33 children). A negative predictive value of 93% was attained at the 14th percentile MAP cutpoint, and 92% at the 17th percentile MAP cutpoint. At the 26th percentile MAP cutpoint, a negative predictive value of 95% was demonstrated. (c) PPVT-R. When using the PPVT-R as the outcome measure, the MAP again attained acceptable negative predictive values. A negative predictive value of 92% was demonstrated at the 5th percentile cutpoint of the MAP (correct identification of 33 out of 36 children). The 14th, 17th and 26th percentile MAP cutpoints demonstrated negative predictive values of 90%, 93% and 91%, respectively. (d) VMI. When using the VMI as the outcome measure, acceptable negative predictive values for the MAP were also demonstrated. A negative predictive value of 84% was demonstrated at the 5th percentile MAP cutpoint (correct identification of 31 out of 37 children). Both the 14th and 17th percentile cutpoints of the MAP attained negative predictive values of 90%, while 96% was demonstrated at the 26th percentile cutpoint. The negative predictive value for the MAP Total Score for each of the four outcome measures is shown in Table 4. Validity of Table 4 Negative Predictive Value of the MAP Total Score for each of the Four Outcome Measures 59 MAP's cutpoints Outcome Measures WPPSI-Ra TERA-2b PPVT-RC VMId 5th 95% 87% 92% 84% 14th 100% 93% 90% 90% 17th 100% 92% 93% 90% 26th 100% 95% 91% 96% an = 37. bn = 33. cn = 36. dn = 37. Research Question #5 The overreferral rate is the proportion of false positives in relation to the total sample (Miller et al., 1990; Schouten & Kirlcpatrick, 1993). This value provides an examiner with information regarding the extent to which a test may overidenufy participants with the problem in question. A test will yield different overreferral rates depending on the cutpoint used. At which cutpoint does the MAP's Total Score demonstrate the lowest overreferral rate using the following outcome measures? (a) WPPSI-R. Using the 5th percentile cutpoint, the overreferral rate of the MAP to later IQ, as assessed by the WPPSI-R FSIQ, was 0%. There were no children who scored < 5th percentile on the MAP so, therefore, there were no children who could have been falsely identified by the MAP as at-risk for later below average IQ scores. Using the 14th and 17th percentile cutpoints, the MAP attained acceptable overreferral rates of 11% and 16%, respectively. However, at the 26th percentile cutpoint of the MAP, 10 out of 37 children were incorrecdy classified for an overreferral rate of 27%. (b) TERA-2. Using the 5th percentile cutpoint, the overreferral rate of the MAP to early reading behavior, as assessed by the TERA-2, was 0%. The MAP attained acceptable Validity of 60 overreferral rates of 12% and 18% at the 14th and the 17th percentile cutpoints, respectively. However, at the 26th percentile cutpoint of the MAP, 9 out of 33 children were incorrectly classified, rendering an overreferral rate of 27%. (c) PPVT-R. Using the 5th percentile cutpoint, the overreferral rate of the MAP to later receptive vocabulary, as assessed by the PPVT-R, was 0%. The MAP attained an acceptable overreferral rate of 17% at the 14th percentile cutpoint, but had an overreferral rate of 22% at the 17th percentile cutpoint. At the 26th percentile cutpoint of the MAP, 12 of 36 children were incorrectly classified for an overreferral rate of 33%. (d) VMI. Using the 5th percentile cutpoint, the overreferral rate of the MAP to later visual motor integration, as measured by the VMI, was 0%. The MAP attained acceptable overreferral rates of 8% and 14% at the 14th and 17th percentile cutpoints, respectively. At the 26th percentile cutpoint of the MAP, 7 of 37 children were incorrectly classified for an acceptable overreferral rate of 19%. The overreferral rate of the MAP Total Score for each of the four outcome measures is shown in Table 5. Table 5 Overreferral Rate of the MAP Total Score for each of the Four Outcome Measures MAP's cutpoints Outcome Measures WPPSI-R8 TERA-2b PPVT-RC VMId 5th 0% 0% 0% 0% 14th 11% 12% 17% 8% 17th 16% 18% 22% 14% 26th 27% 27% 33% 19% 8n = 37. bn = 33. cn = 36. dn = 37. Validity of 61 Research Question #6 The underreferral rate is the proportion of false negatives in relation to the total sample (Miller et al., 1990; Schouten & Kirkpatrick, 1993). This value provides an examiner with information regarding the extent to which a test may underidentify participants with the problem in question. A test will yield different underreferral rates depending on the cutpoint used. At which cutpoint does the MAP's Total Score demonstrate the lowest underreferral rate using the following outcome measures? (a) WPPSI-R. When using the WPPSI-R as the outcome measure, the MAP attained excellent underreferral rates. Using the 5th percentile cutpoint, the underreferral rate of the MAP to later IQ as assessed by the WPPSI-R FSIQ, was 5%. In other words, of the 37 children who did well on the MAP using the 5th percentile cutpoint, 2 scored greater than 1 SD below the mean on the WPPSI-R FSIQ. However, at the 14th, 17th and 26th percentile cutpoints of the MAP, the underreferral rate was 0%, indicating that both of the children who had difficulty on the WPPSI-R FSIQ were classified correctly by the MAP at these three cutpoints. (b) TERA-2. When using the TERA-2 as the outcome measure, the MAP attained acceptable underreferral rates. Using the 5th percentile cutpoint, the underreferral rate of the MAP to early reading behavior, as assessed later by the TERA-2, was 12%. At this cutpoint, there were 33 children who scored well on the MAP but 4 who scored greater than 1 SD below the mean on the TERA-2. The MAP attained an underreferral rate of 6% at both the 14th and the 17th percentile cutpoints. Using the 26th percentile cutpoint of the MAP, 1 of 33 children was incorrectly classified, for an underreferral rate of 3%. (c) PPVT-R. When using the PPVT-R as the outcome measure, the MAP again attained acceptable underreferral rates. Using the 5th and 14th percentile cutpoints, the underreferral rate of the MAP to later receptive vocabulary, as assessed by the PPVT-R, was 8%. At these cutpoints, there were 36 children who scored well on the MAP but 3 of them scored greater than 1 SD below the mean on the PPVT-R. At both the 17th and the 26th percentile cutpoints of the MAP, 2 of 36 children were incorrectly classified for an underreferral rate of 6%. Validity of 62 (d) VMI. When using the VMI as the outcome measure, the MAP also attained acceptable underreferral rates. Using the 5th percentile cutpoint, the underreferral rate of the MAP to later visual motor integration as assessed by the VMI, was 16%. At this cutpoint, there were 37 children who scored well on the MAP but 6 of them scored greater than 1 SD below the mean on the VMI. The underreferral rate of the MAP at both the 14th and the 17th percentile cutpoints was 8%. Using the 26th percentile cutpoint of the MAP, 1 child was incorrectly classified out of 37 for an underreferral rate of 3%. The underreferral rate of the MAP Total Score for each of the four outcome measures is shown in Table 6. Table 6 Underreferral Rate of the MAP Total Score for each of the Four Outcome Measures MAP's cutpoints Outcome Measures WPPSI-R3 TERA-2b PPVT-RC VMId 5th 5% 12% 8% 16% 14th 0% 6% 8% 8% 17th 0% 6% 6% 8% 26th 0% 3% 6% 3% an = 37. bn = 33. cn = 36. dn = 37. Research Question #7 The ROC curve illustrates the trade-off between the sensitivity and specificity of a test over a range of cutpoints by a curve on a graph (Fletcher, Fletcher & Wagner, 1988). Tests or cutpoints with high levels of predictive accuracy crowd the upper left corner of the curve, while less discriminative tests or cutpoints have curves that fall closer to a diagonal line. The larger the area under the curve, the more accurate the test or cutpoint. At which cutpoint does the MAP's Validity of 63 Total Score demonstrate the highest level of predictive accuracy when using a ROC curve, relative to the four outcome measures? (a) WPPSI-R. Using a ROC curve, the best trade-off between the sensitivity and the specificity for the MAP Total Score in predicting later IQ as assessed by the WPPSI-R, appears to be at the 14th percentile cutpoint. At this cutpoint, the sensitivity is 100% and the specificity is 89%. The ROC curve which represents each of these MAP cutpoints in relation to the WPPSI-R is shown in Figure 1. 14th 17th 26th Sensitivity 100-75i 5(H 25i 0 0 25 50 75 100 1-Specificity Figure 1. ROC curve using the WPPSI-R for each of the MAP's four cutpoints. Validity of 64 (b) TERA-2. Using a ROC curve, the best trade-off between the sensitivity and the specificity for the MAP Total score in predicting early reading behavior as assessed by the TERA-2, appears to be at the 26th percentile cutpoint. At this cutpoint, the sensitivity is 75% and the specificity is 69%. The ROC curve which represents each of these MAP cutpoints in relation to the TERA-2 is shown in Figure 2. 1 0 0 26th Sensitivity 501 75i 251 0 0 25 50 75 100 1-Specificity Figure 2. ROC curve using the TERA-2 for each of the MAP's four cutpoints. Validity of 65 (c) PPVT-R. Using a ROC curve, the best trade-off between the sensitivity and the specificity for the M A P Total score in predicting later receptive vocabulary as assessed by the PPVT-R, appears to be at the 17th percentile cutpoint. At this cutpoint, the sensitivity is 33% and the specificity is 76%. The ROC curve which represents each of these MAP cutpoints in relation to the PPVT-R is shown in Figure 3. 1001 751 Sensitivity 501 17th 26th 251 0 * 0 25 50 75 100 1-Specificity Figure 3. ROC curve using the PPVT-R for each of the MAP's four cutpoints. Validity of 66 (d) VMI. Using a ROC curve, the best trade-off between the sensitivity and the specificity for the MAP Total score in predicting later visual motor integration as assessed by the VMI, appears to be at the 26th percentile. At this cutpoint, the sensitivity is 83% and the specificity is 77%. The ROC curve which represents each of these MAP cutpoints in relation to the VMI is shown in Figure 4. 100 75H Sensitivity 501 251 1-Specificity Figure 4. ROC curve using the VMI for each of the MAP's four cutpoints. Research Question #8 Each of the tests used in this study has a normative sample with which to compare an individual child's test performance. Information regarding the clinical profile of this group of children with prenatal drug exposure may be obtained by comparing this sample's test performance with the sample upon which each test was normed. How did this particular sample of children perform on each of the tests, compared to the sample on which each test was normed? Validity of 67 (a) MAP. The MAP's percentile scale ranges from 1 to 99 (although not every percentile within this range is attainable) with 50 representing the median. Based on 37 percentile scores, the children in this study attained a median percentile score of 40 with a range from 7 to 92. (b) WPPSI-R. The WPPSI-R has standard scores with a mean of 100 and a SD of 15. Based on 37 standard scores, the children in this sample attained a mean score of 100.16 and a SD of 11.67. The range was 80 - 138. fc) TERA-2. The TERA-2 has standard scores with a mean of 100 and a SD of 15. Based on 33 standard scores, the children in this sample attained a mean score of 94.94 and a SD of 10.85. The range was 75 -119. (d) PPVT-R. The PPVT-R has standard scores with a mean of 100 and a SD of 15. Based on 36 standard scores, the children in this sample attained a mean score of 98.33 and a SD of 9.42. The range was 78 - 122. (e) VMI. The VMI has standard scores with a mean of 100 and a SD of 15. Based on 37 standard scores, the children in this sample attained a mean of 94.84 and a SD of 10.01. The range was 72 - 112. The means and SDs of the four outcome measures are shown in Table 7. Table 7 Means and Standard Deviations for the Four Outcome Measures Outcome Measures Standard scores WPPSI-Ra TERA-2b PPVT-RC VMId Mean 100.16 94.94 98.33 94.84 SD 11.67 10.85 9.42 10.01 an = 37. bn = 33. cn = 36. dn = 37. Validity of 68 Summary In this chapter, the results from the data analysis were presented. Using the WPPSI-R as the outcome measure, the MAP was successful in predicting later intelligence in this sample of preschool-aged children with prenatal drug exposure, particularly when using the MAP's 14th percentile cutpoint. The MAP performed the next best when predicting outcome of early reading behavior and visual-motor integration as assessed by the TERA-2 and the VMI, respectively. The best balance of predictive accuracy values, when using these two outcome measures, was at the MAP's 14th percentile cutpoint; however, the sensitivity and positive predictive values were unacceptably low. Acceptable sensitivity values were attained at the MAP's 26th percentile cutpoint when using these two outcome measures, but at the expense of lower specificity values and higher overreferral rates. As assessed using the PPVT-R as the outcome measure, the MAP demonstrated overall poor ability to predict later receptive vocabulary at any of the four cutpoints. The test performances of this sample of children with prenatal drug exposure approximated a normal distribution on the WPPSI-R and PPVT-R. Their test performance on the TERA-2 and VMI more closely approximated literature findings for children with prenatal drug exposure, i.e. performance in the low normal range on cognitive or developmental testing. Validity of 69 CHAPTER 5 Discussion This chapter first discusses the study's results in more detail and relates them to other studies, particularly Miller's (1986) study on the predictive validity of the MAP. Limitations of research involving children with prenatal drug exposure, in general, are then presented, followed by the limitations specific to this study. The final section presents the summary and conclusions. This study investigated the predictive validity of the MAP relative to later cognitive performance in preschool-aged children with prenatal drug exposure as measured by four tests: the WPPSI-R, TERA-2, PPVT-R and VMI. Previous criticisms of studies of the MAP's predictive validity have been: (1) the predominant use of correlational and t-test analysis, rather than clinical epidemiological techniques which classify individual children into appropriate membership groups for comparison of predicted risk status with ultimate outcome status, thereby determining the amount of agreement between screening decisions and future outcomes; (2) the reporting of only some types of screening errors (i.e., overreferral rates) while omitting other types of screening errors (i.e., underreferral rates); and (3) the questionability of the predictive accuracy of the MAP's recommended 5th and 25th percentile cutpoints without investigation of the full range of possible cutpoints (Schouten & Kirkpatrick, 1993). This study attempted to address each of these criticisms. Clinical epidemiological techniques were used in order to precisely determine how accurately the MAP classifies children into the appropriate outcome groups, and all types of screening errors were reported. In order to investigate a fuller range of cutpoints, the 14th and 17th percentile cutpoints were examined in addition to the recommended 5th and 25th percentile cutpoints. The 14th and 17th percentile cutpoints were chosen because they most closely approximated what would be 1 SD below the mean. As a secondary purpose, the study also compared the participants' test performance to each of the five test's norms to investigate how this sample of children with prenatal drug exposure performed in comparison to the sample upon which each test was normed. Validity of 70 WPPSI-R Overall, the highest level of predictive accuracy of the MAP was attained when using the WPPSI-R as the outcome measure and the MAP's 14th percentile as the cutpoint. At this cutpoint, acceptable levels of predictive accuracy were obtained when using all of the clinical epidemiological measures, with the exception of the positive predictive value (33%). Overall, this indicates that the MAP was quite successful in predicting later intelligence in this sample of preschool-aged children with prenatal drug exposure, as assessed by the WPPSI-R using the MAP's 14th percentile cutpoint This outcome is not surprising given that the WPPSI-R was one of the tests that influenced the development of the format and the content of the MAP (Miller, 1982). Furthermore, given the MAP's verbal and perceptual-based items, it appears to resemble the WPPSI-R more so than the other three outcome measures. In Miller's (1986) study, the MAP demonstrated the highest levels of sensitivity and specificity when using the WISC-R as the outcome measure, compared to all of the other outcome measures that were included in the study. This finding was the same for the current study using the WPPSI-R as the outcome measure. In Miller's (1986) study, the predictive validity of the MAP's Total Score was investigated relative to the WISC-R (Wechsler, 1974) using the clinical epidemiological techniques of sensitivity, specificity, overreferral rates and underreferral rates. Compared to the present study, which had a sample of 37 children and a mean follow-up interval of 16.6 months, Miller (1986) had a much larger sample (338 children) and followed them over a much longer time interval (4 years). The 5th percentile is one of the MAP's recommended cutpoints. At the 5th percentile cutpoint, the findings of Miller's (1986) study and this study appeared to follow the same trend with low sensitivity levels (21% vs. 0%, respectively), but acceptable levels of accuracy for specificity (97% vs. 100%, respectively), overreferral rates (2.4% vs. 0%, respectively) and underreferral rates (7.9% vs. 5%, respectively). The 25th percentile is the MAP's second recommended cutpoint However, the 25th percentile is not an attainable score on the MAP Total Score. The next attainable percentile score below the 25th percentile cutpoint is the 20th percentile. Therefore, Miller's (1986) study used an Validity of 71 actual cutpoint of the 20th percentile while the present study used a cutpoint of the 26th percentile (which was chosen because the 26th percentile is closer to the 25th percentile cutpoint). At the 20th and 26th percentiles, the findings of the two studies demonstrated wider variation than at the 5th percentile cutpoint The sensitivity levels, in particular, differed from Miller's (1986) study, which reported a sensitivity of 59%, whereas the present study attained a sensitivity of 100%. The other findings between Miller's (1986) study and the present study vary less, with Miller's study reporting an acceptable level of specificity at 80% vs. 71% for this study, as well as an acceptable overreferral rate at 18% vs. 27% for this study. The underreferral rates were similar between the two studies with Miller's (1986) study attaining 4.1% vs. 0% for the present study. There are several likely reasons for the wider variation in predictive accuracy values between the two studies at the two higher cutpoints. First of all, more variation would be expected given that the two studies used two different cutpoints, with Miller's (1986) study actually using the 20th percentile and the present study using the 26th percentile. Secondly, Miller's (1986) study had a much longer follow-up interval at 4 years vs. 16.6 months for the current study. Given potentially confounding factors such as history and maturation, higher sensitivity values would be expected for a study with a shorter follow-up interval. Thirdly, the difference between the sample sizes would have an effect with Miller's (1986) study having a much larger sample size of 338 and the present study having a sample size of only 37. The predictive accuracy values, therefore, would have a much finer gradation in Miller's (1986) study and less vulnerability to extreme scores. For example, in the present study, only 2 of 37 participants had a poor outcome on the WPPSI-R; therefore, the sensitivity values of the MAP when using the WPPSI-R as the outcome measure, could only be 0%, 50% or 100%. The difference between the two studies' samples may also explain some of the variation in the findings. Of Miller's (1986) sample of 338 participants, 309 participants were reported to come from the "normal" population and 29 from an "at-risk" population (p. 175), compared to the current study of 37 participants with prenatal drug exposure. Therefore, the present study had a sample which would be considered to be more at-Validity of 72 risk for later learning difficulties whereas Miller's (1986) sample would more likely have a normal distribution. Given the differences between the two samples, the likelihood of differences between findings of the two studies is increased. TERA-2 One of the other highest levels of predictive accuracy of the MAP overall was attained when using the TERA-2 as the outcome measure and the MAP's 14th percentile as the cutpoint. At this cutpoint, acceptable levels of predictive accuracy were attained for the specificity, negative predictive value, overreferral rate and underreferral rate. However, acceptable levels of accuracy were not attained for the sensitivity (50%) and the positive predictive value (33%), indicating that the MAP was not very effective in correctly identifying children who would have later difficulty with early reading behavior. The MAP approached an acceptable level of sensitivity at 75% when using the 26th percentile as the cutpoint. However, at this cutpoint, 26% of the population would have to be identified as at-risk in order to screen for the relative few who would later have difficulty in early reading behavior. Furthermore, at this cutpoint the positive predictive value is still low at 25% and the overreferral rate was 27%, indicating that there was a high number of false positives. This suggests that the 26th percentile cutpoint does not demonstrate overall good predictive accuracy, despite its improvement in sensitivity. In Miller's (1986) study, the predictive validity of the MAP Total Score relative to reading ability was examined using the W-J - Reading subtest (Woodcock & Johnson, 1977) and report card grades in reading as outcome measures. At the 5th percentile cutpoint, the findings between the two outcomes in Miller's (1986) study and the present study again appeared to follow the same trend with the sensitivity levels being low at 20% (W-J) and 25% (report card grade) vs. 0% for this study. Acceptable levels were attained for the specificity at 97% (W-J and report card grade) vs. 100% for this study; the overreferral rate at 2.4% (W-J) and 2.96% (report card grade) vs. 0% for this study; and the underreferral rate at 8.3% (W-J) and 4.4% (report card grade) vs. 12% for this study. Validity of 73 At the higher cutpoints (20th and 26th percentiles), only the underreferral rates were similar between the two outcomes in Miller's (1986) study and the present study with acceptable levels of accuracy reported at 4.7% (W-J) and 2.37% (report card grade) vs. 3% for this study. Otherwise, the two outcomes in Miller's (1986) study also approached or attained acceptable levels of accuracy for the specificity at 80% (W-J) and 78% (report card grade) vs. 69% for this study, and attained acceptable overreferral rates at 18% (W-J) and 20.4% (report card grade) vs. 27% for this study. However, the sensitivity was lower for the two outcomes in Miller's (1986) study at 54% (W-J) and 60% (report card grade) vs. 75% for this study. Therefore, despite Miller's (1986) report that reading, as measured by report card grades, appears to be the best outcome criterion on which the MAP is least likely to miss children, the findings overall between the two studies seem to suggest that the MAP is better at identifying children who will not have difficulty later with reading than identifying those who will have difficulty. Furthermore, the 14th percentile MAP cutpoint appears to demonstrate higher levels of predictive accuracy when using the TERA-2 as the outcome for assessing reading abilities. The predictive accuracy values for the two reading outcome measures in Miller's (1986) study and the reading outcome measure in the present study are shown in Tables 8 and 9. Validity of 74 Table 8 Predictive Accuracy Values for Reading Outcomes in Miller's (19861 Study and the Present Study ("using the 5th Percentile Cutpoint) Miller's (1986) Present study reading outcomes reading outcome Predictive accuracy W-J Report card grade TERA-2 values Sensitivity 20% 25% 0% Specificity 97% 97% 100% Overreferral rate 2.4% 2.96% 0% Underreferral rate 8.3% 4.4% 12% Table 9 Predictive Accuracy Values for Reading Outcomes in Miller's (1986) Study ("using the 20th Percentile Cutpoint) and the Present Study ("using the 26th Percentile Cutpoint) Miller's (1986) Present study reading outcomes reading outcome Predictive accuracy W-J Report card grade TERA-2 values Sensitivity 54% 60% 75% Specificity 80% 78% 69% Overreferral rate 18% 20.4% 27% Underreferral rate 4.7% 2.37% 3% Validity of 75 PPVT-R Using the PPVT-R as the outcome measure, the MAP was not very successful at predicting later receptive vocabulary. The 17th percentile MAP cutpoint demonstrated the highest level of predictive accuracy overall with acceptable levels of accuracy attained for the negative predictive value and the underreferral rate. The specificity and the overreferral rate approached acceptable levels of accuracy at 76% and 22%, respectively. However, the sensitivity and positive predictive value were unacceptably low at 33% and 11%, respectively. These findings suggest that once again, the MAP was much more successful at identifying children who would not have later difficulty with receptive vocabulary, than identifying those who would have difficulty. In Miller's (1986) study, the predictive validity of the MAP Total Score relative to language ability was assessed using the W-J - Language subtest (Woodcock & Johnson, 1977) and report card grades in language. Once again, the reported sensitivity, specificity, overreferral rates and underreferral rates were extremely similar across both outcome measures used in Miller's (1986) study. At the 5th percentile cutpoint, the findings of Miller's (1986) study and the present study again appear to follow the same trend with the sensitivity being low at 23% (W-J) and 20% (report card grade) vs. 0% for this study. Acceptable levels of predictive accuracy were attained for the specificity at 97% (W-J and report card grade) vs. 100% for this study; the overreferral rate 2.4% (W-J) and 2.9% (report card grade) vs. 0% for this study; and the underreferral rate 6.8 (W-J) and 5.9% (report card grade) vs. 8% for this study. At the higher cutpoints (20th and 26th percentiles), only the underreferral rates were very similar between Miller's (1986) study and the present study at 4.1% (W-J) and 3.6% (report card grade) vs. 6% for this study. Otherwise, Miller's (1986) study approached acceptable specificity levels at 79% (W-J) and 78% (report card grade) vs. 64% for this study, and attained acceptable overreferral rates at 19.2% (W-J) and 20% (report card grade) vs. 33% for this study. Both studies demonstrated low sensitivity levels at 53% (W-J) and 52% (report card grade) for Miller's (1986) study and 33% for this study. Validity of 76 Overall, neither study was successful at identifying children who would later develop language difficulties, but both were more successful at identifying those who would not have difficulties. However, the more broad based language outcome measures of the W-J - Language subtest and the language report card grades in Miller's (1986) study, allowed for higher levels of MAP predictive accuracy than when using the PPVT-R measure of receptive vocabulary as the outcome criterion, as was done in the present study. The predictive accuracy values for the two language outcome measures in Miller's (1986) study and the language outcome measure in the present study are shown in Tables 10 and 11. Table 10 Predictive Accuracy Values for Language Outcomes in Miller's (1986) Study and the Present Study (using the 5th Percentile Cutpoint) Miller's (1986) Present study language outcomes language outcome Predictive accuracy W-J Report card grade PPVT-R values Sensitivity 23% 20% Specificity 97% 97% Overreferral rate 2.4% 2.9% Underreferral rate 6.8% 5.9% 0% 100% 0% 8% Validity of 77 Table 11 Predictive Accuracy Values for Language Outcomes in Miller's (1986) Study (using the 20th Percentile Cutpoint) and the Present Study (using the 26th Percentile Cutpoint') Miller's (1986) Present study language outcomes language outcome Predictive accuracy values W-J Report card grade PPVT-R Sensitivity 53% 52% 33% Specificity 79% 78% 64% Overreferral rate 19.2% 20% 33% Underreferral rate 4.1% 3.6% 6% VMI Using the VMI as the outcome measure, the MAP also demonstrated relatively high overall levels of predictive accuracy, particularly when the 26th percentile cutpoint was used. At this cutpoint, the MAP demonstrated acceptable levels of accuracy for sensitivity, negative predictive value, overreferral rate and underreferral rates. The specificity approached an acceptable level at 77%, while the positive predictive value was low at 42%. However, the 26th percentile cutpoint is not a particularly efficient cutpoint to use in general, in that 26% of the population would be identified as at-risk in order to correctly identify the relative few who would later develop difficulty with visual-motor integration. Furthermore, the 14th and 17th percentile cutpoints demonstrated the same trend as the 26th percentile cutpoint. The sensitivity was only 50% at these two cutpoints but an acceptable level of specificity was attained. In Miller's (1986) study, evaluation using clinical epidemiological techniques was not conducted with the VMI but predictive relationships were examined using correlation coefficients. The correlations between the MAP Total Score and the VMI were significant but quite low (r = Validity of 78 .10 to .21), indicating that the predictive relationship between the MAP and the VMI was very low (Miller, 1986). Clinical Profiles of Test Scores This sample of high-risk children performed better, in general, on the measure of intelligence (WPPSI-R) than is typically reported in the literature for children with prenatal drug exposure. In the present study, the children's performance on the WPPSI-R approximated a normal distribution with a mean of 100.16 and SD of 11.67, whereas it is commonly reported that children with prenatal drug exposure tend to perform in the lower end of the normal range on developmental or intelligence measures (Hans, 1989; Howard et al., 1989; van Baar & de Graaff, 1994; Wilson et al., 1979). The children's performance on the PPVT-R also approximated a normal distribution with a mean of 98.33 and SD of 9.42, a finding which also conflicts with literature reports of tendencies toward language impairments in children with prenatal drug exposure (Fulks & Harris, 1995; van Baar, 1990; van Baar & de Graaff, 1994; Wilson, 1989). The children's performance in early reading, as measured by the TERA-2, and their visual-motor integration, as measured by the VMI, is more reflective of the published literature with performance in the low normal range on cognitive or developmental tests. This sample had a mean of 94.94 on the TERA-2 with a SD of 10.85, while on the VMI they had a mean of 94.84 and a SD of 10.01. These results may provide some support for the possibility of specific learning difficulties for children with prenatal drug exposure (Davis & Templer, 1988; Wilson, 1989; Wilson et al., 1979). The children in this sample attained a median score of 40 on the MAP which does not fall as close to a normal distribution as did their performance on the WPPSI-R, the test to which the MAP most accurately predicted outcome. However, given that the MAP is a broader based test than the WPPSI-R in that it includes a large number of motor items in addition to cognitive items, a median Total Score of 40 may reflect overall average cognitive development while indicating the possibility of more specific weaknesses, such as that reflected by the children's performance on the VMI. Fulks and Harris (1995) found that their sample of children with prenatal drug Validity of 79 exposure who were tested on the MAP, had the most difficulty with the MAP's motor and language items, while performing within the average range on the non-verbal cognitive items. Limitations The limitations in studying children with prenatal drug exposure as a whole are overwhelming. It is very difficult to determine the exact drugs used by the mother, and virtually impossible to determine the exact timing, exposure and frequency of drug use (Householder et al., 1982; Lindenberg, Alexander, Gendrop, Nencioli & Williams, 1991; Rodning et al., 1989; Schutter & Blinker, 1992). Furthermore, there is widespread incidence of polydrug abuse (Lindenberg et al., 1991). Mothers who abuse drugs tend to receive little or no prenatal care, and their health and nutritional status are often poor (Lindenberg et al., 1991; Schutter & Blinker, 1992). The pre- and perinatal periods are often complicated with medical problems such as premature labor and delivery, dysmaturity, low birth weight, etc. (Chasnoff, Griffith, MacGregor, Dirkes & Burns, 1989; Lindenberg et al., 1991; Schutter & Blinker, 1992; Weston, Ivins, Zuckerman, Jones & Lopez, 1989). These issues present confounding factors which make it difficult to be certain that the outcomes of children with prenatal drug exposure are due solely to the abuse of a specific drug(s) in pregnancy. Issues regarding the home environment may also confound research findings. Families with parent(s) who abuse substances often have a transient lifestyle which makes it difficult to conduct longitudinal studies of their children (Howard et al., 1989; Lindenberg et al., 1991). Low socioeconomic status, which itself is a powerful predictor of poor future outcome (Werner, 1986), is common in families with substance abuse (Lindenberg et al., 1991). Furthermore, difficulties with parenting may be present and may adversely influence the developmental outcome of children with prenatal drug exposure (Weston et al., 1989). Lastly, because of the difficulties in conducting longitudinal research of children with prenatal drug exposure, the availability of participants is generally limited, precluding random selection (Householder et al., 1982; Lindenberg et al., 1991). Therefore, studies are generally Vahdity of 80 Umited in their population scope (Lindenberg et al., 1991) and in their generahzability (Domholdt, 1993). The limitations specific to this study are many. First, given that a larger sample tends to be more representative of its population than a smaller sample (Domholdt, 1993), and given that these participants represented a sample of convenience and thus are not representative of the population of children with prenatal drug exposure as a whole, this sample of 37 children is small and likely not very representative of its parent population. Specifically, the children in this study represented only 37 (24%) of 154 potential participants. The original list of 154 participants was compiled from review of medical records of children with prenatal drug exposure, who were followed longitudinally from 1988 to 1994 by the ICAR Program at SHHCC in Vancouver, BC. The majority of the potential participants (47%) did not participate because of ICAR Program attrition. Furthermore, 9% of the potential participants were ehminated because they were unable to complete the MAP, thus eliminating a group of children who were more likely to be lower functioning than the rest of the potential participants. Most of the children (81%) who participated in the study were living with adoptive and foster parents, and therefore were not likely to be living in an environment with drug abuse or to be dealing with issues of poverty, which are two commonly reported living conditions of children with prenatal drug exposure (Howard et al., 1989; Lindenberg et al., 1991). Most of the children (70%) also attended school or daycare, so likely had some experience in dealing with the types of tasks requested and with a structured setting. Lastly, because the study required several hours of testing over one full day or two half days, this sample of children may have been biased toward those with motivated parents who were willing to take the time to have this testing done. Therefore, this sample of children with prenatal drug exposure is likely exhibiting higher cognitive function than its parent population. This has ramifications for the study's findings in that the results may be skewed and the generahzability limited. Specifically, examination of the predictive validity of the MAP, with regard to this population, is limited to those who were higher functioning (i.e., able to complete the MAP), and who experienced similar living environments. Furthermore, because this sample is likely higher functioning than its parent population, it is also Validity of 81 more apt to be comparable to the test norms of the outcome measures and the MAP, rather than to the parent population. Because there were only a few children with poor outcomes on most of the outcome measures, another limitation may be that those children who performed poorly were among the 30% who did not attend school or daycare, or perhaps were relatively younger than the other participants. However, this generally does not appear to be the case in this sample of children with prenatal drug exposure. For example, both of the children who had difficulty on the WPPSI-R were among the oldest in the sample at 65 and 66 months. Both were currently enrolled in kindergarten and both had previously attended preschool (one had been in a special needs placement). Both of the children were also adopted. These two particular participants also had difficulty on the VMI, and one had further difficulty on the TERA-2. Of the nine other participants who had difficulty on the other three outcome measures, eight of the children had difficulty on only one of these tests. Of these eight children, six had attended preschool and one had attended daycare. Four of the children were currently enrolled in kindergarten. Five of the children were adopted while two were in foster care. One child lived with his father and stepmother. One final participant had difficulty on both the TERA-2 and the VMI. This child did not attend preschool and was two months away from kindergarten enrollment However, in lieu of preschool enrollment, her foster mother had chosen to place this child in various community activities, such as library storytelling, skating and swirnming. The family was planning to adopt this child. Another issue is the validity of the MAP across the different age groups. There is some disagreement over how well the MAP performs across the different age groups. Miller and Schouten (1988) reported that the MAP demonstrated the strongest predictive accuracy between the ages of 3 and 4 years. If this is the case, this study used the best possible age group in that the MAP was administered between 35 and 49 months. However, this hmits the generalizability of the findings to this particular age group and is possibly the best the MAP has to offer in terms of its predictive accuracy. It also limits the value of the comparability between the findings of this study and those of Miller's (1986) study, in which the MAP was administered across the full age Vakdityof 82 range of 33 to 68 months. On the other hand, Schouten and Kirkpatrick (1991) have recently reported that the test performs most poorly at the lowest age levels and inconsistently across the other age groups. Therefore, the issue of the predictive accuracy of the MAP across the different age groups has not been resolved. Other limitations of the study are the relatively short follow-up interval between the predictor and outcome measures, and the broad age ranges over which the predictor and outcome measures were taken. The mean follow-up interval between the predictor and outcome measures in this study was 16.6 months, with the predictor measure (MAP) administered to the children between the ages of 35 to 49 months and the outcome measures administered between the ages of 47 to 68 months. This clearly does not meet Satz and Fletcher's (1988) criterion that there should be at least a 3 year follow-up interval between the predictor and outcome measures. Higher predictive accuracy values are more likely to be attained with shorter follow-up intervals because confounding factors such as history, maturation and participant attrition would be less influential. Therefore, those children in this sample who had shorter follow-up intervals would more likely be placed in the correct outcome category than those with longer follow-up intervals. Furthermore, the broad age ranges over which the tests were adrninistered makes it difficult to investigate specific age-related issues regarding the performance of the MAP, and to make any definitive statement about later outcome at a specific age. Because neither the psychologist nor the occupational therapist were blinded to the children's prenatal drug history or to results of assessments at previous follow-up visits, i.e. on the Bayley Scales of Infant Development (Bayley, 1969) or the MAP, there is a risk of expectation bias. There also was no control group of children not exposed to drugs prenatally with which to compare this sample's test performance. Lastly, because the WPPSI-R was the first of the three tests adrninistered by the psychologist (including the TERA-2 and PPVT-R), the children's relatively better performance on the WPPSI-R may reflect that it was completed first and therefore not as subject to factors such as fatigue, as may have been true of the other tests (particularly the TERA-2 on which the children performed the most poorly). However, this did not hold true for the MAP and the VMI Validity of 83 which were administered by the occupational therapist, where the MAP was prioritized but the children attained scores on both tests which were lower than the normative samples. Summary and Conclusions This study investigated the predictive validity of the MAP relative to later cognitive performance in preschool-aged children with prenatal drug exposure, using four psychological outcome measures. The MAP demonstrated the highest level of predictive accuracy when using the WPPSI-R as the outcome criterion measure, thus suggesting it was the most successful in identifying later intelligence in children with prenatal drug exposure compared to measuring later reading behavior, receptive vocabulary and visual-motor integration. Given that the MAP is a screening test and therefore takes less time to administer than the WPPSI-R, and given that OT personnel are less costly than psychologists, the MAP may therefore be an effective and more cost-efficient means of screening for later intelligence problems in children with prenatal drug exposure. The MAP was less successful in identifying children with more specific cognitive problems such as early reading behavior, receptive vocabulary and visual-motor integration. It was much better at identifying children who did not have poor outcomes than at identifying those who did have poor outcomes. This finding does not reflect particularly well on the MAP, given that the aim of the screening process is to identify children who are at-risk for poor developmental outcomes, particularly those at-risk for later learning difficulties (Schouten & Kirkpatrick, 1993). From a clinician's perspective, it is important to be aware that the MAP did better at predicting which children will not have specific learning difficulties than it did at predicting which children will have specific learning difficulties, and that it appeared to best predict general cognitive function. Clinically, it may be more practical and accurate to consider using the 14th percentile as the cutpoint for determining at-risk status as opposed to the recommended 5th and 25th percentile cutpoints, because the 14th percentile cutpoint was found to generally demonstrate higher levels of predictive accuracy. However, future research is necessary to determine if this Vahdity of 84 finding is the same across other population groups. Although using the 26th percentile cutpoint resulted in more acceptable levels of sensitivity, particularly for the outcome measures of the TERA-2 and VMI, it was at the expense of the specificity and the overreferral rate. Furthermore, it is not practical from a fiscal or resource perspective to identify 26% of the population as at-risk for later learning problems. There was no support for the 5th percentile cutpoint in this study in that the predictive accuracy values were very unbalanced at this cutpoint (e.g., 0% sensitivity but 100% specificity). This study also compared the sample children's performance to each test's norms. The children in this study performed in the average range and their scores approximated a normal distribution on measures of intelligence and receptive vocabulary. While measures of early reading ability and visual-motor integration were also within the average range, the mean performance of the children in this study was lower than the mean of the normative samples. In general, children with prenatal drug exposure tend to perform within the average cognitive range. Therefore, a clinician must be vigilant in looking for subtle cognitive weaknesses. With regard to future research, this study should be replicated with a much larger sample, randomly selected if possible, and should include a nonexposed control group, in order to strengthen the generalizability of the findings to the population of children with prenatal drug exposure. Blinding of examiners to previous test results would also help to control for expectation biases. Use of Satz and Fletcher's (1988) recommended 3 year follow-up interval would strengthen the value of the predictive accuracy. Use of narrower age groupings for the predictor and the outcome measures, would not only provide more specific age-related information about the performance of the MAP but would also allow for additional evaluation with regard to specific age groups (e.g., does evaluation at age 3 accurately predict outcome at age 6). Furthermore, the predictive validity of the MAP should be compared to the predictive validity of other preschool assessment tools. This would provide information about the value of the MAP's predictive validity relative to other instruments. Given the controversy over the definition and early identification of LD, particularly with regard to the preschool-aged child, it seems next to impossible for a test to accurately predict Validity of 85 which child will later be labeled as LD. Furthermore, given the complexity of factors that contribute to the developmental outcome of a child, including the home environment (Hartzell & Compton, 1984; Keogh & Sears, 1991; Werner, 1986) and behavioral and experiential factors (Paget, 1990), it seems unrealistic to expect one test to account for all potential risk factors. Clinicians must be cognizant of all potentially influential factors and attempt to focus on the whole child rather than on a single test score. In addition to using a standardized test such as the MAP, an evaluator needs to use sound clinical judgment and to consider all relevant factors in determining the at-risk status of a child. References Validity of 86 Altepeter, T., & Handal, P. J. (1985). A factor analytic investigation of the use of the PPVT-R as a measure of general achievement. Journal of Qinical Psychology. 41(4), 540-543. Aylward, G. P., Gustafson, N., Verhulst, S. J., & Colliver, J. A. (1987). Consistency in the diagnosis of cognitive, motor, and neurologic function over the first three years. Journal of Pediatric Psychology. 12(1), 77-98. Ayres, A. J. (1972). Southern California Sensory Integration Tests. Los Angeles, CA: Western Psychological Corporation. Badian, N. (1988). The prediction of good and poor reading before kindergarten entry: A nine-year follow-up. Journal of Learning Disabilities. 21(2), 98-103. Banus, B. (1983). The Miller Assessment for Preschoolers (MAP): An introduction and review. American Journal of Occupational Therapy. 37(5). 333-340. Bauer, A. M. (1991). Drug and alcohol exposed children: Implications for special education for students identified as behaviorally disordered. Behavioral Disorders. 17(1), 72-79. Bayley, N. (1969). Bayley Scales of Infant Development. New York, NY: Psychological Corporation. Beery, K. E. (1989). Developmental Test of Visual-Motor Integration (3rd Revision): Administration. Scoring, and Teaching Manual. Cleveland, OH: Modern Curriculum Press. Bender, L. (1938). Bender Visual Motor Gestalt Test for Children. New York, NY: American Orthopsychiatric Association. Bernheimer, L. T., & Keogh, B. K. (1988). The stability of cognitive performance of developmentally delayed children. American Journal of Mental Deficiency. 92(6), 539-542. Brown, V. L., Harnmill, D. D., & Wiederholt, J. L. (1986). Test of Reading Comprehension. Austin, TX: PRO-ED. Burks, H. (1975). Burks Behavior Rating Scales. Los Angeles, CA: Western Psychological Services. Capute, A. J., & Accardo, P. J. (1978). Linguistic and auditory milestones during the first two years of life. Clinical Pediatrics. 17, 847-853. Carran, D. T., & Scott, K. G. (1992). Risk assessment in preschool children: Research implications for the early detection of educational handicaps. Topics in Early Childhood Special Education. 12(2), 196-211. Validity of 87 Carta, J. J., Sideridis, G., Rinkel, P., Guimaraes, S., Greenwood, C, Baggett, K., Peterson, P., Atwater, J., McEvoy, M., & McConnell, S. (1994). Behavioral outcomes of young children prenatally exposed to illicit drugs: Review and analysis of experimental literature. Topics in Early Childhood Special Education. 14(2), 184-216. Chasnoff, I. J., Griffith, D. R., MacGregor, S., Dirkes, K., & Burns, K. A. (1989). Temporal patterns of cocaine use in pregnancy. Journal of the American Medical Association. 261(12), 1741-1744. Cohn, S. H. (1986). An analysis of the predictive validity of the Miller Assessment for Preschoolers in a suburban public school district. Unpublished doctoral dissertation, University of Denver, Denver, CO. Daniels, L. (1990). The Miller Assessment for Preschoolers: Analysis of score patterns for children with developmental delays. Canadian Journal of Occupational Therapy. 57(4), 205-210. Daniels, L., & Bressler, S. (1990). The Miller Assessment for Preschoolers: Clinical use with children with developmental delays. American Journal of Occupational Therapy. 44(1), 48-53. Davis, D., & Templer, D. (1988). Neurobehavioral functioning in children exposed to narcotics in utero. Addictive Behaviors. 13. 275-283. Day, N. L., & Richardson, G. A. (1994). Comparative teratogenicity of alcohol and other drugs. Alcohol Health and Research World. 18(1), 42-48. de Cubas, M. M., & Field, T. (1993). Children of methadone-dependent women: Developmental outcomes. American Journal of Orthopsychiatry. 63(2), 266-276. de Hirsch, I., Jansky, J., & Langford, W. (1966). Predicting reading failure. New York, NY: Harper & Row. Di Pasquale, G., Moule, A., & Rewelling, R. (1980). The birthdate effect. Journal of Learning Disabilities. 13,234-238. Domholdt, E. (1993). Physical Therapy Research: Principles and Applications. Philadelphia, Pennsylvania: W. B. Saunders, Co. Dorland's pocket medical dictionary (23rd ed.). Philadelphia, PA: W. B. Saunders Company. Dunn, L. M. (1959). Peabody Picture Vocabulary Test. Circle Pines, MN: American Guidance Service. Dunn, L. M., & Dunn, L. M. (1981). Peabody Picture Vocabulary Test - Revised Manual. Circle Pines, MN: American Guidance Service. Validity of 88 Feinstein, A. R. (1985). Clinical epidemiology: The architecture of clinical research. Philadelphia, PA: W. B. Saunders, Co. Felton, R. H. (1992). Early identification of children at risk for reading disabilities. Topics in Early Childhood and Special Education. 12(2), 212-229. Fletcher, R. H., Fletcher, S. W., & Wagner, E. H. (1988). Clinical epidemiology: The essentials. Baltimore, MD: Williams & Wilkins. Fletcher, J., & Satz, P. (1980). Development changes in the neuropsychological correlates of reading achievement: A six-year longitudinal follow-up. Journal of Clinical Neuropsychology. 2(1), 23-37. Frankenburg, W. K., & Dodds, J. B. (1973). Revised Denver Developmental Screening Test. Denver, CO: Ladoca Project and Publishing Foundation. Frostig, M., Maslow, P., Lefever, D., & Whittlesey, J. (1964). The Marianne Frostig Developmental Test of Visual Perception. Perceptual and Motor Skills. 19. 463-499. Fulks, M. L., & Harris, S. R. (1995). Children exposed to drugs in utero: Their scores on the Miller Assessment for Preschoolers. Canadian Journal of Occupational Therapy. 62(1), 7-15. Greer, J. V. (1990). The drug babies. Exceptional Children. 56, 382-384. Harnmill, D. D., & Leigh, J. E. (1983). Basic School Skills Inventory: Diagnostic. Austin, TX: PRO-ED. Hans, S. L. (1989). Developmental consequences of prenatal exposure to methadone. Annals of the New York Academy of Sciences. 562, 195-207. Haring, K. A., Lovett, D. L., Haney, K. F., Algozzine, B., Smith, D. D., & Clarke, J. (1992). Labeling preschoolers as learning disabled: A cautionary position. Topics in Early Childhood and Special Education. 12(2), 151-173. Harris, S. R., Eaves, L., & Fulks, M. (1994). Validity of the Miller Assessment for Preschoolers in detecting learning disorders in preschool-aged children with prenatal drug and/or alcohol exposure. Final report. Hartzell, H. E., & Compton, C. (1984). Learning disability: A ten-year follow-up. Pediatrics. 74(6), 1058-1064. Householder, J., Hatcher, R., Burns, W., & Chasnoff, I. (1982). Infants born to narcotic-addicted mothers. Psychological Bulletin. 92,453-468. Howard, J., Beckwith, L., Rodning, C, & Kropenske, V. (1989). The development of young children of substance abusing parents: Insights from seven years of intervention and research. Zero to Three. 9(5). 8-12. Vahdity of 89 Humphry, R., & King-Thomas, L. (1993). A response and some facts about the Miller Assessment for Preschoolers (Commentary). Occupational Therapy Journal of Research. 13(1), 34-49. Jastak, J. F., & Jastak, S. (1976). Wide Range Achievement Test (Manual). Wilmington, DE: Jastak Associates, Inc. Jones, B. F., Palincsar, A., Ogle, D., & Carr, E. (1987). Strategic teaching and learning: Cognitive instruction in the content areas. Alexandria, VA: Association for Supervision and Curriculum Development in cooperation with the North Central Regional Educational Laboratory. Kaltenbach, K. A. & Finnegan, L. P. (1989). Prenatal narcotic exposure: Perinatal and developmental effects. NeuroToxicology. 10. 597-604. Kavale, K. (1990). Variances and verities in learning disability interventions. In T. Scruggs & B. Wong (Eds.), Intervention in learning disabilities (pp. 466-485). San Diego, CA: Academic Press. Keogh, B. (1987). A shared attribute model of learning disabilities. In S. Vaughn & C. Bos (Eds.). Research in learning disabihties (pp. 3-18). Boston: College Hill Press. Keogh, B. K., & Sears, S. (1991). learning disabilities from a developmental perspective: Early identification and prediction. In B. Y. Wong (Ed.), Learning about learning disabilities (pp. 485-503). San Diego, CA: Academic Press, Inc. Kirk, S., & Chalfant, J. (1984). The learning disabled child. Developmental and academic learning disabihties. Denver, CO: Love Publishing Co. Kirk, S. A., McCarthy, J. J., & Kirk, W. D. (1968). Illinois Test of Psycholinguistic Abilities: Revised Edition. Urbana, IL: University of Illinois Press. Kirkpatrick, L. A., & Schouten, P. G. (1993). Authors reply to commentaries. Occupational Therapy Journal of Research. 13(1), 50-61. Koppitz, E. (1973). Special class pupils with learning disabilities: A five-year follow-up study. Academic Therapy. 8,133-140. Kronstadt, D. (1989). Pregnancy and cocaine addiction: An overview of impact and treatment. (The Center For Child and Family Studies: A Report From the Drug-Free Pregnancy Project). San Francisco, CA: Far West Laboratory For Educational Research and Development. Lemerand, P. (1985). Predictive validity of the Miller Assessment for Preschoolers. Unpublished doctoral dissertation, University of Michigan, Ann Arbor. Lerner, J.W. (1993). Learning disabilities: Theories, diagnosis, and teaching strategies (6th ed.). Boston, MA: Houghton Mifflin Co. Validity of 90 Lesiak, J. (1984). The Bender Visual Motor Gestalt Test: Implications for the diagnosis and prediction of reading achievement. Journal of School Psychology. 22.391-405. Lewis, K. D., Schmeder, N. H., & Bennett, B. (1992). Maternal drug abuse and its effects on young children. American Journal of Maternal Child Nursing. 17(4), 198-203. Lindenberg, C. S., Alexander, E. M., Gendrop, S. C, Nencioli, M., & Williams, D. G. (1991). A review of the literature on cocaine abuse in pregnancy. Nursing Research. 40(2), 69-75. Loock, C. A., Kinnis, C, Selwood, B., Robinson, G. C, Segal, S., Blatherwick, F. J., & Armstrong, R. W. (1993). Targeting high risk families: Prenatal alcohol/drug abuse and infant outcomes. Unpublished manuscript, University of British Columbia, Department of Paediatrics, Faculty of Medicine, Vancouver, BC. McCance-Katz, E. F. (1991). The consequences of maternal substance abuse for the child exposed in utero. Psychosomatics. 32,268-274. McCarthy, D. (1972). McCarthy Scales of Children's Abilities. New York, NY: Psychological Corporation. Meisels, S. (1988). Developmental screening in early childhood. Annual Reviews in Public Health. 9, 527-550. Meisels, S. J., & Wasik, B. A. (1990). Who should be served? Identifying children in need of early intervention. In S. J. Meisels & J. P. Shonkoff (Eds.), Handbook of early childhood intervention. Cambridge, MA: Cambridge University Press. Miller, L. J. (1982). Miller Assessment for Preschoolers manual. Littleton, CO: KID Foundation. Miller, L. J. (1986). The predictive vahdity of the Miller Assessment for Preschoolers: A four year study. Unpublished doctoral dissertation, University of Denver, Denver, CO. Miller, L. J. (1987a). Longitudinal validity of the Miller Assessment for Preschoolers: Study I. Perceptual and Motor Skills. 65,211-217. Miller, L. J. (1987b). Response to 'A critique of the standardization of the Miller Assessment for Preschoolers' (Letter to the editor). American Journal of Occupational Therapy. 41.537-538. Miller, L. J. (1988a). Differentiating children with school-related problems after four years using the Miller Assessment for Preschoolers. Psychology in the Schools. 25,10-15. Miller, L.J. (1988b). Longitudinal validity of the Miller Assessment for Preschoolers: Study II. Perceptual and Motor Skills. 66, 811-814. Miller, L. J. (1988c). Miller Assessment for Preschoolers: Manual (rev. ed.). San Antonio, TX: Psychological Corp. Validity of 91 Miller, L. J. (1990). An overview of the predictive validity of the Miller Assessment for Preschoolers: A four year study (Dissertation abstract). Physical and Occupational Therapy. 10(1), 101-102. Miller, L. J. (1993). Response to 'Questions and concerns about the Miller Assessment for Preschoolers'. Occupational Therapy Journal of Research. 13(1), 29-33. Miller, L. J., Lemerand, P. A., & Cohn, S. H. (1987). A summary of three predictive studies with the MAP (Brief report). Occupational Therapy Journal of Research. 7(6), 378-381. Miller, L. J., Lemerand, P. A., & Schouten, P. G. (1990). Interpreting evidence of predictive validity for developmental screening tests. Occupational Therapy Journal of Research. 10(2). 74-86. Miller, L. J., & Schouten, P. G. (1988). Age-related effects on the predictive validity of the Miller Assessment for Preschoolers. Journal of Psychoeducational Assessment. 6, 99-106. Mutti, M., Sterling, H., & Spalding, N. (1978). Quick Neurological Screening Test. Novato, CA: Academic Therapy Publications. Naglieri, J. A., & Pfeiffer, S. I. (1983). Stability, concurrent and predictive validity of the PPVT-R. Journal of Clinical Psychology. 39(6). 965-967. Osborn, J. A., Harris, S. R., & Weinberg, J. (1993). Fetal alcohol syndrome: Review of the literature with implications for physical therapists. Physical Therapy. 73, 599-607. Paget, K. D. (1990). Assessment of intellectual competence in preschool-age children: Conceptual issues and challenges. In C. R. Reynolds & R. W. Kamphaus (Eds.), Handbook of psychological and educational assessment of children: Intelligence and achievement. New York, NY: The Guilford Press. Provost, B., Harris, M., Ross, K., & Michnal, D. (1988). A comparison of scores on two preschool assessment tools: Implications for theory and practice. Physical and Occupational Therapy in Pediatrics. 8(4), 35-51. Reid, D. K., Hresko, W. P, & Harnmill, D. D. (1989). Test of Early Reading Ability-2. Austin, TX: PRO-ED. Reynolds, C. R. (1990a). Conceptual and technical problems in learning disability diagnosis. In C. R. Reynolds & R. W. Kamphaus (Eds.), Handbook of psychological and educational assessment of children: Intelligence and achievement. New York, NY: The Guilford Press. Reynolds, C. R. (1990b). Visual-motor assessment. In C. R. Reynolds & R. W. Kamphaus (Eds.), Handbook of psychological and educational assessment of children: Intelligence and achievement. New York, NY: The Guilford Press. Vahdity of 92 Rodning, C, Beckwith, L., & Howard, J. (1989). Prenatal exposure to drugs: Behavioral distortions reflecting CNS impairment? NeuroToxicology. 10.629-634. Satz, P., & Fletcher, J. M. (1988). Early identification of learning disabled children: An old problem revisited. Journal of Consulting and Clinical Psychology. 56(6), 824-829. Satz, P., Taylor, G., Friel, J., & Fletcher, J. (1978). Some developmental predictive precursors of reading disabilities: A six-year follow-up. In A. L. Benton & D. Pearl (Eds.), Dyslexia: An appraisal of current knowledge. New York, NY: Oxford University Press. Scarr, S. (1981). Testing for children: Assessment and the many determinants of intellectual competence. American Psychologist. 36.1159-1168. Schneider, J., Griffith, D., & Chasnoff, I. (1989). Infants exposed to cocaine in utero: Implications for developmental assessment and intervention. Infants and Young Children. 2(1), 25-36. Schouten, P. G., & Kirkpatrick, L. A. (1991). Test review: Miller Assessment for Preschoolers. Journal of Psychoeducational Assessment. 9. 179-185. Schouten, P. G., & Kirkpatrick, L. A. (1993). Questions and concerns about the Miller Assessment for Preschoolers. Occupational Therapy Journal of Research. 13,7-28. Schutter, L. S., & Brinker, R. P. (1992). Conjuring a new category of disability from prenatal cocaine exposure: Are the infants unique biological or caretaking casualties? Topics in Early Childhood Special Education. H(4), 84-111. Shriver, M. D., & Piersel, W. (1994). The long-term effects of intrauterine drug exposure: Review of recent research and implications for early childhood special education. Topics in Early Childhood Special Education. 14(2), 161-183. Slaton, D. (1985). The Miller Assessment for Preschoolers: A clinician's perspective. Physical and Occupational Therapy in Pediatrics. 5(1), 65-70. Snyder, P., Bailey, D. B., & Auer, C. (1994). Preschool ekgibility determination for children with known or suspected learning disabihties under IDEA. Journal of Early Intervention. 18(4), 380-390. Thorndike, R. L., Hagen, E. P., & Sattler, J. M. (1986). The Stanford-Binet Intelligence Scale: Fourth Edition. Chicago, IL: Riverside Publishing Company. Tolor, A., & Brannigan, G. C. (1980). Research and clinical applications of the Bender-Gestalt Test. Springfield, IL: Charles C. Thomas. van Baar, A. (1990). Development of infants of drug dependent mothers. Journal of Child Psychology and Psychiatry. 31,911-920. Vakdity of 93 van Baar, A., & de Graaff, B. (1994) Cognitive development at preschool-age of infants of drug-dependent mothers. Developmental Medicine and Child Neurology. 36.1063-1075. Van Dyke, D. C, & Fox, A. A. (1990). Fetal drug exposure and its possible impkcations for learning in the preschool and school-age population. Journal of Learning Disabilities. 23(3), 160-163. van der Meulen, B. F., & Smrkovsky, M. (1983). De Bavley Ontwikkelings-Schalen (BOS 2-30). Lisse: Swets & Zeitlinger. van der Meulen, B. F., & Smrkovsky, M. (1987). De Bayley Ontwikkelings-Schalen (BOS 2-30). Niet-Verbale Versie. Lisse: Swets & Zeitlinger. Wechsler, D. (1974). Wechsler Intelkgence Scale for Children - Revised. New York, NY: Psychological Corporation. Wechsler, D. (1989). Wechsler Preschool and Primary Scale of Intelligence-Revised. New York, NY: Psychological Corporation. Wechsler, D. (1967). Wechsler Preschool and Primary Scale of Intelkgence. New York, NY: Psychological Corporation. Werner, E. E. (1986). A longitudinal study of perinatal risk. In D. C. Farran & J. D. McKinney (Eds.), Risk in intellectual and psychosocial development. Orlando, FL: Academic Press, Inc. Weston, D., Ivins, B., Zuckerman, B., Jones, C, & Lopez, R. (1989). Drug-exposed babies: Research and cknical issues. Zero to Three. 9(5). 1-7. Williams, B. F., & Howard, V. F. (1993). Children exposed to cocaine: Characteristics and impkcations for research and intervention. Journal of Early Intervention. 17(1), 61-72. Wilson, G. (1989). Cknical studies of infants and children exposed prenatally to heroin. In D. E. Hutchings (Ed.), Prenatal abuse of kcit and illicit drugs. Annals of the New York Academy of Sciences. 562.183-194. Wilson, G., Desmond, M., & Wait, R. (1981). Follow-up of methadone-treated and untreated narcotic-dependent women and their infants: Health, developmental, and social implications. Journal of Pediatrics. 98, 716-722. Wilson, G. S., McCreary, R., Kean, J., & Baxter, J. C. (1979). The development of preschool children of heroin-addicted mothers: A controlled study. Pediatrics. 63.135-141. Woodcock, R. W., & Johnson, M. B. (1977). Woodcock-Johnson Psycho-Educational Battery. Hingham, MA: Teaching Resources Corporation. Young, S. L., Vosper, H. J., & Philkps, S. A. (1992). Cocaine: Its effects on maternal and child health. Pharmacotherapy. 12(1). 2-17. Vahdity of 94 Ysseldyke, J. E., & Algozzine, B. (1983). LD or not LD: That's not the question. Journal of Learning Disabilities. 16. 29-31. Ysseldyke, J. E., Algozzine, B., Shinn, M. R., & McGue, M. (1982). Similarities and differences between low achievers and students classified as learning disabled. Journal of Special Education. 16.73-85. Appendix Participants' Test Scores Validity of 95 Test Scores Participant8 MAP WPPSI-R TERA-2 PPVT-R VMI 1 7 93 85 93 107 2 8 80 81 90 80 3 8 115 97 122 97 4 9 88 ~ 94 79 5 11 84 99 89 72 6 14 103 80 97 93 7 17 89 91 84 97 8 17 111 — — 107 9 26 111 107 100 82 10 26 92 85 92 101 11 26 106 92 103 106 12 26 94 75 92 84 13 32 95 85 105 100 14 32 116 103 109 88 15 32 90 88 90 101 16 40 104 106 108 106 17 40 91 92 92 77 18 40 94 92 94 86 19 40 90 96 100 100 20 47 101 80 99 90 21 47 111 119 108 107 22 47 92 85 93 85 23 55 95 99 103 88 Validity of 96 Participants' Test Scores (continued) Participant3 MAP WPPSI-R TERA-2 PPVT-R VMI 24 55 99 95 95 87 25 55 90 92 78 106 26 64 89 85 82 102 27 64 103 85 109 86 28 64 118 106 102 112 29 64 97 103 90 98 30 64 92 92 94 93 31 . 74 138 113 116 107 32 74 100 95 105 98 33 83 118 — 112 96 34 83 113 Ill 99 93 35 92 98 — 92 102 36 92 98 Ill 108 106 37 92 108 108 101 90 Note. Italicized numbers highlight the test scores which were < 85 on the outcome measures. Dashes indicate that the test was not completed. Participants were ordered from poor to good performance on the MAP Total score. 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0087070/manifest

Comment

Related Items