Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Producing equivalent examination forms : an assessment of the British Columbia Ministry of Education… MacMillan, Peter D. 1991

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1991_A8 M23_96.pdf [ 5.79MB ]
Metadata
JSON: 831-1.0100498.json
JSON-LD: 831-1.0100498-ld.json
RDF/XML (Pretty): 831-1.0100498-rdf.xml
RDF/JSON: 831-1.0100498-rdf.json
Turtle: 831-1.0100498-turtle.txt
N-Triples: 831-1.0100498-rdf-ntriples.txt
Original Record: 831-1.0100498-source.json
Full Text
831-1.0100498-fulltext.txt
Citation
831-1.0100498.ris

Full Text

PRODUCING EQUIVALENT EXAMINATION FORMS: AN ASSESSMENT OF THE BRITISH COLUMBIA MINISTRY O F EDUCATION EXAMINATION CONSTRUCTION PROCEDURE by PETER DENTON MAC MILLAN B.Sc,  University of British Columbia 1972  A THESIS SUBMITTED IN PARTIAL FULFILLMENT O F THE REQUIREMENTS FOR THE DEGREE OF MASTER OF ARTS in THE FACULTY OF GRADUATE STUDIES Department of Educational Psychology  We accept this thesis as conforming to the required standard  ©THE UNIVERSITY OF BRITISH COLUMBIA August  1991  In  presenting  degree  this  at the  thesis  in  University of  partial  fulfilment  British Columbia,  freely available for reference and study. copying  of  department  this or  thesis by  for scholarly  his  publication of this thesis  or  her  the  requirements  for  an  Department The University of British Columbia Vancouver, Canada  advanced  I agree that the Library shall make it  I further agree that permission for extensive  purposes  may be  representatives.  It  is  granted  by the head of  understood  that  for financial gain shall not be allowed without  permission.  DE-6 (2/88)  of  copying  my or  my written  ii  ABSTRACT Questions have been raised concerning the equivalency of the January, June, and August forms of the British Columbia provincial Grade 12 examinations for a given subject.  The procedure for  constructing these examinations has been changed as of the 1990/91 school year.  The purpose of this study was to duplicate  this new procedure and assess the equivalency of the forms that resulted. An examination construction team, all of whom had previous experience with the British Columbia Ministry of Education's Student Assessment  Branch, simultaneously constructed  two  forms of a Biology 12 examination from a common table of specifications  using a pool of multiple choice items from previous  examinations.  A sample of students was obtained in the Okanagan,  Thompson, and North Thompson areas of British Columbia.  Both  forms were administered to each student, as required by the test equating design (Design II (Angoff, 1971)) chosen. consisted of responses from 286  The data sample  students.  The data were analyzed using a classical item analysis (LERTAP, Nelson, 1974) followed by a 2x2 order-by-form fixed effects ANOVA with repeated measures on the second factor.  Item  analysis revealed all items on both forms performed satisfactorily,  ruling out an alternate  hypothesis  being the cause of the lack of equivalence found. significant  of flawed  items  Results showed a  (p<.05) difference in the means of the two forms, no  < i <  significant form  (p>.25) order effect, and a significant (p<.25) order-by-  interaction. Linear and equipercentile equatings were carried out.  The  linear and equipercentile equatings yielded very similar results. Equating errors, judged using the conditional root mean square error of equating, was 4.86 points (9.35%) for both the equatings. Equivalency was also judged employing a graphical procedure in which the deviation of the equating function from the identity function was plotted with error bands produced using the standard error of equating.  The procedure showed the two forms to be  nonequivalent, particularly for the lower scoring  students.  The source of the nonequivalency was investigated by separating the forms into three subtests based on the pairs of items possessing or lacking item statistics at the time of test construction.  The linear equating followed by the graphical  analysis was repeated for the pairs of subtests.  The pairs of  subtests comprised of item pairs for which difficulty (p) values were present at time of construction for one or both of the items in an item pair were found to be equivalent.  In contrast, the pair  of subtests comprised of items for which p values were unavailable for either item in an item pair at time of construction were found to be not equivalent. It was concluded that the examination construction procedure in its present state cannot be relied on to produce equivalent forms.  An experienced examination construction team  was unable to accurately match items based on the level of  iv  difficulty for items which did not have prior item statistics. such, a necessary requirement for construction of equivalent forms is that item statistics be present at the time of construction.  Research Supervisor Dr. D. J . Bateson  As  V  TABLE OF CONTENTS ABSTRACT  II  LIST O F TABLES  .  ix  LIST O F FIGURES  x  ACKNOWLEDGEMENT  xi  Chapter 1  1 1 2 3  INTRODUCTION The Grade 12 Examination Program Purpose of the Examinations . Nonequivalency . Changes in Examination Assembly Procedures Statement of the Problem Definition of Terms Equating Parallel Forms Equivalent Tests Equivalent Scores Comparable Scores Hypotheses . . . . . Outline of Thesis . . . .  5 6 6 6 7 7 7 8 8 10  REVIEW OF THE LITERATURE .  11  Equivalence versus Comparability Parallelism Psychological Parallelism Statistical Parallelism Equating Designs Equating Methods . . . . Linear Equating . . . . Design I Equivalent Groups Design Design II Test-Retest Design Design III Anchor Test Design Equipercentile Equating Equating Error . . . . Conditional Root Mean Square  1 1 12 12 13 13 15 16 16 17 19 20 20 21  vi  The Average Absolute Difference . Standard Error of Equating Linear Equating Design I Design II . . . . Design III . Equipercentile Equating Design I . Design II . . . . Design III . Use of Standard Error of Equating . Magnitude of the Standard Error of Equating Selection of Data Collection Design Selection of Equating Model 3  EXAMINATION CONSTRUCTION Content, Materials and Assembly Team Choice of Subject Number of Forms Type of Item Content Tested Item Bank Examination Assembly Team Table of Specifications Examination Construction Process Examination Construction Allocation of Items to Forms Assessment of Comparability Production of Examination Booklets  4  METHODOLOGY AND PRELIMINARY ANALYSIS Sample Selection Sample size School Selection Examination Administration Scoring and Data Preparation Preliminary Analysis Response Rate  22 22 23 23 23 24 24 25 25 25 25 27 28 28 31 31 31 31 32 33 34 34 35 36 37 37 40 40 41 41 41 41 43 45 46 46  vii  Preliminary Classical Item Analysis Order of Administration 5  ANALYSES AND RESULTS  50  Pre-Equating Analysis Statistical Properties of the Forms Equating Linear Equating Angoff's Design II . Linear Equating Equipercentile Equating Comparison of the Two Equatings . Differential Performance at Lower Achievement Levels . Assessment of Equivalence of Forms CRMS and AAD Standard Error of Equating Nonequivalence Source of Nonequivalence Equating of Subtests 6  47 48  50 50 51 51 52 54 57 58 59 59 59 61 61 61  CONCLUSIONS  65  Summary Purpose and Problem Procedure . Analysis and Results Limitations . . . . . . Conclusions . . . . . . Implications . . . . . . Implications for Practice . . . Implications for Future Research . REFERENCES  65 65 65 66 68 68 69 69 70 71  APPENDIX  A:  Ministry Examination Construction  APPENDIX  B:  Materials for Construction Team  APPENDIX  C:  Examination Forms  Process .  76 79 86  APPENDIX  D:  Request and Permission Forms  APPENDIX  E:  Teacher  APPENDIX  F:  Equipercentile Equating Tables  Administration  Instructions  IX  LIST O F TABLES  Table 1  1989 Examination Means and Standard Deviations  2  Three Random Group Equating Designs  14  3  1900  35  4  Form A and Form B Table of Specifications  5  Test Administration Design and  Biology 12 Table of Specifications  Enrollment  .  .  .  3  38  Estimated .  .  .  44  6  Summary ANOVA  48  7  Cell Means  49  8  Psychometric  9  Weighted and Pooled Means and Variances  52  10  Results of the Linear Equating;  53  11  Results of Equipercentile Equating  56  Properties of the Two Forms  51  LIST OF FIGURES  Figure 1  Linear Equating  54  3  Equipercentile  56  3  Standard Error of Equating Total Examination .  4  Standard Error of Equating P Values Known for  Equating  Both Items 5  62  Standard Error of Equating P Value Known One Item Only  6  60  63  Standard Error of Equating P Values Unknown for Both Items  64  xi  ACKNOWLEDGEMENT  I would like to express my appreciation and gratitude to the many individuals and organizations that made completion of this study possible.  Although all of my thesis committee members  have offered insightful comments, and given me valuable support beyond what is normally required, all have given me much more. Special thanks are offered to: Dr. David J . Bateson for first stirring my interest in educational  measurement;  Dr. Harold Ratzlaff for the encouragement that lead to my entrance into the MERM area; Dr. W. Todd Rogers, not only for his advice and support during my course work both at the University of British Columbia and later the University of Alberta, but most especially during the writing of this thesis. This study could not have been carried out without a grant from the B. C . Ministry of Education.  Personal financial assistance  in the form of course fee reimbursement was given by the Kamloops District Teachers' Association. and  I appreciate the time  effort put forth by the students and teachers who took part in  this study.  I am especially indebted to the examination assembly  team for the use of their time and expertise. Finally, this thesis could not have been completed without the invaluable support of my family, and my friends and colleagues at UBC, the U of A, and in the Kamloops School District.  1  j;  CHAPTER 1 INTRODUCTION In 1983, the British Columbia Ministry of Education reintroduced provincial examinations subjects.  for thirteen grade  12  While there have been, and still are many issues  surrounding British Columbia governmental  examinations  (Anderson, Muir, Bateson, Blackmore, & Rogers, 1990; Rogers 1990; Ministry of Education, 1985; Sullivan, 1988), this study is concerned with only one of these issues. reinstatement,  Seven years after the  there are still concerns about the ability of the  test makers to create equivalent forms that are required for use at the three testing times the school year.  January, June, and August -  throughout  In an attempt to address this concern, the  Ministry has changed its procedures for examination construction. The purpose of this study was to assess the success of these new procedures. The Grade 12 Examination Program All Grade 12 students in British Columbia are required to write school leaving examinations in each examinable course in which they are enrolled.  All grade 12 students must write at least  one examination in order to graduate, either English 12 or Communications 12.  Presently examinations are administered in  fifteen courses: Algebra (Mathematics as of September 1990), Biology, Chemistry, Communications, English, English Literature, Francais-langue, French, Geography, Geology, German, History, Latin, Physics, and Spanish.  The scores from each examination are  2  combined with the school awarded mark so that the examination counts 40% of the final mark for each course. The examinations are offered on three occasions during the school year: January, June, and August. students  in semestered  The January sittings allow  schools to write examinations for their  first term courses at the appropriate time.  The June sittings are  the times at which students in semestered  schools write their  second term examinations.  Students in nonsemestered  write all of their examinations in June.  schools  Some students are granted  the opportunity to write one or more examinations in August. Purpose of the Examinations The expressed purposes of the Grade 12 examinations are ... to insure that grade 12 students meet consistent provincial standards of achievement, in academic subjects. The examination program will also ensure that graduating students from all schools will be treated equitably when applying for admission to universities and other post-secondary institutes. An additional purpose is to respond to strong public concerns for improved standards in education. (Ministry of Education, 1983, p. 6)  The stated reasons for the examinations center around a desire for consistent standards, whether the consumers of this information be the Ministry, universities, teachers, students themselves. this study.  parents, or  It is this consistency that is the issue of  3  Nonequivalencv The examinations are prepared using a common table of specifications, the assumption being that the forms produced will be equivalent.  Preliminary inspection of the examinations using  statistical criteria indicates there may possibly be  differences  between forms in some subject areas (Ministry of Education, 1989).  Table 1 displays means and standard deviations of the  multiple choice sections for the 1989  examinations.  Table 1 1989 Examination Means and Standard Deviations  January  Subject  Alg Bio Chem Comm Eng Lit Fren Geog Hist Phys  * p<.05  June  n  M  S.D.  2857 2007 1234 1094 6420 577 1 136 1599 1562 432  34.0 32.3 32.2 35.5 18.3 19.0 45.4 35.9 31.9 39.4  10.19 8.28 7.06 4.01 3.74 5.20 7.74 7.55 8.12 10.02  n  9795 7741 5881 4618 21338 3354 4838 6644 6205 3343  M  S.D.  t-test  33.5 31.9 32.5 31.8 18.4 20.0 42.9 35.5 34.4 45.0  9.79 8.96 7.43 4.47 3.71 5.07 8.42 7.80 7.65 9.37  -2.33* -1.89 1.35 -26.88* 1.88 4.28* -9.63* -1.88 11.00* 11.01*  4  Examination of Algebra 12 (Alg), Biology 12 (Bio), Chemistry 12 (Chem), English 12 (Eng), and Geography 12 (Geog) show little difference,  less than one point, between the means of the  respective January and June examinations. means are significantly  different  While the Algebra  (t=-2.33, p<.05), the  difference  of 0.5 points might be considered a trivial difference. Communications 12 (Comm), English Literature 12 (Lit), French 12 (Fren), History 12 (Hist), and Physics 12 (Phys) all show significant (p<.05) and nontrivial differences.  The equal means of  the January and June forms of the examinations for the first group of subjects, Algebra, Biology, Chemistry, English, and Geography suggest that the January and June forms for a given subject are of equal difficulty and that the January and June examinee populations are of equal ability within the various subject areas. If then, the examinee populations are of equal ability, the difference in means of the second group of subjects must be due to January and June forms that vary in difficulty. The Ministry of Education is aware of the possible lack of equivalency  between forms as reflected  by the following  statement which appears in its Biology 12 Report to Schools (1988): Without field test information it is not possible to ensure that the January, June, and August examinations will be of equal difficulty. (Biology 12 Report to Schools, Appendix C, p. 2)  5  The Ministry has attempted to address this problem through the use of a standard setting process (Ministry of Education, 1989, p. 75).  Students do not receive raw score percentages as their  examination  marks. Instead,  the province-wide distribution of  examination results for a given subject is compared with the province-wide distribution of school-awarded subject.  marks for the  given  If a difference between the two distributions occurs, the ;  examination distribution is altered to more closely fit the school awarded marks distribution.  This is done in spite of a Ministry  caveat that the two marks do not measure the same skills or knowledge.  1  Changes in Examination Assembly Procedures In an attempt to address the apparent lack of equivalency of examinations in a given school year, the Ministry implemented a new test construction procedure (see Appendix A) in 1990 (Ministry of Education, 1990).  Prior to 1990, the forms for a given  school year were produced separately at different times of the year, although by the same team.  Beginning with the 1990  examinations, the three forms were assembled  simultaneously  The August results are not discussed for two reasons. First, the Ministry of Education does not routinely release information pertaining to the August examinations as it does for the January and June sittings. Second, students writing one or more examinations in August do so either because of a previous examination failure that school year or due to some other reason acceptable to the Ministry of Education. The examinees are clearly neither a random nor a representative sample of the population of students writing examinations in any given school year. Knowledge of the August results would yield no information by which the equivalency of the three forms could be assessed. 1  6  using items that had already been developed, and, in some cases field  tested. Statement of the Problem  The purpose of this .study was to empirically test the assumption that the examination forms produced by the new procedure are equivalent.  The specific question addressed was:  For a given subject, will the two forms of the government examination constructed at the same time by the same team from a common table of specifications be equivalent? It is in the interests of all the stakeholders that the exams be made as fair and consistent as possible.  Not only must the results  of the examinations be as consistent as possible, but it must also be demonstrated that this consistency exists.  There is an  "...obligation to review whether a practice has appropriate consequences  for individuals and institutions, and especially to  guard against adverse consequences" (Messick, 1989, p. 20).  The  intent of this study was to determine whether or not the procedure of simultaneous construction of the examination forms would produce the equivalent forms of an examination that would ensure consistent treatment of all examinees. Definition of Terms Equating For the purposes of this study, the definition of equating proposed by Angoff (1971) was adopted:  7  equating, or the derivation of equivalent scores, concerns itself with the problem of unique conversions which may be derived only across test forms that are parallel, that is forms that measure, within acceptable limits, the same psychological function, (p. 563) Parallel Forms Equivalence is dependent upon parallelism, in fact is synonymous with it (Holmes, 1981, p. 7). defined  using both statistical  Gulliksen (1950) defined  Parallel tests can be  and psychological criteria.  parallelism as  follows:  In addition to equal means, variances, and reliabilities, parallel tests should have approximately equal validities for any criterion...The tests should contain items dealing with the same subject matter, items of the same format, etc. In other words, the tests should be parallel as far as psychological judgement is concerned, (pp. 173-174)  Equivalent  Tests  Dorans and Lawrence (1990) describe tests as being equivalent if the equating function, the function that maps scores on one test to scores on another, is the identity function, y=x, (p.253). Equivalent Scores Test forms which meet the criteria for parallelism are referred to as parallel forms or equivalent forms.  The scores  obtained from these forms are interchangeable; they can be interpreted in the same way.  8  Comparable Scores If the tests are not parallel, then comparable scores may be calculated, but these scores do not function in the same fashion. 1  For example, scores on a spelling quiz could be mapped to comparable scores on a multiplication quiz, yet no one would claim these scores measure the same construct. comparable, are not equivalent.  Such scores, while  Rogers and Holmes (1987) state:  {computing comparable scores} can be done irrespective of test content and carries no implication of the interchangeability of tests, (p. 4) If two scores are merely comparable, they correspond to equal standard scores or the same rank position if the score distributions are similar. However, the scores do not carry any connotation as to the equality of the amount of the trait(s) being measured (Rogers & Holmes, 1987, p. 5).  The score conversions are  nonunique and are not generalizable beyond the group for which they were established  (Millman & Linoff, 1964). Hypotheses  Given the Gullicksen (1950) definition of parallel forms, the following three hypotheses for the study were formulated:  1.  Ho  Lla=LLb  Hi  LLaiLLb,  where LLa and Lib are the means on Form A and Form B, respectively.  9  2.  Ho  0 a/a b  Hi  0 a/0~ b * 1 ,  2  2  2  =1  2  where 0 a and 0" b are the variances of Form A and Form B, 2  2  respectively.  3.  Ho  p a a = pbb  Hi  paa*pbb,  where p a a a n d p b b  are the internal reliabilities of Form A and  Form B, respectively.  A fourth hypothesis, based on work by Loret, Seder, Bianchini, & Vale (1974, p. 8) and Holmes (1981, p. 80), was:  4.  where  Ho  c p a b =1  Hi  cpab<1,  c p a b is the intertest correlation coefficient  between Form  A and Form B corrected for attenuation.  A fifth hypothesis, based on Dorans' and Lawrence's work, was:  5.  Ho  F(x) = l(x)  Hi  F(x) * l(x),  where F(x) is the equating function and l(x) is the identity function  y=x.  1  Outline of the Thesis Chapter Two begins with an elaboration of equivalent and comparable scores discussed in relation to psychological and statistical parallelism.  Data collection designs and equating  methods along with associated equating error estimates are discussed.  The relative merits and weaknesses of each model as  applied to this study are used to select the most feasible data collection design and equating process.  In Chapter Three the  construction of the examination forms used in this study is described.  Chapter Four includes a description of both the  ;  methodology and the preliminary data analysis, while the main analyses and the corresponding results are presented in Chapter Five.  A summary of the study together with conclusions,  implications for practice, and implications for future research are presented and discussed in Chapter 6.  o  11  CHAPTER 2 REVIEW OF THE LITERATURE As described in Chapter 1, questions have been raised regarding the equivalency of the three forms of a Grade 12 examination that are administered during a given school year. Described in this chapter are various procedures that can be employed to test the equivalency of test forms.  First, equivalent  and  comparable scores are discussed in relation to psychological  and  statistical  noted.  parallelism and the differences  between them  This is followed by an outline of standard data collection  designs and corresponding analytical procedures for assessing the equivalence of different test forms. equating are then described.  Various error estimates of  The chapter concludes with the  rationale for the design selected and used in the present study. Equivalence versus Comparability The  distinction between equating and comparing is  somewhat  blurred due to inconsistent use in the literature (Rogers & Holmes, p. 5).  As described in Chapter One, comparable scores can be  produced irrespective of test content but these scores are not interchangeable and do not carry any connotation about the amount of trait measured.  Equivalent scores are interchangeable and  represent equal amounts of a trait. Likewise, terms describing the mathematical process for obtaining these equivalent and comparable scores is also blurred. Angoff (1971) uses the term calibrating as the more inclusive term for the process that maps a score on one test to a score on ;  another test. scores.  Thus the calibration process produces comparable  In the more restrictive case of parallel tests, the  of calibrating is properly described as equating.  process  The result of an  equating is equivalent scores on forms that are said to be equivalent forms.  The production of equivalent scores is  dependent on the requirement of parallel forms being met. Parallelism Gulliksen (1950) addressed the interchangeability of forms when he suggested that "two tests are parallel when it makes no difference which test you use." (p. 11).  Later Lord and Novick  (1968) clarified the definition of parallel when they  stated:  ...parallel measurements measure exactly the same thing in the same scale and, in a sense, measure it equally well for all persons, (p. 48). Parallelism can be described in both psychological and statistical terms.  Both psychological and statistical parallelism must be  satisfied  if the forms are to be considered parallel.  Psychological  parallelism.  Psychological parallelism  requires that the test forms to be equated^ measure the same psychological trait or function (Rogers & Holmes, p.6).  Earlier,  Wesman (1958) suggested, "the degree to which the conditions of parallelism are met are most closely achieved when forms are designed to be parallel" (p. 8).  Since the two forms of the Biology  12 examination, the examination selected! for study in this thesis, were produced simultaneously from the same table of  specifications,  they met the criteria suggested by Wesman.  Therefore they were judged to be psychologically equivalent. Statistical statistical  parallelism.  In addition to Gullicksen's (1950)  criteria of equal means,  variances,  and reliabilities,  Loret et al (1974, p. 8), and Holmes (1981, p. 80) employed the intertest  correlation coefficient  as a test for parallel forms.  corrected for attenuation, The basic formula for this  c  pab,  coefficient  is:  cpab = Pab/^(PaaPbb),  (1)  where cPab is the intertest correlation corrected for attenuation, pab Paa  is the intertest correlation, and and pbb are the internal consistency  reliabilities of  Form A and Form B. (Cronbach, 1971, p. 489).  A value of .95 is a commonly accepted cutoff point for equating studies (Linn, 1975, p. 206). Equating  Designs  Angoff (1982) presented three basic designs for use in developing equivalent or, in the case of lack of parallelism, comparable scores.  These designs are shown in Table 2.  1 4  Table 2 Three Random Group Equating Designs  Design  Group 1  Group 2  I  X  Y  II  X:Y  Y:X  III  X, U  Y,U  In Design I, the Equivalent Groups Design, two groups formed by random selection are administered two forms of the same instrument (X and Y). The Equivalent Groups Design employs random groups and assumes equality of the two groups. Design II, the Test-Retest Design, involves the administration of both forms to one group in one order and a randomly different group in reverse orderi thereby allowing for an assessment of a possible order effect.  The order effect may  consist of practice effects or fatigue effects.  The randomization  is often achieved by randomly splitting the group in two by spiralling the test booklets so that half the sample write the forms in one order while the other half write in the reverse order. Design III, the Anchor Test Design, consists of different forms administered to random groups as in Design I but with a set of common or anchor items (labelled U in Table 1).  The anchor  items may be placed within the forms of the test or administered  in a separate form.  Anchor items are placed on all forms of the  instrument so that comparisons between performance of the two groups can be made.  Use of the Anchor Test Design, Design III,  reduces the need for the assumption of equivalency of the two groups required by Design I. Equating Methods An equating method is an empirical procedure for determining a transformation to map the scores on one form of a test onto the scores of a second parallel form of the test (in the the case of parallel tests) or to map the scores of one test onto another (in the case of comparable scores).  There are three  methods for determining the transformation: linear equating, equipercentile equating, and item characteristic curve (ICC) equating (Petersen, Marco, & Stewart, 1982, p. 73).  Linear and  equipercentile equating may be carried out using observed scores (Braun & Holland, 1982, pp. 9-49) within the classical test model; ICC equating is carried out using item response  models.  2  Observed score equating can be carried out if the forms are equally reliable (Angoff, 1982, pp. 58-59)J  Examination of the  internal consistency! reliability estimates for the examinations  revealed that eight of the ten  1989  grade  12  examinations'  reliabilities differ by .02 or less (Ministry of Education, 1989, pp.  Classical test-model theory has been usefully applied to examinee achievement and aptitude test performance for many years. Item Characteristic Curve (ICC) equating, which uses item response theory, is an alternative to classical observed score equating (Lord, 1980). The reader is referred to Hambleton (1989) for a more complete introduction to item response theory. 2  16  66-67).  As the forms in this study are designed following Ministry  procedures, it is likely that equally reliable forms will result.  The  following discussion of equating deals with the case of equally reliable forms  The reader is referred to Angoff (1971) for the  case of unequally reliable forms. Linear Equating In linear equating it is assumed that two scores are equal if they correspond to equal standard score deviates (Angoff  1971,  p. 564):  (Y-My)/Sy = ( X - M ) / s , x  where M and M and s y  x  y  and s  x  x  (2)  are, respectively, the means and  standard deviations of Form Y and Form X. The linear nature is more apparent if the equation is rewritten to give:  Y = AX+B  (3)  with coefficients A and B defined according to the equating design. Design I Equivalent Groups Design The equivalent groups design consists of one administration of Form X to one group and one administration of Form Y to another group assumed to be equivalent to the first group.  The values of  the coefficients A and B can be calculated according to the following  equations:  A = Sy/Sx  (4)  and B = My-Alvl (Angoff,  1971,  (5)  x  pp. 564-575).  Design II Test-Retest  Design  In the case of the test-retest design each examination form is administered to both group 1 and group'2.  Therefore there are  two means and two standard deviations for each group. for A and B can be calculated according to the following  The values equations  provided the sample sizes are equal:  A = V [ ( s 2 + S y2)/(S x1+s2 )]  (6)  B = 0.5(M +My2)-0.5A(M i+M )  (7)  2  2  y1  x2  and y1  (Angoff,  1971,  x  x2  pp. 564-577).  However as pointed out in Glass & Hopkins (1984), if the sample sizes are not equal, then this must be taken into account when computing values for A and B.  A = V( 2 / 2 ) S  and  y  S  x  The equations then used are:  (8)  B = [(ny2My2+nyiMyi)/(ny2+ny )]1  [(A)(n 2Mx2+nxiM i)/(nx2+n i)], X  x  (9)  x  where s  2  s  2  y  is the weighted average variance on Form Y,  x  is the weighted average variance on Form X,  M i is the mean on Form Y of the group of examinees who wrote y  Form X first, My2 is the mean on Form Y of the group of examinees who wrote Form X second, Mxi is the mean on Form X of the group of examinees who wrote Form X first, M 2 is the mean on Form X of the group of examinees who wrote X  Form X second, and nxi, n 2, yi> n  X  1971,  a n  d y2 are the respective sample sizes (Angoff, n  pp. 574-575).  The weighted average variances are calculated with the equations:  S y = [(ny1-1)S y1+(ny2-1)S y2 + ny (My -My) +ny2(My2-My) ]/(n -1 ) 2  2  2  2  n  2  l  y  (10) and  s x = [(n i-1)s i+(n 2-1)s 2+n (M -M ) +n 2(M -M ) ]/(n -1), 2  2  2  x  x  X  2  X  x1  x1  x  2  X  x2  x  x  (11)  where s  i , s 2. s 2  2 x  X  i , and s 2 , are the variances of the 2  2 y  y  respective subsamples,  and with other variables defined as  before  (Glass & Hopkins, 1984, p. 53).  Angoff (1971) describes a variation of these procedures (p. 575).  The data on Form X for the two half groups are combined and  likewise the data on Form Y for the two half groups.  The equating  is then carried out using equation 4 and equation 5. Design III Anchor Test Design The  anchor test design consists of a set of common items  that can be described as forming a test U. Tests X and Y are then regressed on U to form estimates for the mean (u, and |i ) and x  standard deviation (c? and a ). y  x  The values for A and B can be  calculated according to the following  A = c» /a y  y  equations:  (12)  x  and (13)  B = |Iy-A|I , X  where the estimated means are:  |i = M i+b i(Li -M i) x  \i = y  x  x u  u  u  M +b 2(|iu-Mu2), y2  yU  (14)  (15)  ! \  20  and the estimated standard deviations are:  o- = V [ a x i + b x u i ( a - < J u i ) ]  (  o-y = V [ c x 2 + b y u 2 ( a u - a u 2 ) L  (17)  2  2  2  where b  2  2  u  x  x u  2  2  2  1 6  )  i and b 2 are the regression coefficient X on U and Y on y U  U for the respective subsamples.  (Angoffi 1971, pp. 564-577).  Equipercentile  Equating  Equipercentile equating is based on a definition of equivalent scores (Flanagan, 1951; Lord, 1950) that considers two scores equivalent if their corresponding percentile ranks are equal.  As  such, equipercentile equating makes no assumptions about the score distributions for the two forms.  The procedure first  requires the computation of percentile ranks for the raw score distributions. score.  These-percentile ranks are; plotted against the raw  Smoothing of the line may be done by hand although  computer programs that simulate the smoothing process have been developed (Lindsay and Prichard, 1971).  Scores from the smoothed  curves are plotted against one another and this new curve is then smoothed.  Score conversions are taken from this final smoothed  curve. Equating Error Three error measures are considered when assessing the degree of equivalence as determined by an equating method. first two, the conditional root-mean-square (CRMS) and the  The  average absolute difference (AAD) are used to judge the equivalence between scores with the sample(s) used.  The third,  i  the standard error of equating (s *), is used to assess the degree y  of sampling error in the estimates across| repeated replications. Conditional Root Mean Square The C R M S is a measure of the fit between an observed score on X and the equivalent score on X derived from a knowledge of a score on Y. The formula used to calculate CRMS is: C R M S = V[£(X-Y*)2/n-1],  where  (18)  X  is the the score on Form X '  Y*  is the the score derived frolm Form Y, and  n  is the the number of subjects writing both Form X and Form Y (Bianchini & Loret, 1974, p.  158).  This error is an estimate of "the degree to which a score read from the equating table would differ from the score a pupil would have earned had he been given the equated test" (Jaeger, 1973, p. 7). Holmes (1981) judged discrepancies of 12. points or more (M=100, S.D.=15) as an indication of nonequivalency (p. 137).  In a practical  sense, the magnitude of the error as it pertains to test interpretation can be considered an indicator of equivalency (Holmes, p.136).  As well, the C R M S can serve as an indicator of  the suitability of the method of equating.  22  i  The magnitude of the conditional root mean square can be compared to another similar error estimate in an attempt to assess equivalency of the forms and the* interpretability of the individual scores.  The standard error of measurement (SEM) "can  be viewed as the standard deviation of the discrepancies between a typical examinee's true score and the observed scores over an infinite number of repeated testings" (Crocker & Algina, 1986, p. 150). The Average Absolute Difference The AAD is defined as:  AAD = £|X-Y*|/n  where  (19)  X  is the score on Form X,  Y*  is the score derived from,Form Y and,  n  is the number of pairs of scores.  Like the CRMS, the AAD provides an indication of the degree to which individuals achieve the same score on two forms to be equated. Standard Error of Equating Unlike the C R M S and the AAD, which are indicators of the similarity of an individual's scores on the two forms, the standard error of equating, s *, is a measure of error in the equating y  procedure itself.  The standard error of equating (s *) y  "reflects the  degree to which the equating results would vary if the same  equating procedure and method were applied to different representative samples of pupils" (Jaeger, 1973, p. 7).  Unlike the  C R M S and the AAD, which are defined as above regardless of equating design used, the formula for s * depends upon the design y  used and the equating method employed. Linear Equating Design I.  For Design I, the Equivalent Groups Design, the  error is estimated by the equation:  s * = V[2s2 ( 2 +2)/n], y  y  where s * y  s z  z  (20)  x  is the standard deviation of the error estimate, is the variance of the scores on Form Y,  2 y  is a standard score on Form X, and  x  n  is the total number of examinees in both groups (Angoff, 1971, p.570)3.  Design II.  For Design II, the Test-Retest Design, the  appropriate formula is:  s * = V{s2 (1-r y)[(z2 (l r ) 2]/n}, y  where s  2  r  y  x y  y  X  x  +  xy  +  (21)  is the variance of the Form Y scores, is the correlation between the Form X and Form Y  Braun and Holland (1982, pp.32-36) discuss the more general case of unequal group size and nonnormal score distributions. 3  scores, Z  X  n  is the standard score for each Form X score, and is the number of examinees (Angoff, 1971, p.575).  Design III.  For Design III, the Anchor Test Design, the  appropriate formula is:  s * = V{2s2 (l-r2)[(z2 (l 2) 2]/n}, y  where s  2 y  r  y  x  +r  +  (22)  is the estimated variance of the Form Y scores, is the estimated correlation between the  estimated  Form X and estimated Form Y scores, z n  x  is the standard score for each Form X score, and is the total number of examinees (Angoff, 1971, p.577).  Equipercentile Equating Lord (1982), demonstrated that the magnitude of standard errors of linear equating was one half that of equipercentile error in the middle of the score range and comparatively smaller at the extremes (p. 173).  Even though Lord derives standard errors of  equipercentile equating for Design I and Design II, he concludes that linear equating is clearly the method of choice (p.174).  Given  Lord's strong preference and, as will be reported in Chapter 5, only the equipercentile standard error of equating for Design I is discussed here. Design I.  For Design I, the Equivalent Groups Design, the  error is estimated by the equation:  (22)  Sy* = V [ s y p q ( n " + n - )/(t) ] 1  2  x  where s  2 y  1  2  y  is the estimated variance of the Form Y scores,  p  is the proportion of scores below a score, xn,  q  is  <> (  is the standard normal density: at the unit normal below  1-p,  which p of the cases fall, n  x  is the number of examinees writing Form X, and  n  y  is the number of examinees writing Form Y,  (Petersen,  Kolen & Hoover, 1989, p. 251)  Design II.  For Design II, the Test-Retest Design, the  appropriate formula and related discussion is found in Lord (1982). Design III.  For Design III, the Anchor Test Design, the  appropriate formula and related discussion is found in Jarjoura & Kolen (1985). Use of Standard Error of Equating The standard error of equating (s *) was used to judge the y  the equivalency of the two forms in the Anchor Test Study.  Linn  (1975) reported errors of less than one score point were found in the Anchor Test Study (p. 207). Dorans and Lawrence (1990) used this error (S *) to place a y  heuristic confidence band around the equating function (p. 247). They suggested that the null hypothesis for equating be that the equating function equals the identity function (p. 253).  They then  suggested placing bands of plus or minus two standard errors of equating ( ± 2 S * ) around the equating function to define the region y  in which the identity function must fall if the tests are judged to be equivalent at the .05 level of significance.  To present this  information, Dorans and Lawrence suggested using a graph with the ratio of the difference between predicted score and observed score divided by the standard error of equating (s *) plotted on the y  ordinate and the observed score of a test plotted on the abscissa. Forms were judged equivalent if the plotted function does not exceed bands of ± 2 units. While attractive due to its ease of interpretation, the graph of this ratio versus raw score has two flaws.  First, the standard  error of equating is not constant throughout the range of scores, but this is not apparent using this graph since the standard error of equating is not plotted.  Second, the plot of the equating  function is distorted, giving the appearance of being least equivalent near the mean of the scores.  Both these shortcomings  are corrected by first using bands formed;by ± 1 . 9 6 S y * which are placed around the line y=0.  The line ,y=0, is equivalent to the  difference between the equating function and the identity function if the forms are perfectly equivalent. Forms are judged as equivalent if the plot of difference between the equating function and  the identity function falls within the standard error bands.  This plot accurately displays both the nature and magnitude of the equating function and the standard error of equating.  Magnitude of the Standard Error of Equating Sample size, equating design and equating method all affect the magnitude of the standard error of equating. Within a given equating method, sample size and design interact to determine the magnitude of equating error.  With Design I, large sample sizes  may be needed to reduce the standard error of equating.  For  example, Dorans and Lawrence (1990) used two random groups of 48,639 and 43,845 subjects in an equating study in which the statistical equivalence  of nearly identical test editions of the  Scholastic Aptitude Test (SAT) was investigated.  Angoff (1971)  found that for a test-retest reliability of .80, ten times the sample size is needed to reduce the standard error of equating Design I to the standard error of equating Design II (p. 575)..  With  Design III, the anchor test scores are used to reduce error caused by group differences, whether the differences or non-random group assignment.  result from random  Continuing his earlier example,  Angoff states that when the correlation between the anchor test and the forms to be equated equals .87, the equating error is onefourth as large at the mean as it would be using Design I. The sample size using Design I would need to be four times that of the Design III sample if the same magnitude of error is desired. Given a set sample size and set equating design, the method of equating also influences equating error.  As described earlier,  the error of equipercentile equating is generally larger than that of linear equating throughout most of the score range.  Selection of Data Collection Design Since this study will use forms as similar as possible to a British Columbia Grade 12 Biology Examination, the content area to be tested includes the entire core curriculum which is taught throughout the course.  Also, since students should be given the  opportunity to learn the material to be tested, the testing should take place as near to the end of a school year.  However, teachers,  wary of the pending examinations, express concern about time that can be spared for a research project.  It was therefore decided,  that given limited time for testing and in light of cost considerations,  Design II, the Test-Retest Design, (see Table 2)  would be employed to collect the data.  Further, these restrictions  were the reasons that two rather than three test forms would be constructed at one time for a single subject, Biology 12. Selection of Equating Model Linear equating is appropriate if the X and Y distributions differ only in the means and standard deviations (Braun & Holland, 1982, p. 17).  As it is totally analytical and free from smoothing  errors, it is the preferred method if the differences  between the  shapes of the score distributions can be regarded as trivial. Dorans and Lawrence (1990) support the choice of the linear method arguing that because the null hypothesis for parallel forms is a linear identity function, it does not make sense to use equipercentile equating if forms are expected to be nearly parallel (p. 253).  If a linear equating function cannot be assumed, then equipercentile equating can be used.  However, the lack of  assumptions about the score distributions and subsequent wider applicability has a price.  The error of equipercentile equating is  larger than that of linear equating throughout the score range (Lord, 1982, pp. 173-174).  This makes it less powerful in  detecting nonequivalency between forms (Dorans & Lawrence, 1990, p. 253).  Ceiling and floor effects allow equipercentile  equating to be used when a linear equating produces equated score outside of the test score range.  This occurred in the Anchor Test  Study (Bianchini & Loret, 1974). Skaggs & Lissitz (1986) suggest that no single method of equating is consistently superior to others.  If tests are reliable,  nearly equal in difficulty, and samples are nearly equal in ability, then most linear, equipercentile, and IRT models will produce satisfactory results (p. 523).  For small sample sizes (n<1000), it  is unclear which of these methods yields the best results.  Earlier,  Kolen and Whitney (1982), using the Test of General Educational Development in an equating study with a sample size of 200, found that linear and Rasch (a one-parameter logistic item  response  model) equating methods produced the acceptable results for horizontal  equating  4  of achievement tests, while equipercentile  and three-parameter logistic (item response  model) equating  methods produced unacceptable results.  Equating tests of similar difficulty is referred to as horizontal equating.  The examination forms for this study were simultaneously assembled according to the specifications of a common specification table.  The intent of the new Ministry procedure is to  construct equivalent forms. difficulty.  There should be no difference in  Equally reliable test forms would also be  expected  since most examination pairs are already equally reliable.  A test-  retest design is being used; there should be no difference in ability of the examinees from one testing day to the next.  Under these  conditions, linear equating, (Petersen, Cook, & Stocking,  1983)  using linear method for equally reliable tests (equations  3-6),  appeared to be the most appropriate method to follow.  However,  given the uncertainty in the literature about which method is superior, it was also decided that both Angoff's linear procedure for Design II in which the data are pooled across order of administration, and equipercentile equating using pooled data would be performed. proves feasable,  If the linear equating using pooled data  only this method will be retained.  CHAPTER 3 EXAMINATION CONSTRUCTION As described in Chapter 1, beginning with the 1989/90 school year the three forms of each course examination are to be constructed at the same time in an attempt to produce a greater degree of equivalency among the three forms.  The degree to which  this desired outcome is met was tested in this study. restrictions of time for testing,  Due to  two rather than three test forms  were constructed at one time for a single subject, Biology 12. construction of these two forms is described in the  The  present  chapter. Content, Materials, and Assembly Team Choice of Subject The choice of Biology 12 was based on three factors. there was a large enrollment in this course.  First,  The large enrollment  means a large potential sample size representing a wide cross section of Grade 12 students.  Second, there was a sufficiently  large number of multiple choice items available that would be suitable for use with the 1990 Table of Specifications adopted for this study.  Finally, an intact experienced team of school  teachers  was available to assemble the Biology 12 forms from the pool of available  items.  Number of Forms Two  forms, each containing 52 items, were produced.  Producing three forms, as is the intention of the full Ministry examination program, would have required a larger sample of both  items and people as well as increased testing time, all factors which could not be accommodated in the present study.  The forms  were administered to students near the end of the school year to better approximate the actual time of writing. willing to cooperate,  Teachers, while  were willing to administer only two forms.  However, if the two forms for this study, which were constructed following procedures set forth by the Ministry of Education, were found to be equivalent, then it would be reasonable to expect the process would be capable of yielding three equivalent forms. Contrarily, if the two forms were found not to be equivalent, then it would be impossible that three forms would be. Type of Item In 1987, 1988, and 1989, the provincial Biology 12 examination has consisted of approximately 50 multiple choice and 5 supply items.  These items formed the pool of field tested items  that were used for construction of the two forms administered in the present study.  Since copies of previous examinations may be  obtained from the Ministry, it is likely that each student would be familiar with at least some of the items in this pool, particularly since many teachers often use these items as review material. This creates a potential problem.  An item that is recognized by an  examinee because of previous experience with that item would be more likely to be answered correctly than if the item were new to the examinee.  If two examination forms are created to be of equal  difficulty  previous  using  item difficulty statistics, yet  one  form  contains more items that are recognized by the examinees sitting  the two forms, this form will appear easier due to the items.  recognized  Since the items to be used were to be selected from a pool  of items available to the examinees, the greater number of multiple choice items than supply items provided a larger pool from which to select the items for the two forms to be constructed.  This greater number of multiple choice items means  that the likelihood of students recognizing a multiple choice  item  would be lower in comparison to the likelihood of recognizing a supply item.  More importantly, the frequency of item recognition  over the entire examinee population should be randomly equal on the two forms.  With the much smaller sample of supply items,  recognition of items is more likely to occur and an unequal number of items recognized on each form is more likely to result.  Thus,  given the limited time for testing and the greater number of potential multiple choice items, the decision was taken to include only multiple choice items on the two forms. Content Tested The Biology 12 curriculum consists of a core area and six options.  The provincial examination for Biology 12 reflects this  structure; students must write a set of core items and then select two of the six options.  Assuming for the sake of argument that  each option is chosen by an equal number of examinees, then each option is attempted by approximately 1/3 of the total examinees. If it were essential to obtain item statistics on the optional items equivalent  in precision to the statistics for items included  in the core area, a sample of examinees three times as large as  that for the core items would be needed.  Consequently, given the  restrictions mentioned earlier, the sample of items for the tests used in this study was restricted to the core content only. Item Bank The item bank available consisted of 288 multiple choice items in the core topics of 'cells' and 'humans' that were used on the 1987, 1988, and 1989 Biology 12 examinations.  However, only  the 184 items from the 1987 and 1988 examinations had any available item statistics, and only the percentage chose each option was available.  of students who  Statistics for the 1989 items  were not available at the time of examination  assembly.  Examination Assembly Team The team of five teachers responsible for the construction of the two forms of the Biology 12 examination used in this study were members of a team currently involved in the Ministry's item banking project for Biology 12.  The five teachers each had  received training and gained experience  in item writing, item  classification, and mini-test construction.  The five teachers  established  had  a stable working relationship through working on the  item banking project for over a year.  The; team chairperson was a  teacher who, for several years prior to the item banking project, had served on the Ministry's Biology 12 examination production team.  As such, they were comparable in knowledge and experience  to the Ministry of Education teams that construct the Biology 12 examinations.  Due to illness, one member of the team later  dropped out of the project.  Table of  Specifications  The specifications table used, presented in Table 3, was adapted from the 1990 Table of Specifications for Biology 12. Table 3 1990 Biology 12 Table of Specifications Cognitive K*  Topic/subtopic Methods & principles I Experimental Design II Homeostasis Cells III Cell compounds IV Ultrastructure V Ultraprocesses VI Photosynthesis  XII  U/A  HMP  Total  5 ITEMS  1 1  Subtotal Humans VII VIII IX X XI  level  19  Cells, Tissues and Digestive Circulatory Nervous Excretion & Respiration Endocrine  Subtotal  10  18  5  33  TOTAL  15  29  8  52  *K = Knowledge  U/A = Understanding/Application  HMP = Higher Mental Process  The  specifications table includes the core topics of 'Cells' and  'Humans' and reflects the restriction to multiple choice items.  It  should be noted that each item was classified by a two dimensional grid of "topic" and "cognitive level".  The British  Columbia Ministry of Education uses Bloom's taxonomy (Bloom, 1956) collapsed to three levels: knowledge (K), understanding/application (U/A), and higher mental processes (HMP).  This table contains the information given to the  examination assembly team to form a basis for their selection of items. The  numbers within the cells are the number of items for  each topic by cognitive level cell.  A total of five items from the  'Cells' and 'Humans' topics are also included by the Ministry in the category 'Methods and Principles'.  Since these five items are  already included within the 'Cells' or 'Humans' topics, they are not included again in the item subtotals or item total. Examination Construction Process The Ministry document titled 'FLOWCHART and TIMELINES' for the Production of the 1991 Examinations DRAFT' (see Appendix A) describes the entire new procedure to be followed to construct the forms of the examination.  Only the stages that are directly  related to this study are discussed below. instructions, the examination specifications  A package consisting of table, items, and item  option p-values (when available) was sent to all team members. The  materials with the exception of copies of the items are  included in Appendix B.  Examination Construction Prior to meeting on the examination assembly day the team leader assigned a topic area and corresponding subtopics Table 3) to each member of the team.  (see  Each team member was  asked to confirm through personal professional judgement the curricular validity of the items referenced on the basis of previous years classification for the subtopics for which they were responsible.  Their judgements were then discussed at the  beginning of the meeting to achieve consensus on the curricular validity.  Only items which the team unanimously judged to  measure the subtopic were retained. Allocation of Items to Forms The published table of specifications (Table 3) gives subtotals for only the very broad topics of 'Cells' and 'Humans' by cognitive level.  Following identification of suitable items,  the  assembly team determined the number of items to be included for each subtopic by cognitive level.  The results of this assignment  are shown in Table 4. To actually assemble the test forms, the team divided into two pairs, each pair responsible for one of the topic areas. Working within each > subtopic by cognitive level sub-cell, each pair first categorized each retained item according to difficulty level. The proportion of examinees answering each item correctly (item p values) were available for the 1987 and 1988 items. difficulty  levels for the  1989  by each pair of team members.  The  items were established  subjectively  Table 4 Form A and Form B Table of Specifications Cognitive Topic/subtopic Form  .  K A,B  Methods & principles I Experimental Design II Homeostasis  a  level U/A A;B  HMP A,B  Subtotal  5 ITEMS  Cells III IV V VI  Cell compounds Ultrastructure Ultraprocesses Photosynthesis  Subtotal Humans VII VIII IX X XI XII Subtotal TOTAL  a  Cells, Tissues Digestive; Circulatory Nervous Excrete & Respire Endocrine  1,1 1,1 3,3 0,0  2,3 3;3 3,2 3,3  1,1 0,0 2,2 0,0  5,5  11,11  3,3  1,1 3,3 2,2 3,3  0,0 4,4 4,4 4,4  0,0 0,0 1,1 0,1  2,2 1,1  4,4 3,3  1,0 0,0  12.12  19,19  2.2  17,17  30,30  5,5  Form A number of Items, Form B number of items.  19,19  33,33  52,52  Once this had been done, items within the same subcell were paired according to similar difficulty. One item of each pair was then randomly assigned to each form.  Three different  occurred during the matching of items.  First, both items in a pair  had p value information. There were 24 such pairs.  situations  Second, an  item with known p value was matched with an item whose difficulty was judged subjectively by the test assembly  team.  There were 17 such pairs; 8 items with known p values were placed on Form A and 9 items on form B.  The remaining 11 pairs  were formed with no prior p value information about either item. In four instances, a matched pair of items within a subtopic area could not be formed.  In these cases, items from different  subtopics within the same topic area, at the same cognitive level, and with similar difficulty levels were combined for purposes.  assignment  The final number of items selected for each cell is  reported in Table 4.  As shown in Table 4, the number of items  classified as either K or U/A is greater than the intended number shown in Table 3.  This difference is due to the lack of availability  of matched item pairs within some subtopics at the HMP cognitive level. Items were placed on each form according to the order given in the table of specifications (see Table 4). form were prepared.  Draft copies of each  These draft forms were checked by the team  for duplication of items or the possibility that information in one item could be used to correctly answer another item.  Assessment  of Comparability  The composition of the two forms is shown in Table 4 and a copy of each examination form is provided in Appendix C .  5  The  item subtotals for both the Topic by Cognitive Level and Subtopic by Cognitive Level are identical on both forms. differences the  The only  between the structures of the two test forms are in  'Cells by U/A' cell: 'cell compounds' and 'ultraprocesses'  subcells and the 'Humans by HMP' cell: 'nervous' and 'excrete and respire' subcells. any differences  These differences were not expected to create in examinee  performance between the two forms.  The mean p value for the items of known p value was 0.64 for each form. Production of Examination Booklets The final forms of the examination were printed by a professional printer using camera ready copy provided by the researcher.  Prior to printing, the camera ready copy was  reviewed  by a Biology 12 teacher who was not on the examination assembly team.  This was done to produce an independent check of the  technical aspects of the form; the clarity and accuracy of the diagrams; correctness  of spelling; spacing; and item formatting.  F o r m A and Form B were initially referred to as Form I and Form II respectively. The designations, Form I and Form II, are used consistently throughout the appendices. 5  41  CHAPTER 4 METHODOLOGY AND PRELIMINARY ANALYSES Chapter Four consists of two major sections.  In the first  section, the procedure used to select the sample is described.  This  is followed by a description of the test administration and scoring procedures.  The second section contains a description of the  preliminary analysis of the data.  The purposes of the preliminary  analysis were to determine the characteristics of the sample and the characteristics of the two forms that were relevant to the choice of equating methods used to test the equivalence of the two forms. Sample Selection Sample Size The desired sample size was set at 500.  For forms that  should be very nearly parallel, one might want to detect a one mark difference between the means using an alpha level of .05.  Using  these values, and in light of the results of previous Biology 12 examinations (standard deviation 10 for approximately 50  items),  the t-test between independent groups indicated a sample size of 400 would provide the statistical precision necessary to test the equivalence of the two forms using the test-retest design with independent groups (Angoff, 1971, pp. 58-59). School  Selection  Sample schools were selected  in two stages.  First, officials  in four school districts in the Okanagan, Thompson, and North Thompson geographical areas in British Columbia were asked for  permission to approach Biology 12 teachers in their districts to request that they and their students participate in the study Appendix D).  (see  These four districts were selected so that the  combined Biology 12 course enrollment was greater than 700, thereby allowing for possible and student levels.  nonresponse  at the district, school,  Initial verbal inquiries of teachers in these  districts yielded a high expressed willingness  to participate;  therefore an initial target sample size of 700 was judged adequate. Permission was granted in three districts.  Officials in the  fourth district indicated that they did not wish to have students in their district approached. one requested.  No reason for refusal was given nor was  Attempts to replace this district with a district of  similar population size in the same region were unsuccessful to the lateness of the request.  due  Consequently, the potential sample  size at the end of stage 1 was approximately 600. In the second stage, a letter was sent to all Biology 12 teachers within participating districts.  This letter contained  a  brief explanation of the process of the study and a request that their Biology 12 classes take part (see Appendix D).  Of the 16  schools in which these teachers taught, teachers in 15 schools initially agreed.  The single Biology 12 teacher in the remaining  school explained that because of time considerations would not be able to participate.  his classes  All Biology 12 teachers in each  of the remaining 15 sample schools agreed to administer the two forms of the examination to all students in their Biology 12  classes.  Based on teacher enrollment figures the total number of  potential subjects was 597.  This figure of 597 is somewhat high  as teachers who did not have exact figures readily available when contacted were asked to overestimate.  As this estimate was used  to determine the number of examination forms to be printed, overestimation was  more desirable than underestimation.  The request for permission letters along with administration instructions and the 'Request for Ethical Review' form were sent to The University of British Columbia Behavioral Sciences Screening Committee For Research and Other Studies Involving Human Subjects.  A copy of the Certificate of Approval received is  provided in Appendix D. Examination Administration Students were tested by their teachers during the last four weeks of the school year.  The test administration design and the  estimated enrollment as reported by the teachers for their respective schools are shown in Table 5.  As shown, Form A was  scheduled first in 5 schools and Form B first in 10 schools. there are advantages  While  to interleaving the forms within classes,  students within a given class were all given the same form at the the same time.  Further, all classes within a school were given the  forms in the same order and on the same day.  The concern that  students would share information between test sittings was reason for both practices.  the  Teachers were asked to allow up to a maximum of 5 school days between administrations and not to administer both forms at one  sitting.  Table 5 Test Administration Design and Estimated Enrollment School  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15  Enrollment Order  AB  AB BA BA BA BA BA BA AB BA AB BA BA BA AB AB  32  Total Enrollment  BA  42 27 35 35 35 25 10 14 46 41 7 73 130 45 291  306  n=597  Beyond these instructions teachers were allowed to schedule the examinations at their convenience.  This was necessary given the  time of year at which the testing was completed.  During the last  four weeks of school there is a general increase in the number of activities that impinge upon the time spent in the classroom. However, despite the existence of these activities, it was  necessary to administer the tests toward the end of the school year as the content examined required the students to have completed the core sections of the Biology 12 curriculum. Teachers were asked not to inform the students following the first test administration that they (the students) tested with the second form.  would be  Further, they were asked not to  discuss content relevant to the examinations during the interval between  testings.  The necessary number of test forms (see Appendix D) and instructions (see Appendix E) were sent to each cooperating school during the first week of May. to write each test form.  Students were allowed 40 minutes  Students recorded their responses on an  NCS General Purpose Answer Sheet.  The same sheet was used on  both occasions with Form A answers recorded on side 1 and Form B answers on side 2.  Following both test administrations, teachers  collected the answer sheets and returned the sheets to the researcher for scoring and analysis. Scoring and Data Preparation The completed answer sheets were processed using the optical scanner maintained by the Educational Measurement Research Group (EMRG) at the University of British Columbia.  The  tests were scored and an item analysis was performed using L E R T A P (Nelson, 1974).  Scoring and statistical analyses were  completed using various S P S S - X (SPSS Inc., 1988) computer programs.  The analyses were performed on the Amdahl 470/V8  computer maintained by the University Computing Centre at the University of Alberta. Preliminary  Analyses  Response Rate Altogether, answer sheets were received for 312 or 52.3% of total teacher estimated enrollment.  students,  After initial  agreement to participate, teachers at two schools opted out of the study, reducing potential sample size by 175 students. given for opting out was lack of time.  The reason  An additional 73 answer  sheets from one school were lost in the mail.  The mail loss, the  two schools opting out, and the 312 completed forms account for 560 of the estimated 597 students. likely due to a number of causes.  The difference of 37(6.19%) is Students who had written one or  neither of the two forms should not have had an answer sheet returned.  Class enrollments may decrease slightly due to the  student dropping Biology 12 near the end of the school year From the answer sheets received, those from one school with seven students were deleted due to failure to follow instructions. The students in this school were allowed to begin their second form once they had completed their first form rather than waiting until the next class, the stipulated minimum amount of time i'  between testings.  Eight students who, due to absence, completed  only one of the two forms were removed from the sample.  Lastly,  11 students who failed to see Item #52 Form A were removed. This problem will be discussed in detail in the preliminary  classical item analysis.  Thus, the final number of usable Form A  and Form B pairs was 286. The loss of 248 students due to schools opting out and mail loss reduces the the sample size considerably.  The schools were  urban schools that enrolled an estimated two to six classes of Biology 12.  The 1990 Biology 12 examination means and standard  deviations for the two schools that opted out revealed little difference between either of these schools and the means and standard deviations for the district.  Based on these results there  is no suggestion of bias due to the loss of these two schools. Preliminary Classical  Item Analysis  The student responses were analyzed using LERTAP (Nelson, 1974).  Inspection of the percentage of students who responded to  each item in each form revealed that the last item in Form A had a larger nonresponse than the preceding 51 items.  Twelve subjects  (4.0 %) of the sample did not respond. This non response rate is substantially higher than the mean item nonresponse rate of 0.1% (S.D=0.2%) for the remaining items. The value of the point biserial, 0.00, for the nonresponse category indicates no correlation between the ability of the subjects and the lack of response to this item.  Further Item #52  appeared alone on the second to the last page while the last page was blank.  The placement of the item on the form, together with  the item results, suggested that the 12 students likely did not see this  item.  However, of the 12 students, one subject also omitted Item  #51.  Thus, in contrast to the other 11 students, it seemed plausible that this particular student omitted Item #52 than poor placement of the item.  for reasons  other  Consequently, only 11 students  were deleted from the sample. Order of Administration The  order of administration was tested with a 2x2 (order of  administration-by-test  form) fixed  measures on the second factor.  effects A N O V A  with  repeated  The results are summarized in  Table 6. As revealed in Table 6, there is no significant examination form order of administration effect (F=1.33, p<.25) but there is a significant examination form effect (F=25.18, p<.05). Table 6 Summary ANOVA Part of model  df  MS  F  P  Between Subjects Order Subject(order)  1, 284  209.51 156.99  1.33  .25  Within subjects Form Order*Form Form*Subjects(order) Error  1, 284 1, 284 1, 284  313.56 61.45 156.99 12.45  25.18 4.93 12.61  .00 .03 .00  There is also a significant examination form by administration order interaction (F=12.16, p<.25).  Of interest here was whether  this  interaction was  attributable to differences  between the two examination occasions.  within form  Examination of the cell  (form by administration order) means and application of multiple comparison procedure revealed that these were not significant (p<.05).  Scheffe's  differences  The significant interaction was  attributable to differences between  means on the two forms  within schools in which the two test forms were administered in the same order.  In the sample of eight schools in which Form B  was administered first, the Form B means exceeded the Form A means in all but one school (a small school: n=9).  In the 3 schools  in which Form A was administered first, the Form B means were greater (see Table 7). Table 7 Cell Means Order n Form means Form A Form B  AB 84  BA 202  29.12 (56.00%) 31.46 (60.50%)  28.51 (54.83%) 29.42 (56.58%)  Deletion of this school from the analysis revealed little change in the results.  Consequently, given the lack of known cause for this  one school's results and the lack of significant change when this school was removed from the sample, the students in this school were retained.  CHAPTER 5 . ANALYSES AND RESULTS Chapter Five begins with the presentation of the psychometric characteristics of the two forms.  The  characteristics of the two forms are then used to justify the equating techniques used.  The second portion of the chapter  consists of a description of the analyses and presentation of results in the sequence in which the analyses were conducted.  The  chapter concludes with an exploration of the source of the nonequivalence  identified. Pre-Equating  Statistical  Properties of the  Analyses  Forms  Presented in Table 8 are the mean, standard deviation, range, internal consistency  (Hoyt, 1941), standard error of  measurement  (SEM), skewness, and kurtosis of the distributions of scores on Form A and Form B. As shown, the mean scores of Form A, 28.69 (55.17%), and Form B, 30.02(57.73%), are significantly different (t=5.02, p<.05), while the standard deviations (t=.58, p<.56) and internal consistency  estimates (t=0.92, p<.36) are not (see  1, Hypotheses 1, 2, and 3).  Chapter  The value of skewness and kurtosis for  Form A and for Form B are, respectively, within 1 standard error of each other.  The ihtercorrelation coefficient  corrected for  attenuation (.97) exceeds the value of .95 that Linn (1975) describes as a commonly accepted cut off for equating studies Chapter 1, Hypothesis 4).  (see  51  Table 8 Psychometric Properties of the Two Forms n=286  Mean Standard deviation Minimum score Maximum score Internal consistency SEM Skewness Kurtosis Correlation coefficient Disattenuated correlation 3  Form A  Form B  28.69 9.46 9.00 51.00 .89 3.14 0.24 2.13  30.02 8.97 12.00 51.00 .87 3.15 0.28 2.24  ;  .85 .97  i  a Hoyt (1941) Equating Linear Equating Angoff's Design II Since the internal consistency of the two tests did not differ significantly,  and the  attenuation (.97)  intercorrelation coefficient  exceeded  corrected for  .95, Lord's (1950) linear equating  method for equally reliable tests for Design II (Angoff, 1971) was used.  1982;  Although Angoff described a variant of Design II  in which the data are pooled across order of administration (see Chapter 2), the Form A and Form B samples were not initially combined.  Before equating, the weighted means and variances for  each form to be used for the linear equating were compared with the pooled estimates that Angoff suggested could be used.  As  shown in Table 9, the weighted means and variances and the corresponding pooled means and variances for Form A and for Form  B were found to be identical (less than 0.1% difference in the Form A and Form B variances).  The equivalence of the weighted and  pooled estimates means the equating results based on the pooled variances would be identical to those obtained using the variances.  weighted  All equatings were completed using the pooled data.  Table 9 Weighted and Pooled Means and Variances Sample  number of cases  mean  variance  SD  Form A AB BA Weighted Pooled  84 202 286 286  29.12 28.51 28.69 28.69 •  99.60 85.56 89.42 89.39  9.98 9.25 9.46 9.46  Form B AB BA Weighted Pooled  84 202 286 286  31.46 29.42 30.02 30.02  88.74 76.04 80.33 80.41  9.42 8.72 8.96 8.97  Linear Equating The results of the linear equating using the pooled estimates are summarized in Table 10 and illustrated in Figure 1.  As shown  in Table 10, while the means of the derived score on Form B and the obtained scores on Form B are equal, the C R M S is large, 4.87. The value for the AAD is also large, 3.74.  Further, the  corresponding ranges of the differences between B and B*, -19 to 13, is wide.  Table 10 Results of the Linear Equating Variable  Mean  SD  Score A Score B Score B* B-B* AAD  28.69 30.02 30.01 0.01 3.74  9.46 8.97 8.97 4.87 3.11  Range 9 12 12 -19 0  51 51 51 13 19  Figure 1 is a graph of derived Form B scores vs Form A scores.  The identity function, B=A, has been added to the graph to  serve as a point of reference as this is the line that would result if the two forms were perfectly equivalent. Examination of Figure 1 reveals that the significant difference of -1.33 (2.56%) observed between the mean on Form A and the mean on Form B (Table 10) does not appear to be constant across all performance levels.  That is, the equating line is not  parallel to the linear identity function B=A.  In particular, students  at the lower performance levels (below a score of 26 (50.0%) on Form A) would have received derived scores 2 marks (3.8%) above that had there been perfect equivalence.  In contrast, for those in  the middle range of performance the difference was about 1 mark while the difference between the derived score and the score received had there been perfect equivalence for the the most able examinees (above a score of 39 (75.0%) on Form A) was essentially zero.  Figure 1 Linear Equating 60  -i  o -jr—>  0  1  10  •  1  20  •  1  •  1  30 40 Score A  '  1  50  •  1  60  These linear equating results lend support to that conclusion of nonequivalency derived from the significant difference of the means, and adds the suggestion that the nonequivalency arises because of the poor fit between the derived score and observed scores on Form B for the less able examinees.  Collectively, these  results confirm the finding that the two forms were not equivalent for the sample of students include in this, study. Equipercentile  Equating  The steps followed to complete the equipercentile equating were as  follows:  1.  A table of relative cumulative frequencies was prepared  separately for the distribution of scores on Form A and the distribution of scores on Form B. 2.  The cumulative frequencies and the raw scores for each  distribution were plotted on arithmetic graph paper with raw scores placed on the horizontal axis.  Hand smoothed curves were  drawn through the plots of the two distributions. 3.  Score values from the smoothed plots were read and recorded  for  each percentile within each distribution.  4.  The score values from step 3 for Form A were plotted against  the score values for Form B and a smoothed curve drawn. 5.  A table of equivalent score values was prepared from this final  curve.  (Angoff, 1971, pp. 571-576) Tables of cumulative frequencies and score values (step 1,  step 3, and step 5) are provided in Appendix F. Equipercentile equating results are summarized in Table 11 and illustrated in Figure 2.  The means for Form A and Form B and  the derived scores on B along with their corresponding standard deviations are reported in Table 11 together with the values of the CRMS and AAD. The  mean difference, B-B*, (0.18) is likely due to rounding  error and smoothing error at the various stages of the equipercentile equating process.  The C R M S is large, 4.86.  value for the A A D is also large, 3.73.  The  Further, the corresponding  ranges of the differences between B and B*, -20 to 13, is wide.  Table 11 Results of Equipercentile Equating Variable  Mean  SD  Score A Score B Score B* B-B* AAD  28.69 30.02 29.84 0.18 3.73  9.46 8.97 8.91 4.86 3.10  Range 9 12 11 -20 0  51 51 51 13 20  Examination of Figure 2 reveals that the significant difference of -1.33 (2.56%) observed between the mean on Form A and the mean on Form B ( Table 11) does not appear to be constant across all performance levels.  That is, the equating line is not  parallel to the linear identity function B=A.  Figure 2 Equipercentile Equating 60 -i 50 -  DQ  40 -  CD \—  o o CO  30 20 -  —o—  B=A  — E q u a t i n g  Function  10 -  10  20  T T 30 40 Score A  50  60  In particular, students at the lower performance levels (below a score of 22 (42.3%) on Form A) would have received derived scores 2 marks (3.8%) above that had there been perfect equivalence.  In  contrast, for those in the middle range of performance the difference was about 1 mark, while the difference between the derived score and the score received had there been perfect equivalence for the the most able examinees (above a score of 37 (71.2%) on Form A) was essentially zero.  These equipercentile  equating results lend support to that conclusion of nonequivalency derived from the significant difference of the means and adds the suggestion that the nonequivalency arises because of the poor fit between the derived score and observed scores on Form B for the less able  examinees.  Comparison of the Two Equatinas The results of the linear and equipercentile equatings were very similar.  The conditional root mean squares for the two  equatings were 4.87 (9.37%), and 4.86 (9.35%) respectively.  In  each case a mean absolute difference between observed score on Form B and predicted score on Form B were 3.74 (7.19%) and 3.73 (7.17%)  respectively.  The two equating results also suggest the two forms perform differently for varying ability levels.  There is approximately a 2  mark difference at the lower level of examination performance, about a 1 mark difference for the middle range of scores, and no difference for the upper level of achievement.  Earlier in the chapter, given that the linear weighted and linear pooled means and variances agreed, the decision to pool was taken.  Using the pooled data, the equipercentile equating results  agreed with the results of the linear equating.  As there is no  advantage to equipercentile equating as demonstrated by the C R M S and the A A D , all subsequent analyses were conducted using linear equating with the pooled sample of 286 Differential  students.  Performance at Lower Achievement Levels  Both equatings revealed greater differences in the performance on the two forms by lower achieving students.  This  is contrary to a suggestion by Petersen, Kolen, and Hoover (1989) that test forms may appear equivalent at chance levels of achievement  (p. 243).  The interpretability of scores, particularly  the lower scores, may be questionable due to the phenomenon of guessing.  If these lower scores are suspect, then the significant  differences found may be in fact due to guessing.  Various studies  indicate, however, there is no clear advantage to either applying a correction for chance formula or removing chance scores from the sample (Angoff, 1989; Albanese, 1988; Donlon, 1981; Tinkelman, 1971). In contrast, the performance of high scoring students was similar on both forms.  This may be explained by the possible  presence of a ceiling effect.  If both forms are too easy, then high  performing students would be expected to achieve similar and near perfect scores.  However, the observed score distributions do not  show the strong negative skewing (see Table 8) that would be  expected if a ceiling effect was present. explanation  was  Therefore this  rejected.  Assessment of Equivalence of Forms CRMS and AAD The conditional root mean square, C R M S , was described in Chapter 2 as an indicator of the degree to which a score earned by a student would differ from a score derived for that student from an equating table.  Similarly the absolute average difference, AAD,  provides an indication of the degree to which an individual would achieve the same score on the two forms to be equated. As reported in Table 8, the standard error of  measurement  (SEM) was 3.15 (6.1%) and the CRMS was 4.87 (9.37%) (Table 10). The S E M provides a logical lower bound for the C R M S .  Form B  cannot be more parallel to Form A than Form A is to itself nor can Form A be more parallel to Form B than Form B is to itself.  The  C R M S was 1.55 times greater than the S E M . The AAD, like the CRMS, can serve as an indicator of usefulness or interpretability of the parallel form scores. , As reported in Table 10, the AAD was 3.73 (7.17%). situation, the AAD would be zero.  In the ideal  This value is judged too large to  be an indication that,Form A and Form B are parallel forms. Standard Error of Equating The standard error of equating (s *) D  reflects the degree to  which the equating results would vary if the equating repeated with a different sample.  was  The modified Dorans and  Lawrence (1990) approach described in Chapter 2 allows a  graphical interpretation of equivalence.  As stated there, the  confidence bands are established about a line that would be obtained given exact equivalency.  As shown in Figure 3, the plot of  the equating function cuts through the 95% confidence band limits about the identity function.  As the difference between the  equating function and the identity function exceeds the standard error of equating, the two forms are judged to be overall nonequivalent (see Chapter 1, Hypotheses 5). Figure 3 Standard Error of Equating Total Examination  As suggested by Figures 1 and 2, and confirmed by Figure 3, while the forms are equivalent for the highest achievers, for most of the  61  score range, a difference that is too large to be attributed to equating error is found.  i Nonequivalence  Source of Nonequivalence As described in Chapter 3, prior p value information was available and used to form 24 pairs of items. pair was randomly assigned to each form.  One item from each  An additional 17 pairs  of items in which only one item had a prior p value were created. Again, one item from each pair was assigned to each form so that 8 items with prior p value information were placed on one form and 9 items on the other form.  For the remaining 11 pairs of items  judgements were made about item p value similarity prior to item assignment to form.  The three different conditions of p value  information within item pairs were the basis for the formation of three  subtests.  Equating of  Subtests  It was hypothesized that the lack of parallelism noted in the two forms was attributable to differences  arising from an  inability to match items on difficulty level when p values were not available.  To test this hypothesis, the linear equating was  replicated for each of the three subtests now formed within each test.  The equating results were examined using the modified  Dorans and Lawrence (1990) graphical analysis.  As the separate  subtests are comprised of varying numbers of items, all reported figures on the graphs are percentages comparison among the three.  in order to facilitate  As previously described, the first pair of subtests each of 24 items.  consisted  Difficulty indices (p values) were known for all these  items prior to administration.  The mean p value for Form A  subtest items was .64 while for the Form B subtest the mean p value was .65.  Equivalent subtests were expected.  As shown in  Figure 4, the equating function lies between the standard error of equating  bands, ± 1 . 9 6 s * , the two forms comprised of pairs of D  items for which p values were known for both items are equivalent. Figure 4 Standard Error of Equating P Values Known for Both Items  The second set of subtests consisted of 17 items each.  Eight  items with known p values were placed on Form A while the remaining nine items with known p values Were placed on Form B. The Form A subtest mean p value was .62 for the items with prior p values.  The Form B subtest had a mean p value of .63 for the  items with prior p values.  As with the first pair of subtests  equivalent tests were expected.  As seen in Figure 5, subtests in  which the p value for only one of the pair of items were known also resulted in equivalent forms. Figure 5 Standard Errror of Equating P Value Known One Item Only Equating Function Standard Error  4 -  Standard Error Exact Equivalency  CD o o  2 -  "O CD -•—» co '> CD Q  0- o  CO  -H...  — i —  20  -*  —r  —r40  *•  *"  -  60  80  1 00  1 20  Score A  The third set of subtests consisted of 11 items each. these items had prior p values.  None of  The results for these subtests  formed from pairs of items for which the difficulty was  qualitively judged are markedly different (see subtests are not equivalent.  Figure 6).  These  The source of nonequivalence of the  total forms appears to be due to the pairs of items for which p value data was not available for either item.  Figure 6 Standard Error of Equating P Values Unknown for Both Items 40 n  -10 I 0  •  1  1  20  1  1  1  40  •  •  1  •  60  Score A  i  i  80  •  1  1  100  •  1  1  120  CHAPTER 6 CONCLUSIONS This chapter contains four sections.  First, a summary of the  study including its purpose, procedure, and findings is given. Limitations of the study are then discussed. followed by the conclusions of the study.  The limitations are  The chapter ends with  implications for practice and implications for research. Summary Purpose and Problem Questions have been raised concerning the equivalency of the January, June, and August forms of the British Columbia provincial Grade 12 examinations for a given subject.  The implementation of  a new procedure for constructing these forms began in the 1990/91 school year.  The change in procedure was expected to  improve the ability of the examination construction team to produce equivalent forms.  The purpose of this study was to  duplicate this new procedure and assess the equivalency of the forms that  result.  Procedure An examination construction team, all of whom had previous experience with the British Columbia Ministry of Education's Student Assessment  Branch, simultaneously constructed two  forms of a Biology 12 examination from a common table of specifications examinations.  using a pool of multiple choice items from previous Some of the items had accompanying p value  information at the time of examination construction, while other  items lacked this information.  A sample of students was obtained  in the Okanagan, Thompson, and North Thompson areas of British Columbia.  Both forms were administered to each student, as  required by the test equating design (Design II (Angoff, 1971)) chosen.  The final usable data sample consisted of responses from  286 students, with Form A administered first to a subsample of 84 students and Form B administered first to the other 202 students. Analysis and Results The  data were first analyzed using a classical item analysis  followed by a 2x2 order-by-form fixed effects A N O V A with repeated measures on the second factor.  Classical item analysis  revealed all items on both forms performed satisfactorily, ruling out the alternate hypothesis of a flawed item(s) being the cause of the lack of equivalence found.  The ANOVA  difference in the means of the two forms. order effect was was present.  revealed a significant While no significant  found, a significant order-by-form interaction  No other significant differences between the two  forms were found. Linear and equipercentile equatings were carried out.  Since  the weighted and pooled estimates for the variances of the two forms were found to be equal, the pooled estimates were used for these analyses.  Both equatings yielded essentially the same  results with the same degree of agreement between observed scores and derived scores (B derived from A) in the sample as estimated by the C R M S and the AAD.  Both revealed that students  at high levels of achievement found the forms of equal difficulty  while students at lower levels of achievement found one form to be easier than the other form.  Since the linear and equipercentile  method yielded similar results, and in light of the arbitrariness associated  with equipercentile equating, all further analyses  were  conducted using the linear equating method and computation of standard error of equating.  The forms were judged to be  nonequivalent as the CRMS and AAD were large and a plot of the equating function deviated around the identity function fell outside the range of the standard error of equating estimated for the sample. The source of the nonequivalency was examined by separating each of the two forms into three subtests based on the items possessing or lacking p values at the time of test construction. The first subtest pair of forms consisted of 24 pairs of items with prior p value information available for both items in the pair.  The  second subtest pair of forms were created from an additional 17 pairs of items in which only one item of the pair had a prior p value.  The third pair of subtest forms were created from the  remaining 11 pairs of items for which qualitative judgements had been made about item difficulty prior to item assignment to form. The forms were judged to be equivalent or nonequivalent based on whether the plot of the equating function deviated around the identity function fell inside or outside the range of standard error of equating estimated for the sample.  The first pair of subtests  were judged to be equivalent, as were the second pair of subtests.  Only the pair of subtests formed from items with no p value present for either item were judged to be nonequivalent. Limitations The generalizability of the study is limited by several factors.  This study was restricted to multiple choice items in the  core curriculum of one subject area.  Factors not considered were  supply items, optional areas within a subject area, and other subject areas.  However, since the multiple choice items on the  core areas are most probably the most reliable and stable items, if two equivalent forms of a test using only these items cannot be constructed, there is little hope that equivalency using the other forms of items or other topic areas can be achieved. Regarding the sample, although there was sufficient range in scores to examine the student performance at all plausible achievement levels (scores ranged from 17% to 98%) for the experimental examinations, the score distributions obtained cannot be assumed to represent provincial norms. Conclusions The central problem of this study was: will the new procedure for the construction of equivalent forms from a common table of specifications  result in equivalent forms?  The major  conclusion is that the procedure in its present state cannot be relied on to produce equivalent forms.  Subsequent investigation  following the finding of nonequivalent forms suggests that the lack of equivalency results from the inability of an experienced examination assembly team to accurately match levels of  difficulty for pairs of items which do not have prior item statistics accompanying them.  Conversely, an experienced  examination assembly team is able to produce equivalent forms if prior p value information is available.  This study utilized an item  pool in which the proportion of the items with prior p value information was greatly in excess of the 20% goal for the 1990 examinations (see Appendix A). The lack of equivalency would be expected to worsen when the proportion of items with prior p values is decreased. Equivalency of forms can be judged using a combination of a fit statistic, the conditional root mean square (CRMS), and a measure of sample error in the equating, the standard error of equating  (s *). y  Implications Implications  for Practice  The procedure for examination construction used in this study cannot be expected to produce equivalent forms within a given subject area.  The use of a trained and experienced  examination construction team will not guarantee  equivalent  forms, items with p value information must be used. The magnitude of the CRMS and AAD indicate that a student's score on one form cannot be taken as a reasonable estimate of the score that would have been obtained had a different form been written.  This is of particular importance when interpreting the  student scores obtained on the August supplemental examination. The score from the August supplemental examination is used in  place of a January or June score that is either absent or unsatisfactory.  The group writing the August supplemental cannot  be assumed to be randomly equivalent to the larger population. The awarding of provincial scholarships depend in part on scores achieved on these examinations.  Scores from students  writing an examination in January, for example Biology 12, must be compared with scores for another student writing a different Biology 12 examination in June.  The modified Dorans and Lawrence  (1990) procedure could be used to judge January and June forms as equivalent or nonequivalent in the score range of interest.  If a  pair of forms are equivalent, no equated score need be used, the scores are interchangeable.  If the January and June forms are  judged as nonequivalent in the score range of interest, then an equated score could be given to the January student.  In either  situation, the January and June examinees in a given subject area can be compared. Implications for Future Research The implication that the use of classical item statistics could be used to produce parallel forms should be tested for multiple mark (supply) questions. statistically  parallel optional sections within an  form should also be explored. investigated.  The possibility of creating examination  Other subject areas should be  There should be an attempt to duplicate the results  that suggested equivalent examination forms were produced when item pairs in which only one member of a pair possessed prior item statistics were  used.  71  REFERENCES Albanese, M. A. (1988). The projected impact of the correction for guessing on individual scores. Journal of Educational Measurement. 25. 149-157. Anderson, J . O., Muir, W., Bateson, D. J . , Blackmore D., & Rogers, W.T. (1990). The Impact of Provincial Examinations on Education in British Columbia: General Report. Victoria, B C : Ministry of Education. Angoff, W. H. (1971). Scales, norms, and equivalent scores. In R. L. Thorndike (Ed.), Educational measurement (2nd ed.) (pp. 508-600). Washington, DC: American Council on Education. Angoff, W. H. (1982)! Summary and derivation of equating methods used at ETS. In P. W. Holland & D. B. Rubin (Eds.), Test equating (pp. 55-69). New York: Academic Press. Angoff, W. H. (1989). Does guessing really help? Educational Measurement. 26. 323-336.  Journal of  Bianchini, J . C.,& Loret, P. G . (1974). Anchor Test Study. Final Report. Project report and volumes 1 through 30: (ERIC Nos. ED 092 061 through ED 092 631). Bloom, B. S. (Ed.). (1956). Taxonomy of educational objectivesHandbook 1: The cognitive domain. New York: McKay. Braun, H. I., & Holland, P. W. (1982). Observed-Score equating: A mathematical analysis of some E T S equating procedures. In P. W. Holland & D. B. Rubin (Eds.), Test equating (pp. 9-49). New York: Academic Press. Crocker, L , & Algina J . (1986). Introduction to classical and modern test theory. New York: Holt, Rinehart and Winston. Cronbach, L. J . . (1971). Test validation. In R. L. Thorndike (Ed.), Educational measurement (2nd ed.) (pp. 443-507). Washington, DC: American Council on Education.  Donlon, T. F. (1981). Uninterpretable scores: their implications for testing practice. Journal of Educational Measurement. I S , 213-219. Dorans, N. J . , & Lawrence, I. M. (1990). Checking the equivalence of nearly identical test editions. Applied Measurement in Education. 3. 245-254 Flanagan, J . C. (1951). Units, scores, and norms. In E . F. Linquist (Ed.), Educational measurement, (pp. 695-763). Washington, DC: American Council on Education. Glass, G . V., & Hopkins, K. D. (1984). Statistical methods in education and psychology. Englewood Cliffs, NJ: PrenticeHall Inc. Gulliksen, H. (1950)  Theory of mental tests.  New York: Wiley.  Hambleton, R. K. (1989). Principles and selected applications of item response theory. In R. L. Linn (Ed.), Educational measurement (3rd ed.) (pp. 147-200). New York.: MacMillan Holmes, B. J . (1981). Individually administered intelligence tests: an application of anchor test norming and equating procedures in British Columbia. Unpublished doctoral dissertation, University of British Columbia, Vancouver. Hoyt, C . J . (1941). Test reliability estimated by analysis of variance. Psvchometrika. 6. 153-160. Jaeger, R. M. (1973). The national test-equating study in reading (The anchor test study). N C M E Measurement in Education. 4 (Whole No. 4). Jarjoura, D., & Kolen; M. J . (1985). Standard errors of equipercentile equating for the common item nonequivalent populations design. Journal of Educational Statistics. 10. 143-160.  Kolen, M. J . , & Whitney, D. R. (1982). Comparison of four procedures for equating the tests of general educational development. Journal of Educational Measurement. 1_9_, 279294. Lindsay, C . A., & Prichard, M. A. (1971). An analytical procedure for the equipercentile method of equating tests. Journal of Educational Measurement. 8, 203-207. Linn, R. L. (1975). Anchor Test Study: The long and the short of it. Journal of Educational Measurement. 12. 201-204. Lord, F. M. (1950). Notes on comparable scales for test scores (Research Bulletin), Princeton, NJ: Educational Testing Services. Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Lord, F. M. (1982). The standard error of equipercentile equating. Journal of Educational Statistics. 7165-174. Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley. Loret, P. G . , Seder, A., Bianchini, J . E . , & Vale, C . A. (1974). Anchor test study. Equivalence and norms tables for selected reading achievement tests (grades 4. 5. 6). Washington, DC: U. S. Government Printing Office. Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement. (3rd ed.) (pp. 13-102). New York: MacMillan. Millman, J . , & Linof, J . (1964). The comparability of fifth grade norms of the California, Iowa, and Metropolitan achievement tests. Journal of Educational Measurement. 1. 135-137. Ministry of Education. BC: Author  (1983).  Ministry policy circular.  Victoria,  74  Ministry of Education. (1985). Let's talk about schools: A report to the Minister of Education and the people of British Columbia. Victoria, B C : Author. Ministry of Education. (1988). Victoria, B C : Author.  Biology 12 Report to schools.  Ministry of Education. Author.  Report to schools.  (1989).  Victoria, B C :  Ministry of Education. (1990). Flowchart and Timelines for the Production of the 1991 Examinations Draft. Victoria, B C : Author. Nelson, L. R. (1974). Guide to LERTAP use and interpretation [Computer program manual]. Dunedin, New Zealand: University of Otago, Education Department. Petersen, N. S., Cook, L. L., & Stocking, M. L. (1983). IRT versus conventional equating methods: A comparative study of scale stability. Journal of Educational Statistics. 8. 137-156. Petersen, N. S., Kolen, M. J . , & Hoover, H. D, (1989). In R. L. Linn (Ed.), Scaling, norming, and equating. Educational Measurement (3rd ed.) (pp. 221-262). New York: MacMillan Petersen, N. S., Marco, G . L., & Stewart, E . E. (1982). A test of the adequacy of linear score equating models. In P. W. Holland & D. B. Rubin (Eds.), Test equating (pp. 71-136). New York: Academic Press. Rogers, W.T. (1990). Current educational climate in relation to testing. Alberta Journal of Educational Research. 3JL 52-64. Rogers, W. T., & Holmes, B. J . (1987). Individually administered intelligence test scores: equivalent or comparable? Alberta Journal of Educational Research. 3JL 2-20. Skaggs, G . , & Lissitz, R. W. (1986). IRT test equating: Relevant Issues and a review of recent research. Review of Educational Research. 5j6_, 495-529.  S P S S Inc. (1988). S P S S - X user's guide (3rd ed.) program manual]. Chicago: Author.  [Computer  Sullivan, B. M. (1988). A legacy for learners: The report of the Roval Commission on Education. Victoria, BC: Ministry of Education. Tinkelman, S. N. (1971). Planning the objective test. In R. L. Thorndike (Ed.), Educational measurement (2nd ed.) (pp. 4680). Washington, DC: American Council on Education. Wesman, A. C. (1958). Comparability vs. equivalence of test scores. Test Service Bulletin No. 53. New York. The Psychological Corporation.  Appendix A Ministry Examination Construction  Process  77  F L O W C H A R T e n d TJW1EL3MES  for tha Production  of  tho  1991  Examination's  tO/3/1  Phis* 1  WRfTMG T M J T I davabpa k t m i *u<(lcl4nt lor T H R E E Ptov. & T H R E E Schol. axtmt ( mlgs. req'd. lot T E A M EOrT.). 4 parsons @ J2000/pS«rcon Slaggerad sub<nlislon: tst-April 30 1990 2nd-July 15 or bafora. EXAM  120  i.tj /1 / 3 0. DESKTOP PU8LEHNG  Wcxd prooaas Input submlnad by Exam Wiling laama. 120 SO/8/9  LC IMA  ITEMS (Iltld-Ujlad?) from' ITEM BANKINQ 20% of Kama on atch a*sm should b« •|tam-banV (laid lestacT Tha potential ol lhasa lama as yal It undalatmlnad. For m u l m u m ItailblBly. Exam Oavalopan not dapand on this i t a soutca; rathar thata hams should appralsad as lhay baooma avallabta and used whara . approprlata.Tha 20% manlbnad abova repratanu (ha Idaal goal.  P O O L oirr E M S (or PREPARATION TEAM.  • uttlcUnt to prodoc* THREE SETS ol < u m i . (sat - Ptov.&Schol.)  l$±\J  3  tSt.ll 11  SELECTION of lama to prapa/a THREE SETS d EXAMS by lha EXAM PREPARATION TEAM. (< parto"ninS>~J1SC/atam/patson) H potsbla 20% of Kama on aach aram should ba'ham-bank Paid ttslad* Taam maeilng toma'.lma In August.  DRAFT  7  90/8/20  ISJA1V. DeSKTCf PUBLISHING UNIT ol lha 3 t»U (Pfov.iScfior) Inlo Ravla»-(a»dy llrtt dialls.  i£±l±2-2  X  30  78  LSX2-L2.4  A.  EXTERNAL REVIEW ol lha 3 M t s ol t u n i i ; 3 day mlg. raq'd. Sa.t.&Sun.+ Won. ot F i l . 4 p « n o n > @ $120 par w ! of m m i (Prov.* Schol)  ton o / i 5 Suggeilad change* liom E*t«rn«l Ravlaw lo P R E P TEAM lor potabl* action (changes).  9 . 0 / 9 / 2 8  W  6 0/1' DESKTOP PUBLISHING UNIT Word p r o o e n Ravlaw a n l « . changes lo produce lecond draft lot external review.  10  S O 1 /0 1 /2 '80/1  02 /2 JONT REVIEW o( lha 3 %alt o( l u r n i ; 3 day mlg. raq'd. Sal.ASun.* Mon. or FiL conabttng of 4 Reviewer* and 4 Pr apart.  DRAFT  80/10/27 8QM 0/1H DESKTOP PUBLISHING UNIT Word procsft 'Review team changea' lo product Internal Review draft.  10  10/11/8 80/11/1?  DRAFT  KTERNALREVEW SAB N.B. The OTPU w l l alao ba Invorved processing changes as lhay arise >l this lima. Because of lima, only I ia( of eiisns can ba processed with these daadlnaa, peihaps tha other 2 should ba staggered lor the two weeke following.  90/1 1112 TRIAL WRITINO ol 3 sets of t m r n i (J60 par exam) 9 0 / 1 1/1 v  90/11/16  8 0 / 1 1/1 9 CUE ENS PRINTER Cameta-ready copyUi ol ONE (Ihiaa) eel(>) ol i m m i sent loOP (or Blue-line production.  9B/U/10  Appendix B Materials for Construction Team  81  BIOLOGY 12 - PROVINCIAL EXAMINATION - 1990 TABLE OF SPECIFICATIONS  TOPIC P R I  M E t  a  COGNITIVE LEVEL Understanding Higher and Knowledge Mental TOTAL Application Processes  CURRICULUM GUIDE REFERENCE  I Experimental Design  n  1. H N OD D P S L E S  —*4&-  C  "10  1  in Cell Compounds IV Ultrastructure v Ultraprocesses VI Photosynthesis  L S H U 3. M A N S  0  N  -(00% -49-  S VII Cells, Tissues and bVTJJ Digestive 1 LX Circulatory B X Nervous Kt •XI Excretion and RespiXII Endocrine  c O H P O o T s I E  8.  II Homeostasis  T W  9. S O  lo  2  I-Immunology II Skeletal System and III Reproduction and Embryology  -to-  IV Genetic Disorders & Engineerin; V Cancer VI Sensory Receptors TOTAL % ^  33  ,30,  5*  -20-  103  ^ 3 % multiple-choice questions — 4 8 % written response questions-  o c ,  * METHODS AND PRINCIPLES:  Questions related to Experimental Design and Homeostasis will be embedded within topics 2 and 3 in tlx above table, and account fcr 17. 4-G4r-of the consent of the exam. Ao nresulthorizontal addition of lopicr, 2 and 3 yield 29jt and 51ft rcsp:aively. 3  Biology Page 5 TABLE 5A MULTIPLE-CHOICE SUMMARY JANUARY 1987 N = 1,924  Item No.  Topic and Cog. Level  % of Students Choosing A l t e r n a t i v e s B  1  2  Item No.  D  Topic and Cog. Level  % of Students Choosing A l t e r n a t i v e s 1  A  B  C  D  16  44 45  1  IK  73*  4  10  13  22  7U  33*  6  2  IK  16  11  21  53*  23  7U  38*  9  8  3  1U  24  72*  2  2  24  7U  31  9  57*  4  IK  4  88*  5  2  25  7U  9  71*  5  15  5  IK  0  3  74*  22  26  7U  4  11  54*  30  6  1H  46*  70*  7  2U  12  4  8  2U  9  41  9  IK  40*  10  3K  4  3K  14 -2-4—  11  -MJ-i-3-  33  -44-29-  -re--  13  8  27  7H  17  3  10  2  28  7H  13  80*  5  2  4  45*  29  6U  68*  4  6  22  17  20  22  30  8U  7  75*  14  3  34  48*  13  31  9U  56*  8  34  2  4  12  70*  32  8K  3  7  87*  3  -±6—  -52*-  33  8U  14  15  6  65*  -fr2*-  34  8K  14  14  63*  9  35  8K  8  10  13  69*  36  9U  11  8  33  49*  37  9K  71*  15  8  5  38*  38  9U  77*  8  14  1  -3-2-  3  17  4K  41  18  4K  19  4U  20 21  6  15  43*  23  9  24  39  10U  48  62*  14  21  2  40  10K  1  7K  5  71*  14  10  41  10X  79*  7U  12  74*  7  7  42  10U  10  See p. 6 f o r 43-70 (Optional  39*  6  7  25  66*  8  8  10  3  14  34  42*  Section)  * c o r r e c t response (The % students choosing the c o r r e c t response the d i f f i c u l t y l e v e l of the question.)  indicates  'These codes r e p r e s e n t topics and c o g n i t i v e l e v e l s l i s t e d i n Table 4. For f u r t h e r d i s c u s s i o n of topics and c o g n i t i v e l e v e l s , see Page 1. Percentages 3  4  82*  -5*-  are based on a l l students who responded  Item 16 was d e l e t e d  2  to the item.  at the reconLnendaticn of the markers.  Biology Page 7  TABLE 53 hULTIPLZ-CSOICZ SUMMARY JUNE 1987 N = 7,288  Itera No.  6 f o r 43-70 (Optional  1  A  B  C  D  78*  7  14 16  22  7U  1  7U  68*  11  4  24  7K  68*  15  14  25  7U  6  11  51*  32  26 •  7U  29*  21  26  24  27  7K  3  3  55*  39  28  7U  55*  7  15  23  29  8U  13  51*  28  8  30  8U  10-  63*  18  4  31  8U  48*  5  19  27  32  8K  17  78*  1  4  33  8K  17  19  22  34  8U  6  25  65*  4  35  9U  1  8  71*  19  36  9K 7  75*  7  9  13  3  14  70*  38  p.  % of Students Choosing A l t e r n a t i v e s  23  37  See  Topic and Cog. Level  ©  Item 18 was  at the  9  9U  20  47*  10  23  3  77*  18  2  40  10K  2  2  86*  41  10U  6  89*  4  1  42  10U  71*  11  9  9  10  Section)  based on a l l students who  deleted  42*  10U  indicates  'These codes represent t o p i c s and c o g n i t i v e l e v e l s l i s t e d i n Table 4. For f u r t h e r d i s c u s s i o n of t o p i c s and c o g n i t i v e l e v e l s , see Page 1.  3  3  39  •'•'correct response (The % students choosing the c o r r e c t response the d i f f i c u l t y l e v e l of the question.)  P e r c e n t a g e s are  2  responded to the  recommendation of the  markers.  itera.  Biology  Page 5 TABLE 5A HULTIPLE-CHOICE SUMMARY JANUARY 1988 N = 2,139  Item No.  1 1  4-  3 4 5 6 7 8 9 10 11 12 13 14 IS 16 17 18 19 20 21 -2-2— -23  Topic and Cog. Level  - -r *r r  n  A  B  64* 10 8 7 61* 11 7 7* 68* 2 42* 15 45* 3 12 5 10 13 69* 6 19 9  27 45* 3 3 10 65* 10 4 65* 49 65* 14 2 4 14 70* 45* 24 59* 8 77*  1  2U 1U IK 1U 1H . 1U IK 2U 2U 3H 3K 3U 3K 3U 3U 4U 4U 4U 4U 4K 4U  1  i  0  I i  I1 J /  OiJ  "  *~n io  -26  *" 1'  -2-5 26—  % of Students Choosing A l t e r n a t i v e s  ± H  *7 N  jy*'  A  4 S .  J, 0  C  Item • No.  D  5 4 6 39 10 79* 84* 5 21 9 14 10 10 3 3 25 20 12 6 3 7 13 21 21 79* 16 15 70* 13 68* 8 12 31 11 4 3 24 10 17 56* 2 11 - 5 0 * - -20—79*- — 2 10 -6«*-7  2  I  -I  J/ — R~  0  27 28 29 30 31 32 33 34 35 36 37 38 39 ' 40 41 42 43 44 45 46 47 48 49 50 51 52  Topic and Cog . Level 6K 6K 6K 6U 6U 7U 7U 7K 7K 7U 7K . 8K 8K 10U 10K 10U 10U 10K 10X 10K 10U 9U 9K 9U 9U 9U  % of Students Choosing A l t e r n a t i v e s A  B  C  D  9 11 36* 7 12 35 4 5 6 83* 13 3 3 32 20 5 9 54* 17 1 59* 2 9 22 72* 2  68* 19 38 59* 22 45* 2 0 23 4 10 87* 6 7 1 77* 5 21 11 10 21 12 12 51* 19 13  19 28* 16 4 7 11 92* 6 18 11 58* 3 81* 50* 1 5 70* 17 21 88* 6 67* 51* 18 3 16  5 42 10 29 59* 9 3 89* 53* 3 19 7 9 12 78* 13 17 8 51* 1 14 19 29 9 6 68*  1  '•'•'correct response (The % of students choosing the c o r r e c t response the d i f f i c u l t y l e v e l of the question). 'These codes r e p r e s e n t t o p i c s and c o g n i t i v e  levels  listed  P e r c e n t a g e s are based on a l l students who responded  indicates  i n Table 4.  to the item.  2  B i o 1 ogy  Page 7 TABLE 5B MULTIPLE-CHOICE SUMMARY JUNE 1988 N = 7,992  I tern No.  1 2 3 4 5 6 7 8 9 10 11 12 13 14 1-51-6i-7~ 181-92024-  Topic and Cog. Level 2 LI IK 1U 1U IK 1U IK 2H 2K 3U 3U 3U 3U 3K -5U-5-U5K—5K—5K5U5U-  1 o f Students Choosing A l t e r n a t i v e s  3 53* 8 7 8 2 3 80* 5 3 86* 8 57* 67*  —6&—12—~3—16— —69—3r*-  TT  6K  Item , No. .  D  1  -2626  2  18  3 34 10 619 22 11 7 41-' 27" 10 •64* 19 4 -4.4-  84-' 10 3 10 7210 31 1 11 72* 75--' 1 814 4 8 41 13 8 63 2 3 4 24 13 11 18 11 -i-5-47'' — 3 ^ —1-8- —te  -6-9—96- — T f l * -76'-'< — 8 —1-1—2-5- —16- — 5 3 * —3-0-22- —M-21-61* -g-7—4-8-3* -4-8-58*- -4468* 13  27 28 29 30 31 32 33 34 35 36 37 38 39 . 40 41 42 43 44 45 46 47 .48 49 50 51 52  Topic and Cog. Level 6K 6U 6U 7U 7K7U 7U 7U 7K 7K 7U 8U 8K 8K 8K 8U 9K ' 9U 9U 9K 10U 10U 10K 10U 10U 10U  % o f Students Choosing A l t e r n a 'r i v e s A  B  C  D  81* 8 6 8 4 60* 23 7 25 24 4 74* 68* 22 14 61* 6 7 67* 5 83* 61* 8 19 5 16  8 9 75* 73* 11 16 21 17 2 12 15 14 12 62* 11 14 11 54* 12 6 4 15 4 36* 32 7  4 11 14 12 4 5 46* 14 69* 62* 70* 8 11 8 15 15 38 7 15 84* 8 19 81* 34 7 28  6 72* 5 7 82* 19 10 62* 3 2 12 4 9 6 60* 10 45* 31 6 5 4 5 8 10 56* 49*  1  •'•'correct response (The % of students choosing the c o r r e c t response the d i f f i c u l t y l e v e l o f the q u e s t i o n ) . ^ h e s e codes r e p r e s e n t t o p i c s and c o g n i t i v e  levels l i s t e d  P e r c e n t a g e s are based on a l l students who responded  indicates  i n T a b l e 4.  t o the item.  2  Appendix C Examination Forms  FORM I  NAME:  NUMBER:  SCHOOL:  BIOLOGY 12  GENERAL INSTRUCTIONS 1.  Write your name and school i n the space provided  2.  DO NOT WRITE YOUR NAME ON THE ANSWER SHEET.  3.  Ensure t h a t t h e Form Number (I or II) on t h i s b o o k l e t and the answer sheet are the same.  4.  Ensure t h a t t h e I.D. Number (1 to 600) on answer sheet i s the same as on the exam booklet.  5.  When w r i t i n g t h e second form (I or II) ensure t h a t the I.D. Number i s t h e SAME I.D. as on the f i r s t form. (I o r I I )  6.  When i n s t r u c t e d t o open t h i s booklet, check t h e numbering o f the pages t o ensure that they are numbered i n sequence from page 1 t o t h e l a s t page which i s i d e n t i f i e d by END OF EXAMINATION.  7.  At the end o f the examination, place your answer sheet i n s i d e t h e f r o n t cover of t h i s bookelt and r e t u r n the b o o k l e t and your answer sheet to the s u p e r v i s o r .  Value:  52 marks (one mark per question)  Time:  above.  40 minutes  INSTRUCTIONS: For each q u e s t i o n , s e l e c t the BEST answer and r e c o r d your c h o i c e on the answer sheet provided. Using an HB p e n c i l , completely f i l l i n t h e c i r c l e that has the l e t t e r c o r r e s p o n d i n g t o your answer.  Page - 1  Use the f o l l o w i n g diagram to answer question 1. Ihree-leller  AAU AAC AAA  ) )  ACU ACC AC A ACG  cocons  CAU CAC CAA CAC  Asparag.nr Lysine  AG A AGO AUU A-_C AUA  > >  CC'J  i  AjJ  ccc CCA CCG  } )  CCJ CCC CGA CGG  Arj*w\e  Cuu CL.-C CUA CuG  } }  •Itinv.  « no acifis  cessenger RWA and tht  of  I  H.H<0-AC  P<0'-<nc  GAU GAC GAA GAG  )  CC'J GCC GCA GCO  I 1 1  1-  \  A i p i M . c acid G-uun'C acd  GGu GCC GGA CGG  \  s p e c i f i e d by  Cuu  Ai jr..n(  C-jrG'JG  C<*3 Ol 11*4 IC«fr\jl•o« Ol  the c o d o n s .  UAU UAC UAA UAG  ) )  UCJ uCC UCA  f  uGu UGC UCA L-'GG  I  UCTJ u-JC UUA uuG  > )  U s i n g t h e above t a b l e of codons, which amino a c i d below i s formed from the DNA strand? A T A C G A C A A. B. C. D.  AG C C  Tyrosine-alanine-valine-arginine. Tyrosine-arginine-leucine-alanine . Methionine-alanine-leucine-alanine. Methionine-arginine-glutamic acid-cysteine. Use t h e f o l l o w i n g information glucose  +  fructose  The above r e a c t i o n A. B. C. D.  ^  represents  reduction. synthesis. hydrolysis. denaturation.  Page - 2  to answer 2.  sucrose  +  water  (7eirrw\a:c']'  (Teim*nj:o*J*  sequence  Use the f o l l o w i n g diagram to answer q u e s t i o n  3.  1  The box which i n d i c a t e s a s i n g l e nucleotide i s A. B. C. D.  1. 2. 3.  4.  Which o f the f o l l o w i n g would describe the t e r t i a r y of a p r o t e i n ? A. B. C. D.  structure  A c h a i n o f amino acids i n a l i n e a r sequence. A molecule whose s p e c i f i c shape i s s p i r a l . A molecule c h a r a c t e r i z e d by side chains o f aminoacids. A molecule whose s p e c i f i c shape i s determined by f o l d i n g back on i t s e l f . Use t h e f o l l o w i n g t a b l e to answer q u e s t i o n 1. 2. 3. 4. 5. 6.  5.  CELL MEMBRANE CELL WALL CHLOROPLAST NUCLEAR MEMBRANE RIBOSOME MITOCHONDRION  Which of the above s t r u c t u r e s are found i n e u k a r y o t i c BUT NOT i n p r o k a r y o t i c c e l l s ? A. B. C. D.  1, 1, 2, 3,  3, 2, 4, 4,  5. 5. 6. 6.  Page - 3  cells  90 The 5-carbon compound that reacts with carbon d i o x i d e Calvin cycle i s A. B. C.  PGA (phosphoglyceric a c i d ) . RuBp ( r i b u l o s e biphosphate). PGAIi (phosphoglyceraldehyde).  D.  NADP (nicotinamide Use  i n the  adenine d i n u c l e o t i d e phosphate) .  the f o l l o w i n g diagram to answer q u e s t i o n 10. GRAPH Y  GRAPH x  product  product  Progress of R e a c t i o n  P r o g r e s s of Reaction E = energy o f activcli a  E i s g r e a t e r i n graph X than i n graph Y. The BEST e x p l a n a t i o n f o r t h i s i s that the energy o f a c t i v a t i o n i n Y is a  A. B. C. D.  r a i s e d by the a d d i t i o n of an enzyme. lowered by the a d d i t i o n of an enzyme. r a i s e d by the a d d i t i o n of more substrate. lowered by the a d d i t i o n of more s u b s t r a t e .  Page - 5  91 Use the f o l l o w i n g graph to answer q u e s t i o n RATE O F  11.  R E A C T I O N vs. SUBSTRATE C O N C E N T R A T I O N (AT O P T I M U M TEMPERATURE)  SUBSTRATE C O N C E N T R A T I O N -The a b o v e g r a p h c o n t a n is " d a t a on the i n t e r a c t i o n of an e n z y m e at o p m i tu m t e m p e r a t u r e w t i h v a r y n ig a m o u n s t of s u b s t r a t e . 11.  Which statement BEST explains why the graph r i s e s a t C? A. B. C. D.  12.  The c e l l membrane surrounds some molecules i n s o l u t i o n and takes them i n t o the c e l l . This i s an example of A. B. C. D.  13.  More enzyme was added. More s u b s t r a t e was added. The temperature was lowered. A c o m p e t i t i v e i n h i b i t o r was added.  diffusion. pinocytosis. phagocytosis. f a c i l i t a t e d transport.  During y e a s t c e l l A. B. C. D.  fermentation, pyruvic a c i d i s converted to  alcohol. glycogen. lactic acid. a c t i v e acetate.  Page - 6  its  Use the following diagram to answer q u e s t i o n -1'4 .  X  .  Y  It solution  to salt and water  I f t h e experimental s e t up i l l u s t r a t e d above were a l l o w e d t o s i t f o r 5 hours, what net movement of s a l t and water would occur? A. B. C. D.  Salt Salt Salt Salt  moves moves moves moves  from from from from  Y X Y X  to to to to  X; Y; X; Y;  water water water water  moves moves moves moves  from from from from  X Y Y X  to to to to  Y. X. X. Y.  Use t h e f o l l o w i n g diagram to answer q u e s t i o n 15.  In a e r o b i c r e s p i r a t i o n , most of the ATP i s produced a t A. B. C. D.  1. 2. 3. 4.  Page - 7  16-  The d i f f e r e n c e between active transport and f a c i l i t a t e d t r a n s p o r t i s that ONLY A. B. C. D.  17.  a c t i v e t r a n s p o r t requires ATP. f a c i l i t a t e d t r a n s p o r t requires ATP. a c t i v e t r a n s p o r t uses c a r r i e r molecules. f a c i l i t a t e d transport uses c a r r i e r molecules.  Which o f t h e f o l l o w i n g occurs i n BOTH c y c l i c photophosphorylization? A. B. C. D.  and n o n c y c l i c  O2 i s produced. ATP i s produced. C0-> i s broken down. H26 i s a hydrogen donor. Use the f o l l o w i n g diagram to answer q u e s t i o n 18.  CO,  Glucose  Visible licht  18.  The name g i v e n to the pathway l a b e l l e d "1" i s A. B. C. D.  19.  The p r o d u c t i o n A. B. C. D.  20.  hydrolysis. Krebs c y c l e . Calvin cycle. photophosphorylation. of glucose occurs during  aerobic r e s p i r a t i o n . anaerobic r e s p i r a t i o n . the carbon dioxide-reducing r e a c t i o n s . the l i g h t - c a p t u r i n g reactions.  Which of the f o l l o w i n g i s c l a s s i f i e d A. B. C. D.  Bone. Skin. Blood. Muscle.  as an organ?  Where i n the d i g e s t i v e t r a c t does the chemical p r o t e i n begin? A. B. C.  Mouth. Stomach. Small i n t e s t i n e .  D.  Large i n t e s t i n e .  digestion o  P r o t e i n i s digested by A. B. C. D.  lipase. tryps i n . amylase. secretin.  Which enzyme c a t a l y s e s the following r e a c t i o n ? fat droplets  +  water  A. Lipase. B. Pepsin. C. Maltase. D. Trypsin. An example of chemical A. B. C. D.  —-—^  glycerol  +  f a t t y acids  digestion i s  chewing. absorption. hydrolysis. peristalsis.  Which o f the f o l l o w i n g i s NOT a component o f g a s t r i c j u i c e A. B. C. D.  Water. Pepsin. Amylase. Hydrochloric acid.  Blockage o f the b i l e duct would MOST LIKELY A. B. C. D.  decrease b i l e production. a f f e c t the d i g e s t i o n of f a t . r a i s e the pH of the duodenum. decrease the quantity of feces.  Page - 9  Use the following diagram to answer q u e s t i o n  27.  X  27.  I d e n t i f y the s t r u c t u r e l a b e l l e d X i n the above diagram. A. B. C. D.  28.  Which i o n i s necessary f o r blood c l o t t i n g ? A. B. C. D.  29.  Cecum. Duodenum. Cardiac Sphincter. P y l o r i c sphincter.  Iron.. Sodium. Calcium. Chloride.  Which of the following are absorbed by l a c t e a l s ? A. B. C. D.  Starches. Fatty acids. Amino Acids. Monosaccharides.  Page - 10  MOST o f the carbon dioxide transported to the lungs by the c i r c u l a t o r y system i s i n the form of A. B. C. D.  carbonic acid, bicarbonate ions. c a l c i u m carbonate, c a r b o n i c anhydrase.  I f a person with blood group A receives type AB blood i n a t r a n s f u s i o n , which of the following occurs? A. B. C. D.  Agglutination. Clotting. Erythroblastosis. No r e a c t i o n .  Which b l o o d v e s s e l s have the greatest surface area t o volume ratio? A. B. C. D.  Veins. Venules. Arterioles. Capillaries.  In a blood pressure reading of 120/80, 80 represents the pressure A. B. C. D.  i n the c a p i l l a r i e s . i n the v e i n s . when the v e n t r i c l e s have relaxed. when the v e n t r i c l e s have contracted.  Page - 11  97  Use the f o l l o w i n g diagram to answer question  34.  z 34.  Which of the f o l l o w i n g i s true of blood v e l o c i t y ? A. B. C. D.  35.  fastest fastest fastest fastest  at at at at  Z, Y, X, X,  slowest slowest slowest slowest  at at at at  X. Z. Y. Z. -  Aldosterone. Adrenalin. Acetylcholine. Cholinesterase.  What type of neuron transmits an impulse the CNS ( c e n t r a l nervous system)? A. B. C. D.  37.  is is is is-  Which of the f o l l o w i n g i s associated with i n c r e a s e d heartbeat and breathing rate? A. B. C. D.  36.  Velocity Velocity Velocity Velocity  from a r e c e p t o r to  Motor. Efferent. Sensory. Interneuron.  The c o r r e c t pathway of an impulse passing through a r e f l e x arc i s A. B. C. D.  receptor, sensory neuron, motor neuron, e f f e c t o r , sensory neuron, motor neuron, sensory"neuron, motor neuron, e f f e c t o r , motor neuron, receptor, sensory neuron,  V = c.e  -  12  effector. receptor. receptor. effector.  38.  Which of the following describes an e f f e c t of a l c o h o l on individual? A. B. C. D.  39.  Narcotic. Stimulant. Depressant. Hallucinogen.  The c o n t r o l of hearbeat, l i e s with the A. B. C. D.  40.  breathing, and r e f l e x a c t i v i t i e s  cerebrum. thalamus. cerebellum. medulla oblongata.  The  sodium/potassium pump i n neurons i s i n v o l v e d with  A. B. C. D.  e x c r e t i o n of s a l t s . resting potential. synaptic transmission. threshold l e v e l . Use the f o l l o w i n g diagram to answer question 41.  Y mV  +40  0 mV  "  41.  5  0  m  V  ACTION  POTENTIAL  During which stage i n the above graph i s energy used? A. B. C. D.  W X Y Z  Page - 13  Use  42.  The A. B. C. D.  43.  oblongata  low O? l e v e l s . h i g h 62 l e v e l s . low COo l e v e l s . h i g h CO2 l e v e l s .  liver. kidney. pancreas. large intestine.  During e x p i r a t i o n (exhalation) the diaphragm A. B. C. D.  46.  an axon. a dendrite. a nucleus. a node of Ranvier.  Urea i s produced i n the A. B. C. D.  45.  42.  s t r u c t u r e l a b e l l e d 2 i n the above diagram i s  Receptors i n the breathing centre of the medulla are s t i m u l a t e d by A. B. C. D.  44.  the f o l l o w i n g diagram to answer q u e s t i o n  r e l a x e s and the r i b cage l i f t s . r e l a x e s and the r i b cage drops. c o n t r a c t s and the r i b cage l i f t s . c o n t r a c t s and the r i b cage drops.  Which would be an i n d i c a t i o n of kidney A. B. C. D.  malfunction?  S a l t s i n the loop of Henle. Glucose i n the glomerulus. Hemoglobin i n the Bowman's capsule. U r i c a c i d i n the c o l l e c t i n g ducts.  Page - 14  In the kidney, nutrients and s a l t ions are s e l e c t i v e l y reabsorbed i n t o the A. B. C. D.  renal artery. c o l l e c t i n g duct. afferent arteriole. peritubular capillaries.  F l u i d s appear i n the Bowman's capsule as a r e s u l t o f A. B. C. D.  reabsorption. blood pressure. osmotic pressure. tubular excretion.  I f a person has had very l i t t l e to drink on a hot day the p i t u i t a r y gland would respond by secreting more A. B. C. D.  adrenalin. thyroxin. PTH (parathormone). ADH ( a n t i d i u r e t i c hormone).  A hormone i s i n j e c t e d into muscle t i s s u e . The muscle c e l l s are observed t o increase t h e i r rate of c e l l u l a r r e s p i r a t i o n . This hormone was MOST LIKELY obtained from the A. B. C. D.  testes. thyroid. pancreas. posterior pituitary.  Hormones t r a v e l to t h e i r target c e l l s by way of A. B. C. D.  villi. ducts. lacteals. capillaries.  Page - 15  Which of the following BEST displays a graph r e p r e s e n t a t i o n of a homeostatic mechanism?  Page - 16  FORM I I  NAME:  NUMBER:  SCHOOL:  BIOLOGY 12  GENERAL INSTRUCTIONS 1.  Write your name and school i n the space provided above.  2.  DO NOT WRITE YOUR NAME ON THE ANSWER SHEET.  3.  Ensure that the Form Number (I or II) on t h i s booklet and the answer sheet are the same.  4.  Ensure t h a t the I.D. Number (1 to 600) on answer sheet i s the same as on the exam booklet.  5.  When w r i t i n g the second form (I or II) ensure t h a t t h e I.D. Number i s the SAME I.D. as on the f i r s t form. (I o r I I )  6.  When i n s t r u c t e d t o open t h i s booklet, check the numbering o f the pages t o ensure that they are numbered i n sequence from page 1 t o the l a s t page which i s i d e n t i f i e d by END OF EXAMINATION.  7.  At the end o f the examination, place your answer sheet i n s i d e the f r o n t cover of this bookelt and r e t u r n the booklet and your answer sheet to the s u p e r v i s o r .  Value:  52 marks (one mark per question)  Time:  40 minutes  INSTRUCTIONS: For each question, s e l e c t the BEST answer and record your c h o i c e on the answer sheet provided. Using an HB p e n c i l , completely f i l l i n the c i r c l e that has the l e t t e r corresponding t o your answer.  Use the following diagram to answer question 1.  Which i s t h e c o r r e c t A. B. C. D.  DNA DNA DNA DNA  at at at at  X Y W X  relationship?  p r o d u c e s mRNA (messenger p r o d u c e s mRNA (messenger i s u s e d t o make p r o t e i n i s u s e d t o make p r o t e i n  RNA) w h i c h i s u s e d a t W RNA) w h i c h i s u s e d a t X a t X. a t Z.  A c a r b o n compound has 2 hydrogen atoms f o r e v e r y oxygen compound i s a A. f a t . B. protein. C. nucleotide. D. carbohydrate.  Page - 1  atom,  104  Use the following chart to answer question 3. BASE COMPOSITION OF A PIG'S THYMUS CELLS X  Y  Guanine  27%  21%  . 21%  Z 27%  The al>ove c h a r t shows and experimental a n a l y s i s o f the DNA bases contained i n the c e l l s of a pig's thymus. The c o r r e c t i d e n t i f i c a t i o n o f the bases X, Y and Z r e s p e c t i v e l y i s A. B. C. D.  adenine, c y t o s i n e , thymine. thymine, adenine, cytosine. c y t o s i n e , adenine, thymine. c y t o s i n e , thymine, adenine.  The a n t i c o d o n on tRNA i s A. B. C. D.  i d e n t i c a l t o the codon on mRNA. i d e n t i c a l t o the t r i p l e t code on rRNA. complementary t o the codon on mRNA. complementary to the t r i p l e t code on DNA.  DNA molecules i s o l a t e d from onion and human c e l l s d i f f e r i n t h e i r A. B. C. D.  type o f sugars. sequence o f bases. number o f strands. o r d e r o f phosphates.  Page - 2  Use the following diagram to answer question  6.  I d e n t i f y the s t r u c t u r e l a b e l l e d Z i n the above diagram. A. E. C. D.  Lysosome. Ribosomes. G o l g i apparatus. Endoplasmic reticulum.  According t o the f l u i d mosaic model of the c e l l membrane, A. B. C. D.  p r o t e i n s are scattered throughout a double l a y e r o f phospholipid. p r o t e i n s are sandwiched between two layers of c e l l u l o s e . phospholipids are sandwiched between two l a y e r s of p r o t e i n , phospholipids are scattered throughout a double l a y e r of cellulose.  Page - 3  1 06  Use the following chart to answer question 8.  Y  .X  SALT  2% SALT 98% WATER  I 92% WATER  /  SEMI- PERMEABLE ME.V.3RANE  8.  In the above experimental s i t u a t i o n , osmosis i s the term used to d e s c r i b e the movement of A. B. C. D.  9.  s a l t from X t o Y. s a l t from Y-to X. water from Y to X. water from X to Y.  Which c e l l u l a r destroyed? A. B. C. D  :  -  -  f u n c t i o n would be i n t e r f e r e d w i t h i f lysosomes were  RNA s y n t h e s i s . C e l l secretion. • C e l l digestion. Packaging o f molecules. Use the f o l l o w i n g information to answer question 10. NAD SUBSTRATE A  10.  NADH, >  What k i n d of chemical reaction .dees NAD A. 3. C. D.  Reduction. Synthesis. Oxidation. ' Phosphorylation.  SUBSTRATE B  NADH2 represent:  1 07  Use the following diagram to answer q u e s t i o n 11.  enzyme 1  ^ Y-  enzyme 2  z  Inhibition (negative feedback) •  11.  With r e f e r e n c e t o the above diagram, when the amount o f "Z" i n c r e a s e s i n - a - c e l l , enzyme "1" w i l l convert A. B. C. D.  12.  X Y X Y  to to to to  Y. X. Y. X.  osmosis. diffusion. active transport. f a c i l i t a t e d transport.  With a l l other f a c t o r s constant, which of the f o l l o w i n g w i l l increase the q u a n t i t y of the end product of an enzyme-catalysed reaction? A. B. C. D.  14.  less less more more  The c e l l process which uses energy to b r i n g substances i n t o the cell is A. B. C. D.  13.  I  Increase Increase Maintain Maintain  the the the the  temperature to 100 C. amount of the substrate. pH. r e a c t i o n a t body temperature.  The C a l v i n c y c l e reactions i n photosynthesis produce A. B. C. D.  ATP. water. oxygen. carbohydrates.  Page - 5  Use the following graph to answer question  15.  OXYGEN PRODUCTION DURIXG PHOTOSYNTHESIS vs. TEMPERATURE  Amount of released  5  10-15  20  25  Temperature (°C)  According to the above graph, which of the f o l l o w i n g c o n d i t i o n s must be kept constant? A. B. C. D.  Temperature and r e l a t i v e humidity. Water supply and oxygen concentration. L i g h t i n t e n s i t y and carbon dioxide c o n c e n t r a t i o n . Oxygen and carbon dioxide concentration.  Substances which may contribute atoms to an enzyme-catalyzed r e a c t i o n are c a l l e d A. B. C. D.  inhibitors. coenzymes. apoenzymes. heavy metals.  The C a l v i n c y c l e in photosynthesis A. B. C. D.  NADPH2 NADPH2 NADPH2 NAJDPH2  produced only i n and ATP produced and ATP produced produced only i n  uses energy from  the thylakoids. i n the t h y l a k o i d s . i n the stroma. the stroma.  A  l o w e r e d c o n c e n t r a t i o n of  a f f e c t a plant A. B. C. D.  l i g h t - c a p t u r i n g of photosynthesis, l i g h t energy i s  used t o make glucose. used to make carbon dioxide. changed i n t o three carbon sugars. t r a n s f o r m e d i n t o chemical energy.  Which one A. B. C. D.  o f the  f o l l o w i n g i s an example of a t i s s u e ?  Eye. Skin. Blood. Pancreas.  Blockage o f the substances leaving t h e pancreas A. B. C. D. The A. B. C. D.  would  d e c r e a s e b i l e production. r a i s e the pH of the duodenum. d e c r e a s e the q u a n t i t y of feces. a f f e c t the d i g e s t i o n of p r o t e i n . conversion  of amino acids to glucose occurs mainly i n the  liver. thymus. spleen. pancreas.  Which o f the per gram? A. B. C. D.  would  r a i s i n g i t s r a t e of r e s p i r a t i o n . l o w e r i n g i t s r a t e of r e s p i r a t i o n . r a i s i n g i t s r a t e of photosynthesis. l o w e r i n g i t s r a t e of photosynthesis.  During the A. B. C. D.  environmental c a r b o n d i o x i d e  by  f o l l o w i n g would produce the g r e a t e s t  amount of energy  Fat. Sugar. Starch. Protein.  The c o r r e c t sequence of structures t h a t along the d i g e s t i v e system i s  food contacts  intestine, small  as i t moves  A.  mouth, stomach, l a r g e  B.  pharynx, small  i n t e s t i n e , anus.  C. D.  mouth, esophagus, stomach, s m a l l i n t e s t i n e , l a r g e i n t e s t i n e . esophagus, pharynx, stomach, l a r g e i n t e s t i n e , s m a l l i n t e s t i n e .  i n t e s t i n e , stomach, l a r g e i n t e s t i n e ,  Page - 7  anus.  110  25.  The substance secreted by the walls of the stomach t h a t the stomach from d i g e s t i n g i t s e l f i s A. B. C. D.  26.  prevents  water. mucus. pepsin. gastrin.  Which d i g e s t i v e enzyme i s i n c o r r e c t l y matched t o i t s substrate? A. B. C. D.  Lipase - f a t . Pepsin - p r o t e i n . Trypsin - nucleic acid. S a l i v a r y amylase - starch. Use the following information to answer q u e s t i o n 27.  27.  Which number represents the organ that secretes amylase? A. B. C.  1. 2. 3.  D.  4.  Page - 8  111  28.  Deoxygenated hemoglobin i s found i n i t s greatest c o n c e n t r a t i o n i n A. B. C. D.  29.  F a i l u r e of the lymphatic system to c o l l e c t t i s s u e f l u i d s r e s u l t s i n A. B. C. D.  30.  F i b r i n and calcium ions. F i b r i n and thromboplastin. Calcium ions and prothrombin. F i b r i n o g e n and thromboplastin.  A blood pressure reading of 165/100 i s c h a r a c t e r i s t i c of A. B. C. D.  32.  infection. tissue swelling. excessive u r i n a t i o n . growth of adipose t i s s u e .  Which of the f o l l o w i n g are necessary f o r the production of thrombin \ i n an area of i n j u r e d t i s s u e ? A. B. C. D.  31.  pulmonary v e i n s . systemic a r t e r i o l e s . pulmonary a r t e r i e s . coronary a r t e r i e s .  hypotension. hypertens i o n . a normal r e s t i n g a d u l t . insufficient dietary salt.  Blood plasma with f i b r i n o g e n removed i s known as A. B. C. D.  serum. lymph. formed elements. tissue f l u i d .  Page - 9  Which b l o o d v e s s e l i s l a b e l l e d X i n the above diagram? A. B. C. D.  A n t e r i o r vena cava. Pulmonary trunk. Pulmonary v e i n . Aorta.  In the f e t u s , blood passes from the r i g h t atrium to the l e f t atrium through the A. B. C. D.  semi-lunar v a l v e s . ductus venosus (venous duct). foramen ovale (oval opening). ductus a r t e r i o s u s ( a r t e r i a l duct).  Which p a r t of the nervous system regulates body temperature? A. B. C. D.  Thalamus. Cerebellum. Hypothalamus. C e r e b r a l cortex.  A sensory neuron c a r r i e s information A. B. C. D.  to muscles. from the c e n t r a l nervous system. t o the c e n t r a l nervous system. between interneurons.  Page - 10  1 13 37.  When f r i g h t e n e d , a p e r s o n ' s p u p i l nervous system that controls this  A. B. C. D. 38.  of the  sensory nervous system. somatic nervous system. sympathetic nervous system. parasympathetic nervous system.  p a s s i v e transport. diffusion. osmosis. a c t i v e transport.  The type o f sensation experienced depends on the A. B. C. D.  40.  The p a r t  Sodium ions are moved out of a neuron by A. B. C. D.  39.  size increases. response i s t h e  as a r e s u l t o f a nerve impulse  area o f the brain stimulated. p a r t o f the s p i n a l cord that i s stimulated. s t r e n g t h of the impulse. type of e f f e c t o r involved.  Damage t o the o c c i p i t a l lobes of the brain may r e s u l t i n impaired A. B. C. D.  smell. speech. vision. hearing.  Page  - 11  114  Use  41.  44.  an e f f e c t o r ?  f o l l o w i n g c a u s e s human l u n g s  to  inflate?  C o n t r a c t i o n o f abdominal m u s c l e s . C o n t r a c t i o n o f t h e diaphragm. R e l a x a t i o n o f t h e diaphragm. R e l a x a t i o n of i n t e r c o s t a l muscles.  Below t h e order? A. B. C. D.  41.  1 2 3 4  Which o f t h e A. B. C. D.  43.  f o l l o w i n g diagram to answer q u e s t i o n  Which number r e p r e s e n t s A. B. C. D.  42.  the  pharynx, which i s the c o r r e c t d e s c e n d i n g  Larynx, e p i g l o t t i s , trachea, E p i g l o t t i s , trachea, larynx, E p i g l o t t i s , larynx, trachea, Larynx, trachea, e p i g l o t t i s ,  anatomical  alveoli. alveoli. alveoli. alveoli.  I f a i r p l a n e p a s s e n g e r s were t o e x p e r i e n c e a l o w e r l e v e l o f a t m o s p h e r i c oxygen, what r e s p o n s e would o c c u r i m m e d i a t e l y ? A. B. C. D.  The h e a r t r a t e would d e c r e a s e . C a l c i t o n i n l e v e l s would i n c r e a s e . The b r e a t h i n g r a t e would i n c r e a s e . More h e m o g l o b i n would be p r o d u c e d .  Page -  12  115  45.  A f t e r prolonged sweating and l i t t l e water intake, youR body would respond by A. B. C. D.  46.  Which of the f o l l o w i n g i s the BEST i n d i c a t o r of abnormal kidney function? A. B. C. D.  47.  s e c r e t i n g more ADH. reducing aldosterone l e v e l s . i n c r e a s i n g urine production. decreasing c o l l e c t i n g duct permeability.  Glucose i a the u r i n e . P r o t e i n i n the u r i n e . Hormones i n the u r i n e . E l e c t r o l y t e s i n the urine.  Hemoglobin pigments are excreted A. B. C. D.  by the'  liver. marrow. spleen. pancreas. Use the f o l l o w i n g diagram t o answer question  3  48.  Pressure f i l t r a t i o n occurs at the region A. B. C. D.  .1 2 3 4  Page - 13  labelled  48.  116  49.  Damage to the p o s t e r i o r p i t u i t a r y gland would have an e f f e c t on A. B. C. D.  50.  Which o f the f o l l o w i n g glands has both an endocrine and an exocrine function? A. B. C. D.  51.  Adrenals. Thyroid. Pituitary. Pancreas.  Deamination o f amino acids produces ammonia which i s then converted to A. B. C. D.  52.  adrenal glands. kidney f u n c t i o n . the pancreas. the t h y r o i d gland.  urea. protein. creatinine. bile.  Which hormone i s a s e c r e t i o n of the adrenal A. B. C. D.  Cortisol. Adrenalin. Aldosterone. Testosterone. END OF EXAMINATION  Page - 14  medulla?  Appendix D Request and Permission Forms  CONSENT FORM I,  Superintendent of  School District  consent to the involvement of Biology 12 students in the examination equivalency study as outlined in the letter from Peter MacMillan. Peter MacMillan has permission to contact any or all Biology 12 in the district requesting their participation.  It is the  teachers' professional decision as to whether or not they wish their classes to participate.  125  Appendix E Teacher  Administration  Instructions  ADMINISTRATION  INSTRUCTIONS  TO THE TEACHER Advance  Information  Students two  may be t o l d  ahead o f time  forms o f a n e x a m i n a t i o n  multiple, choice  t h a t they  will  be w r i t i n g  that i s e q u i v a l e n t to the  s e c t i o n o f t h e B i o l o g y 12 government  examination.  Students writing  should will  be t o l d  one c l a s s  i n advance o f when an exam  occur.  Students  may be g i v e n any i n f o r m a t i o n a b o u t  you  t o g i v e BUT o n l y a f t e r  wish  the second  the study form  that  has b e e n  written. If any  you o r t h e s t u d e n t s comments a b o u t  wish  any f u r t h e r  the study,  please  i n f o r m a t i o n o r have  send  the r e q u e s t s o r  comments t o me when t h e answer .'sheets a r e b e i n g Prior Please  to the Examination check  t h a t you have e q u a l  II  o f the examination.  if  necessary.  the  returned.  package.  Feel  E x t r a answer  numbers o f FORM  free sheets  to xerox  I and FORM  e x t r a exam  have been  copies  included with  ADMINISTRATION  Your or  school  has been  FORM I I f i r s t .  classes  i n your  selected  by l o t t o w r i t e  either  written  Please ensure that a l l s t u d e n t s i n a l l  school  write  t h e two forms  i n t h e sequence  to coordinate  so t h a t  the c l a s s e s  i n which  d i s c u s s i o n between c l a s s e s  t h e forms  i s minimized.  The o t h e r form s h o u l d be w r i t t e n as s o o n a s f e a s i b l e the  first  form; back, to back o r w i t h i n  p e r i o d s would At  FORM I  f o r you.  Please attempt are  selected  INSTRUCTIONS  two t o t h r e e  after class  be i d e a l .  t h e Time o f A d m i n i s t r a t i o n  Give  t h e s t u d e n t s answer s h e e t s and t e l l  them TO PUT NAMES  ON THEM BUT NOT TO BUBBLE IN THEIR NAMES. Review d i r e c t i o n s of  f o r marking  bubble  s h e e t s found on s i d e 2  t h e answer s n e e t .  Have s t u d e n t s b u b b l e DO NOT b u b b l e  i n the sex, grade, b i r t h  i n any I.D. number even  mark t h e s h e e t s u s i n g  IF  YOUR SCHOOL IS SELECTED TO WRITE FORM  II  *0* BUBBLE OF SPECIAL CODE K. FIRST,  columns.  i f you a r e i n t e n d i n g  to  THE  a scanner  date  i n your  school.  I FIRST,  BUBBLE IN  I F YOU ARE WRITING FORM  BUBBLE IN THE '0' BUBBLE OF SPECIAL CODE L.  ADMINISTRATION  INSTRUCTIONS  WHEN WRITING FORM I BUBBLE IN ANSWERS 1 TO 52. FORM I I BUBBLE IN ANSWERS 101 TO  The s t u d e n t s  should  read  152.  the S t u d e n t  instructions  Instruction  replacing  general  All  other  rules  for  the B i o l o g y 12 government e x a m i n a t i o n  being  teacher  t h a t the s t u d e n t s  writing  be  e.g.  the.same as  calculator  The  a r e s u p e r v i s e d by t h e i r  exception classroom  -'  Administration of F i r s t  Collect  shall  non-assistance.  teacher. After  Sheet  2. to. 5. when a p p r o p r i a t e .  f o r examination  usage, s c r a p paper,  WHEN WRITING  a l l test  Form  b o o k l e t s w i t h answer s h e e t s  placed  i n the  booklets.  Check f o r name b u t remove any b u b b l e d  i n names on answer  sheets. Separate a secure  DO  NOT  and s t o r e e x a m i n a t i o n  b o o k l e t s and answer s h e e t s i n  location.  RETURN, OR  Administration  MARK, OR  DISCUSS THE EXAM AT THIS  o f the Second  Form  Return  the s t u d e n t  Follow  t h e same a d m i n i s t r a t i o n p r o c e d u r e  f orm.  TIME!  answer s h e e t s  to the o r i g i n a l  owners,  as f o r the  first  Appendix F Equipercentile  Equating  Tables  1 30 13 S e p 9 0 22:01:32  S P S S - X R E L E A S E 3 . 0 FOR U n i v e r s i t y of A l b e r t a  IBM  MTS  SCOREA  VALUE  LABEL  VALUE  FREQUENCY  9 10 12 13 14 15 1G 17 18 19 20 2 1 22 23 24 25 26 27 28 29 30 3 1 32 33 3d 35 36 37 38 39 40 4 1 42 43 44 45 46 47 48 5 1 "AL PERCENTILE  VALUE  1.00  11.740  VALID  CASES  286 .  PERCENTILE  1 1 2 4 4 7 5 9 15 9 7 16 1 1 8 9 1 1 9 9 9 14 6 12 10 8 12 5 3 7 8 9 6 8 7 2 6 5 4 2 5 1 286  VALID PERCENT  CUM PERCENT  .3 3 . 7 1. 4 1. 4 2 . 4 1. 7 3 . 1 5 . 2 3 . 1 2 . 4 5 . 6 3 .8 2 .8 3 .1 3 8 3 1 3 . 1 3 .1 4 .9 2 . 1 4 . 2 3 .5 2 .8 4 . 2 1. 7 1 .0 2 . 4 2 .8 3 .1 2 .1 2 .8 2 . 4 . 7 2 .1 1. 7 1. 4 . 7 1. 7 .3  .3 .3 . 7 1. 4 1. 4 2 . 4 1. 7 3 . 1 5 . 2 3 . 1 2 . 4 5 . 6 3 8 2 8 3 . 1 3 8 3 1 3 . 1 3 . 1 4 ,9 2 . 1 4 2 3 5 2 8 4 ,2 1 . 7 1 0 2 4 2 8 3 .1 2 . 1 2 8 2 4 . 7 2 . 1 1 .,7 1 .. 4 7 1 7 3  .3 . 7 1. 4 2 .8 4. 2 6 . 6 8. 4 1 1. 5 16 . 8 19 .9 22 . 4 28 .0 3 1. 8 34 . 6 37 . 8 4 1. 6 44 8 4 7. 9 5 10 55 . 9 58 .0 62 2 65 . 7 68 5 72 . 7 74 5 75 . 5 78 0 80 . 8 83 . 9 86 . 0 88 . 8 9 1. .3 92 .0 94 . 1 95 . 8 97 . 2 97 9 99 7 100 .0  100 . 0  VALUE  100.00 M I S S I N G CASES  PERCENT  O  100 .0  131 13 S e p 9 0 22:01:32  S P S S - X R E L E A S E 3 . 0 FOR University of Alberta  IBM  MTS  SCOREB  VALID VALUE  LABEL  VALUE  FREQUENCY  12 14 15 16 17 18 19 20 2 1 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 ' AL PERCENTILE  VALUE  1.00  12.000  100.00  286  MISSING  VALID  CASES  3 1 3 4 3 7 15 1 1 10 9 15 19 6 9 12 7 9 1 1 10 14 7 5 15 1 1 3 10 6 9 8 7 5 2 3 3 5 4 2 1 2 286  PERCENTILE  CASES  PERCENT  1 .0 3 1 .O 1 ..4 1 ..0 2 4 5 .. 2 3 .8 3 .. 5 3 . 1 5 .2 6 .6 2 . 1 3 . 1 4 .2 2 .4 3 . 1 3 .8 3 . 5 4 .9 2 .4 1.7 5 .2 3 . 8 1 .0 3 . 5 2 . 1 3 . 1 2 . 8 2 .4 1. 7 .7 1 .0 1 .0 1.7 1.4 .7 . 3 .7 100 .0  VALUE  O  PERCENT  1 .o 3 1 .O  1 .4 1 0 2 .4 5 .2 3 .8 3 .5 3. 1 5 2 6.6 2 .1 3 .. 1 4 2 2 .4 3 .1 3 ..8 3 .. 5 4 ..9 2 .4 1 .7 . 5 .2 3 .8 1 .0 3 .5 2 . 1 3 . 1 2 .8 2 .4 1.7 .7 1 .0 1 .0 1 . 7. 1.4 .7 .3 .7 100 .0  CUM PERCENT  1 .0  1.4 2 .4 3 .8 4 ..9 7 .. 3 12 ..6 16 . 4 19. 9 23 . 1 2 8 .. 3 3 5 .. 0 37 . 1 40. 2 44 . 4 46 . 9 5 0 . .0 53. 8 5 7 .. 3 62 . 2 6 4 .. 7 6 6 .4 7 1 ..7 7 5 .5 7 6 . .6 80 . 1 82 . 2 85 . 3 88 . 1 9 0 . .6 92 . 3 93 .0 94 . 1 95 . 1 9 6 .9 98 . 3 99 .0 99 . 3 100 .0  T a b l e F3 Equipercentile  Score 1-8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52  A  (Pooled Variance  Score  B*  Estimates)  Difference  11 12 13 14 15 16 17 18 19 20 21 22 23 24 24 25 26 27 28 29 30 31 32 33 34 35 36 37 37  38  40 41 42 43 45 46 47 48 49 50 51 52  -  -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0 0 0 0 0 0 0 -1 -1 -1 -1 -1 -1 -1 -1  Table  F4  Cumulative CP  1 2 3 4 .5 6 7 a 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34  Percentiles  Score A 11. 0 12. 5 13. 2 13. 8 14. 4 15. 0 15. 3 15. 8 16. 2 16. 6 16. 9 17. 2 17. 4 17. 7 17. 9 18. 1 18. 3 18. 5 18. 9 19. 1 19. 5 19. 7 19. 9 20. 2 20. 5 20. 7 20. 9 21. 1 21. 4 21. 7 21. 9 22. 2 22. 3 22. 8  B 12. 0 14. 6 15. 5 16. 3 17. 0 17. 7 17. 9 18. 2 18. 4 18. 6 18. 8 19. 0 19. 3 19. 5 19. 7 20. 0 20. 3 20. 5 20. 9 21. 0 21. 3 21. 6 21. 9 22. 1 22. 3 22. 6 22. 8 23. 1 23. 2 23. 4 23. 7 23. 8 24. 0 24. 2  CP 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67  (CP) and S c o r e s on Form Score A 23. 2 23. 5 23. 7 24. 1 24. 3 24. 6 24. 9 25. 2 25. 5 25. 8 26. 1 26. 4 26. 7 27. 0 27. 3 27. 7 28. 0 28. 2 28. 5 28. 7 28. 9 29. 3 29. 7 30. 0 30. 2 30. 5 30. 7 31. 0 31. 3 31. 6 31. 9 32. 2 32. 6  B 24. 6 24. 8 25. 1 25. 4 25. 6 25. 9 26. 1 26. 5 26. 7 27. 0 27. 4 27. 7 28. 0 28. 3 28. 6 29. 0 29. 2 29. 5 29. 7 29. 8 30. 3 30. 6 30. 9 31. 2 31. 5 31. 7 31. 9 32. 1 32. 4 32. 9 33. 1 33. 6 34. 0  CP 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100  A and Form B Score  A 32. 8 33. 3 33. 5 33. 9 34. 4 34. 7 35. 0 35. 6 36. 0 36. 4 36. 8 37. 2 37. 5 37. 8 38. 3 38. 7 39. 1 39. 5 40. 0 40. 4 40. 8 41. 1 41. 5 42. 0 42. 6 43. 3 44. 0 44. 7 45. 2 46. 0 46. 9 47. 5 51. 0  B 34. 2 34. 4 34. 6 34. 9 35. 1 35. 5 35. 8 36. 1 36. 6 37. 0 37. 3 37. 7 38. 0 38. 4 38. 8 39. —> 39. 6 39. 9 40. 3 40. 7 41. 1 41. 5 41. 8 42. 3 43. 0 43. 8 44. 6 45. 4 46. 3 47. 0 48. 8 49. 0 51. 0  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0100498/manifest

Comment

Related Items