Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Examinee control of item order effects on latent trait model and classical model test statistics Scales, Michael J. 1990

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1990_A8 S32.pdf [ 6.23MB ]
Metadata
JSON: 831-1.0098329.json
JSON-LD: 831-1.0098329-ld.json
RDF/XML (Pretty): 831-1.0098329-rdf.xml
RDF/JSON: 831-1.0098329-rdf.json
Turtle: 831-1.0098329-turtle.txt
N-Triples: 831-1.0098329-rdf-ntriples.txt
Original Record: 831-1.0098329-source.json
Full Text
831-1.0098329-fulltext.txt
Citation
831-1.0098329.ris

Full Text

EXAMINEE CONTROL OF ITEM ORDER EFFECTS ON LATENT TRAIT MODEL AND CLASSICAL MODEL TEST STATISTICS by MICHAEL J . SCALES B.Ed., U n i v e r s i t y of V i c t o r i a , 1978 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF ARTS in THE FACULTY OF GRADUATE STUDIES (Department of E d u c a t i o n a l Psychology and S p e c i a l Education) We accept t h i s t h e s i s as conforming to the r e q u i r e d standard THE UNIVERSITY OF BRITISH COLUMBIA J u l y 1990 (c) Michael S c a l e s , 1990 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of The University of British Columbia Vancouver, Canada DE-6 (2/88) ABSTRACT The purpose of t h i s study was to determine what e f f e c t changes i n the item order had on c l a s s i c a l and on l a t e n t t r a i t t e s t s t a t i s t i c s . As w e l l , comparisons were made between students who were allowed to answer the qu e s t i o n s i n any order, and students who were r e q u i r e d to answer the que s t i o n s In the order presented i n the t e s t b o o k l e t . The r e s u l t s were then analyzed u s i n g the student's a b i l i t y l e v e l as an a d d i t i o n a l independent f a c t o r . Four d i f f e r e n t formats of a f o r t y item mathematics t e s t were used with 590 students i n grade e i g h t . Half of the b o o k l e t s had the items sequenced from e a s i e s t to ha r d e s t . The other b o o k l e t s were sequenced from hardest to e a s i e s t . In a d d i t i o n , h a l f of the t e s t s of each sequence had s p e c i a l d i r e c t i o n s which prevented students from a l t e r i n g the giv e n item d i f f i c u l t y sequence. The classroom teachers provided a r a t i n g of each student's a b i l i t y i n mathematics. The order of the items was found t o have a s i g n i f i c a n t e f f e c t . T e s t s which were sequenced from hard to easy had a lower mean s c o r e . Although students with t e s t b o o k l e t s with r e s t r i c t i v e d i r e c t i o n s had lower s c o r e s on average, i t was not a s t a t i s t i c a l l y s i g n i f i c a n t d i f f e r e n c e . There were no s i g n i f i c a n t i n t e r a c t i o n s found. C l a s s i c a l and l a t e n t t r a i t 11 item d i f f i c u l t y s t a t i s t i c s showed a high degree of c o r r e l a t i o n . I t was concluded t h a t under c e r t a i n c i r c u m s t a n c e s , the order of the items c o u l d e f f e c t both c l a s s i c a l and l a t e n t t r a i t s t a t i s t i c s . I t was a l s o recommended t h a t c a r e should be taken when assumptions are made about p a r a l l e l forms or l o c a l independence. i i i TABLE OF CONTENTS CHAPTER PAGE I. INTRODUCTION 1 Context of the Research Problem 1 Purpose of the Study 7 I I . REVIEW OF LITERATURE 9 I n i t i a l S t u d i e s 9 C r i t i c a l Re-examinations 12 I n i t i a l S t u d i e s : C o n c l u s i o n s 16 A n x i e t y and Item Order 17 Achievement and Item Order 28 Other I n t e r a c t i o n s 32 I n t e r a c t i o n Research: C o n c l u s i o n s 44 Simple E f f e c t Re-examined 46 C o n t r o l l i n g Other F a c t o r s 52 Test Wiseness 61 Lat e n t T r a i t and Context E f f e c t s 65 Context E f f e c t and Item Order 69 Summary and C o n c l u s i o n s 71 I I I . PROBLEM 78 Statement of the Problem 78 R a t i o n a l e 79 Hypotheses 82 l v IV. METHOD 85 Design 85 Subjects 86 Instrument and Tasks 86 Procedure 88 Analysis 88 V. RESULTS 90 Main Ef f e c t s 91 Interactions 94 Item D i f f i c u l t i e s 96 VI. SUMMARY AND CONCLUSIONS 102 Purpose of The Study 102 Sequence 103 Directions 104 A b i l i t y 105 Interactions 105 Latent T r a i t 106 Limitations 107 Implications 113 Future Research 114 REFERENCES 116 APPENDIX I: Teacher Instructions 121 APPENDIX I I : Test Booklet Covers and End Pages 134 APPENDIX I I I : P-level Item Analysis Data and 143 B-value Item Parameter Estimates v APPENDIX IV: Sample Test Bo o k l e t , Format 3 146 AUTHOR INDEX 1 8 9 v i L i s t of Tables Table 1: L i s t of Item Order Research: 73 R e s u l t s , Types, and Samples Table 2: Test Means and Sample S i z e s of 9 3 Student A b i l i t y L e v e l s Table 3: Summary of A n a l y s i s of Variance 95 of Test Scores by A b i l i t y (Ab), Item Order (Or), and Test D i r e c t i o n s ( D i r ) Table 4: Summary of M u l t i v a r i a t e A n a l y s i s 97 of V a r iance of Test Item P - l e v e l s Table 5: Summary of M u l t i v a r i a t e A n a l y s i s 98 of V a r iance of Test Item B-values Table 6: Test Format P - l e v e l Means, B-value 99 Means, and Score Means Table 7: Pearson C o r r e l a t i o n C o e f f i c i e n t s 100 of P - l e v e l s and B-values f o r Easy to Hard (EH), Hard to Easy (HE), R e s t r i c t e d (R), and U n r e s t r i c t e d (U), Test Formats Table 8: Summary of A n a l y s i s of Variance 101 of Theta Values by A b i l i t y (Ab), Item Order ( O r ) , and Test D i r e c t i o n s ( D i r ) v i i L i s t of F i g u r e s F i g u r e 1: Low A b i l i t y Students F i g u r e 2: High A b i l i t y Students v i i i ACKNOWLEDGEMENTS I would l i k e to thank the members of the thesis committee for their assistance. Dr. Robert Conry, Dr. Donald A l l i s o n , and Dr. David Bateson provided me with much needed advice and support under what I am sure were conditions of very short notice. I would also l i k e to thank the many teachers and students who provided me with the i r greatly appreciated cooperation. F i n a l l y , I would l i k e to bestow my deepest gratitude to my wife, Margot Lakoduk, my son, Riordan, and my daughter, Alysha. Their s a c r i f i c e s on my behalf have been s i n c e r e l y appreciated. ix Chapter I I n t r o d u c t i o n Context of the Research Problem Ever s i n c e m u l t i p l e c h o i c e t e s t s were c o n s i d e r e d the "new-type examinations" (Ruch, 1929) to the present d e s c r i p t i o n s of computerized marking systems (Hopkins & Antes, 1985), a d v i c e has been forthcoming from many textbook authors t h a t m u l t i p l e c h o i c e t e s t s should be arranged with the e a s i e s t q u e s t i o n s a t the beginning to the hardest q u e s t i o n s a t the end. T h i s a d v i c e has had a gre a t d e a l of i n t u i t i v e appeal and assumed c e r t a i n t y . For example one author s t a t e s : The l e v e l of d i f f i c u l t y of o b j e c t i v e t e s t items i s used as a b a s i s f o r a r r a n g i n g these items i n a t e s t by p l a c i n g the easy ones f i r s t , the more d i f f i c u l t ones l a t e r , and the most d i f f i c u l t ones l a s t . Such an arrangement has advantages f o r the average and below average p u p i l . With t h i s k i nd of t e s t he uses the t e s t i n g time allowed more e f f i c i e n t l y , and h i s morale i s improved. I f the d i f f i c u l t t e s t items appear f i r s t , many p u p i l s of average or low achievement w i l l waste a gr e a t d e a l of time t r y i n g to answer them. They may f a i l t o answer e a s i e r t e s t items l a t e r i n the t e s t (1) (2) because so much time was spent on the f i r s t ones. Moreover, they may q u i c k l y become dis c o u r a g e d or even h o s t i l e . On the other hand, i f the e a s i e r t e s t Items are l i s t e d f i r s t , these same p u p i l s w i l l a t f i r s t make smooth progress i n the t e s t , and consequently f e e l encouraged. When they l a t e r encounter the more d i f f i c u l t t e s t items, they no doubt w i l l have time to a t t a c k them. Even i f they f a i l to answer some of them c o r r e c t l y , as w i l l v e r y l i k e l y happen, the r e s u l t i n g disappointment w i l l be moderated by the knowledge t h a t they a l r e a d y have responded to some items i n a manner t h a t i s p r o b a b l y c o r r e c t . (Ahmann & Glock, 1963, p.115) However, d e s p i t e such c o n v i c t i o n of what examinees w i l l no doubt do and f e e l , e m p i r i c a l r e s e a r c h does not support t h i s same lack of doubt. Research over the years has been i n c o n c l u s i v e . I t i s not a c e r t a i n t y t h a t the arrangement of t e s t items w i l l make a d i f f e r e n c e to the s c o r e of the examinee. In f a c t , the i s s u e of item order e f f e c t s has been an area of r e s e a r c h f o r n e a r l y f o r t y y e a r s . As Leary and Oorans (1985) p o i n t e d out, the r e s e a r c h has r e f l e c t e d the i n t e r e s t s and s t a t i s t i c a l a b i l i t i e s of the times. So, while the a c c u r a c y of t e s t s have improved, the need f o r more p r e c i s e s t a t i s t i c s has a l s o i n c r e a s e d . As a r e s u l t , item order c o n t i n u e s to be a concern due to c o n f l i c t i n g r e s e a r c h (3) r e s u l t s ; one r e s e a r c h e r w i l l conclude t h a t item order has no s i g n i f i c a n t e f f e c t ( A l l i s o n , 1984) whereas another r e s e a r c h e r w i l l conclude t h a t the e f f e c t i s s i g n i f i c a n t (Hambleton & Traub, 1974). In f a c t , Lane, B u l l , Kundert, and Newman (1987) r e p o r t e d f i n d i n g s i g n i f i c a n t order e f f e c t s i n t h e i r f i r s t study and n o n - s i g n i f i c a n t e f f e c t s i n t h e i r second study. The f i r s t r e s e a r c h on item order began i n the e a r l y 1950s and examined the simple main e f f e c t of item order on c l a s s i c a l t e s t s t a t i s t i c s . Researchers wanted t o t e s t the axiom t h a t t e s t s should be c o n s t r u c t e d with the e a s i e s t q u e s t i o n s f i r s t . A v a r i e t y of arrangements were t r i e d such as easy to hard, hard to easy, random, and s p i r a l l i n g . Some i n i t i a l s t u d i e s r e p o r t e d a s i g n i f i c a n t e f f e c t (Mollenkopf, 1950; MacNicol, 1956; Sax & C a r r , 1960; Sax & Cromack, 1966; Flaugher, Melton, & Meyers, 1968; S i r o t n i k & W e l l i n g t o n , 1974; Hambleton & Traub, 1974; K l e l n k e , 1980; Hodson, 1984). However, some l a t e r r e s e a r c h r e p o r t e d t h a t item order d i d not make a s i g n i f i c a n t d i f f e r e n c e (Brenner, 1964; Huck & Bowers, 1972; Monk & S t a l l i n g s , 1970; Klosner & Gellman, 1973; Kestenbaum & Weiner, 1970; A l l i s o n , 1984). In the l a t e 1960s there was a concern about the emotional s t a t e o£ the exam t a k e r s and t h e i r l e v e l of a n x i e t y , so the emphasis of the r e s e a r c h s h i f t e d to examine these i n t e r n a l s t a t e s . In a d d i t i o n , s t a t i s t i c a l techniques (4) u s i n g f a c t o r a n a l y s i s were i n more common usage, and r e s e a r c h e r s c o u l d examine the i n t e r a c t i o n e f f e c t of r e p o r t e d a n x i e t y l e v e l and t e s t r e s u l t s . As i n p r e v i o u s r e s e a r c h , a v a r i e t y of item arrangements were used. The r e s u l t s of t h i s r e s e a r c h were a l s o mixed with some s t u d i e s r e p o r t i n g s i g n i f i c a n t e f f e c t s (Munz & Smouse, 1968; Smouse & Munz, 1969; Towle & M e r r i l l , 1975; Plake, Ansorge, Parker, & Lowry 1982) whereas other r e s e a r c h found the main e f f e c t s and the i n t e r a c t i o n e f f e c t s to be n o n - s i g n l f l e a n t (French & Greer, 1964; Smouse & Munz 1968; Berger, Munz, Smouse, & A n g e l i n o , 1969; Marso, 1970; Munz & Jacobs, 1971; Plake, 1980; Plake, Thompson, & Lowry, 1980; P l a k e , M e l i c a n , C a r t e r , Shaughnessy, 1983; P l a k e , Ansorge, 1984; Kllmko, 1984). Recent r e s e a r c h has r e t u r n e d to a concern about simple item order, but the r e s e a r c h e r s have begun to use a more modern computerized a n a l y s i s i n v o l v i n g l a t e n t t r a i t models of t e s t s t a t i s t i c s r a t h e r than c l a s s i c a l t e s t s t a t i s t i c s . The r e s u l t s of the l i m i t e d number of s t u d i e s to date have r e p o r t e d s i g n i f i c a n t e f f e c t s of item order on some Item parameters (Whitely & Dawis, 1976; Yen, 1980; K i n g s t o n & Dorans, 1984). T h i s r e c e n t r e s e a r c h r a i s e s some important Issues. For one, i f item order has a s i g n i f i c a n t e f f e c t , then some of the r e s u l t s of p r e v i o u s r e s e a r c h may be q u e s t i o n a b l e s i n c e they may have lacked the power or s t a t i s t i c a l s o p h i s t i c a t i o n (5) to d e t e c t an item order e f f e c t . P r e v i o u s concerns and c o n c l u s i o n s may have t o be re-examined i n l i g h t of new f i n d i n g s . Of course, t h i s d i s c r e p a n c y between the l a t e n t t r a i t model f i n d i n g s and some of the c l a s s i c a l model f i n d i n g s may be due to fundamental d i f f e r e n c e s i n the types of f a c t o r s under study. T h i s r e s e a r c h may a l s o b r i n g i n t o q u e s t i o n a b a s i c premise of the l a t e n t t r a i t model t h a t each item i s l o c a l l y independent. I f item order has an e f f e c t such t h a t the p r o b a b i l i t y of g e t t i n g one item r i g h t i s e f f e c t e d by the p r o b a b i l i t y of g e t t i n g some other q u e s t i o n r i g h t , then the assumption of l o c a l independence i s v i o l a t e d . T h e r e f o r e , any t e s t t h a t d i d show item order e f f e c t s would not be p e r f e c t l y s u i t a b l e f o r u s i n g l a t e n t t r a i t model item parameters. H o p e f u l l y , l a t e n t t r a i t models are robust and can t o l e r a t e s m a l l v i o l a t i o n s o£ some b a s i c assumptions; however, the extent and e f f e c t of t h i s source of e r r o r needs f u r t h e r r e s e a r c h . Research i n t o item order must i n v e s t i g a t e s e v e r a l areas of growing concern. Not o n l y must the item order e f f e c t r e p o r t e d by l a t e n t t r a i t model s t u d i e s be examined, but a l s o the source of the mixed r e s u l t s among the c l a s s i c a l model s t u d i e s must be c o n s i d e r e d . I t i s un c l e a r i f perhaps the c l a s s i c a l model s t u d i e s lacked the power or the s e n s i t i v i t y of the l a t e n t t r a i t s t u d i e s or i f some of the c l a s s i c a l (6) models d i d not p r o p e r l y c o n t r o l a f a c t o r i n the d e s i g n of t h e i r s t u d i e s which may have i n f l u e n c e d the r e s u l t s they o b t a i n e d . N e v e r t h e l e s s , comparisons between present and past r e s e a r c h i s c a l l e d f o r to shed f u r t h e r l i g h t on an ongoing measurement problem. One study, u s i n g c l a s s i c a l s t a t i s t i c s , t h a t d i d f i n d a s i g n i f i c a n t e f f e c t from Item order suggested t h a t p r e v i o u s s t u d i e s without s i g n i f i c a n t r e s u l t s had been i n e r r o r s i n c e they d i d not c o n t r o l f o r w i t h i n - s u b j e c t rearrangement of t e s t item order (Hambleton & Traub, 1974). I f s u b j e c t s are allowed to s k i p hard q u e s t i o n s and do the easy ones f i r s t then Hambleton and Traub reasoned t h a t the e f f e c t of item order would be masked and the d i f f e r e n c e s between item orde arrangements would appear to be i n s i g n i f i c a n t . Hambleton and Traub, as a r e s u l t , developed a mathematics t e s t with a t e s t b o o k l e t format t h a t prevented w i t h i n - s u b j e c t rearrangement. T h e i r s i g n i f i c a n t r e s u l t s do q u e s t i o n the v a l i d i t y of the f i n d i n g s of p r e v i o u s r e s e a r c h . However, t h lack of a c o n t r o l group t h a t d i d not have a r e s t r i c t e d format i n Hambleton and Traub's study l i m i t s the g e n e r a l i z a b i l i t y of t h e i r f i n d i n g s . However, i f Hambleton and Traub's f i n d i n g s are c o r r e c t then w i t h i n - s u b j e c t rearrangement Is perhaps a random e r r o r f a c t o r t h a t may have been c a u s i n g the i n c o n s i s t e n t r e s u l t s In t h i s f i e l d . In a d d i t i o n , w i t h i n - s u b j e c t rearrangement (7) may be such a s i g n i f i c a n t f a c t o r t h a t a l l examinees should be made aware of i t s p o t e n t i a l so t h a t they might use i t when they are t a k i n g a t e s t , j u s t as examiners must be aware of i t s e r r o r c a u s i n g a b i l i t i e s when they d e s i g n a t e s t . There i s a need to r e p l i c a t e Hambleton and Traub's r e s e a r c h of examinee c o n t r o l of item order, but with a c o n t r o l group, to determine i f item order and examinee c o n t r o l of the order are s i g n i f i c a n t f a c t o r s . In a d d i t i o n , both l a t e n t t r a i t and c l a s s i c a l s t a t i s t i c s c o u l d be used i n the a n a l y s i s t o determine i f the r e s u l t s are d u p l i c a t e d with both types of s t a t i s t i c s . Another area of concern i s the e f f e c t t h a t item order has on low a c h i e v i n g s t u d e n t s . S u r p r i s i n g l y , even though much of the concern over item order i n v o l v e d t h i s i n t e r a c t i o n e f f e c t with low a c h i e v i n g s t u d e n t s , o n l y four s t u d i e s i n c l u d e d t h i s as a f a c t o r (Sax & Cromack, 1966; Klosner & Gellman, 1973; Hodson, 1984; A l l i s o n , 1984). The r e s u l t s were i n c o n c l u s i v e , but f u r t h e r r e s e a r c h was recommended (Klosner & Gellman, 1973). P u r p o s e of t h e S t u d y T h i s study r e p l i c a t e d the procedures of the item order r e s e a r c h of Hambleton and Traub (1974). In a d d i t i o n , t h i s study examined the e f f e c t s of w i t h i n - s u b j e c t rearrangement, as d e f i n e d by Hambleton and Traub, by us i n g an a d d i t i o n a l (8) group of students as a c o n t r o l group who were not g i v e n t e s t b o o k l e t s with r e s t r i c t i v e d i r e c t i o n s . Another a s p e c t s t u d i e d was a comparison of the performance of high a b i l i t y students with low a b i l i t y s t u d e n t s . An easy to hard arrangement has been g e n e r a l l y b e l i e v e d t o be of b e n e f i t t o low a b i l i t y students who are supposedly e a s i l y f r u s t r a t e d by the hard to easy arrangement. On the other hand, high a b i l i t y s t udents may be a b l e to a v o i d t h i s f r u s t r a t i o n by u s i n g w i t h i n - s u b j e c t rearrangement when t e s t b o o k l e t s do not prevent such changes to the item o r d e r . T h i s s t u d y used a l a r g e sample of students with a wide range of a b i l i t y l e v e l s t o compare the performance of students with d i f f e r e n t a b i l i t y l e v e l s under easy to hard or hard to easy item d i f f i c u l t y sequences and under r e s t r i c t e d or u n r e s t r i c t e d t e s t b o o k l e t formats. A f i n a l a spect s t u d i e d was a comparison of the r e s u l t s of item d i f f i c u l t y s t a t i s t i c s based on c l a s s i c a l t e s t theory with item d i f f i c u l t y s t a t i s t i c s based on l a t e n t t r a i t s t a t i s t i c s . T h i s was done to determine i f s t u d i e s using l a t e n t t r a i t s t a t i s t i c s are comparable to s t u d i e s u s i n g c l a s s i c a l based s t a t i s t i c s . chapter II Review of L i t e r a t u r e I n i t i a l S t u d i e s I n i t i a l r e s e a r c h d i d tend to support the view t h a t the context of a t e s t item c o u l d i n f l u e n c e the s c o r e of the examinee. Mollenkopf (1950) was the f i r s t to study the e f f e c t s of changing item o r d e r . In a d d i t i o n , h i s study examined the e f f e c t of r e d u c i n g time l i m i t s . He used 382 grade 11 and 12 students who were d i v i d e d i n t o four groups to take one of two forms of a combined v e r b a l and mathematics exam. Then each group was a s s i g n e d to f i n i s h t h e i r t e s t under one of two t i m i n g c o n d i t i o n s . Item order was m o d i f i e d o n l y s l i g h t l y . The t e s t s were rearranged by s e c t i o n s with each s e c t i o n arranged i n t e r n a l l y from easy to hard, and with content areas kept t o g e t h e r . The time l i m i t s had more s u b s t a n t i a l changes. The time l i m i t s were 1 hour 45 minutes fo r h a l f the s t u d e n t s , while the other h a l f were gi v e n o n l y 35 minutes. The group with the s h o r t time l i m i t was, however, allowed to complete the t e s t with a d i f f e r e n t c o l o u r e d p e n c i l . Most of Mollenkopf r e s u l t s were as he expected. For one, changing the order of whole s e c t i o n s d i d not cause any changes i n the performance of the students on the (9) (10) mathematics t e s t . In a d d i t i o n , d e c r e a s i n g the time l i m i t s caused a d e t e r i o r a t i o n i n performance. Mollenkopf recommends t h a t to get u s e f u l t e s t s t a t i s t i c s , time l i m i t s should be long enough to a l l o w a t l e a s t h a l f of the students to complete the t e s t . One unexpected r e s u l t was that with o n l y minimal re-arrangement of the v e r b a l t e s t items, there was a s t a t i s t i c a l l y s i g n i f i c a n t change i n the d i f f i c u l t y l e v e l of those items (p < .05). Items p l a c e d a t the end of the t e s t had a lower p r o p o r t i o n of c o r r e c t responses. He di s m i s s e d t h i s f i n d i n g as a s m a l l i n s i g n i f i c a n t e r r o r p o s s i b l y r e l a t e d to f a t i g u e t h a t c o u l d be ignored by t e s t d e v e l o p e r s . Another e a r l y study i n t o item order was an unpublished r e p o r t by K. MacNicol i n 1956. I t was c i t e d by s e v e r a l authors (Flaugher, Melton, & Myers, 1968; Monk & S t a l l i n g s , 1970; Hambleton & Traub, 1974; Plake, 1980; Hodson, 1984; Leary & Dorans, 1985; Lane, B u l l , Kundert, & Newman, 1987). Accor d i n g to Leary and Dorans (1985), MacNicol randomly gave 1,500 high s c h o o l students one of three forms of a v e r b a l a n a l o g i e s t e s t . The mean of hard to easy arrangement was s i g n i f i c a n t l y lower than the easy to hard arrangement whereas the random arrangement was not s i g n i f i c a n t l y d i f f e r e n t from the easy to hard arrangement. U n f o r t u n a t e l y , the 30 minute time l i m i t on a 50 item t e s t may have been a (11) f a c t o r , p a r t i c u l a r l y s i n c e some students r e p o r t e d l y d i d not f i n i s h the t e s t . F u r t h e r r e s e a r c h was conducted by Sax and Carr (1962), who used 325 c o l l e g e freshmen t a k i n g two forms of the Henmon Nelson Mental A b i l i t y Test f o r C o l l e g e Students. The t e s t was arranged i n two forms. One form had the t e s t u n a l t e r e d with the d i f f i c u l t y l e v e l s a l t e r n a t i n g among easy, medium or hard, and with the content c a t e g o r i e s i n t e r m i x e d . T h i s type of arrangement Is c a l l e d s p i r a l - o m n i b u s form. The other type of arrangement was to regroup the items i n t o t h e i r three content types, v o c a b u l a r y , mathematics, and s p a t i a l r e l a t i o n s h i p s . A l l students took both forms. Sax and Carr t r i e d to reduce the e f f e c t s of speed by i n c r e a s i n g the time l i m i t from the recommended 30 minutes to 40 minutes. U n f o r t u n a t e l y , many items were not completed by the s t u d e n t s , so speed was, u n f o r t u n a t e l y , a confounding f a c t o r . The c o n c l u s i o n found by Sax and Carr was t h a t the order of the items d i d make a d i f f e r e n c e (p. < .001). Students got more answers c o r r e c t with the s p i r a l - o m n i b u s format. In a d d i t i o n , students omitted fewer items a t the end on the sp i r a l - o m n i b u s form. The most s i g n i f i c a n t number of omissions o c c u r r e d i n the mathematics s e c t i o n of the content based t e s t . They concluded t h a t the presence of (12) i n c r e a s i n g l y complex items tends to dis c o u r a g e students from responding t o the more d i f f i c u l t items. Crit ical Re-examinations I t was a cla s s r o o m teacher who wanted t o s o l v e the p r a c t i c a l problem of whether or not he c o u l d randomly rearrange t e s t items from a t e s t item bank t o c r e a t e s e v e r a l forms of the same t e s t . He wanted to have two forms o£ the t e s t i n c l a s s to prevent c h e a t i n g i n crowded s i t u a t i o n s , and he wanted to change h i s t e s t format over the years without w r i t i n g a l l new items each year or j e o p a r d i z i n g the s e c u r i t y of h i s items. M. H. Brenner (1964) used the r e s u l t s of h i s E d u c a t i o n a l Psychology 407 midterm t e s t s t o compare the r e l i a b i l i t y , d i s c r i m i n a t i o n and d i f f i c u l t y s t a t i s t i c s of rearranged p a i r s of t e s t s a d m i n i s t e r e d over four terms. Brenner compared easy to hard arrangements a g a i n s t hard to easy arrangements. As w e l l , he compared an easy t o hard order on the f i r s t ten items with a hard to easy order on the f i r s t ten items. On both forms, the l a s t t h i r t y items were i n random or d e r . U n f o r t u n a t e l y , he d i d not r e p o r t the number of s u b j e c t s i n v o l v e d , nor d i d he adequately d e s c r i b e the s t a t i s t i c a l t e s t s t h a t he used to analyze h i s m u l t i p l e comparisons. Brenner r e p o r t e d o n l y one s i g n i f i c a n t s t a t i s t i c a l d i f f e r e n c e s i n twelve comparisons. One p a i r of t e s t s had (13) s i g n i f i c a n t l y d i f f e r e n t (p < .05) d i s c r i m i n a t i o n indexes. However, without an adequate d e s c r i p t i o n of the type of t - t e s t used, t h i s c o u l d merely be a chance event. Since h i s r e s u l t s i n d i c a t e d t h a t changing of the item order d i d not make a d i f f e r e n c e i n student performance, he recommended t h a t c o l l e g e i n s t r u c t o r s not bother a r r a n g i n g t e s t items based on item d i f f i c u l t i e s . The g e n e r a l i z a b i l i t y of t h i s study i s l i m i t e d s i n c e f o u r t h year e d u c a t i o n students t a k i n g a r e q u i r e d course are a v e r y motivated and s o p h i s t i c a t e d group of s t u d e n t s . I t seems u n l i k e l y t h a t such students would become dis c o u r a g e d by any arrangement. Other r e s e a r c h e r s a l s o wanted t o f i n d out i f item order made a d i f f e r e n c e , e s p e c i a l l y G. Sax whose f i r s t study (Sax and C a r r , 1962) found a s i g n i f i c a n t e f f e c t from the item arrangement. In h i s e a r l i e r r e s e a r c h he had found that students w r i t i n g a s p i r a l format t e s t d i d b e t t e r than students who had the more t r a d i t i o n a l i n c r e a s i n g d i f f i c u l t y arrangement. These f i n d i n g s were i n c o n t r a s t t o the commonly held view that recommended easy to hard arrangements (Ahmann and Glock, 1963). As a r e s u l t , Sax and Cromack (1966) rearranged the Henmon-Nelson T e s t s of Mental A b i l i t y i n t o the f o l l o w i n g four forms: easy to hard, hard to easy, s p i r a l , and random. (14) The four forms were then a d m i n i s t e r e d to 467 f i r s t year c o l l e g e students who were allowed one of two time l i m i t s . H a l f the students had a generous 48 minutes which i s 18 more than the manual suggests while the other h a l f had the suggested 30 minutes. In a d d i t i o n , cumulative grade p o i n t s of a l l students were used as a c o v a r i a t e f a c t o r i n the a n a l y s i s of the r e s u l t s to d i v i d e the group i n t o high a b i l i t y and low a b i l i t y . P r e d i c t a b l y , Sax and Cromack found t h a t s tudents g i v e n more time performed b e t t e r on the t e s t s . In a d d i t i o n , t h e i r study found t h a t i f a r e s t r i c t i v e time l i m i t was imposed, then the mean of the easy to hard form was s i g n i f i c a n t l y higher than the mean of the hard t o easy arrangement (p. < .001). The other two arrangements were not s i g n i f i c a n t l y d i f f e r e n t . On the other hand, i f there were longer time l i m i t s , then the arrangement d i d not make a d i f f e r e n c e . In comparing the r e s u l t s of students with high grade p o i n t averages and low grade p o i n t averages there were no s i g n i f i c a n t d i f f e r e n c e s f o r time or order except when students were g i v e g r e a t e r amounts of time and were answering q u e s t i o n s on a hard to easy format t e s t . In t h i s case, high a c h i e v i n g students performed b e t t e r . T h i s unusual i n t e r a c t i o n was s i g n i f i c a n t a t the .05 l e v e l . Despite f i n d i n g t h a t under c e r t a i n circumstances low a c h i e v i n g students do i n f a c t have more d i f f i c u l t y with one (15) order than another, Sax and Cromack concluded " . . . l i t t l e i s gained i n a r r a n g i n g items i f time l i m i t s are generous." However, while i t may be true t h a t there i s not evidence t h a t one arrangement w i l l h e l p low a c h i e v i n g s t u d e n t s , the higher s c o r e s by high a c h i e v i n g students on the hard to easy arrangement may i n d i c a t e a d i f f e r e n c e i n a t t i t u d e t h a t i s r e l a t e d to item order e f f e c t s . In t h i s study, although low a c h i e v e r s d i d not seem to be d i s c o u r a g e d , high a c h i e v i n g students may have been c h a l l e n g e d by the unusual format and a c t u a l l y performed b e t t e r as a r e s u l t . Item order e f f e c t s may a f f e c t people d i f f e r e n t l y , and as a r e s u l t have both p o s i t i v e and negative e f f e c t s . Of course, t h i s study i s not i n d i c a t i v e of a wide p o p u l a t i o n s i n c e the c o l l e g e students i n v o l v e d p r o b a b l y performed w i t h i n a v e r y l i m i t e d , but very h i g h , range of achievement. U n t i l 1968, most s t u d i e s on item order i n v o l v e d s m a l l c l a s s r o o m samples. Large s c a l e t e s t d e v e l o p e r s such as the C o l l e g e Entrance Examination Board needed to know i f they c o u l d rearrange s m a l l banks of items on d i f f e r e n t t e s t s without adverse e f f e c t s . F l a u g her, Melton, and Myers (1968) used the C. E. E. B.'s S c h o l a s t i c A p t i t u d e Test to t e s t 5,000 c o l l e g e a p p l i c a n t s with 4 d i f f e r e n t forms. The arrangements v a r i e d the easy to hard arrangement w i t h i n b l o c k s of f i v e s i m i l a r content items, and v a r i e d the sequence of the content based b l o c k s . Students had 30 (16) minutes to complete 40 v e r b a l type q u e s t i o n s and 30 minutes to complete the 25 mathematics items. Although t h i s study d i d not i n v o l v e v e r y s i g n i f i c a n t changes i n item order, they found t h a t under t h e i r somewhat speeded c o n d i t i o n , item order d i d make a d i f f e r e n c e on v e r b a l items (p <.001). They d i d not f i n d a d i f f e r e n c e with d i f f e r i n g arrangements of mathematics q u e s t i o n s . They concluded t h a t s i n c e some of the r e l a t i v e l y easy v e r b a l items o c c u r r e d l a s t and were omitted by some s t u d e n t s , d i f f e r i n g numbers of unanswered q u e s t i o n s were a f a c t o r . Item ord e r , they f e l t , was an e r r o r f a c t o r , but an e r r o r f a c t o r s m a l l e r than the t e s t s standard e r r o r of measurement. Nonetheless, t h i s f a c t o r would have to be c o n s i d e r e d i f t e s t s i n v o l v e d item rearrangement. In i t ia l Studies; conclusions To summarize the f i n d i n g s of the e a r l y r e s e a r c h up to the l a t e 1960s, time l i m i t s were shown to have a d e f i n i t e impact. Item s t a t i s t i c s became more prone to item order e f f e c t s as more q u e s t i o n s are omitted by the s t u d e n t s . One e f f e c t of time l i m i t s i n some s t u d i e s was t h a t q u e s t i o n s which were not reached or omitted were gi v e n o n l y a random chance l e v e l of being c o r r e c t . T h i s would cause easy q u e s t i o n s that were completed by a l l students a t the beginning of one t e s t t o be r e p o r t e d as more d i f f i c u l t i f (17) placed a t the end where they would not be reached by students t a k i n g the hard to easy t e s t . In a d d i t i o n , f a t i g u e or f r u s t r a t i o n are some other f a c t o r s t h a t might account f o r easy q u e s t i o n s a t the b e g i n n i n g of one t e s t seeming to be hard a t the end of another. The more speeded the t e s t becomes, the more e r r o r and u n c e r t a i n t y develop. A n x i e t y and Item Order Researchers i n the l a t e s i x t i e s began to t u r n t h e i r a t t e n t i o n to more i n t e r n a l responses of the s t u d e n t s . Concerns over the a n x i e t y and s t r e s s l e v e l of the s t u d e n t s prompted r e s e a r c h e r s to examine these v a r i a b l e s . One of the f i r s t s t u d i e s to c o n s i d e r item order and s t r e s s was by French and Greer (1964). The study i n v o l v e d 152 f i r s t grade s t u d e n t s . The students were g i v e n four d i f f e r e n t v e r s i o n s of the P i c t o r i a l Test of I n t e l l i g e n c e i n a c o u n t e r - r o t a t e d o r d e r . The four forms were e i t h e r easy to hard w i t h i n s u b t e s t s , easy to hard w i t h i n the whole t e s t , random, or a s p i r a l of two easy and one hard. In a d d i t i o n to the P.T.I., students a l s o took the C a l i f o r n i a Test of Mental M a t u r i t y , the General A n x i e t y Scale f o r C h i l d r e n , and the Test A n x i e t y Scale f o r C h i l d r e n . The c h i l d r e n were a l s o r a t e d on the P.T.I. B e h a v i o r a l R a t i n g S c a l e and measured f o r s k i n r e s i s t a n c e on a polygraph r e c o r d e r . (18) Despite a l a r g e amount of assessment te c h n i q u e s , not a l a r g e amount of data was r e p o r t e d . The o n l y r e s u l t s t h a t French and Greer r e p o r t e d to be s i g n i f i c a n t were the data t h a t i n d i c a t e d t h a t r e g a r d l e s s of the order, the f i r s t performance out of the four exposures was lower. There were not any i n c r e a s e s i n galvanometer readings as a r e s u l t of changing the o r d e r . The authors concluded t h a t If the P.T.I, was used with a s i m i l a r group of s t u d e n t s , they would not be s e n s i t i v e to d i f f e r e n t item arrangements. U n f o r t u n a t e l y , t h e i r sample of students l i m i t s the a p p l i c a b i l i t y of the French and Greer study. For one, the r e p o r t e d I.Q. s c o r e s of the students ranged from 100 to 125. Most of the concern about item order i n v o l v e s the f r u s t r a t i o n of low a b i l i t y s t u d e n t s , but t h e i r sample d i d not i n c l u d e low a b i l i t y s t u d e n t s . In a d d i t i o n , f i r s t grade students may not have had s u f f i c i e n t s c h o o l experience to f i n d t e s t s s t r e s s f u l or f r u s t r a t i n g . A l s o , the item arrangements g i v e n d i d not i n v o l v e the p o s s i b l y most f r u s t r a t i n g arrangement of hard to easy. The authors d i d mention t h a t t h i s sample may not have been s u f f i c i e n t l y anxious enough, nor were the p i c t u r e s of the t e s t a r o u s i n g enough to r e g i s t e r any r e a c t i o n on the galvanometer. In a d d i t i o n to Improving the g e n e r a l i z a b i l i t y of t h e i r study, French and Greer c o u l d have r e p o r t e d more d e t a i l s about t h e i r f i n d i n g s . For one, they d i d not provide the (19) d e t a i l s of the d i f f i c u l t y l e v e l of the items used. Secondly, the r e s u l t s of the a n x i e t y measures were not r e p o r t e d , and f i n a l l y there was not a f a c t o r i a l a n a l y s i s to determine i f students with high a n x i e t y had r e a c t i o n s to any p a r t i c u l a r order t h a t were d i f f e r e n t than students with low a n x i e t y . Researchers not o n l y began to take an i n t e r e s t i n a n x i e t y i n the l a t e 1960s, but they a l s o began to use the F t e s t s t a t i s t i c s t o look a t both the main e f f e c t s and the i n t e r a c t i o n e f f e c t s . As a r e s u l t , the i n t e r a c t i o n e f f e c t of high a n x i e t y with hard to easy item d i f f i c u l t y sequence was not overlooked by Smouse and Munz (1968). In t h e i r study, 113 c o l l e g e freshmen were g i v e n one of three forms of a psychology f i n a l exam. The items on the exam were arranged by d i f f i c u l t y l e v e l e i t h e r easy to hard (E-H), hard to easy (H-E), or random (R). In a d d i t i o n , the M u l t i p l e A f f e c t A d j e c t i v e Check L i s t , a t e s t f o r a n x i e t y , was i n c l u d e d a t the end of the t e s t . The three d i f f e r e n t item groups were randomly a s s i g n e d to two d i f f e r e n t t e s t d i r e c t i o n s groups. One group r e c e i v e d a n x i e t y provoking i n f o r m a t i o n concerning steps to prevent widespread c h e a t i n g along with t h e i r d i r e c t i o n s f o r the t e s t while the other group j u s t r e c e i v e d n e u t r a l , non-arousing t e s t d i r e c t i o n s . Smouse and Munz (1968) found no s i g n i f i c a n t d i f f e r e n c e s i n t e s t r e s u l t s among any of the groups. The item (20) arrangement d i d not make a d i f f e r e n c e , the type of a n x i e t y group d i d not make a d i f f e r e n c e , and the i n t e r a c t i o n s between those f a c t o r s were not s i g n i f i c a n t . The r e s u l t s of the a n x i e t y measure at the end of the t e s t o n l y showed t h a t everyone was h i g h l y anxious. The authors were d i s a p p o i n t e d with t h e i r r e s u l t and concluded t h a t any d i f f e r e n c e s t h a t c o u l d be caused by a n x i e t y were p o s s i b l y masked by the a l r e a d y h i g h l y a n x i e t y producing s i t u a t i o n of a f i n a l exam. They also- reasoned t h a t there may be i n d i v i d u a l r e a c t i o n s to t e s t s which can be e f f e c t e d by item sequence. In a follow-up study Munz & Smouse (1968) used the Achievement A n x i e t y Test and t h e i r psychology f i n a l exam to compare 120 c o l l e g e freshmen. The students had to take one of three t e s t s with the d i f f i c u l t y l e v e l s arranged (H - E ) , (E-H), or (R), as i n the p r e v i o u s study. However i n t h i s study, the students r e s u l t s were d i v i d e d a c c o r d i n g to t h e i r A.A.T. s c o r e s i n t o four groups of a n x i e t y l e v e l s f o r s t a t i s t i c a l a n a l y s i s . While Munz and Smouse (1968) d i d not show a s i g n i f i c a n t main e f f e c t f o r item o r d e r , there was a f a i r l y complex i n t e r a c t i o n between a n x i e t y l e v e l and form of the t e s t (p < .01). One group l a b e l l e d " n o n - a f f e c t e d s " performed lowest on the random arrangement but h i g h e s t on the hard to easy arrangement. Another group, the " h i g h - a f f e c t e d s " , d i d best on the random but worst on the easy to hard. The (21) " d e b i l l t a t o r s " d i d p o o r l y on a l l forms while the " f a c i l i t a t o r s " d i d w e l l on a l l but the hard to easy form. The hard to easy form was found to have the s m a l l e s t v a r i a n c e with no s i g n i f i c a n t d i f f e r e n c e s between the groups. Munz and Smouse's c o n c l u s i o n s were based on an i n v e r t e d - U l e v e l of a r o u s a l t h e o r y . I f a r o u s a l l e v e l s are i n c r e a s e d , some people w i l l respond w e l l while others w i l l respond p o o r l y . For each person t h e r e i s an optimum l e v e l of a r o u s a l t h a t serves as the peak of an l n v e r t e d - U graph of t h e i r performance. The d i f f e r e n t formats p r o v i d e d d i f f e r e n t l e v e l s of a r o u s a l and, as a r e s u l t , d i f f e r e n t l e v e l s of performance. Increases i n performance by c e r t a i n a n x i e t y types tended to be c a n c e l l e d out by decreases by other t y p e s , so the main e f f e c t was not s i g n i f i c a n t . However, Munz and Smouse noted t h a t the hard to easy arrangement had the l e a s t v a r i a n c e , so they recommended t h a t the hard to easy sequence would be the best format to use to minimize p e r s o n a l i t y v a r i a b l e s . In a f o l l o w - u p study, Smouse and Munz r e p l i c a t e d t h e i r study and f i n d i n g s , but tempered t h e i r c o n c l u s i o n with a c a u t i o n t h a t even though the hard to easy sequence may e l i m i n a t e some unwanted v a r i a n c e , i t may i n t r o d u c e other t e s t t a k i n g contaminants not examined by t h e i r s t u d i e s (Smouse & Munz, 1969). Another follow-up study by Munz and Jacobs (1971) a l s o e s s e n t i a l l y r e p l i c a t e d the procedures and the r e s u l t s of the (22) study by Munz and Smouse (1968). One change of procedure was to t r y and determine the d i f f i c u l t y l e v e l of each item by s u b j e c t i v e judgement procedures r a t h e r than the t y p i c a l •p 1 l e v e l s t a t i s t i c a l method. T h i s was an attempt to address the i s s u e t h a t i f the s u b j e c t i v e d i f f i c u l t y of an item d i f f e r s among i n d i v i d u a l s , then arrangements of d i f f i c u l t y based on 'p' l e v e l may not be a c t u a l l y sequencing the items f o r the s u b j e c t as the r e s e a r c h e r intended. They asked 142 psychology students and 9 i n s t r u c t o r s to r a t e the d i f f i c u l t y l e v e l of each q u e s t i o n which would be a d m i n i s t e r e d to the second group of 133 c o l l e g e s tudents on t h e i r psychology f i n a l exam. The i n t e r - o b s e r v e r agreement showed a moderately high r e l a t i o n s h i p , and was r e p o r t e d as r = .62. The average d i f f i c u l t y of the group of items was r e p o r t e d as not d i f f e r i n g s i g n i f i c a n t l y . They f e l t t h e i r r e s u l t s with t h i s new procedure demonstrated t h a t the d i f f i c u l t y of the t e s t item was more complex than was r e f l e c t e d by the t y p i c a l 'p' index. Oddly, Munz and Jacobs (1971) d i d not r e p o r t the c o r r e l a t i o n between the s u b j e c t i v e d i f f i c u l t y r a t i n g s and •p 1 index r a t i n g s . In a d d i t i o n , t h e i r f i n d i n g s d i d not seem to d i f f e r from s t u d i e s which d i d use the 'p' index. As a r e s u l t , the Increased e f f o r t i n o b t a i n i n g s u b j e c t i v e r a t i n g s does not seem to be necessary i n examining the e f f e c t s of item d i f f i c u l t y sequence. T h i s i s p a r t i c u l a r l y the case 1£ (23) there i s a s t r o n g c o r r e l a t i o n between the 'p' index and the s u b j e c t i v e r a t i n g . Of course, as Munz and Jacobs concluded, f u r t h e r r e s e a r c h on the r e l a t i o n s h i p of s u b j e c t i v e d i f f i c u l t y l e v e l s with other v a r i a b l e s i s needed. The i n t e r a c t i o n e f f e c t t h a t Munz and Smouse (1968) found with achievement t e s t s and a n x i e t y types was not found to be present with a b i l i t y t e s t s (Berger, Munz, Smouse, & Angel i n o , 1969). Berger et a l . used a format s i m i l a r to t h a t of the p r e v i o u s s t u d i e s by Smouse and Munz. They had the three d i f f e r e n t forms of item d i f f i c u l t y and i d e n t i f i e d the four d i f f e r e n t a n x i e t y types. However, they used the Henmon-Nelson Test of Mental A b i l i t i e s with 330 high s c h o o l students r a t h e r than a c o l l e g e f i n a l exam. In t h i s study they a l s o had two t e s t i n s t r u c t i o n c o n d i t i o n s to h o p e f u l l y generate 2 l e v e l s of a n x i e t y . One group of students r e c e i v e d i n s t r u c t i o n s t h a t the mental a b i l i t y s c o r e s would be used on t h e i r permanent r e c o r d while the others were t o l d t h a t the s c o r e s were to be o n l y used f o r r e s e a r c h purposes. Not o n l y d i d Berger et a l . not f i n d any i n t e r a c t i o n e f f e c t s , but the main e f f e c t s from changing item order and the main e f f e c t from g i v i n g d i f f e r e n t t e s t i n s t r u c t i o n s were a l s o found to be n o n - s i g n i f i c a n t . However, a n x i e t y type was a s i g n i f i c a n t f a c t o r . " F a c i l i t a t o r s " scored h i g h e s t f o l l o w e d by " n o n - a f f e c t e d s " then " h i g h - a f f e c t e d s " with " d e b i l l t a t o r s " l a s t . Berger et a l . concluded t h a t the item (24) d i f f i c u l t y sequence does not e f f e c t a b i l i t y t e s t s . However, they f e l t d i f f e r e n t i a l r e a c t i o n to t e s t t a k i n g a n x i e t y does have a s i g n i f i c a n t e f f e c t on a p t i t u d e t e s t s . S e v e r a l e x p l a n a t i o n s were g i v e n by Berger et a l . f o r t h i s lack of e f f e c t . For one, there i s the p o s s i b l e s t a b i l i t y of a p t i t u d e t e s t s . Another p o s s i b l e e x p l a n a t i o n i s t h a t the high s c h o o l student p o p u l a t i o n has a g r e a t e r v a r i a t i o n of i n t e l l i g e n c e , t e s t t a k i n g a b i l i t y , and t e s t t a k i n g m o t i v a t i o n . T h i s may be true f o r a l l high s c h o o l students or f o r j u s t t h i s sample, and the authors are remiss i n not p r o v i d i n g d e t a i l s of the i n t e l l i g e n c e s c o r e s and socioeconomic s t a t u s of the sample e s p e c i a l l y s i n c e t h a t i n f o r m a t i o n was r e p o r t e d l y gathered. Such i n f o r m a t i o n might a l s o i n d i c a t e i f there was an unusual l a c k of v a r i a t i o n i n the i n t e l l i g e n c e s c o r e s of t h e i r sample. A lack of v a r i a n c e c o u l d mask any changes caused as a r e s u l t of changing the order s i n c e those changes are supposedly most n o t i c e a b l e among the s t u d e n t s with the lowest a b i l i t y . F u r t h e r attempts were made to r e p l i c a t e the s t u d i e s of Smouse and Munz. Marso (1970) r e p o r t e d two s t u d i e s t h a t examined a n x i e t y and item sequence. His f i r s t study i n v o l v e d 122 f i r s t year c o l l e g e students randomly a s s i g n e d to three item arrangements (E-H, H-E, and R) with item d i f f i c u l t i e s r a n g i n g from 0% to 100%. The students had to complete 139 q u e s t i o n s of the Quick Word T e s t . In a d d i t i o n , (25) each student completed a s e r i e s of t e s t a n x i e t y measures p r i o r to t a k i n g the Q.W.T. so t h a t t h e i r t e s t r e s u l t s c o u l d be grouped i n t o high, average, or low a n x i e t y t e s t t a k i n g groups. There were no time l i m i t s , but a r e c o r d was kept to determine how long each student took to complete h i s or her t e s t . Marso's a n a l y s i s of v a r i a n c e found no s i g n i f i c a n t e f f e c t from item order nor from an i n t e r a c t i o n of item order and a n x i e t y l e v e l . A l s o , n e i t h e r item order nor a n x i e t y l e v e l had a s i g n i f i c a n t impact on the l e n g t h of time t h a t students used to complete t h e i r t e s t . However, a n x i e t y l e v e l was found to be a s i g n i f i c a n t f a c t o r In the l e v e l of performance (p < .01). The most anxious students had the lowest s c o r e s . Marso's second study found s i m i l a r r e s u l t s . Only the a n x i e t y l e v e l was found to be a s i g n i f i c a n t f a c t o r i n the l e v e l of performance (p < .01). The second study i n v o l v e d 156 c o l l e g e s tudents w r i t i n g t h e i r psychology f i n a l exam. As i n the p r e v i o u s study, students were grouped f o r a n a l y s i s i n t o high, average, and low a n x i e t y l e v e l s based on a s e r i e s of a n x i e t y t e s t s . T h i s experiment was, however, q u i t e d i f f e r e n t i n the arrangement. The a c t u a l item d i f f i c u l t i e s were not used, but the order i n which the t o p i c s were presented i n c l a s s was used f o r the b a s i s of the (26) \ arrangement. One t e s t had the items presented i n the order t h a t t h e i r content was presented by the tea c h e r , another t e s t was presented i n a r e v e r s e order of p r e s e n t a t i o n , and a f i n a l form was arranged i n random o r d e r . Marso's c o n c l u s i o n s from the two t e s t s were that t e s t s without time l i m i t s do not have t o be arranged i n d i f f i c u l t y o r d e r , or i n order of c l a s s p r e s e n t a t i o n , or i n groups of s i m i l a r c o n t e n t . These c o n c l u s i o n s are made even though h i s f i r s t experiment with the Quick Word Test may not have had s u b j e c t s who were motivated enough to experience t y p i c a l c l a s s r o o m l e v e l s of a n x i e t y and f r u s t r a t i o n . His second study c o u l d have i n v o l v e d h i g h l e v e l s of a n x i e t y , but the t e s t s were not arranged by d i f f i c u l t y l e v e l s . In p a r t i c u l a r , the hard t o easy sequence was not t e s t e d . In c o n t r a s t to Marso's study i s the Towle and M e r r i l l (1975) study, which a l s o was an attempt to r e p l i c a t e the work of Munz and Smouse (1968), comparing d i f f e r e n t a n x i e t y l e v e l s and item o r d e r s . Towle and M e r r i l l used 82 v o l u n t e e r s from e d u c a t i o n a l psychology courses and community c o l l e g e mathematics courses to take the F l o r i d a Statewide Twelfth-Grade Mathematics Achievement T e s t . The t e s t had a r e p o r t e d wide range of d i f f i c u l t i e s , and the items were arranged i n the three common d i f f i c u l t y p a t t e r n s (E-H, H-E, and R). The students were a l s o r e p o r t e d to have a wide range of mathematics a b i l i t i e s . In a d d i t i o n to the (27) mathematics t e s t , the students completed the Achievement A n x i e t y Test and the S t a t e - T r a i t A n x i e t y (S.T.A.I.) p r i o r t o the mathematics t e s t and the S.T.A.I, a f t e r the mathematics t e s t . Towle and M e r r i l l found no s i g n i f i c a n t i n t e r a c t i o n s , and a n x i e t y l e v e l s were not a s i g n i f i c a n t f a c t o r . However, u n l i k e the p r e v i o u s s t u d i e s , they d i d f i n d t h a t the order of the item d i f f i c u l t i e s was s i g n i f i c a n t (p < .05). Towle and M e r r i l l concluded t h a t the s i g n i f i c a n t e f f e c t of item order was a r e s u l t of the s l i g h t time l i m i t s placed on the exam. Many of the students d i d not have time to c o n s i d e r every problem. Another s i g n i f i c a n t r e s u l t was the i n c r e a s e i n a n x i e t y l e v e l as recorded on the S.T.A.I, post t e s t . T h i s a n x i e t y i n c r e a s e was however not r e l a t e d to the d i f f e r e n t item o r d e r s , so none of the orders seemed to i n c r e a s e a n x i e t y more t h a t any o t h e r . While the lnverted-U t h e o r y of Munz and Smouse (1968) was not supported with a s i g n i f i c a n t i n t e r a c t i o n , Towle and M e r r i l l f e l t t h a t the data showed a tendency t h a t would i n d i c a t e the presence of such an e f f e c t . They f e l t t h a t the e f f e c t may have been masked by the f a c t t h a t the a n x i e t y l e v e l groups are based on sample norms i n each study r a t h e r than s t a n d a r d i z e d norms. As a r e s u l t , an i n d i v i d u a l c o u l d be p l a c e d i n a d i f f e r e n t a n x i e t y l e v e l group In each study depending on the sample used i n the study. (28) In summary, r e s e a r c h on item order and i t s r e l a t i o n to a n x i e t y was not p r o v i n g to be a f r u i t f u l l i n e of r e s e a r c h . For one, the lack of a n x i e t y l e v e l norms l i m i t e d the g e n e r a l i z a b i l i t y of the f i n d i n g s . F u r t h e r , o n l y three s t u d i e s found a s i g n i f i c a n t i n t e r a c t i o n (Munz & Smouse, 1968; Smouse & Munz, 1969; Munz & Jacobs, 1971). The other f i v e s t u d i e s , i n c l u d i n g two with Smouse and Munz, f a i l e d to f i n d an i n t e r a c t i o n . One, however, d i d f i n d a s i g n i f i c a n t e f f e c t of order alone, but as with other s t u d i e s , i t i n v o l v e d a t e s t with time l i m i t s . Achievement and Item Order An important area of i n t e r a c t i o n r e s e a r c h i s with achievement l e v e l s and item sequencing. I t has been t h e o r i z e d t h a t students with average or below average achievement would perform most c o n f i d e n t l y and e f f i c i e n t l y with an easy to hard arrangement (Ahmann & Glock, 1963). Few s t u d i e s have c o n s i d e r e d the e f f e c t of t h i s f a c t o r . Sax and Cromack (1966), as p r e v i o u s l y reviewed, r e p o r t e d t h a t high a b i l i t y s t udents d i d b e t t e r on a hard to easy t e s t i f they were a l s o g i v e n a generous amount of time. Klosner and Gellman (1973) ranked 54 graduate students on the b a s i s of t h e i r midterm marks as e i t h e r high or low a c h i e v e r s . Students were randomly a s s i g n e d to take one of three forms of the f i n a l exam. The item format was arranged (29) u s i n g item d i f f i c u l t y i n a t y p i c a l c l a s s r o o m manner. One t e s t was an easy to hard order w i t h i n s i m i l a r content groupings. Another was random order w i t h i n s i m i l a r content o r d e r . The t h i r d format was the more common easy to hard arrangement. There were no time l i m i t s or item d i f f i c u l t i e s r e p o r t e d . Klosner and Gellman d i d not show a s i g n i f i c a n t I n t e r a c t i o n or main e f f e c t of item o r d e r . However, the authors f e l t t h a t the i n t e r a c t i o n was almost s i g n i f i c a n t (p < .15) and t h e r e f o r e showed a t r e n d . The low a c h i e v i n g students seemed to do best on the easy to hard with the content groups arrangement. T h e i r s u g g e s t i o n t h a t f u r t h e r r e s e a r c h should proceed i s indeed warranted s i n c e there i s reason to b e l i e v e that t h e i r study lacked power due to the s m a l l sample s i z e . In a d d i t i o n , a more g e n e r a l i z a b l e study should be done s i n c e the sample of t h i s study was high a b i l i t y graduate students who p r o b a b l y do not demonstrate some of the t y p i c a l behaviour p a t t e r n s of students normally c o n s i d e r e d low a c h i e v e r s . One t y p i c a l c h a r a c t e r i s t i c of low a c h i e v e r s i s t h e i r low achievement m o t i v a t i o n . Although Kestenbaum and Weiner (1970), d i d not examine s p e c i f i c a l l y v a r i o u s achievement l e v e l s , they d i d examine achievement m o t i v a t i o n . In a d d i t i o n , they examined the r e l a t i o n s h i p between item order, t e s t a n x i e t y and achievement m o t i v a t i o n . They used 79 (30) seventh and e i g h t h graders who were a d m i n i s t e r e d a r e a d i n g t e s t i n e i t h e r random or easy to hard sequence. A l s o , the students were a d m i n i s t e r e d the Test A n x i e t y S c a l e f o r C h i l d r e n and the C h i l d r e n ' s Achievement M o t i v a t i o n S c a l e . There were no s i g n i f i c a n t e f f e c t s as a r e s u l t of d i f f e r e n t o rder, but t h e r e was a s i g n i f i c a n t c o r r e l a t i o n between m o t i v a t i o n , a n x i e t y , and performance s c o r e s . They d i d not r e p o r t how motivated the s t u d e n t s were i n the study, but they d i d conclude t h a t h i g h l y motivated students with low a n x i e t y tend to p e r s i s t a t endeavours d e s p i t e f a i l u r e . I t would have been i n t e r e s t i n g to see the i n t e r a c t i o n s t h a t may have r e s u l t e d i n t h i s study i f i t had i n c l u d e d the p o s s i b l y most f r u s t r a t i n g sequence of hard to easy. Hodson (1984), however, d i d use v a r i o u s achievement l e v e l s and a hard to easy sequence. He compared 157 students between the ages of 16 to 19 who were t a k i n g the B r i t i s h s c h o o l system's A - l e v e l exam i n ch e m i s t r y . U n f o r t u n a t e l y , due to the h i g h l y academic and c o m p e t i t i v e nature of the exams, the students who took the exam were r e p o r t e d t o be high a b i l i t y s t udents and h i g h l y motivated s t u d e n t s . Nonetheless, the students were grouped Into three d i f f e r e n t a b i l i t y l e v e l s based on t h e i r p r e v i o u s O - l e v e l exam r e s u l t s . The students were gi v e n one of three t e s t s arranged i n the t y p i c a l formats: easy to hard, random, or hard to easy. ( 3 1 ) The students who took the hard to easy arrangement had the lowest average mean, 26.6 on the f i f t y item t e s t . In a d d i t i o n , most of the students who f a i l e d to complete the t e s t w i t h i n the time l i m i t s had taken the hard t o easy t e s t . Item order had an e f f e c t s i n c e there was a s i g n i f i c a n t d i f f e r e n c e between the means of a l l the t e s t s (p < .01). The mean of the easy to hard t e s t was the h i g h e s t , 31.5, and the mean of the random format was 29.6. The a b i l i t y l e v e l was a l s o r e p o r t e d as a s i g n i f i c a n t main e f f e c t (p < .01), but there were no s i g n i f i c a n t i n t e r a c t i o n e f f e c t with the item order and the a b i l i t y l e v e l s . Sex of the student was a l s o examined as a f a c t o r , but no s i g n i f i c a n t main e f f e c t s or i n t e r a c t i o n s were found. S u r p r i s i n g l y , Hodson concluded, "Apart from a s l i g h t l y i n f l a t e d mean s c o r e , which might have some m o t i v a t i o n a l v a l u e , there was no evidence t o support the p r a c t i c e of p r e s e n t i n g m u l t i p l e c h o i c e c h e m i s t r y q u e s t i o n s In an easy to hard sequence." However, h i s f i n d i n g s show more than a s l i g h t l y i n f l a t e d mean s c o r e . They show a s i g n i f i c a n t d i f f e r e n c e between t e s t s c o r e s . The three t e s t s are c l e a r l y not e q u i v a l e n t . From an academic student's p o i n t of view there i s a l s o a v e r y s i g n i f i c a n t d i f f e r e n c e between r e c e i v i n g a score of 31.5 on the t e s t as compared with a 26.6 on the t e s t and p o s s i b l y not having enough time to f i n i s h . Such a d i f f e r e n c e on a c r u c i a l exam c o u l d have very (32) s i g n i f i c a n t e f f e c t s on a student's f u t u r e as w e l l . Despite the s u g g e s t i o n s by Hodson, c h e m i s t r y t e a c h e r s should i n f a c t spend the e f f o r t to sequence items i n a way which a t the v e r y l e a s t a v o i d s the t h e o r e t i c a l discouragement of the hard to easy sequence. Other I n t e r a c t i o n s The i n t e r a c t i o n s of item order with v a r i o u s other unusual f a c t o r s were a l s o of p a r t i c u l a r i n t e r e s t to B. S. Plake. Plake was i n v o l v e d i n f i v e s t u d i e s . Although other f a c t o r s were examined, the primary area of i n t e r e s t i n three s t u d i e s was the e f f e c t of item order and student's knowledge of t h a t o r d e r , with the student's performance and p e r c e p t i o n s . Plake's f i r s t study (Plake, 1980) used 104 p s y c h i a t r i c nurses t a k i n g three forms of t h e i r midterm exam, easy to hard,, s p i r a l , or random. Half of the nurses were g i v e n i n f o r m a t i o n i n the t e s t i n s t r u c t i o n s about the order and s t r a t e g i e s t o d e a l with the ord e r . The other h a l f r e c e i v e d no such e x t r a Information. A l l students had to complete a q u e s t i o n n a i r e a t the end of the t e s t d e s c r i b i n g t h e i r p e r c e p t i o n s of the t e s t o r d e r , t h e i r performance, and t h e i r expected s c o r e . U n f o r t u n a t e l y , an adequate d e s c r i p t i o n of h i s measure of student p e r c e p t i o n was not pr o v i d e d , so there are some q u e s t i o n s about the v a l i d i t y of t h i s measure. (33) In t h i s f i r s t study, Plake (1980) d i d not f i n d t h a t item order or knowledge of the item order s i g n i f i c a n t l y e f f e c t e d e i t h e r the s t u d e n t s ' t e s t s c o r e s or t h e i r r e p o r t e d p e r c e p t i o n s of the t e s t . Plake d i d admit t h a t the type of examination and the type of student used l i m i t e d the g e n e r a l i z a t i o n s p o s s i b l e from t h i s study. However, she d i d t h e o r i z e t h a t a n x i e t y may have been an i n t e r a c t i n g f a c t o r . She proposed t h a t knowledge of easy to hard order may have caused the performance of the h i g h l y anxious to drop and o f f s e t the r i s e i n performance by the l e s s anxious s t u d e n t . A n x i e t y a l o n g with item order and the knowledge of that order was the focus of the next study (Plake, Thompson, & Lowry, 1980). A n x i e t y was measured as i n e a r l i e r r e s e a r c h by Towle and M e r r i l l (1975). They used the Achievement A n x i e t y Test before the exam and the STATE and TRAIT A n x i e t y i n v e n t o r i e s as p r e - t e s t s and p o s t - t e s t s . Item orde r , knowledge of order and student p e r c e p t i o n s were the same as i n Plake's f i r s t t e s t . In a d d i t i o n , two d i f f e r e n t s c o r i n g methods were used, number r i g h t or e l i m i n a t i o n . Knowledge of the s c o r i n g method was a l s o a f a c t o r i n the study with the h a l f of the students who d i d not r e c e i v e i n f o r m a t i o n about the order r e c e i v i n g the i n f o r m a t i o n about the s c o r i n g system i n s t e a d . U n f o r t u n a t e l y , t h i s procedure does not provide f o r a c o n t r o l group to not r e c e i v e i n f o r m a t i o n about s c o r i n g or o r d e r . The s u b j e c t s were 97 e d u c a t i o n a l (34) p s y c h o l o g y s t u d e n t s who v o l u n t e e r e d t o t a k e the A.C.T. C o l l e g e Mathematics Placement Program t e s t . The s t u d e n t s r e c e i v e d c o u r s e c r e d i t f o r v o l u n t e e r i n g . W ith so many f a c t o r s , some of the r e s u l t s of P l a k e , Thompson e t a l . were q u i t e d i f f i c u l t t o i n t e r p r e t . For example, the i n t e r a c t i o n of a n x i e t y c o n d i t i o n w i t h knowledge of o r d e r w i t h u s i n g the number r i g h t marking s y s t e m was s i g n i f i c a n t . The a u t h o r s a d m i t t e d t h a t such a r e s u l t may not be m e a n i n g f u l s i n c e the i n t e r a c t i o n between a n x i e t y and the knowledge of o r d e r was a l r e a d y shown t o be n o n - s i g n i f i c a n t . The i m p l i c a t i o n s of the o t h e r f i n d i n g s were a b i t more c l e a r . The main e f f e c t of o r d e r was not found t o be s i g n i f i c a n t . However, o r d e r d i d i n t e r a c t w i t h the p r e - t e s t and p o s t - t e s t a n x i e t y s c o r e s t o produce a r e s u l t t h a t was a l m o s t s i g n i f i c a n t (p < .10). The lower p o s t - t e s t a n x i e t y s c o r e s h i n t s a t a t r e n d which might i n d i c a t e t h a t some i t e m arrangements i n c r e a s e d p r e - t e s t a n x i e t y whereas o t h e r s may have caused a d e c r e a s e . The a u t h o r s s t a t e : Trends i n t h e d a t a do s u p p o r t the presence of some p o s s i b l e e f f e c t s . T h e r e f o r e , c o n c l u s i o n s based on the r e s u l t s of t h i s s t u d y s h o u l d be tempered by the knowledge t h a t the l a c k of s i g n i f i c a n t e f f e c t may have been due i n p a r t t o i n s u f f i c i e n t power and/or (35) m o t i v a t i o n i n the r e s e a r c h d e s i g n . (Plake, Thompson, & Lowry, 1980, p. 218) I t i s the concern with m o t i v a t i o n t h a t c h a r a c t e r i z e s her t h i r d study on the e f f e c t s of the knowledge of order (Plake, Ansorge, Parker, & Lowry, 1982). To get a l a r g e r and more motivated sample, the authors used 170 s e n i o r and graduate students e n r o l l e d i n an i n t r o d u c t o r y s t a t i s t i c s c o u rse. The study was arranged i n a s i m i l a r manner as the p r e v i o u s study by Plake, Thompson, and Lowry, there were 3 common item arrangements (E-H, R, S), h a l f the students were given i n f o r m a t i o n about the o r d e r . In a d d i t i o n , the same mathematics t e s t , a n x i e t y t e s t s , and p e r c e p t i o n q u e s t i o n n a i r e s were used. The d i f f e r e n c e s between the two s t u d i e s were t h a t the students were a l s o g i v e n a Revised Mathematics A n x i e t y R a t i n g S c a l e , there was o n l y one s c o r i n g system used, and to motivate the s t u d e n t s , they were t o l d t h a t the r e s u l t s of the mathematics t e s t would be used to determine which students would q u a l i f y f o r e x t r a remedial mathematics c l a s s e s . An a d d i t i o n a l change was t h a t a l l of the t e s t s were grouped by sex f o r f a c t o r a n a l y s i s . T h i s was done to examine the i n f l u e n c e of sex on mathematics t e s t performance and to examine the i n t e r a c t i v e e f f e c t s of a n x i e t y , sex of the s u b j e c t , item arrangement, and knowledge of arrangement. (36) Even though the students In the Plake, Ansorge, et a l . study had 75 minutes to complete a 48 item t e s t , and even though the v o l u n t e e r s i n the p r e v i o u s study by Plake, Thompson, and Lowry e a s i l y f i n i s h e d the t e s t i n the a l l o t t e d time, these h i g h l y motivated students d i d not complete a l l items. Twenty percent of the students f a i l e d to f i n i s h the mathematics t e s t . Due to some problems with d i r e c t i o n s , the two p e r c e p t i o n q u e s t i o n s were a l s o l e f t blank by many st u d e n t s . As a r e s u l t , a planned power t e s t became a speeded t e s t . The e f f e c t of time l i m i t s on order has been w e l l e s t a b l i s h e d . As p r e v i o u s r e s e a r c h has shown, when there are s i g n i f i c a n t time l i m i t s , item order can have an e f f e c t . So, with time l i m i t s i n v o l v e d , Plake, Ansorge et a l . found an item order e f f e c t . While there was no s i g n i f i c a n t main e f f e c t from item order, there was an unusual but s i g n i f i c a n t i n t e r a c t i o n e f f e c t (p < .007). Not o n l y d i d males out perform females on a l l mathematics t e s t s combined (p < .002), but f a c t o r a n a l y s i s showed that males a c t u a l l y d i d best on the easy to hard order and the random order while both males and females performed e q u a l l y w e l l on the s p i r a l arrangement. The authors suggested t h a t f u r t h e r r e s e a r c h would be needed u s i n g non-mathematics t e s t s and longer time l i m i t s . (37) Item order was a l s o a s i g n i f i c a n t f a c t o r i n the s e l f - r e p o r t e d p e r c e i v e d performance and p e r c e i v e d t e s t d i f f i c u l t y . Although none of the p a i r w i s e comparisons were s i g n i f i c a n t , the random form showed a tendency to have the lowest p e r c e i v e d d i f f i c u l t y r a t i n g and the h i g h e s t performance r a t i n g . One l e s s o n t h a t might be lea r n e d from the Plake, Ansorge et a l . study i s to not assume t h a t a power t e s t w i l l i n f a c t be a power t e s t . Sometimes, i t i s the students who w i l l determine i f there w i l l be time l i m i t s . Despite p r o v i d i n g what one might c o n s i d e r t o be generous time l i m i t s , and t h e r e f o r e u s i n g assumptions about power t e s t , i f a number of students do not f i n i s h , then the assumptions about time l i m i t s and item order e f f e c t s might a p p l y . The unusual f i n d i n g s of Plake, Ansorge et a l . t h a t item order i n t e r a c t e d with the sex of the s u b j e c t on a mathematics t e s t was f u r t h e r examined i n two follow-up s t u d i e s (Plake, M e l i c a n , C a r t e r , & Shaughnessy, 1983; Plake & Ansorge, 1984). Both t e s t s used c o l l e g e students who were w r i t i n g a psychology f i n a l exam. Plake, M e l i c a n et a l . had 167 students w r i t e t e s t s composed of three s e c t i o n s with each s e c t i o n having q u e s t i o n s t h a t would be arranged i n one of the f o l l o w i n g three ways: easy to hard, s p i r a l , or random. While the item order and sex of the s u b j e c t i n t e r a c t i o n was not found to be s i g n i f i c a n t at the .05 (38) l e v e l , i t would have been s i g n i f i c a n t at the .10 l e v e l . There was a s i g n i f i c a n t e f f e c t i n c o n n e c t i o n with the sex of the s u b j e c t with the 128 females r e c e i v i n g higher s c o r e s than the 39 males (p < .05). On the other hand, Plake & Ansorge d i d not f i n d any such sex of the s u b j e c t e f f e c t with t h e i r 279 female and 73 male s t u d e n t s . Both of these s t u d i e s concluded t h a t the f i n d i n g s of Plake, Ansorge et a l . , which i n v o l v e d higher s c o r e s by males on a mathematics t e s t , were not a p p l i c a b l e to n o n - q u a n t i t a t i v e t e s t s . I t was noted t h a t comparisons were d i f f i c u l t to make s i n c e the t e s t s i n a l l the s t u d i e s were not e q u a l l y d i f f i c u l t . For example, the t e s t used by Plake and Ansorge had a d i f f i c u l t y r a t i n g of .67, but the t e s t s used by Plake, M e l i c a n et a l . had d i f f i c u l t y r a t i n g s between .32 and .48. Another area which causes d i f f i c u l t y i n making comparisons, which was not mentioned by the authors of e i t h e r study, was the non-random nature of t h e i r samples. U n f o r t u n a t e l y , t h e i r samples were based on students who took c e r t a i n c o u r s e s . I t does not seem a p p r o p r i a t e to equate males and females who take a s t a t i s t i c s c l a s s with males and females who take a psychology c l a s s . For a v a r i e t y of reasons, these two samples may be d i f f e r e n t . The males or females i n e i t h e r course might not be the same as the males (39) and females i n the t o t a l p o p u l a t i o n . C o n c l u s i o n s about gender d i f f e r e n c e s must be taken with extreme c a u t i o n . Another item order study i n v o l v i n g female and male students t a k i n g an e d u c a t i o n a l psychology examination was by Klimko (1984). A d d i t i o n a l f a c t o r s that were c o n s i d e r e d were a n x i e t y l e v e l and c o g n i t i v e e n t r y l e v e l . C o g n i t i v e e n t r y l e v e l was d e f i n e d as p r e r e q u i s i t e types of knowledge, s k i l l s and competencies which are e s s e n t i a l to the l e a r n i n g of a p a r t i c u l a r new task or s e t of t a s k s . C o g n i t i v e e n t r y was measured at the b e g i n n i n g of the course u s i n g a f o r t y - f i v e item t e s t which was designed by Klimko. There were 93 female and 18 male c o l l e g e students who were randomly as s i g n e d to midterm examination formats c o n t a i n i n g the three common item arrangements (E-H, H-E, R). Klimko found t h a t c o g n i t i v e e n t r y c h a r a c t e r i s t i c s was the o n l y f a c t o r with a s i g n i f i c a n t r e l a t i o n s h i p to the performance score (p < .0001). Item order, sex of the s u b j e c t , and a n x i e t y l e v e l were not s i g n i f i c a n t f a c t o r s . His main c o n c l u s i o n s was t h a t item order does not i n f l u e n c e achievement examination performance. He a l s o concluded t h a t c o g n i t i v e e n t r y was a meaningful p r e d i c t o r of achievement performance. U n f o r t u n a t e l y , he p r o v i d e d l i t t l e i n the way of s p e c i f i c i n f o r m a t i o n d e t a i l i n g the parameters of c o g n i t i v e e n t r y c h a r a c t e r i s t i c s . His f o r t y - f i v e item t e s t of c o g n i t i v e e n t r y c h a r a c t e r i s t i c s had no v a l i d i t y data (40) other than i t s r e s u l t s were s i m i l a r to a psychology midterm examination. Another f a c t o r based on c o g n i t i v e t h e o r y was by Lane, B u l l , Kundert, and Newman (1987). They used Bloom's taxonomy to s u b j e c t i v e l y determine the " c o g n i t i v e d i f f i c u l t y " of every t e s t item i n a f o r t y item education course examination. The t e s t items were arranged i n f i v e d i f f e r e n t formats. Two forms had the items with i n c r e a s i n g c o g n i t i v e d i f f i c u l t y and e i t h e r i n c r e a s i n g or d e c r e a s i n g s t a t i s t i c a l d i f f i c u l t y . Two other forms had the items with d e c r e a s i n g c o g n i t i v e d i f f i c u l t y and e i t h e r i n c r e a s i n g or d e c r e a s i n g s t a t i s t i c a l d i f f i c u l t y . A f i f t h format had the que s t i o n s i n random or d e r . These t e s t s were used i n two d i f f e r e n t s t u d i e s . In the f i r s t study by Lane et a l . , 59 male and 96 female c o l l e g e students wrote one of the f i v e d i f f e r e n t examination formats. Although item order was not a s i g n i f i c a n t f a c t o r In the t o t a l s c o r e s , some subscores d i d seem to be e f f e c t e d by the item order (p < .05). Students who had the easy knowledge items f i r s t f o l l o w ed by ques t i o n s of i n c r e a s i n g s t a t i s t i c a l and c o g n i t i v e d i f f i c u l t y had the hi g h e s t mean s c o r e s . The lowest mean scores were r e c e i v e d by students who had exams with d e c r e a s i n g c o g n i t i v e d i f f i c u l t y and i n c r e a s i n g s t a t i s t i c a l d i f f i c u l t y . Another unusual f i n d i n g was t h a t gender was a s i g n i f i c a n t (41) i n t e r a c t i v e and main f a c t o r with females s c o r i n g higher than males. In the second study, Lane et a l . used o n l y three of h i s o r i g i n a l f i v e formats. They used the form with i n c r e a s i n g c o g n i t i v e and s t a t i s t i c a l d i f f i c u l t y , the form with d e c r e a s i n g c o g n i t i v e and s t a t i s t i c a l d i f f i c u l t y , and the form with the random order of d i f f i c u l t y . In a d d i t i o n , h a l f of the t e s t s had l a b e l s on each item to i n d i c a t e i t s c o g n i t i v e d i f f i c u l t y . The s i x forms were randomly g i v e n to 78 male and 169 female c o l l e g e students as an exam i n an educat i o n course. The r e s u l t s of the second study d i f f e r e d from the f i r s t s i n c e item order had no s i g n i f i c a n t e f f e c t with e i t h e r the t o t a l score or with the subscores. However, as i n the f i r s t study, females had higher s c o r e s than males. The presence or absence of l a b e l s was a l s o a s i g n i f i c a n t f a c t o r . Both males and females had higher s c o r e s when the l a b e l s were prese n t . In a d d i t i o n , there was an i n t e r a c t i o n e f f e c t between gender and l a b e l l i n g . Although a l l students d i d b e t t e r when l a b e l s were i n c l u d e d , the d i s c r e p a n c y between males and females decreased when the l a b e l s were i n c l u d e d . Lane et a l . concluded t h a t the presence of l a b e l l i n g was b e n e f i c i a l and c o u l d p o s s i b l y negate the e f f e c t of item o r d e r i n g based on item d i f f i c u l t y . However, the (42) g e n e r a l i z a b i l i t y of t h i s study may be l i m i t e d by the f a c t t h a t o n l y s u b j e c t s who understood Bloom's c o g n i t i v e l e v e l s took the t e s t . Lane et a l . a l s o suggested that f u r t h e r r e s e a r c h was needed i n the area of performance by females s i n c e t h i s r e s e a r c h c o n t r a d i c t e d the r e s u l t s of Plake and Ansorge (1984) on the performance of females on n o n - q u a n t i t a t i v e t e s t s . The e f f e c t s of item order and sex of the s u b j e c t were s t u d i e d by K l e i n k e (1980), but speed was a l s o i n c l u d e d as a f a c t o r . He had 484 f o u r t h grade students complete a t h i r t y s i x item s o c i a l s c i e n c e s t e s t . The t e s t was presented in e i t h e r an easy to hard arrangement or a uniform s p i r a l arrangement. The students were gi v e n twenty minutes to complete the t e s t , but a f t e r o n l y ten minutes, students were asked to draw a l i n e to i n d i c a t e which q u e s t i o n they had j u s t f i n i s h e d . T h i s provided data on each student's speed and a c c u r a c y under very speeded c o n d i t i o n s . U n f o r t u n a t e l y , o n l y 314 students followed the d i r e c t i o n s and drew the l i n e . An a d d i t i o n a l area of study by K l e i n k e was response l o c a t i o n . The p o s i t i o n of item responses were placed on the l e f t s i d e of h a l f the b o o k l e t s and on the r i g h t s i d e of the other h a l f of the b o o k l e t s . T h i s response p o s i t i o n i n g was compared to the handedness and sex of the student to see how these f a c t o r s r e l a t e d to the student's performance on the t e s t . (43) K l e i n k e found t h a t the easy to hard item arrangement had a higher mean under the speeded ten minute c o n d i t i o n (p < .01). However, the t o t a l s c o r e s were equal under the twenty minute c o n d i t i o n . He concluded t h a t i f an examinee had ample time to complete a t e s t , they w i l l p e r s i s t no matter what the arrangement. He was not ab l e to draw c o n c l u s i o n s on some of h i s other f i n d i n g s . For one, males had higher mean t o t a l s c o r e s , higher s c o r e s a f t e r ten minutes, and more q u e s t i o n s complete a f t e r ten minutes. He a l s o had no c o n c l u s i o n f o r h i s f i n d i n g t h a t l e f t handed p u p i l s had higher s c o r e s a f t e r ten minutes and more que s t i o n s complete a f t e r ten minutes. Other e f f e c t s and i n t e r a c t i o n s were not found to be s i g n i f i c a n t . I t i s un f o r t u n a t e t h a t 170 s u b j e c t s f a i l e d to draw a l i n e a t the ten minute p o i n t of t h e i r t e s t . C o n c l u s i o n s must be l i m i t e d by the f a c t t h a t the remaining 314 are a l e s s than random sample. None the l e s s , i t can be concluded t h a t some students w i l l get more ques t i o n s c o r r e c t at the beginning of a t e s t i f they begin the t e s t with easy q u e s t i o n s i n comparison with some students whose t e s t begins with items of v a r y i n g d i f f i c u l t y . However, t h i s study d i d not address the i s s u e of what e f f e c t s would a s e r i e s of d i f f i c u l t q u e s t i o n s have on a student's t o t a l s c o r e . I n t e r a c t i o n Research; Conclusions (44) The r e s e a r c h i n t o the i n t e r a c t i o n of item order with other measures d i d not produce c l e a r l y s i g n i f i c a n t r e s u l t s . For example s t u d i e s i n v o l v i n g a n x i e t y showed some i n i t i a l s uccess (Munz & Smouse, 1968; Smouse & Munz, 1969), but the e f f e c t s were not confirmed by other r e s e a r c h e r s (Smouse & Munz, 1968; Berger et a l . , 1969; Marso, 1970; Towle & M e r r i l l , 1975). Although a n x i e t y has been one of the more common f a c t o r s i n c l u d e d i n item order r e s e a r c h , other f a c t o r s have a l s o been s t u d i e d i n c o n j u n c t i o n with item o r d e r . However most have had r e s u l t s of l i m i t e d s i g n i f i c a n c e and l i m i t e d a p p l i c a b i l i t y . For one, knowledge of arrangement was one l i n e of i n q u i r y t h at d i d not produce any s i g n i f i c a n t r e s u l t s (Plake, 1980; Plake, Thompson e t a l . , 1980; Plake, Ansorge et a l . , 1982). Klimko (1984) i n c l u d e d c o g n i t i v e e n t r y c h a r a c t e r i s t i c s along with sex, t e s t a n x i e t y , and item order. He found c o g n i t i v e e n t r y c h a r a c t e r i s t i c s to be the onl y p r e d i c t o r of examination performance. Lane et a l . (1987) had very mixed r e s u l t s when they i n c l u d e d c o g n i t i v e and s t a t i s t i c a l item d i f f i c u l t y along with knowledge of the item arrangement and gender. In one study, the o r d e r i n g of the t e s t items based on c o g n i t i v e and s t a t i s t i c a l methods showed a s i g n i f i c a n t e f f e c t . However, i n t h e i r other study, when the t e s t items were l a b e l l e d with t h e i r c o g n i t i v e (45) l e v e l , there were no s i g n i f i c a n t order e f f e c t s , but there were s i g n i f i c a n t d i f f e r e n c e s r e l a t e d to the presence of l a b e l s and the sex of the s u b j e c t . The sex of the s u b j e c t was a l s o a f a c t o r i n s e v e r a l other s t u d i e s . Plake, Ansorge et a l . (1982) found an i n t e r a c t i o n e f f e c t between item order and sex of the s u b j e c t . S e v e r a l s t u d i e s d i d not f i n d an i n t e r a c t i o n e f f e c t , but d i d f i n d a s i g n i f i c a n t main e f f e c t (Plake, M e l i c a n , et a l . , 1983; K l e i n k e , 1980; Lane et a l . , 1987). Other s t u d i e s t h a t examined the sex of the s u b j e c t f a c t o r d i d not f i n d i t to be a s i g n i f i c a n t f a c t o r (Plake & Ansorge, 1984; Klimko, 1984). The mixed r e s u l t s of these s t u d i e s would i n d i c a t e t h at one must be c a u t i o u s i n making c o n c l u s i o n s about gender d i f f e r e n c e s s i n c e there may be other f a c t o r s i n v o l v e d with any r e p o r t e d o b s e r v a t i o n s . Achievement l e v e l as an i n t e r a c t i n g f a c t o r d i d show a tendency toward s i g n i f i c a n c e i n the Klosner and Gellman (1973) study, but u n l i k e Sax and Cromack (1966), i t was s t i l l found to be a n o n - s i g n i f i c a n t f a c t o r with the sample used. On the other hand, Hodson (1984) d i d not f i n d an I n t e r a c t i o n e f f e c t with a b i l i t y l e v e l , but he d i d f i n d a s i g n i f i c a n t main e f f e c t f o r item o r d e r . C o n s i d e r i n g the importance of a b i l i t y l e v e l s to the j u s t i f i c a t i o n of sequencing q u e s t i o n s from e a s i e s t to hardest, the lack of r e s e a r c h i n t h i s area i s s u r p r i s i n g . (46) Any study using s l i g h t l y speeded t e s t s found s i g n i f i c a n t r e s u l t s ; r e s u l t s which confirmed p r e v i o u s f i n d i n g s . The speeded t e s t s e i t h e r caused an o u t r i g h t main e f f e c t f o r item order (Towle & M e r r i l l , 1975; K l e i n k e , 1980; Hodson, 1984) or an i n t e r a c t i o n e f f e c t with the sex of the student (Plake, Ansorge et a l . , 1982). Simple E f f e c t Re-examined The i n t e r a c t i o n of v a r i a b l e s with item order was not the o n l y concern of l a t e r r e s e a r c h e r s i n v e s t i g a t i n g the e f f e c t s of item o r d e r . The simple e f f e c t of item order remained a concern s i n c e textbooks of the day continued to suggest t h a t items be arranged from easy to hard and s i n c e classroom t e a c h e r s s t i l l had q u e s t i o n s about the e q u i v a l e n c y of m u l t i p l e forms. In a d d i t i o n , the use of m u l t i f a c t o r a n a l y s i s procedures, which allowed i n t e r a c t i o n a n a l y s i s , c o u l d a l s o be used f o r m u l t i p l e comparisons of the simple main e f f e c t among s e v e r a l s i m i l a r groups. Huck and Bowers (1972) r e p o r t e d the r e s u l t s of two very s i m i l a r experiments t h a t compared the e f f e c t of s e v e r a l d i f f e r e n t random arrangements on item d i f f i c u l t y . In the f i r s t experiment Huck and Bowers used ten d i f f e r e n t random v a r i a t i o n s to a psychology f i n a l exam with 120 psychology s t u d e n t s . In the second study they used s i x d i f f e r e n t random forms of a psychology midterm with 162 s t u d e n t s . The (47) d i f f i c u l t y r a t i n g s ranged from .00 to 1.00 i n the f i r s t study and .15 to 1.00 i n the second. The average of d i f f i c u l t y r a t i n g ranged from .54 to .61 i n the f i r s t study's 10 t e s t s and .66 to 70 i n the second study's 6 t e s t s . Huck and Bowers found no s i g n i f i c a n t d i f f e r e n c e i n the item d i f f i c u l t y r a t i n g s from any of the t e s t forms. They concluded t h a t the sequence e f f e c t h y p o thesis might not be a v a l i d one and c r i t i c i z e d other w r i t e r s , "comments conc e r n i n g a sequence e f f e c t should be somewhat q u a l i f i e d as compared with p r e s e n t l y appearing statements." T h i s study was l i m i t e d i n i t s a p p l i c a b i l i t y . For one, as Huck and Bowers b r i e f l y mention, c o l l e g e students e n r o l l e d i n a psychology c l a s s do not r e p r e s e n t a random sample f o r the purpose of g e n e r a l i z a t i o n . In theory, item order e f f e c t s are most a p p l i c a b l e to low a b i l i t y s t u d e n t s , and i f low a b i l i t y i s a c a t e g o r y r e f e r e n c e r e l e v a n t to any sample, then a c o l l e g e sample would have "low a b i l i t y " s t u d e n t s . However, i f "low a b i l i t y " i s i n r e f e r e n c e to a l a r g e , age based p o p u l a t i o n , then a c o l l e g e sample probably does not i n c l u d e any low a b i l i t y students f o r whom item order might make a d i f f e r e n c e . Secondly, the e f f e c t of item order i s supposedly the r e s u l t of a lack of success (Ahmann and Glock, 1963). The r e f o r e v a r i o u s random orders would have approximately equal amounts of easy q u e s t i o n s at the beginning and c o r r e s p o n d i n g l y equal amounts of success (48) at the beginning f o r the e a s i l y f r u s t r a t e d low a b i l i t y s t u d e n t . Huck and Bowers' f i n d i n g s are t h e r e f o r e l i m i t e d to o n l y one type of item arrangement, random, and one type of a b i l i t y l e v e l , h i g h . S i r o t n i k and W e l l i n g t o n (1974), on the other hand, d i d t r y to have t h e i r item order study g e n e r a l i z e to a l a r g e r p o p u l a t i o n , and they used a content based arrangements as w e l l as random. The content based arrangements had items grouped a c c o r d i n g to t h e i r b a s i c s u b j e c t a r e a s — e i t h e r mathematics, s o c i a l s t u d i e s , s c i e n c e , language a r t s , or r e a d i n g . In a d d i t i o n , they used a l a r g e grade based sample of 2,463 e i g h t h grade s t u d e n t s , so, by most d e f i n i t i o n s , low a b i l i t y students were i n c l u d e d . For an item pool they used the f i n a l exam qu e s t i o n s from the f i v e b a s i c s u b j e c t areas and s y s t e m a t i c a l l y d i v i d e d them i n t o four t e s t s arranged by content or four t e s t s arranged i n random sequence. T h i s m u l t i p l e matrix sampling d e s i g n allowed them to rearrange one four hour t e s t i n t o e i g h t one hour t e s t s which they hypothesized would be e q u i v a l e n t . The one hour time l i m i t d i d i n t r o d u c e a s l i g h t speeding e f f e c t that the authors f e l t was n e g l i g i b l e but t y p i c a l to the s c h o o l system. The time l i m i t was another attempt to a l l o w t h e i r r e s u l t s to have a broad and p r a c t i c a l a p p l i c a t i o n . Most of S i r o t n i k and W e l l i n g t o n ' s t e s t s of s i g n i f i c a n c e supported t h e i r hypothesis t h a t there was no d i f f e r e n c e i n (49) the means, v a r i a n c e , item d i f f i c u l t i e s , or KR-20 r e l i a b i l i t y f o r most of t h e i r t e s t s . However, they d i d f i n d t h a t the means of the r e a d i n g t e s t s were s i g n i f i c a n t l y d i f f e r e n t . The content based r e a d i n g t e s t was 1.5 percent higher than the random arrangement. Although t h i s r e s u l t was s t a t i s t i c a l l y s i g n i f i c a n t , i t was d i s m i s s e d by S i r o t n i k and W e l l i n g t o n as not being of p r a c t i c a l s i g n i f i c a n c e . Once a g a i n , the e f f e c t of time l i m i t s comes i n t o p l a y with item o r d e r . Research i n v o l v i n g t y p i c a l classrooms and l a r g e numbers of students must contend with the v e r y r e a l problem of time l i m i t s . As a r e s u l t , item order i s l i k e l y to have an e f f e c t . In S i r o t n i k and W e l l i n g t o n ' s r e s e a r c h , item order o n l y seemed to e f f e c t the r e s u l t s i n one s u b j e c t area. There may have been other i n t e r a c t i o n s t h a t were not analyzed i n t h i s study such as sex or achievement l e v e l . Of course, the s i g n i f i c a n c e of the main e f f e c t c o u l d have been j u s t a chance anomaly from the i n c r e a s e d a l p h a e r r o r l e v e l due to repeated t e s t s of s i g n i f i c a n c e . The study c o u l d have been improved with the use of a p p r o p r i a t e s i g n i f i c a n c e t e s t s f o r m u l t i p l e comparisons. None the l e s s , i t i s notable that the mathematics t e s t s d i d not show an e f f e c t of item order under time l i m i t s which i s c o n t r a r y to some pre v i o u s r e s e a r c h (Towle & M e r r i l l , 1975). U n f o r t u n a t e l y , i t i s d i f f i c u l t to make acc u r a t e comparisons of the mathematics t e s t s because S i r o t n i k and W e l l i n g t o n d i d not provide any (50) d e t a i l s as to the d i f f i c u l t y r a t i n g of t h e i r t e s t s and item, nor d i d they have the hard to easy v a r i a t i o n of item o r d e r . A study s i m i l a r to the one by S i r o t n i k and W e l l i n g t o n i s the study by F e l d t and F o r s y t h (1974). They were, however, o n l y concerned with the context e f f e c t s caused by the matrix sampling techniques used with content based item groups. They used two forms of the Iowa Test of E d u c a t i o n a l Development on 530 students i n grades 9 to 12. One group of language q u e s t i o n s or one group of three d i f f e r e n t groups of mathematics q u e s t i o n s were drawn from one of the two r e g u l a r t e s t forms and i n c l u d e d as a s p e c i a l s e c t i o n i n the other t e s t form. As a r e s u l t , e i g h t d i f f e r e n t t e s t packages were developed and t e s t e d . They found s i g n i f i c a n t d i f f e r e n c e s between the means of the mathematics q u e s t i o n s (p < .05) and no s i g n i f i c a n t d i f f e r e n c e s between the means of the language q u e s t i o n s . The means of mathematics q u e s t i o n s i n the s p e c i a l s e c t i o n s were higher than the means of the same questions when they were presented i n the r e g u l a r s e c t i o n . They c i t e d s e v e r a l p o s s i b l e reasons f o r t h e i r mixed r e s u l t s . For one, as with other s t u d i e s , time l i m i t s seemed to have an e f f e c t . They s a i d t h a t even though students had enough time to f i n i s h the t e s t , they may have f e l t more rushed to complete the qu e s t i o n s when they were given i n the r e g u l a r form r a t h e r than when they were given i n the s h o r t e r s p e c i a l s e c t i o n . Another p o s s i b i l i t y g i v e n was that the (51) students may have been s l i g h t l y more f a t i g u e d when answering questions when they were presented i n the longer r e g u l a r s e c t i o n . These c o n c l u s i o n s are tempered by the f a c t that the e f f e c t s were observed o n l y with the mathematics s e c t i o n s but not with the language s e c t i o n . They f e l t t h a t the mathematics s e c t i o n s may have been more r i g o r o u s . A f i n a l reason suggested was t h a t item sequence e f f e c t s may have been present i n some s u b t l e form s i n c e the item order of the q u e s t i o n s i n the s p e c i a l s e c t i o n was not i d e n t i c a l to the order i n the r e g u l a r form. It i s unfortunate t h a t F e l d t and F o r s y t h d i d not c o n t r o l f o r the p o s s i b l e s u b t l e e f f e c t s of item sequence by a d m i n i s t e r i n g d i f f e r e n t item arrangements of each s p e c i a l s e c t i o n . They a l s o d i d not c l e a r l y address the i s s u e of whether or not t h e i r t e s t s were power t e s t s or i f they were speeded t e s t s . They s t a t e t h a t the non-completion of the t e s t s was not a problem, but they a l s o s t a t e t h a t the omission of mathematics items was common. T h i s r e s u l t s i n t h e i r being a b l e to c l a i m that time was a f a c t o r when there was a s i g n i f i c a n t e f f e c t , and c l a i m t h a t time was not a f a c t o r when there was not a s i g n i f i c a n t e f f e c t . However, i t i s a l s o p o s s i b l e t h a t item sequence e f f e c t s can be the cause of what seems to be a time problem r a t h e r than the e f f e c t . R e s u l t s s i m i l a r to speeded t e s t s may occur with d i f f e r e n t item orders s i n c e students f r u s t r a t e d by one arrangement may ( 5 2 ) tend to omit or f a i l to complete more qu e s t i o n s than students e x p e r i e n c i n g a d i f f e r e n t order without the f r u s t r a t i o n . Time l i m i t s must be more c a r e f u l l y c o n t r o l l e d and s t a n d a r d i z e d to make r e s e a r c h on item order e f f e c t s more r e p l i c a b l e and c o n c l u s i o n s more c e r t a i n . C o n t r o l l i n g Other F a c t o r s Some of the shortcomings of p r e v i o u s r e s e a r c h were more adequately c o n t r o l l e d by Hambleton and Traub (1974). A f t e r conducting a f a i r l y c a r e f u l review of p r e v i o u s r e s e a r c h , Hambleton and Traub concluded t h a t s e v e r a l of previous s t u d i e s which had not shown an e f f e c t from item order had not c o n t r o l l e d three f a i r l y important f a c t o r s . The f i r s t f a c t o r t h a t Hambleton and Traub f e l t was not w e l l c o n t r o l l e d was the examinee's c o n t r o l of item order, or the w i t h i n - s u b j e c t rearrangement f a c t o r . I t i s commonly assumed t h a t an easy to hard t e s t or a hard to easy t e s t are taken i n t h a t o r d e r . However, i t may not be the case t h a t s t u d e n t s , i n f a c t , do the items i n the order t h a t was intended by the r e s e a r c h e r . P a r t i c u l a r l y i n the case of hard to easy arrangements, many students may t r y to do the easy ones f i r s t and t h e r e f o r e s t a t i s t i c a l l y mask the e f f e c t of item o r d e r . While Hambleton and Traub d i d not have an estimate of how e x t e n s i v e t h i s w i t h i n - s u b j e c t rearrangement was i n the high s c h o o l p o p u l a t i o n they sampled, they ( 5 3 ) attempted to prevent examinee c o n t r o l of the item order by i n s t r u c t i n g the students to do the q u e s t i o n s i n the order presented i n t h e i r b o o k l e t s . The b o o k l e t s were a l s o s p e c i a l l y p r i n t e d with o n l y one q u e s t i o n p r i n t e d on each page to discourage students from s e a r c h i n g f o r easy q u e s t i o n s to do f i r s t and hard q u e s t i o n s to do l a s t . The second reason that Hambleton and Traub f e l t other s t u d i e s had f a i l e d to f i n d an e f f e c t was due to the l i m i t e d range of the item d i f f i c u l t i e s used i n the t e s t . They poi n t e d out t h a t no i n f o r m a t i o n on the v a r i a t i o n of item d i f f i c u l t i e s was p u b l i s h e d . U n f o r t u n a t e l y , Hambleton and Traub's a r t i c l e r e p o r t e d o n l y the rank c o r r e l a t i o n between item p o s i t i o n i n the t e s t and the p o s i t i o n the item should have been i n , based on item d i f f i c u l t y l e v e l as estimated from the data of t h e i r study. However, t h e i r t e s t was a s t a n d a r d i z e d and p u b l i s h e d t e s t with item d i f f i c u l t y i n f o r m a t i o n a v a i l a b l e i n the t e s t manual. A t h i r d f a c t o r Hambleton and Traub d e a l t with was student m o t i v a t i o n . Hambleton and Traub contended that some of the p r e v i o u s s t u d i e s may have lacked r e a l i s t i c or e f f e c t i v e m o t i v a t i o n by the s u b j e c t s . They po i n t e d out that most s t u d i e s gave no c l e a r d e s c r i p t i o n of how important the student p e r c e i v e d the t e s t s to be. Item order e f f e c t s , they f e l t , would be d i r e c t l y r e l a t e d to the importance a student a t t a c h e s to the t e s t . To attempt to i n s u r e student (54) m o t i v a t i o n , Hambleton and Traub r e p o r t e d t h a t they t o l d the students t h a t the r e s u l t s would be used by t h e i r c l assroom teacher to a r r i v e at a f i n a l grade. T h i s p o i n t i s a l s o s i g n i f i c a n t s i n c e Plake, Thompson, and Lowry. (1980) used v o l u n t e e r s and found no s i g n i f i c a n t e f f e c t s , but when Plake, Ansorge, Parker, and Lowry (1980) used motivated students i n a s i m i l a r study, they d i d get s i g n i f i c a n t r e s u l t s . With these three f a c t o r s c o n t r o l l e d , Hambleton and Traub f e l t t h a t item order should have an e f f e c t . They hypothesized t h a t the e f f e c t was caused by two f a c t o r s . For one, f a t i g u e , as suggested by Mollenkopf (1950), c o u l d cause the e a s i e r q u e s t i o n s at the end of a hard to easy arrangement seem harder. Another p o s s i b l e cause was p e r s o n a l i t y t r a i t such as a n x i e t y . As Munz and Smouse (1968) suggested, some arrangements such as hard to easy improve the performance of those students who need to have a higher a n x i e t y l e v e l on the t e s t while at the same time performance i s lowered f o r those who are d e b i l i t a t e d by the higher a n x i e t y of the hard to easy arrangement. To t e s t t h e i r t h e o r i e s , Hambleton and Traub used 106 e l e v e n t h grade mathematics students e n r o l l e d i n a mathematics summer s c h o o l program. T h e i r mathematics a b i l i t y was c o n s i d e r e d low to average. They were gi v e n an Achievement A n x i e t y Test two weeks p r i o r to the mathematics t e s t to determine the h i g h e s t s c o r i n g 25 percent and the (55) lowest s c o r i n g 25 percent f o r a n a l y s i s of a n x i e t y t r a i t r e a c t i o n s . On the day of the t e s t a l l students were gi v e n the Cooperative Mathematics Test Algebra II p u b l i s h e d by the E d u c a t i o n a l T e s t i n g S e r v i c e . The students were randomly as s i g n e d two forms, e i t h e r easy to hard or hard to easy arrangements. The t e s t b o o k l e t s were designed to discourage students from changing the arrangement o r d e r . The students were t o l d t h a t the exam c o u l d be used f o r marks. To measure s t r e s s l e v e l s students had t h e i r pulse measured every 10 minutes. U n f o r t u n a t e l y , i n one c l a s s the pulse meter d i d not work, so pulse r a t e s were c o l l e c t e d f o r o n l y h a l f of the s t u d e n t s . The t e s t was r e p o r t e d as a power t e s t , but some of the students d i d not f i n i s h , so the t e s t must be co n s i d e r e d as s l i g h t l y speeded. Hambleton and Traub's r e s u l t s i n d i c a t e d a s i g n i f i c a n t e f f e c t due to item order (p < .05). Scores on the hard to easy arrangement were lower than on the easy to hard arrangement. While some students d i d not f i n i s h t h e i r t e s t s the authors f e l t i t was an i n s i g n i f i c a n t number based on a c h i squared t e s t f o r contingency. They f u r t h e r analyzed the types of q u e s t i o n s not completed and f e l t that the r e s u l t s would not be s u b s t a n t i a l l y enough d i f f e r e n t i f a l l students had f i n i s h e d the t e s t . As a r e s u l t , they f e l t t h a t speediness was not a p l a u s i b l e e x p l a n a t i o n of the r e s u l t s . (56) In a d d i t i o n , item order a f f e c t e d t e s t a n x i e t y . The hard to easy item order produced more s t r e s s as measured with the pulse meter than the easy to hard item order. T h i s r e s u l t i s marred by the s m a l l numbers, and the a n a l y s i s was d i f f i c u l t s i n c e the two samples had d i f f e r e n t means to begin with, but Hambleton and Traub d i d t e n t a t i v e l y conclude t h a t the order does e f f e c t the s t r e s s l e v e l . The performance of the two a n x i e t y l e v e l s as i d e n t i f i e d with the A.A.T. d i d not show a s i g n i f i c a n t main e f f e c t nor a s i g n i f i c a n t i n t e r a c t i o n with item o r d e r . The r e s u l t s d i d show a t r e n d with a l e v e l of s i g n i f i c a n c e between .05 and .10; however, the r e s u l t s of Munz and Smouse (1968) were not r e p l i c a t e d . I n t e r e s t i n g l y , t h i s was one of the f i r s t s t u d i e s to r e p o r t an a n a l y s i s based on sex. Hambleton and Traub d i d not f i n d any s i g n i f i c a n t d i f f e r e n c e between male and female students i n t h i s study even though i n a l a t e r study by Plake, Ansorge, Parker, and Lowry (1980) there was an e f f e c t r e l a t e d to sex and a sex by item order i n t e r a c t i o n . Hambleton and Traub concluded that item order does make a d i f f e r e n c e , and they c a u t i o n a g a i n s t the p r a c t i c e of making s e v e r a l forms of a t e s t i n a c l a s s to reduce the chance of c h e a t i n g . Two t e s t s with d i f f e r e n t orders are two t e s t s with d i f f e r e n t p r o p e r t i e s and t h e r e f o r e make comparisons i n v a l i d . They f e l t t h a t the cause f o r t h i s e f f e c t was from what Cronbach (1946, 1950) c a l l e d response (57) s e t . In a m u l t i p l e c h o i c e format, students expected an easy to hard arrangement, so when a student begins with the hardest items, he becomes more anxious s i n c e h i s e x p e c t a t i o n i s to have even harder q u e s t i o n s as he progresses through the t e s t . Monk and S t a l l i n g s (1970) were concerned about Hambleton and Traub's recommendation a g a i n s t u s i n g m u l t i p l e forms i n a c l a s s to prevent c h e a t i n g . They examined the r e s u l t s of twenty-two t e s t s they had a d m i n i s t e r e d over a three year p e r i o d i n t h e i r geography course. The t e s t s were made from items chosen from an item pool and a d m i n i s t e r e d i n p a i r s . Each p a i r had i d e n t i c a l q u e s t i o n s grouped together by content c a t e g o r i e s . There were no arrangements based on item d i f f i c u l t y , but each p a i r had t h e i r content c a t e g o r i e s randomly rearranged to discourage c h e a t i n g . N e a r l y 2000 s u b j e c t s were i n v o l v e d s i n c e each form had approximately 100 students w r i t i n g i t . The t e s t s were s l i g h t l y speeded as i s t y p i c a l i n l a r g e s c a l e t e s t i n g s i t u a t i o n s . They t e s t e d the s i g n i f i c a n c e of t h e i r r e s u l t by repeated t - t e s t s , and found t h a t nine out of elev e n p a i r s were not s i g n i f i c a n t l y d i f f e r e n t . They concluded t h a t e q u i v a l e n t t e s t s c o u l d be produced by r e a r r a n g i n g items. However, they d i d concede t h a t two t e s t s were s i g n i f i c a n t . One was at the .01 l e v e l of s i g n i f i c a n c e , and the other was (58) at the .001 l e v e l . As a r e s u l t , they c a u t i o n e d that item order e f f e c t s may i n f a c t be present i n l a r g e s c a l e t e s t i n g programs. Monk and S t a l l i n g s * a n a l y s i s i s marred by the repeated use of the t - t e s t without a d e s c r i p t i o n of the type of t - t e s t used. I f an i n a p p r o p r i a t e t - t e s t were used then, from chance alone there c o u l d be a t l e a s t one s i g n i f i c a n t r e s u l t i n the ten comparisons a t the .05 l e v e l of s i g n i f i c a n c e . Comparisons with other s t u d i e s are a l s o l i m i t e d by the absence of data on item d i f f i c u l t i e s and the absence of data on the o r d e r i n g of easy q u e s t i o n s i n r e l a t i o n s h i p to hard q u e s t i o n s . These weaknesses hide the p o s s i b i l i t y t h a t i f the item d i f f i c u l t y range was minimal, i f the changes i n item sequence were minimal, and i f the a p p r o p r i a t e t - t e s t s were used, then the f a c t t h a t two t e s t s showed any e f f e c t may be evidence t h a t item order i s a s i g n i f i c a n t f a c t o r . The r e s e a r c h f i n d i n g s of Hambleton and Traub (1974) were a l s o of concern to A l l i s o n (1984). I t was noted by A l l i s o n t h a t most of the p r e v i o u s r e s e a r c h i n t o item order which had i n v o l v e d students who were probably mature enough to use the item r e a r r a n g i n g s t r a t e g y c i t e d by Tuck (1978). While Hambleton and Traub had t r i e d to c o n t r o l t h i s f a c t o r with a r e s t r i c t e d t e s t format, A l l i s o n simply chose a sample of young s t u d e n t s . He f e l t t h a t students i n grade s i x would (59) g e n e r a l l y not yet use t h i s s t r a t e g y . In a d d i t i o n to being one of the few s t u d i e s to examine item order o u t s i d e of a c o l l e g e classroom, A l l i s o n ' s study was a l s o one of the few s t u d i e s to examine the i n t e r a c t i o n e f f e c t of item order with low and high a b i l i t y s t u d e n t s . In h i s study, A l l i s o n randomly gave 364 grade s i x students a s c i e n c e exam arranged i n one of the three common or d e r s , easy to hard, hard to easy, and random. The items had a wide range of d i f f i c u l t y as recommended by Hambleton and Traub (1974). The items ranged i n d i f f i c u l t y from .178 to .981 with a mean of .673. Students were gi v e n an ample 90 minutes to complete the 64 item t e s t . M o t i v a t i o n was a f a c t o r t h a t was taken i n t o account s i n c e the students were t o l d t h a t the t e s t was an important p a r t of t h e i r program. High and low a b i l i t y students were i d e n t i f i e d from I. Q. scores i n the students s c h o o l r e c o r d s . T h i r t y - f i v e students d i d not have such i n f o r m a t i o n on f i l e so o n l y 327 students were used i n the a n a l y s i s . T h e i r mean I. Q. was 113.31. F i n a l l y , as so many other s t u d i e s have done, the sco r e s of the 160 boys were compared with the sc o r e s of the 167 g i r l s to see i f item order has any i n t e r a c t i v e e f f e c t with gender. In c o n t r a s t to Hambleton and Traub's f i n d i n g s , A l l i s o n found no s i g n i f i c a n t d i f f e r e n c e between the means of the three d i f f e r e n t t e s t formats. Nonetheless, there were s i g n i f i c a n t main e f f e c t s a s s o c i a t e d with I. Q. and gender. (60) Boys and students with high I. Q. s c o r e s had higher means on t h i s s c i e n c e t e s t . A l l i s o n drew no c o n c l u s i o n s r e g a r d i n g the two s i g n i f i c a n t main e f f e c t . In examining the i n t e r a c t i o n s , A l l i s o n r e p o r t e d t h a t there were no s i g n i f i c a n t i n t e r a c t i o n e f f e c t s i n v o l v i n g the f a c t o r s of sex, I. Q., and item order. These r e s u l t s are s i m i l a r to other s t u d i e s where the t e s t s were not speeded. A l l i s o n concluded t h a t measurement s p e c i a l i s t s should h e s i t a t e before recommending any one Item arrangement over another. Since both the study A l l i s o n and the study by Hambleton and Traub were very thorough In c o n t r o l l i n g many of the f a c t o r s i n v o l v e d i n item order r e s e a r c h , i t leaves open the q u e s t i o n why these two s t u d i e s d i f f e r e d i n t h e i r r e s u l t s . For one, where Hambleton and Traub d i d not have a c o n t r o l group of u n r e s t r i c t e d s t u d e n t s , A l l i s o n d i d not have a an experimental group with r e s t r i c t e d s t u d e n t s . Although h i s s u b j e c t s were young, they may s t i l l have used the technique t h a t was c o n t r o l l e d by Hambleton and Traub. Another d i f f e r e n c e i s t h a t while Hambleton and Traub used a math t e s t , A l l i s o n used a s c i e n c e t e s t . Students may be more s e n s i t i v e to a s e r i e s of d i f f i c u l t math qu e s t i o n s than to a s e r i e s of d i f f i c u l t s c i e n c e q u e s t i o n s . Science q u e s t i o n s which are s t a t i s t i c a l l y d i f f i c u l t may not be s u b j e c t i v e l y as d i f f i c u l t f o r s t u d e n t s . On the other hand, math q u e s t i o n s may have a b e t t e r match between s u b j e c t i v e and s t a t i s t i c a l (61) d i f f i c u l t y . As a r e s u l t , d i f f i c u l t s c i e n c e q u e s t i o n s may produce l e s s of an item order e f f e c t . Of course the i s s u e of speeded t e s t as compared to power t e s t may be a f a c t o r . Although both s t u d i e s c l a i m to be power t e s t s , t here was some s u g g e s t i o n that time may have been a f a c t o r i n the Hambleton and Traub study. I f time was a s i g n i f i c a n t f a c t o r , then both s t u d i e s merely c o n f i r m p r e v i o u s r e s e a r c h . Test Wiseness The s t u d i e s by Hambleton and Traub (1974) and A l l i s o n (1984) were the o n l y s t u d i e s of item order e f f e c t s to suggest t h a t s u b j e c t s may possess some s o r t of s k i l l i n modifying the e f f e c t s of item order. They suggested that the examinees c o u l d c o n t r o l the item order by answering the e a s i e s t q u e s t i o n s f i r s t r a t h e r than answering the qu e s t i o n s i n the order presented by the r e s e a r c h e r . T h i s s k i l l i s one of s e v e r a l under the g e n e r a l c a t e g o r y d e s c r i b e d as "Test-wiseness". Test-wiseness i s d e f i n e d by Millman and Bishop (1965) as "a s u b j e c t ' s c a p a c i t y to u t i l i z e the c h a r a c t e r i s t i c s and formats of the t e s t and/or the t e s t t a k i n g s i t u a t i o n to r e c e i v e a high s c o r e . " Hambleton and Traub reasoned t h a t i f too many s u b j e c t s d i d the e a s i e s t q u e s t i o n s f i r s t by o m i t t i n g the hard q u e s t i o n s u n t i l the end of the t e s t , then the e f f e c t of item order on the students who d i d not use t h i s s t r a t e g y would be masked by the r e s u l t s of those who d i d use the s t r a t e g y . (62) T h i s c o n c l u s i o n of Hambleton and Traub i s s i g n i f i c a n t i n l i g h t of a l a t e r s t u d i e s . For one, Tuck (1978) questioned n i n e t y psychology students about the s t r a t e g i e s they used on m u l t i p l e c h o i c e t e s t s . He found t h a t 69% of the students r e p o r t e d t h a t they would seek out the easy q u e s t i o n s f i r s t and leave the d i f f i c u l t q u e s t i o n s u n t i l l a s t . F u r t h e r evidence of w i t h i n - s u b j e c t rearrangement was uncovered by Klimko (1984) i n h i s item order study. Klimko i n c l u d e d a s e l f - r e p o r t q u e s t i o n n a i r e a t the end of a midterm examination with h i s 111 psychology s t u d e n t s . He found t h a t 43 s u b j e c t s took the t e s t s t r i c t l y i n the order g i v e n . On the other hand, 58 students went i n order, but skipped over the hard q u e s t i o n s to work on them at the end of the t e s t . In a d d i t i o n , 5 students skipped around l o o k i n g f o r any easy q u e s t i o n s to do f i r s t , and 3 students f l i p p e d through the t e s t to begin a t a random p o i n t , and 2 students d i d not use any of the methods l i s t e d on the q u e s t i o n n a i r e . In a v e r y e x t e n s i v e study, A l l i s o n and Thomas (1986) gave a s h o r t q u e s t i o n n a i r e on the student's p r e f e r r e d s t r a t e g y f o r answering achievement t e s t items to 415 students from grade four through to f i f t h year u n i v e r s i t y . A l l grades r e p o r t e d l y had some students who would use one of the three d i f f e r e n t types of s t r a t e g i e s , e i t h e r easy to hard, as presented, or hard to easy. The easy to hard w i t h i n - s u b j e c t rearrangement s t r a t e g y was used by 58.4% of (63) the students i n the intermediate grades, 62.7% of the students i n j u n i o r high s c h o o l , 49.6% of the students i n s e n i o r high s c h o o l , and 59.6% of the students i n t h e i r t h i r d to f i f t h year of u n i v e r s i t y . Most of the remaining students used the "as presented" s t r a t e g y . Although these r e s u l t s are evidence t h a t students may use rearrangement s t r a t e g i e s , A l l i s o n and Thomas conclude: Whether the s t u d e n t s ' own t e s t - t a k i n g s t r a t e g i e s supersede the i t e m - d i f f i c u l t y sequence intended by the examiner i s not c l e a r . There does not seem to be enough evidence to doubt the m a j o r i t y of the s t u d i e s on i t e m - d i f f i c u l t y sequence simply because the a c t u a l sequence of responding to items was not c o n t r o l l e d . In f a c t , i t may be t h a t the r e s u l t s of s t u d i e s i n v o l v i n g i t e m - d i f f i c u l t y sequence can be more r e a d i l y g e n e r a l i z e d to t y p i c a l t e s t i n g s i t u a t i o n s when students are allowed to use whichever s t r a t e g y they u s u a l l y choose. I t i s a l s o p o s s i b l e to argue t h a t students p e r c e i v e the q u e s t i o n s they can answer to be the easy items and the q u e s t i o n s they cannot answer e a s i l y to be the hard items. In other words, l t may be that the m a j o r i t y of students say, q u i t e reasonably, t h a t they answer the q u e s t i o n s they can answer f i r s t and the q u e s t i o n s they cannot answer are l e f t u n t i l l a t e r . ( A l l i s o n & Thomas, 1986, p.869) (64) It is c e r t a i n l y not clear i f students can improve their scores by answering the more d i f f i c u l t questions l a t e r . Rindler (1980) had 160 college volunteers write a t h i r t y item verbal aptitude test with sp e c i a l scoring sheets designed to i d e n t i f y i f items were skipped. Students were also put into one of three d i f f e r e n t timed conditions, 20 minutes, 25 minutes, or 65 minutes. Grade point averages of a l l students were also obtained to divide the students into high, medium, or low a b i l i t y rankings for a comparison of their performance. The r e s u l t s indicated a complex interaction between a b i l i t y groups and skipping questions. While there were some students who skipped in every a b i l i t y group and under most timing conditions, only the high a b i l i t y students who skipped questions had consistently higher scores under every timing condition. On the other hand, middle a b i l i t y students who did not skip had the higher scores under every timing condition. In contrast, low a b i l i t y students who skipped had higher scores only when they had 25 minutes for the t e s t . Low a b i l i t y students did not skip any questions when they had the ample 65 minute time l i m i t . Rindler concluded that some students may be better advised to spend more time on the easier questions which are usually at the beginning of the t e s t . This t a c t i c would help to make sure that those students would spend the most (65) time on the q u e s t i o n s they were most l i k e l y to get c o r r e c t . She f e l t i t was u n f a i r to suggest to a l l examinees t h a t t h e i r time would be used e f f e c t i v e l y i f they skipped the d i f f i c u l t q u e s t i o n s . Another f a c t shown by t h i s r e s e a r c h i s t h a t u s i n g a t e s t - w i s e n e s s s t r a t e g y and u s i n g l t w e l l are two d i f f e r e n t s k i l l s . One of procedures a s s o c i a t e d with s k i p p i n g d i f f i c u l t q u e s t i o n s i s to r e t u r n to the skipped q u e s t i o n . When low a b i l i t y students skipped q u e s t i o n s , they r e t u r n e d to o n l y 5% of the q u e s t i o n s s k i p p e d . High a b i l i t y s t u d e n t s , depending on the amount of time a v a i l a b l e , would r e t u r n to between 20% to 98% of t h e i r skipped q u e s t i o n s . Latent T r a i t and Context E f f e c t s The r e s e a r c h i n t o item order has been p r o b l e m a t i c a l . There are not enough c l e a r l y s i g n i f i c a n t r e s u l t s to c o n c l u s i v e l y p r e d i c t or d i s m i s s item order e f f e c t s . T h i s u n c e r t a i n t y began to have an impact i n the l a t e 1970s as another new s t a t i s t i c a l technique gained i n c r e a s i n g p o p u l a r i t y . Item response t h e o r y or l a t e n t t r a i t model th e o r y had s e v e r a l advantages over c l a s s i c a l t e s t theory t h a t t e s t c o n s t r u c t o r s found a t t r a c t i v e , but i t s s t a t i s t i c s i n v o l v e d an assumption that each t e s t item was l o c a l l y independent. In other words, to be s t a t i s t i c a l l y e f f e c t i v e , the context (66) of q u e s t i o n s before and a f t e r an item c o u l d not change the p r o b a b i l i t i e s of how students responded to t h a t item. T h e r e f o r e , l a t e n t t r a i t t heory r e q u i r e d t h a t item order was not a s i g n i f i c a n t f a c t o r . W hitely and Dawis (1976) were the f i r s t to t e s t t h i s assumption of l o c a l independence and the e f f e c t of context on item parameters. They a d m i n i s t e r e d s i x t y v e r b a l a n a l o g i e s t e s t s to 1,568 j u n i o r and s e n i o r h i g h . s c h o o l s t u d e n t s . There were seven forms which had a common core of f i f t e e n q u e s t i o n s with each item i n the same p o s i t i o n on any one of the t e s t . The other f o r t y - f i v e q u e s t i o n s were unique q u e s t i o n s developed f o r each of the seven t e s t s to provide seven t o t a l l y d i f f e r e n t c o n t e x t s around the core items. While t h i s study was not designed to t e s t i f s p e c i f i c a l l y the item order has an e f f e c t on s t u d e n t s , i t does address the issue of the e f f e c t t h a t one q u e s t i o n might have on another q u e s t i o n . As a r e s u l t , W h i t e l y and Dawis* r e s u l t s do not i n d i c a t e i f some students do best with a t e s t t h a t s t a r t s with the easy q u e s t i o n s . However, t h e i r r e s u l t s d i d show a s i g n i f i c a n t d i f f e r e n c e i n the r a s c h item parameters obtained f o r nine of t h e i r core items (p < .05). In a d d i t i o n , they found a s i g n i f i c a n t d i f f e r e n c e between the means of the core items on two of t h e i r t e s t s (p < .05). They concluded that i t e m - p a r a m e t e r - i n v a r i a n t models must ( 6 7 ) have t h e i r a s s umptions t e s t e d b e f o r e d e v e l o p i n g e q u i v a l e n t measures. U n f o r t u n a t e l y , r a s c h i t e m parameters a r e u s u a l l y e s t a b l i s h e d i n r e l a t i o n t o the o t h e r items on the t e s t . S i n c e each t e s t was d i f f e r e n t , t h e r a s c h parameter would be d i f f e r e n t f o r the co r e group of i t e m s . So W h i t e l y and Dawis had t o use an uncommon s t a t i s t i c a l method t o e s t a b l i s h a common p o i n t of r e f e r e n c e f o r the co r e group of items b e f o r e t h e y c o u l d c o n c l u d e t h a t t h e r e was i n f a c t a s i g n i f i c a n t d i f f e r e n c e . W h i t e l y and Dawis r a i s e d some co n c e r n s by t e s t c o n s t r u c t o r s who were u s i n g l a t e n t t r a i t models t o p r e t e s t q u e s t i o n s a t one t e s t a d m i n i s t r a t i o n and t h e n use them l a t e r a t a n o t h e r s i t t i n g u s i n g what were assumed t o be i n v a r i a n t i t e m p a r a m e t e r s . As a r e s u l t , K i n g s t o n and Dorans (1984) t e s t e d c o n t e x t e f f e c t s w i t h i n the p r e - o p e r a t i o n a l c a l i b r a t i o n of Graduate Record E x a m i n a t i o n s G e n e r a l T e s t . T h e i r r e s e a r c h d e s i g n i n v o l v e d 1500 examinees who took one of two forms of the G.R.E. G e n e r a l T e s t . Each t e s t had d i f f e r e n t q u e s t i o n s d i v i d e d i n t o f o u r o p e r a t i o n a l s e c t i o n s of s i m i l a r c o n t e n t . A f i f t h p r e - o p e r a t i o n a l s e c t i o n was composed of a random s e l e c t i o n of q u e s t i o n s from the o t h e r t e s t form. S i x v e r s i o n s of the f i r s t form were made w i t h d i f f e r e n t q u e s t i o n s from the second form i n the f i f t h p r e - o p e r a t i o n a l s e c t i o n , and s i x v e r s i o n s of the second form (68) were made a l s o w i t h d i f f e r e n t q u e s t i o n s from the o t h e r form. As a r e s u l t , e v e r y o p e r a t i o n a l q u e s t i o n of each form was used i n one v e r s i o n of the o t h e r form i n the p r e - o p e r a t i o n a l f i f t h s e c t i o n . Rasch model i t e m d i f f i c u l t y parameter e s t i m a t e s f o r v e r b a l , q u a n t i t a t i v e , and a n a l y t i c a l items c o u l d t h e n be compared between t h e i r o p e r a t i o n a l and t h e i r p r e - o p e r a t i o n a l p l a c e m e n t s . K i n g s t o n and Dorans d i d f i n d some items t o be a f f e c t e d by c o n t e x t w h i l e many were u n a f f e c t e d . Of v e r b a l i t e m t y p e s , items i n v o l v i n g antonyms d i s p l a y e d a s l i g h t p r a c t i c e e f f e c t when t h e y were p l a c e d a f t e r a s e r i e s of such q u e s t i o n s , w h i l e " r e a d i n g comprehension" showed a s l i g h t f a t i g u e e f f e c t i f l o c a t e d i n the f i n a l s e c t i o n (p < .05). Q u a n t i t a t i v e items showed l i t t l e change w i t h the e x c e p t i o n t h a t one i t e m on one form i n "data i n t e r p r e t a t i o n " was s i g n i f i c a n t l y more d i f f i c u l t when p l a c e d a t the end of the t e s t (p < .01). By co m p a r i s o n , i n the a n a l y t i c a l s e c t i o n , " a n a l y s i s of e x p l a n a t i o n " and " l o g i c a l d i a g r a m s " both showed s i g n i f i c a n t e f f e c t from p r a c t i c e on both forms (p < .01). " A n a l y t i c a l r e a s o n i n g " i n t h a t same s e c t i o n d i d not show any s i g n i f i c a n t d i f f e r e n c e s . As w i t h the p r e v i o u s r e s e a r c h , K i n g s t o n and Dorans s t u d y does not show the s p e c i f i c e f f e c t s of i t e m o r d e r . However, i t does i n d i c a t e t h a t some items can be a f f e c t e d by placement a t e i t h e r the b e g i n n i n g or the end of the t e s t . (69) Of course, these f i n d i n g s are l i m i t e d by the f a c t t h a t students who take the Graduate Record Examination are not a random c r o s s s e c t i o n , but may be u n u s u a l l y motivated and competent s t u d e n t s . None the l e s s , Kingston and Dorans f e l t t h a t , f o r t h e i r purposes, they c o u l d conclude t h a t l o c a t i o n e f f e c t s were s p e c i f i c to c e r t a i n types of items, and the e l i m i n a t i o n , or the proper placement of those types of items would s o l v e the problem. contex.t Effects and litem order One study u s i n g l a t e n t t r a i t item parameters d i d examine the s p e c i f i c e f f e c t of item order as w e l l as more gene r a l context e f f e c t s . Yen (1980) had 1,300 s i x t h grade students complete the C a l i f o r n i a Achievement Test mathematics s e c t i o n while 1,100 f o u r t h grade students completed the a p p r o p r i a t e r e a d i n g s e c t i o n . Each person used one of the seven d i f f e r e n t forms f o r e i t h e r the mathematics group or the r e a d i n g group. Each form was a d i f f e r e n t combination of f i v e d i f f e r e n t s e t s of q u e s t i o n s . S i x of the seven forms c o n t a i n e d s e t A q u e s t i o n s which had a range of d i f f i c u l t i e s and r e l a t i v e l y good model f i t which were used as the common core of qu e s t i o n s f o r item parameter anchoring purposes. A l s o randomly in t e r m i x e d were s e t X on four of the forms and s e t Y on the other three forms. These s e t s were of primary i n t e r e s t and were qu e s t i o n s with good model f i t and d i s c r i m i n a t i o n , but more l i m i t e d i n t h e i r range of (70) d i f f i c u l t y . The f i n a l v a r i a t i o n of forms was that one t e s t with s e t s A and X a l s o had a s e t V while one form with s e t s A and Y a l s o had a s e t W. Sets V and W had r e l a t i v e l y poor f i t and extreme d i f f i c u l t i e s or low d i s c r i m i n a t i o n s . A l l items from the s e t s t h a t were i n c l u d e d i n the form were i n t e r m i n g l e d so t h a t some forms c o n t a i n e d a l l i d e n t i c a l q u e s t i o n s but i n randomly d i f f e r e n t orders while other forms had many i d e n t i c a l q u e s t i o n s i n a d i f f e r e n t order but a l s o had some a d d i t i o n a l q u e s t i o n s t h a t changed the context of the t e s t items. Yen found t h a t changing the order or i n c l u d i n g a d d i t i o n a l q u e s t i o n s d i d s i g n i f i c a n t l y a l t e r the d i f f i c u l t y parameters and the d i s c r i m i n a t i n g parameters (p < .05). Item order e f f e c t s were demonstrated by the f a c t t h a t the gr e a t e r the c o r r e l a t i o n of sequence the g r e a t e r the s i m i l a r i t y of item parameter e s t i m a t e s . Speediness was not con s i d e r e d a f a c t o r s i n c e 93 percent of the students d i d answer the l a s t q u e s t i o n of one booklet examined. Of course, some students d i d omit some items. As a r e s u l t , t h e i r computer a n a l y s i s gave o n l y those students who omitted an item a chance l e v e l of answering the q u e s t i o n , but ignored the m i s s i n g answers of students who d i d not reach an item. T h i s i s one reason t h a t Yen concluded t h a t f a t i g u e or impatience to f i n i s h r a t h e r than a computer s c o r i n g anomaly were p o s s i b l e causes of some q u e s t i o n s near the end of the (71) t e s t seeming t o be more d i f f i c u l t . However, not a l l q u e s t i o n s a t the end were more d i f f i c u l t , so Yen f e l t t h a t o t h e r f a c t o r s such as the c o n t e n t of the p r e c e d i n g i t e m may have caused the i n s t a b i l i t y . A l t h o u g h Yen's s t u d y does i n d i c a t e an e f f e c t , i t i s s i m i l a r t o o t h e r r e c e n t s t u d i e s u s i n g l a t e n t t r a i t a n a l y s i s and does not a d d r e s s the i s s u e of whether i t i s b e s t t o sequence t e s t from e a s i e s t t o h a r d e s t f o r the b e n e f i t of the low a b i l i t y s t u d e n t s . Yen's s t u d y d i d use a l a r g e and heterogeneous sample of e l e m e n t a r y s t u d e n t s , but her complex a l t e r i n g of c o n t e x t added o t h e r f a c t o r s such as t e s t l e n g t h i n t o her a n a l y s i s t h a t l i m i t the c o n c l u s i o n s about i t e m o r d e r . Summary and C o n c l u s i o n s T h i r t y - s e v e n s t u d i e s p u b l i s h e d over the l a s t 40 y e a r s show the c o n t i n u e d i n t e r e s t i n i t e m o r d e r r e s e a r c h and demonstrate the d i f f i c u l t y t h a t any r e s e a r c h e r would have i n dr a w i n g c o n c l u s i o n s . The d e f i n i t i v e s t u d y has not y e t been p u b l i s h e d , but the many a t t e m p t s t o s e t t l e the i t e m o r d e r c o n t r o v e r s y have s e r v e d t o c r e a t e a l a r g e p o o l of c o n f l i c t i n g r e s u l t s . T a b l e 1 p r e s e n t s the r e s e a r c h r e s u l t s i n a format s i m i l a r t o the a n a l y s i s by L e a r y and Dorans (1985). U n f o r t u n a t e l y , r e s u l t s of i t e m o r d e r e f f e c t s do not always (72) f a l l s imply i n t o the c a t e g o r y of n o n - s i g n i f i c a n t main or i n t e r a c t i o n e f f e c t s (non) or the c a t e g o r y of s i g n i f i c a n t main or i n t e r a c t i o n e f f e c t s ( s i g ) . Those s t u d i e s t h a t do show some f i n d i n g s t h a t r e p o r t r e s u l t s which to a l i m i t e d e xtent show some p o s s i b l e e f f e c t of item order w i l l be i n d i c a t e d as being n o n - s i g n i f i c a n t with a d d i t i o n a l i n f o r m a t i o n (non +). The t e s t s i n Table 1 were a l s o d i v i d e d a c c o r d i n g to type based on whether they were power t e s t s , and r e p o r t e d a l l s t u dents f i n i s h e d the t e s t , or timed t e s t s , and r e p o r t e d t h a t some students d i d not f i n i s h the t e s t . T h i s a l s o i s a d i f f i c u l t judgement to make s i n c e some s t u d i e s gave no i n d i c a t i o n one way or the o t h e r . S t u d i e s which do not c l e a r l y s t a t e the e f f e c t s of t h e i r time l i m i t s are i n d i c a t e d as power t e s t s . In a d d i t i o n , a study which i n v o l v e d a l a r g e number of students c o u l d be e s s e n t i a l l y a power t e s t s f o r most of the students but a speeded t e s t s f o r a few. Such a t e s t i s l i s t e d as a timed t e s t . F i n a l l y i n Table 1, although the number of students i n v o l v e d i s one of the e a s i e s t judgments to make, compromises between b r e v i t y and accura c y were made. In the case of the study by Monk and S t a l l i n g s (1970) the same experiment was repeated 11 times with 11 d i f f e r e n t s m a l l samples, but o n l y the t o t a l aggregate number i s l i s t e d . (73) Table 1 L i s t of Item Order Research: R e s u l t s . Types, and Samples R e s u l t s Non Non + S i g Authors Year Type Sample n French & Greer 1964 P elem. 152 Smouse & Munz 1968 P c o l . 113 Berger et a l . 1969 P sec. 330 Marso (a) 1970 P c o l . 122 Marso (b) 1970 P c o l . 156 Kestenbaum & Weiner 1970 P sec. 79 Huck & Bowers (a) 1972 P c o l . 120 Huck & Bowers (b) 1972 P c o l . 162 Plake 1980 P c o l . 104 Plake & Ansorge 1984 P c o l . 352 Klimko 1984 P c o l . 111 A l l i s o n 1984 P elem. 327 Lane e t a l . (b) 1987 P c o l . 247 Brenner 1964 P c o l . -Munz & Smouse 1968 P c o l . 120 Smouse & Munz 1969 P c o l . 181 Monk & S t a l l i n g s 1970 P c o l . 2 ,000 Munz & Jacobs 1971 P c o l . 133 Klosner & Gellman 1973 P c o l . 54 Plake, Thompson et a l . 1980 P c o l . 97 Plake, M e l i c a n et a l . 1983 P c o l . 167 Mollenkopf 1950 t sec. 382 MacNicol 1956 t s e c . 1 , 500 Sax & Carr 1962 t c o l . 335 Sax & Cromack 1966 t c o l . 467 Flaugher et a l . 1968 t sec. 5 ,000 S i r o t n i k & W e l l i n g t o n 1974 t sec. 2 ,463 F e l d t & F o r s y t h 1974 t sec. 530 Hambleton & Traub 1974 t sec. 106 Towle & M e r r i l l 1975 t c o l . 82 Whitely & Dawis 1976 t sec. 1 ,568 Yen 1980 t elem. 2 , 400 Kl e i n k e 1980 t elem. 484 Plake, Ansorge et a l . 1982 t c o l . 170 Kingston & Dorans 1984 t c o l . 4 ,000 Hodson 1984 t sec. 157 Lane et a l . (a) 1987 P c o l . 155 Note non = not s i g n i f i c a n t ; non + = a d d i t i o n a l r e l e v a n t e f f e c t s ; s i g = s i g n i f i c a n t ; p = power; t = timed; elem. = elementary; s ec. = secondary; c o l . = c o l l e g e ; - = not given (74) The r e s u l t s of Table 1 show s i x t e e n s t u d i e s r e p o r t i n g s i g n i f i c a n t e f f e c t s r e l a t e d to item order and twenty-one r e p o r t i n g n o n - s i g n i f i c a n t e f f e c t s of item o r d e r . As Leary and Dorans (1985) concluded, those s t u d i e s t h a t i n d i c a t e d t h a t speediness was a f a c t o r , due to students not f i n i s h i n g the t e s t w i t h i n the given time l i m i t s , were s t u d i e s t h a t r e p o r t e d s i g n i f i c a n t order e f f e c t s . Item order seems to be a f a c t o r whenever speed i s a f a c t o r . One suggested reason i s t h a t students with the e a s i e s t q u e s t i o n s f i r s t w i l l get more q u e s t i o n s c o r r e c t before they run out of time. T h i s c o n c l u s i o n i s supported by the f a c t t h a t , with one e x c e p t i o n , any study which had no e f f e c t from item order was a power t e s t , and any study which had a s i g n i f i c a n t item order e f f e c t had students who d i d not f i n i s h . Another e x p l a n a t i o n f o r t h i s c o i n c i d e n c e i s that s t u d i e s t h a t attempted to i n c r e a s e t h e i r s t a t i s t i c a l power to f i n d s i g n i f i c a n t e f f e c t s or s t u d i e s that were using l a t e n t t r a i t s t a t i s t i c s needed to have l a r g e r numbers. With l a r g e r numbers of s t u d e n t s , the need f o r time l i m i t s i n c r e a s e d as w e l l as the l i k e l i h o o d t h a t some students would not f i n i s h the exam. In a d d i t i o n , attempts were made to improve the g e n e r a l i z a b i l i t y of the study, so samples with a wider range of a b i l i t i e s than the t y p i c a l f i r s t year c o l l e g e student sample were used. T h i s r e s u l t e d i n samples from the p u b l i c s c h o o l system and t h e r e f o r e samples with a wider (75) v a r i a t i o n i n performance speed. Time l i m i t s became a necessary and t y p i c a l f a c t o r of s t u d i e s t h a t were powerful enough to f i n d an item order e f f e c t . Table 1 shows t h a t the sample type and the sample s i z e seem to be a f a c t o r s , but they are f a c t o r s r e l a t e d to time. A study with over 300 students w i l l p r obably have a s i g n i f i c a n t item order e f f e c t or context e f f e c t , but i t w i l l p r o bably a l s o have time l i m i t s on such a l a r g e number of s u b j e c t s . E l e v e n out of f i f t e e n s t u d i e s show such a p a t t e r n . As w e l l , ten out of f i f t e e n s t u d i e s i n v o l v i n g secondary or elementary s t u d e n t s , r e p o r t e d s i g n i f i c a n t item order e f f e c t s , but they a l s o r e p o r t e d t h a t time l i m i t s were a f a c t o r f o r some s t u d e n t s . F i n a l l y , three s t u d i e s i n v o l v i n g l a t e n t t r a i t models i n d i c a t e d s i g n i f i c a n t item order e f f e c t s , but s i n c e l a t e n t t r a i t models r e q u i r e l a r g e numbers, they a l s o i n v o l v e d time l i m i t s as a f a c t o r (Yen, 1980; Whitely & Dawis, 1976; Kingston & Dorans, 1984) The obvious r e l a t i o n s h i p between the speediness of the t e s t and the item order e f f e c t s may i n v o l v e other f a c t o r s . Whether a t e s t i s c o n s i d e r e d a timed or a power t e s t i n v o l v e s a judgmental problem r e l a t e d to c o r r e l a t i o n as opposed to c a u s a t i o n . While there i s a c o r r e l a t i o n between item order and time, item order may, i n f a c t , cause some students not to f i n i s h on time r a t h e r than time c a u s i n g some students to be e f f e c t e d by the item o r d e r . A s t r o n g (76) c o r r e l a t i o n c o u l d a l s o be a r e s u l t of r e s e a r c h e r s who unexpectedly found an item order e f f e c t and chose to d i s m i s s i t as a s i d e e f f e c t of t h e i r time l i m i t s ; as a r e s u l t , they r e p o r t e d students who f a i l e d to complete the t e s t and d e c l a r e d t hat t h e i r t e s t was speeded. On the other hand, r e s e a r c h e r s who found no e f f e c t from item order may have chosen not to mention t h a t students d i d not complete the t e s t . One r e l a t e d example i s Hambleton and Traub (1974) who expected to f i n d an item order e f f e c t and c o n s i d e r e d t h e i r t e s t to be a power t e s t . However, s i n c e some of t h e i r s u b j e c t s d i d not f i n i s h the t e s t , t h e i r study i s c o n s i d e r e d to i n v o l v e speeded t e s t s (Leary & Dorans, 1985). C l e a r l y , there i s a need f o r f u r t h e r r e s e a r c h about the f a c t o r s t h a t i n f l u e n c e item order e f f e c t s . I t seems t h a t item order e f f e c t s may Involve a q u e s t i o n of s t a t i s t i c a l power. So, a sample s i z e over 400 would seem to be an a p p r o p r i a t e s i z e to ensure adequate power. In a d d i t i o n , item order e f f e c t s may a l s o i n v o l v e s u b j e c t v a r i a n c e . T h e r e f o r e , sampling should be from a p o p u l a t i o n with a wide range of a b i l i t i e s . F u r t h e r , item order r e s e a r c h may a l s o i n v o l v e c o n t r o l l i n g such confounding v a r i a b l e s as w i t h i n - s u b j e c t rearrangement, gender, t e s t content, s u b j e c t m o t i v a t i o n , and the item d i f f i c u l t y range. A l l of which have been shown by the r e s e a r c h to be p o t e n t i a l sources of e r r o r . F i n a l l y , the r e s e a r c h must use a power t e s t to avoid problems w i t h t e s t s c o r e s t h a t a r e the r e s u l t s of speed i n e s s . chapter I I I Problem Statement of the Problem The main problem that was addressed i n t h i s study was whether or not a t e s t with the items arranged from hardest to e a s i e s t i s more d i f f i c u l t than i f i t i s arranged from e a s i e s t to h a r d e s t . Two t e s t s were used. One t e s t had the q u e s t i o n s arranged i n the ascending d i f f i c u l t y sequence, and the other t e s t had the q u e s t i o n s arranged i n the descending d i f f i c u l t y sequence. However, s i n c e r e s e a r c h would i n d i c a t e t h a t s e v e r a l c o n d i t i o n s can a l t e r the s i z e of the item order e f f e c t , s e v e r a l other problems were addressed to determine the i n f l u e n c e of these other v a r i a b l e s . For one, the problem of w i t h i n - s u b j e c t rearrangement was addressed by r e p l i c a t i n g the r e s t r i c t i v e t e s t booklet d i r e c t i o n s used i n the Hambleton and Traub (1974) study. However, to determine i f the t e s t b o o k l e t d i r e c t i o n s are in f a c t a s i g n i f i c a n t f a c t o r , a c o n t r o l group with u n r e s t r i c t i v e d i r e c t i o n s was a l s o used. A t h i r d problem was to determine whether or not low a b i l i t y students performed d i f f e r e n t l y than high a b i l i t y s tudents with the two d i f f e r e n t item order arrangements or with the two d i f f e r e n t t e s t booklet formats. A l a r g e and (78) (79) d i v e r s e sample was used to ensure an ample range of a b i l i t y l e v e l s . F i n a l l y , the problem of whether or not s t u d i e s u s i n g l a t e n t t r a i t s t a t i s t i c s are comparable to s t u d i e s u s i n g c l a s s i c a l s t a t i s t i c s was addressed by u s i n g both types of s t a t i s t i c s i n the a n a l y s i s of the data. R a t i o n a l e F o r t y years of r e s e a r c h i n t o item order arrangements has s t i l l not r e s o l v e d the issue of whether or not the item order sequence w i l l e f f e c t t e s t s c o r e s . T h i s study was intended to help c l a r i f y t h i s c o n t r o v e r s y . In a d d i t i o n , s e v e r a l v a r i a b l e s have been suggested by the r e s e a r c h as f a c t o r s which might i n f l u e n c e the presence or absence of an item order e f f e c t . One p o s s i b l e v a r i a b l e i s i n d i c a t e d by the r e s e a r c h of Hambleton and Traub (1974). In t h e i r study they d i d not a l l o w students to rearrange the order of the items by doing o n l y the easy ones f i r s t r e g a r d l e s s of the item's l o c a t i o n i n the t e s t . T h i s prevented students from u s i n g a t e s t t a k i n g s t r a t e g y that some students seem to possess (Millman & Bishop, 1965; Tuck, 1978; R i n d l e r , 1980; Klimko, 1984; A l l i s o n & Thomas, 1986). However, as Hambleton and Traub concluded, when any students who may have t h i s s t r a t e g y are not able to use i t , then item order has an e f f e c t t h a t i s (80) s t a t i s t i c a l l y s i g n i f i c a n t . They may be p r e v e n t e d from u s i n g t h e i r s t r a t e g y by the t e s t b o o k l e t as Hambleton and Traub d i d , or t h e y may be h i n d e r e d i n u s i n g t h e i r s t r a t e g y by the presence of time l i m i t s which tend t o d i s c o u r a g e s k i p p i n g back and f o r t h between i t e m s . I f t h i s w i t h i n - s u b j e c t rearrangement t e c h n i q u e i s a s i g n i f i c a n t f a c t o r , then much of the p r e v i o u s r e s e a r c h on i t e m o r d e r must be more s e r i o u s l y q u e s t i o n e d f o r i t s f a i l u r e t o c o n t r o l t h i s v a r i a b l e . The v a l i d i t y of p r e v i o u s i t e m o r d e r s t u d i e s can be q u e s t i o n e d s i n c e the s t u d y may be measuring an a d d i t i o n a l f a c t o r of knowledge and usage of a t e s t - w i s e n e s s s t r a t e g y . A second v a r i a b l e i s the p o s s i b i l i t y t h a t when low a c h i e v i n g s t u d e n t s a r e i n v o l v e d i n the s t u d y , t h e s e s t u d e n t s may l a c k t h e s k i l l t o e f f e c t i v e l y use the s t r a t e g y of w i t h i n - s u b j e c t rearrangement and a r e t h e r e f o r e a f f e c t e d by the i t e m o r d e r . T h i s i s i n k e e p i n g w i t h the more common h i s t o r i c a l l y h e l d view t h a t s u p p o r t s the need f o r a r r a n g i n g items from e a s y t o hard t o a v o i d f r u s t r a t i n g low a c h i e v i n g s t u d e n t s . T h e i r f r u s t r a t i o n may be caused by a l a c k of t h i s t e s t t a k i n g s t r a t e g y . I f some examinees can c o n t r o l the d e t r i m e n t a l e f f e c t s of i t e m o r d e r , then t e s t s s h o u l d be a r r a n g e d from easy to hard t o o b t a i n the b e s t r e s u l t s from s t u d e n t s who a r e not ( 8 1 ) adept a t o m i t t i n g the hard q u e s t i o n s . A l s o , s t u d e n t s who a r e unaware of t h i s s t r a t e g y c o u l d be t a u g h t how t o use the w i t h i n - s u b j e c t rearrangement s t r a t e g y t o p e r f o r m b e t t e r on t e s t s . A t h i r d v a r i a b l e c o u l d be t h a t l a t e n t t r a i t i t e m parameters a r e more s e n s i t i v e t o i t e m o r d e r e f f e c t s than c l a s s i c a l model s t a t i s t i c s . A l t h o u g h Yen (1980) found both c l a s s i c a l and l a t e n t t r a i t s t a t i s t i c s t o be s e n s i t i v e t o the i t e m d i f f i c u l t y sequence, t h i s c o u l d be one r e a s o n why the s t u d i e s i n v o l v i n g i t e m o r d e r and l a t e n t t r a i t s t a t i s t i c s r e p o r t e d t h a t changes i n the c o n t e x t were a s s o c i a t e d w i t h s i g n i f i c a n t changes i n the i t e m parameters ( W h i t e l y & Dawis, 1976; K i n g s t o n & Dorans, 1984). I t needs t o be d e t e r m i n e d i f n o n - s i g n i f i c a n t r e s u l t s w i t h c l a s s i c a l s t a t i s t i c s would be s i g n i f i c a n t i f l a t e n t t r a i t s t a t i s t i c s were used i n s t e a d . I f l a t e n t t r a i t s t a t i s t i c s i n d i c a t e d i f f e r e n t c o n c l u s i o n s t h a n c l a s s i c a l s t a t i s t i c s , t h e n i t would be i n a p p r o p r i a t e t o compare r e s u l t s from a t e s t u s i n g l a t e n t t r a i t s t a t i s t i c s w i t h one t h a t used c l a s s i c a l s t a t i s t i c s . In a d d i t i o n , i f l a t e n t t r a i t s t a t i s t i c s a r e p a r t i c u l a r l y s e n s i t i v e to i t e m o r d e r changes t h e n the l a t e n t t r a i t model a s s u m p t i o n of l o c a l independence must c e r t a i n l y be q u e s t i o n e d . ( 8 2 ) Hypotheses Hypothesis # 1 : A t e s t with the items arranged from easy to hard w i l l be e a s i e r than the same t e s t with the items arranged from hard to easy. T h i s e f f e c t w i l l be e v i d e n t from the easy to hard arrangement having a higher mean, a higher average of c l a s s i c a l item p - l e v e l d i f f i c u l t y indexes, and a lower average of l a t e n t t r a i t model item parameter d i f f i c u l t y indexes. Hypothesis #2 : A t e s t with t e s t b ooklet d i r e c t i o n s which r e s t r i c t w i t h i n - s u b j e c t rearrangement w i l l be more d i f f i c u l t than t e s t s without such r e s t r i c t i o n s . T h i s e f f e c t w i l l be e v i d e n t with the r e s t r i c t e d t e s t having a lower mean and c l a s s i c a l d i f f i c u l t y index. The l a t e n t t r a i t model d i f f i c u l t y indexes w i l l be h i g h e r . Hypothesis If3: There w i l l be an i n t e r a c t i o n e f f e c t between the a b i l i t y of the student, the t e s t arrangement and the t e s t format. Students with high a b i l i t y w i l l have the lowest mean and c l a s s i c a l d i f f i c u l t y index on the r e s t r i c t e d format, hard to easy arrangement. The l a t e n t t r a i t model d i f f i c u l t y index w i l l be h i g h e s t . On the other hand, low a b i l i t y students w i l l have t h e i r lowest mean, t h e i r lowest c l a s s i c a l d i f f i c u l t y index, and the h i g h e s t b-value parameter on any hard t o easy arrangement. F i g u r e 1 and F i g u r e 2 d i s p l a y t h i s h y p o t h e s i s . (83) F i g u r e 1 Low A b i l i t y Students U n r e s t r i c t e d R e s t r i c t e d Easy Mean = high Mean = high to p - l e v e l = = high p - l e v e l = = high Hard b-value = = low b-value = = low Hard Mean = low Mean = low to p - l e v e l = = low p - l e v e l = = low Easy b-value = = high b-value = = high F i g u r e 2 High A b i l i t y Students U n r e s t r i c t e d R e s t r i c t e d Easy Mean = high Mean = high to p - l e v e l = = high p - l e v e l = = high Hard b-value : = low b-value : = low Hard Mean = high Mean = low to p - l e v e l = = high p - l e v e l = = low Easy b-value = = low b-value : = high (84) Hypothesis #4: C l a s s i c a l and l a t e n t t r a i t t e s t s t a t i s t i c s w i l l i n d i c a t e s i m i l a r p a t t e r n s of r e s u l t s to each ot h e r . Comparisons between c l a s s i c a l p - l e v e l s t a t i s t i c s and l a t e n t t r a i t b-value s t a t i s t i c s w i l l have a high degree of c o r r e l a t i o n . Chapter IV Method Pesjtgn, The d e s i g n was a p o s t - t e s t o n l y c o n t r o l group d e s i g n . Four d i f f e r e n t t e s t b o o k l e t s were used as the four d i f f e r e n t treatment groups. The b o o k l e t s had e i t h e r r e s t r i c t i v e or u n r e s t r i c t i v e d i r e c t i o n s , and they had e i t h e r an easy to hard item order or a hard t o easy item o r d e r . The students were placed randomly i n t o e i t h e r the c o n t r o l group of u n r e s t r i c t e d , easy to hard t e s t b o o k l e t s or i n t o one of the other treatment groups t h a t used the other t e s t b o o k l e t s . The mathematics t e s t i t s e l f served as the treatment and as the p o s t - t e s t . The independent v a r i a b l e s were item order, t e s t format, and a b i l i t y l e v e l . Item order was e i t h e r the easy to hard arrangement or the hard to easy arrangement. Test format was e i t h e r r e s t r i c t e d or u n r e s t r i c t e d . A b i l i t y l e v e l was r a t e d by the teacher on a s c a l e from "1" to "6". The 10% of students with the lowest a b i l i t y i n mathematics r e c e i v e d a "1". A "2" was used f o r the next 15%, a "3" was used f o r the next 25%, a "4 was used f o r the next 25%, a "5" was used f o r the next 15%, and a "6" was used f o r the 10% of students with the h i g h e s t mathematical a b i l i t y . (85) (86) The dependent v a r i a b l e s were the t e s t s c o r e s and the item d i f f i c u l t i e s . Item d i f f i c u l t y was with comparisons u s i n g c l a s s i c a l p - l e v e l s t a t i s t i c s , and with comparisons u s i n g l a t e n t t r a i t b-value item parameters. Subjects Students i n grade e i g h t from approximately 25 d i f f e r e n t classrooms i n three d i f f e r e n t secondary s c h o o l s i n two suburban s c h o o l d i s t r i c t s near Vancouver B.C. were used . The 590 students were from every grade e i g h t c l a s s i n each of the three s c h o o l s , so a v a r i e t y of socioeconomic backgrounds and a b i l i t y l e v e l s were i n c l u d e d the sample. Students with l e a r n i n g problems s i g n i f i c a n t enough to not be e n r o l l e d i n a r e g u l a r classroom, or students who had an excused absence on the day the t e s t was a d m i n i s t e r e d d i d not p a r t i c i p a t e i n the study. Instrument and Tasks A grade 8 mathematics t e s t was developed u s i n g 40 randomly chosen items from the 150 items i n the Second I. E. A. I n t e r n a t i o n a l Study of Mathematics ( R o b i t a i l l e & Garden, 1987). The d i f f i c u l t y l e v e l s had a range from p = .13 to p = .89 with the mean of d i f f i c u l t y l e v e l s at .492. There was no attempt to r e f l e c t the p r o v i n c i a l c u r r i c u l u m i n the items chosen s i n c e i n d i v i d u a l s c h o o l s (87) d i f f e r e d In the t i m i n g of t h e i r I n s t r u c t i o n of the c u r r i c u l u m . Four mathematics t e s t b o o k l e t s were p r e p a r e d u s i n g t h e s e 40 randomly chosen q u e s t i o n s . One h a l f of the t e s t b o o k l e t s had the q u e s t i o n s a r r a n g e d from e a s y t o hard w h i l e the o t h e r two b o o k l e t s were a r r a n g e d from hard t o easy. I n a d d i t i o n , b oth arrangements were p r e s e n t e d on one of two t e s t b o o k l e t f o r m a t s . One format had i n s t r u c t i o n s which d i r e c t e d the s t u d e n t s t o o n l y do the q u e s t i o n s i n the o r d e r t h a t t h e y were p r e s e n t e d . The o t h e r format had i n s t r u c t i o n s which a l l o w e d the s t u d e n t t o lo o k back and f o r t h t h r o u g h the t e s t b o o k l e t . Both b o o k l e t formats were p r i n t e d w i t h one q u e s t i o n per page, and both b o o k l e t f o r m a t s were c o l o u r coded t o a l l o w t e s t t a k e r s and t e s t s u p e r v i s o r s the a b i l i t y t o be s u r e t h a t the p r o p e r r e s t r i c t e d or u n r e s t r i c t e d i n s t r u c t i o n s were b e i n g f o l l o w e d . In o r d e r t o make a p r a c t i c a l assessment of s t u d e n t ' s a b i l i t y , t e a c h e r s were asked t o complete a s i m p l e s i x p o i n t r a t i n g s c a l e . T h i s s c a l e asked the t e a c h e r t o use a normal c u r v e d i s t r i b u t i o n t o rank the s t u d e n t s ' mathematics a b i l i t y based on the t e a c h e r ' s p e r s o n a l judgement. A r a t i n g of "1" i n d i c a t e d the s t u d e n t s w i t h the l o w e s t a b i l i t y i n math, and a r a t i n g of "6" i n d i c a t e d the s t u d e n t s w i t h the h i g h e s t a b i l i t y i n math. (88) Procedure A l l four t e s t b ooklet types were d i s t r i b u t e d a l t e r n a t e l y to the students i n each classroom. I t i s assumed t h a t t h i s s y s t e m a t i c d i s t r i b u t i o n produced an e s s e n t i a l l y random sample. Students were asked to put t h e i r name, s c h o o l , and mathematics c l a s s I d e n t i f i c a t i o n i n f o r m a t i o n on the t e s t b ooklet answer sheet. Next, students were gi v e n d i r e c t i o n s f o r completing the t e s t and the r e s t r i c t i o n s f o r those with the r e s t r i c t e d format b o o k l e t s . Both the m o t i v a t i o n and the c o o p e r a t i o n of the students was sought i n the d i r e c t i o n s . The students were asked to cooperate s i n c e t h i s was p a r t of an experiment to see i f going back and f o r t h i n a t e s t b ooklet would help students who had to take t e s t s . In a d d i t i o n , the students were t o l d t h a t even though i t was p a r t of an experiment, the r e s u l t s might be used by the teacher i n d e t e r m i n i n g the students f i n a l grade. Students were then allowed the remaining f i f t y minutes of c l a s s time to complete the t e s t . Analys i s The a n a l y s i s of the t e s t s c o r e s was with a 2 x 2 x 6 Anova format u s i n g the S.P.S.S.-X program. I t was assumed th a t the means were normally d i s t r i b u t e d , homogeneous i n v a r i a n c e , and independent. I t was a l s o assumed that the (89) f a c t o r s of s c h o o l and classroom were n o n - s i g n i f i c a n t random f a c t o r s with equal means. The a n a l y s i s of the item d i f f i c u l t i e s was with a Manova repeated measures format u s i n g the S.P.S.S.-X program. I t was assumed t h a t each item's d i f f i c u l t y was measured four times with each t e s t b o o k l e t being presented to e q u i v a l e n t samples of the p o p u l a t i o n . F u r t h e r , i t was assumed t h a t the means were normally d i s t r i b u t e d , homogeneous i n v a r i a n c e , and independent. I t was a l s o assumed t h a t the f a c t o r s of sc h o o l and classroom were n o n - s i g n i f i c a n t random f a c t o r s with equal means. The alpha l e v e l chosen f o r the t e s t s of s i g n i f i c a n c e was .05. T h i s l e v e l was chosen to r e p l i c a t e the s t a t i s t i c a l c o n d i t i o n s e s t a b l i s h e d i n the study by Hambleton and Traub (1974). Chapter V R e s u l t s A mathematics t e s t of f o r t y q u e s t i o n s was a d m i n i s t e r e d to 590 grade e i g h t s t u d e n t s . The f o r t y q u e s t i o n s were presented i n four d i f f e r e n t t e s t b o o k l e t s with the four b o o k l e t s randomly g i v e n to the s t u d e n t s . Two of the b o o k l e t s had the items arranged from e a s i e s t to hardest and the other two b o o k l e t s had the q u e s t i o n s arranged from hardest to e a s i e s t . F u r t h e r , each sequencing format was presented i n two d i f f e r e n t t e s t b o o k l e t s . One booklet had d i r e c t i o n s which r e s t r i c t e d s u b j e c t s from r e a r r a n g i n g the item order, and the other b o o k l e t had no such r e s t r i c t i o n s . The four t e s t s were designed to answer the f o l l o w i n g q u e s t i o n s : 1) Does a l t e r i n g the order of t e s t items r e s u l t i n changes i n means of the t e s t s c o r e s ? 2) Do d i r e c t i o n s on the t e s t b o o k l e t which r e s t r i c t the w i t h i n - s u b j e c t rearrangement of t e s t items r e s u l t i n g r e a t e r e f f e c t s a s s o c i a t e d with the changes i n item order? 3) Is t h e i r an i n t e r a c t i o n e f f e c t between a students a b i l i t y , the item order and the t e s t booklet format? S p e c i f i c a l l y , do low a b i l i t y students have t h e i r lowest score whenever they have a t e s t which begins with hard (90) (91) i t e m s , and do h i g h a b i l i t y s t u d e n t s o n l y have low s c o r e s when t h e t e s t b e g i n s w i t h hard items and when t h e y a r e not a l l o w e d t o a l t e r the i t e m d i f f i c u l t y sequence? 4) Are any changes i n p - l e v e l d i f f i c u l t y s t a t i s t i c s s i m i l a r t o changes i n b- v a l u e i t e m parameter s t a t i s t i c s ? Main E f f e c t s The means of the two d i f f e r e n t i t e m sequences were found t o be s i g n i f i c a n t l y d i f f e r e n t (p < .001). The 297 s t u d e n t s who had an exam w i t h an easy t o hard sequence had an average s c o r e on the 40 i t e m t e s t of 18.56. The 293 s t u d e n t s who had a t e s t which began w i t h hard q u e s t i o n s o n l y had an average s c o r e of 15.90. The means of the two d i f f e r e n t t e s t f o r m a t s were not found t o be s i g n i f i c a n t l y d i f f e r e n t . The 296 s t u d e n t s w i t h the u n r e s t r i c t e d t e s t format b o o k l e t had an average s c o r e of 17.58. On the o t h e r hand, the 294 s t u d e n t s w i t h the r e s t r i c t e d sequence d i d about the same w i t h an average s c o r e o£ 16.89. The f o u r f o r m a t s demonstrated a h i g h l e v e l of r e l i a b i l i t y . Cronbach's c o e f f i c i e n t a l p h a was c a l c u l a t e d f o r each f o r m a t . The f i r s t format w i t h an easy t o hard sequence and u n r e s t r i c t e d d i r e c t i o n s had a r e l i a b i l i t y of .845. The second format w i t h a hard t o easy sequence and (92) u n r e s t r i c t e d d i r e c t i o n s had a r e l i a b i l i t y of .855. The t h i r d format with an easy to hard sequence and r e s t r i c t e d d i r e c t i o n s had a r e l i a b i l i t y of .829. F i n a l l y the f o u r t h format with a hard to easy sequence and r e s t r i c t e d d i r e c t i o n s had a r e l i a b i l i t y of .835. The means of the d i f f e r e n t t e a c h e r - r a t e d a b i l i t y l e v e l s were found to be s i g n i f i c a n t l y d i f f e r e n t (p < .001). The c o r r e l a t i o n between teacher r a t i n g and the student's score was s i g n i f i c a n t l y c o r r e l a t e d (p < .001) with a Pearson c o r r e l a t i o n c o e f f i c i e n t of .6340. The r e s u l t s are presented i n Table 2. (93) Table 2 Test Means and Sample S i z e s of Student A b i l i t y L e v e l s L e v e l A b i l i t y L a b e l Mean n_ 1 Lowest 10% 11.27 44 2 Next 15% 11.76 90 3 Next 25% 14.96 178 4 Next 25% 18.92 112 5 Next 15% 21.63 115 6 Highest 10% 26.41 51 T o t a l 17.37 590 (94) Interactions There were no s i g n i f i c a n t i n t e r a c t i o n e f f e c t s . Low a b i l i t y students d i d not seem to be any more l i k e l y to r e c e i v e a lower score on a t e s t which began with hard q u e s t i o n s than d i d high a b i l i t y s t u d e n t s . The e f f e c t s a s s o c i a t e d with changes of item order e f f e c t e d a l l a b i l i t y groups e q u a l l y . As w e l l , t e s t d i r e c t i o n s d i d no have any i n t e r a c t i o n e f f e c t s . So, i n a d d i t i o n to the f a c t t h a t there was no o v e r a l l main e f f e c t a s s o c i a t e d with d i f f e r e n t t e s t d i r e c t i o n s , d i f f e r e n t a b i l i t y groups were not more l i k e l y to r e c e i v e higher s c o r e s i f they had a d i f f e r e n t type of t e s t d i r e c t i o n . The r e s u l t s of the 2 x 2 x 6 a n a l y s i s of v a r i a n c e are presented i n Table 3. (95) Table 3 Summary of A n a l y s i s of Va r i a n c e of Test Scores by  A b i l i t y (Ab), Item Order (Or), and Test D i r e c t i o n s ( D i r ) Sum of Mean Source Squares DF Square F Prob. Main 12743 .090 7 1820 . 441 65 .715 < .001 A b i l i t y 11629 .609 5 2325 .922 83 .962 < .001 Order 694 .458 1 694 .458 25 .069 < . 001 D i r e c t i o n s 32 .034 1 32 .034 1 .156 .283 Ab x Or 255 .123 5 51 .025 1 . 842 .103 Ab x D i r 80 .120 5 16 .024 . 578 . 717 Or x D i r 6 .453 1 6 .453 .233 .630 Ab x Or x Dir 179 .829 5 35 .966 1 . 298 . 263 Ex p l a i n e d 13249 . 491 23 576 .065 20 .795 < .001 Res i d u a l 15679 .289 566 27 .702 T o t a l 28928 . 780 589 49 .155 (96) Item Qjf f jciUtiles Due t o t h e c l o s e m a t h e m a t i c a l r e l a t i o n s h i p between the mean and p - l e v e l , l t i s not s u r p r i s i n g t h a t a s i g n i f i c a n t d i f f e r e n c e was a l s o found between the mean of the p - l e v e l s of each t e s t . A m u l t i v a r i a t e a n a l y s i s of v a r i a n c e f o r r e p e a t e d measures was used t o d e t e r m i n e i f t h e r e was a s i g n i f i c a n t d i f f e r e n c e between the i t e m d i f f i c u l t i e s on each t e s t . Each i t e m had p - l e v e l v a l u e s from f o u r t e s t s . The samples were assumed t o be e q u i v a l e n t w i t h the v a r i a n c e p r i m a r i l y a r e s u l t of the e x p l a i n e d v a r i a n c e between the t e s t s or a b i l i t y l e v e l s as r e p o r t e d i n the p r e v i o u s Anova r e s u l t s i n T a b l e 3. As e x p e c t e d , a s i g n i f i c a n t d i f f e r e n c e was found between the d i f f e r e n t i t e m d i f f i c u l t i e s of each t e s t as shown i n Tab l e 4. (97) Table 4 Summary of M u l t i v a r i a t e A n a l y s i s of Variance of Test Item P - l e v e l s Sum of Mean Source Squares DF Square F Prob. P - l e v e l s .10 3 .03 10.26 < .001 Within C e l l s .18 57 .004 (98) The same type of a n a l y s i s with rasch item parameters was used. The r a s c h item parameters were estimated u s i n g the M i c r o c a t T e s t i n g System (1988) with a b i l i t y l e v e l s s t a n d a r d i z e d . Table 5 shows t h a t the rasch item parameters a l s o have a s i m i l a r l e v e l of s i g n i f i c a n c e as the c l a s s i c a l item d i f f i c u l t i e s i n Table 4. Table 5 Summary of M u l t i v a r i a t e A n a l y s i s of Variance of Test Item B-values Sum of Mean Source Squares DF Square F Prob. B-values 3.14 3 1.05 9.09 < .001 Within C e l l s 6.56 57 0.12 (99) The s i m i l a r i t y of r e s u l t s i s a l s o apparent by comparing the means of the t e s t s c o r e s , the p-values, and the b - l e v e l s f o r each t e s t as g i v e n i n Table 6. The changes i n the p - l e v e l s and the t e s t means are i n v e r s e l y r e l a t e d to the changes i n the b-values. Table 6 Test Format P - l e v e l Means. B-value Means, and Score Means Format Item Test P - l e v e l B-value Score Number Order Type Mean Mean Mean 1 Easy - Hard Unr. .474 .128 18.946 3 Easy - Hard Res. .454 .295 18.169 2 Hard - Easy Unr. .405 .493 16.190 4 Hard - Easy Res. .390 .652 15.603 Note Unr. = u n r e s t r i c t e d d i r e c t i o n s ; Res. = r e s t r i c t e d d i r e c t i o n s (100) The Pearson c o r r e l a t i o n c o e f f i c i e n t s i n Table 7 a l s o demonstrates a s t r o n g r e l a t i o n s h i p between an item's p - l e v e l s and i t s b-values r e g a r d l e s s of which t e s t format i s used with the item. Table 7 Pearson C o r r e l a t i o n C o e f f i c i e n t s of P - l e v e l s and B-values fQE E a s y to Hard (EH), Hard to Easy (HE), R e s t r i c t e d (R) and U n r e s t r i c t e d (U) Test Formats P - l e v e l Test Formats B-value Test Formats E H , U HE, U E H , R HE, R E H , U HE, U E H , R HE, R -.9652 -.9001 -.9225 -.8141 -.9385 -.9050 -.9350 -.8294 -.8796 -.8558 -.8818 -.7697 -.9125 -.8820 -.9099 -.8181 Note A l l c o r r e l a t i o n s are s i g n i f i c a n t at p < .001. (101) F i n a l l y , Table 8 f u r t h e r demonstrates the s i m i l a r i t i e s between c l a s s i c a l and l a t e n t t r a i t s t a t i s t i c s . Table 8 uses t h e t a values c a l c u l a t e d from b-values s t a n d a r d i z e d f o r d i f f i c u l t y . Table 8 Summary of A n a l y s i s of Variance of Theta Values by A b i l i t y (Ab). Item Order ( O r ) , and Test D i r e c t i o n s ( D i r ) Sum of Mean Source Squares DF Square F Prob. Main 239, ,955 7 34 . ,279 55. , 127 < .001 A b i l i t y 217. .250 5 43. ,450 69. .875 < .001 Order 14 . 997 1 14 . ,997 24 , .117 < .001 D i r e c t i o n s .212 1 .212 , 341 . 559 Ab x Or 5. .052 5 1. .010 1, .625 . 151 Ab x D i r 2, .614 5 5. .523 .841 .521 Or x D i r .002 1 .002 .004 .951 Ab x Or x D i r 4 , .160 5 .832 1, .338 .247 E x p l a i n e d 251 .773 23 10 .947 17 .604 < .001 R e s i d u a l 351, .954 566 .622 T o t a l 603 .727 589 1 .025 Chapter VI Summary and C o n c l u s i o n s Purpose of The Study T h i s study was an attempt to determine i f the sequence of t e s t items has an e f f e c t on the performance of s t u d e n t s . F u r t h e r , t h i s study t r i e d t o determine i f some examinees were ab l e to m i t i g a t e any such e f f e c t s by p e r s o n a l l y r e a r r a n g i n g the item order as presented by the r e s e a r c h e r i n the t e s t b o o k l e t . Whether or not the item order can have an e f f e c t on t e s t s c o r e s has been an area of r e s e a r c h f o r f o r t y y e a r s . Recently, with the i n c r e a s e d usage of l a t e n t t r a i t s t a t i s t i c s , the i s s u e of c o n t e x t e f f e c t s and l o c a l independence has become more of a concern. As a r e s u l t t h i s study a l s o examined the e f f e c t of item order on l a t e n t t r a i t s t a t 1 s t i c s . Four d i f f e r e n t t e s t b o o k l e t s were used and g i v e n randomly to 590 grade e i g h t math s t u d e n t s . Two b o o k l e t s had the items arranged i n sequence from easy to hard q u e s t i o n s , and the other two b o o k l e t s had the items arranged i n sequence f.rom hard to easy. Both types of sequences had one b o o k l e t which allowed the students to rearrange the order of item p r e s e n t a t i o n by s k i p p i n g back and f o r t h between q u e s t i o n s , and there was one booklet of each sequence type t h a t d i d not a l l o w such w i t h i n - s u b j e c t rearrangement. (102) (103) The r e s u l t s of the study supported s e v e r a l of the hypotheses about the e f f e c t s of item order and the f a c t o r s a s s o c i a t e d with i t . Sequence The sequence of the t e s t items can a f f e c t the performance of the s t u d e n t s . Students who took the t e s t with the items arranged from easy t o hard had a s i g n i f i c a n t l y higher mean than the students who had the hard to easy arrangement (p <.001). The students who took the easy to hard t e s t s had a mean score of 18.6 as compared to the students who took the hard to easy t e s t and had a mean score of 15.9. The mean of the students who took the hard to easy exam had sc o r e s 7% lower than students who took the other t e s t . Although there was no a c t u a l measure of the students t h e o r e t i c a l f r u s t r a t i o n and discouragement while they were t a k i n g the t e s t , the common concern about beginning a t e s t with too many hard q u e s t i o n s may have some j u s t i f i c a t i o n . I t i s c l e a r t h a t a hard to easy arrangement can r e s u l t i n lower scores f o r the s t u d e n t s . As a r e s u l t , c a u t i o n should be e x e r c i s e d when d e v e l o p i n g two forms of the same t e s t . I t i s p o s s i b l e to c r e a t e formats with s i g n i f i c a n t l y d i f f e r e n t t e s t c h a r a c t e r i s t i c s even though the items are i d e n t i c a l . (104) D i r e c t i o n s The d i f f e r e n c e between the two types of d i r e c t i o n s was not s i g n i f i c a n t . However, there was a trend toward s i g n i f i c a n c e s i n c e mean of a t e s t with r e s t r i c t e d d i r e c t i o n s was lower than a comparable t e s t with u n r e s t r i c t e d d i r e c t i o n s . In comparing the o v e r a l l t e s t s c o r e s , those students who were allowed to do the qu e s t i o n s i n any order and t o go back and f o r t h between the qu e s t i o n s had a mean of 17.6. The students who c o u l d not rearrange the order of the exam had a mean score t h a t was not s i g n i f i c a n t l y lower at 16.9. The e f f o r t t o prevent students from r e a r r a n g i n g the Item order does not seem to g r e a t l y improve the l i k e l i h o o d of f i n d i n g a s i g n i f i c a n t item order e f f e c t . The evidence f o r the widespread and e f f e c t i v e use of t h i s t e s t - w i s e n e s s s t r a t e g y was not c l e a r l y demonstrated. T h e r e f o r e , t h i s study supports the c o n c l u s i o n s of A l l i s o n and Thomas (1986) t h a t there i s not enough evidence to doubt t h a t the m a j o r i t y of item order s t u d i e s would have had d i f f e r e n t f i n d i n g s i f t h i s f a c t o r had been c o n t r o l l e d . However, i n l i g h t of the trends i n d i c a t e d i n the data, t h i s f a c t o r may be a p r o b l e m a t i c a l v a r i a b l e i n some s i t u a t i o n s as Hambleton and Traub (1974) suggested. (105) A b i l i t y The main e f f e c t a s s o c i a t e d with a b i l i t y l e v e l s was s i g n i f i c a n t . The f a c t t h a t teachers are a b l e to p r e d i c t how w e l l t h e i r s tudents w i l l do on a mathematics t e s t i s not an unexpected f i n d i n g . As a r e s u l t , no c o n c l u s i o n s w i l l be drawn from t h i s f i n d i n g . i n t e r a c t i o n s A more s i g n i f i c a n t f a c t i s t h a t there were no s i g n i f i c a n t i n t e r a c t i o n s between any of the f a c t o r s i n c l u d i n g a b i l i t y . Students who were c o n s i d e r e d to have l i m i t e d mathematical a b i l i t y were e f f e c t e d by the sequence and the d i r e c t i o n s to the same degree as the students who were c o n s i d e r e d to have high mathematical a b i l i t y . T h i s c a l l s i n t o q u e s t i o n one of the j u s t i f i c a t i o n s f o r the concern over item order. The easy to hard order does not appear to help the low a b i l i t y student any more than i t helps the high a b i l i t y s t udent. So while a concern f o r the f e e l i n g s of low a c h i e v i n g students i s admirable, there i s no s p e c i a l j u s t i f i c a t i o n f o r a r r a n g i n g the t e s t items from easy to hard based on the r e s u l t s of t h i s study. Another lack of s i g n i f i c a n t d i f f e r e n c e i n v o l v e s the i n t e r a c t i o n between the t e s t d i r e c t i o n s and the a b i l i t y l e v e l s . The lack of any s i g n i f i c a n t d i f f e r e n c e s c o n t r a d i c t s the f i n d i n g s of R i n d l e r (1980) t h a t a l l a b i l i t y l e v e l s (106) possess the te s t - w i s e n e s s s t r a t e g y of s k i p p i n g q u e s t i o n s , but they use i t with v a r y i n g degrees of success as demonstrated by the complex i n t e r a c t i o n s found i n R i n d l e r ' s study. However, the r e s u l t s of these s t u d i e s do not preclude the value of t e a c h i n g t e s t - w i s e n e s s s t r a t e g i e s to p o s s i b l y help students to l e a r n to use the item rearrangement s t r a t e g y e f f e c t i v e l y . L a t ent T r a i t S t u d i e s which have used l a t e n t t r a i t s t a t i s t i c s i n t h e i r a n a l y s i s of item order or context e f f e c t s have a l l found s i g n i f i c a n t d i f f e r e n c e s i n t h e i r d i f f i c u l t y parameters. Only one study (Yen, 1980) used both l a t e n t t r a i t s t a t i s t i c s and c l a s s i c a l s t a t i s t i c s , and t h a t study found the same s i g n i f i c a n t e f f e c t of item order using both types of s t a t i s t i c s . Both c l a s s i c a l based d i f f i c u l t y l e v e l s and l a t e n t t r a i t d i f f i c u l t y l e v e l s were found to be s i g n i f i c a n t l y c o r r e l a t e d i n t h i s study, and both demonstrated the same s i g n i f i c a n t e f f e c t a s s o c i a t e d with changes to the item order. I t can be concluded t h a t the s t u d i e s which used l a t e n t t r a i t s t a t i s t i c s and found a s i g n i f i c a n t e f f e c t from changes i n context would probably have found s i m i l a r r e s u l t s had they used c l a s s i c a l based s t a t i s t i c s . F u r t h e r , i t i s a l s o p o s s i b l e t h a t i f some pre v i o u s c l a s s i c a l based s t u d i e s had (107) used l a r g e r samples, then t h e i r r e s u l t s may have been s i m i l a r to the l a t e n t t r a i t based s t u d i e s . The assumption of l o c a l Independence can not be supported by the r e s u l t s of t h i s study. L a t e n t t r a i t d i f f i c u l t y parameters were a f f e c t e d by the d i f f i c u l t y parameters of pr e c e d i n g items. Caution must be e x e r c i s e d when comparing t e s t s by u s i n g the l a t e n t t r a i t s t a t i s t i c s s i n c e a t e s t which begins with harder q u e s t i o n s cannot be assumed to be the p a r a l l e l to a t e s t which begins with e a s i e r q u e s t i o n s . L i m i t a t i o n s The c o n c l u s i o n s of t h i s study must be l i m i t e d to comparisons between a t e s t arranged from easy to hard and one arranged from hard to easy. Other formats such as random or s p i r a l were not i n c l u d e d . I t i s open to c o n j e c t u r e and f u t u r e r e s e a r c h i f a random arrangement that began with p r i m a r i l y hard q u e s t i o n s would have s i g n i f i c a n t l y lower s c o r e s than a random arrangement t h a t began with p r i m a r i l y easy q u e s t i o n s . A second area of l i m i t a t i o n i n v o l v e s the content of the t e s t s . T h i s study confirms many of the r e s u l t s found with s t u d i e s t h a t used q u a n t i t a t i v e type t e s t s (Hambleton & Traub, 1974; F e l d t & F o r s y t h , 1974; Towle & M e r r i l l , 1975; Yen, 1980; Plake, Ansorge et a l . 1982; Kingston & Dorans, (108) 1985). However, t h i s study may o n l y g e n e r a l i z e to mathematics t e s t s . One p o s s i b l e reason i s t h a t the d i f f i c u l t y of an item may be h i g h l y s u b j e c t i v e . The s t a t i s t i c a l d i f f i c u l t y l e v e l may o n l y be an estimate of the a c t u a l d i f f i c u l t y t h a t i s p e r c e i v e d by the i n d i v i d u a l e n c o u n t e r i n g the item. In the case of a mathematics q u e s t i o n , the s t a t i s t i c a l d i f f i c u l t y l e v e l may be a good p r e d i c t o r of how d i f f i c u l t each i n d i v i d u a l p e r c e i v e s the q u e s t i o n to be. On the other hand, a s c i e n c e q u e s t i o n may be s t a t i s t i c a l l y v e r y d i f f i c u l t because most s u b j e c t s answer i t i n c o r r e c t l y , but i t i s p e r c e i v e d as a very easy q u e s t i o n by the respondents due to the e f f e c t i v e n e s s of the d i s t r a c t o r s . As a r e s u l t , a s e r i e s of s t a t i s t i c a l l y d i f f i c u l t s c i e n c e q u e s t i o n s may not r e s u l t i n the same e f f e c t s as a s e r i e s of d i f f i c u l t mathematics q u e s t i o n s . A t h i r d l i m i t a t i o n i s the r e s u l t of the d e f i n i t i o n of mathematical a b i l i t y used i n t h i s study. The a b i l i t y l e v e l s used i n t h i s study were based on a teacher r a t i n g system and would be s t r o n g l y i n f l u e n c e d by classroom behaviour, student p e r s o n a l i t y , and the e r r o r s of teacher judgement. Even though the teacher r a t i n g s c a l e had a s i g n i f i c a n t .6340 c o r r e l a t i o n with the mathematics t e s t s c o r e s (p < .001), the r e s u l t s of t h i s study may d i f f e r from a study which uses a measure o£ student a b i l i t y with a more v a l i d c r i t e r i o n of mathematical a b i l i t y . (109) The i s s u e of d i f f i c u l t y i n v o l v e s another l i m i t a t i o n . Items used i n t h i s study do not have d i f f i c u l t y l e v e l s which are i d e n t i c a l to other s t u d i e s . The c o n c l u s i o n s of t h i s study are based on a s e r i e s of items whose p r e - t e s t e d d i f f i c u l t y l e v e l s had an average p value of .49 with a standard d e v i a t i o n around t h a t mean of .21. The range of p l e v e l s was from .13 to .89. U n f o r t u n a t e l y i t i s not c l e a r i f the r e s u l t s of t h i s study compare with the r e s u l t s of other s t u d i e s s i n c e , as Hambleton and Traub (1974) po i n t e d out, many s t u d i e s do not give i n f o r m a t i o n about t h e i r item d i f f i c u l t y . The degree to which the mean and v a r i a t i o n of d i f f i c u l t y l e v e l s i n f l u e n c e item order e f f e c t s i s a s u b j e c t f o r f u t u r e r e s e a r c h . A d e f i n i t e l i m i t a t i o n of t h i s study i s t h a t the r e s u l t s o n l y g e n e r a l i z e to c h i l d r e n e n r o l l e d i n the intermediate or secondary s c h o o l programs of Canadian p u b l i c s c h o o l s with a wide d i v e r s i t y of student a b i l i t y l e v e l s . T h i s study may not be a p p l i c a b l e to a c o l l e g e s e t t i n g where some pr e v i o u s r e s e a r c h has i n d i c a t e d , changes i n item order do not r e s u l t i n s i g n i f i c a n t changes i n the sc o r e s of c o l l e g e s t u d e n t s . However, the r e s u l t s of t h i s study do c a l l i n t o q u e s t i o n some of the g e n e r a l i z a t i o n s of p r e v i o u s r e s e a r c h which used c o l l e g e students to conclude that item order does not have an e f f e c t . (110) C o n c l u s i o n s about l a t e n t t r a i t s t a t i s t i c s are r e s t r i c t e d by the s m a l l sample s i z e . Although there were 590 students i n t o t a l , there were o n l y about 147 students t a k i n g each t e s t . The l a t e n t t r a i t parameters f o r each t e s t format were e s t a b l i s h e d j u s t with the students who were given t h a t p a r t i c u l a r t e s t b o o k l e t . G e n e r a l i z a b i l i t y may a l s o be l i m i t e d by the p o s s i b l e i n t e r a c t i o n of s e l e c t i o n process and the t e s t s used i n t h i s study. T h i s l i m i t a t i o n i s o u t l i n e d as a p o s s i b l e weakness of p o s t - t e s t o n l y c o n t r o l group designs by Campbell and S t a n l e y (1963). While the three s c h o o l s i n v o l v e d i n the study are h o p e f u l l y r e p r e s e n t a t i v e of the t y p i c a l j u n i o r secondary s c h o o l , i t i s p o s s i b l e that the three s c h o o l s i n v o l v e d were a t y p i c a l . For one, they were the o n l y three out o£ the f i v e s c h o o l s asked which agreed to p a r t i c i p a t e i n the study. The two s c h o o l s which d e c l i n e d to p a r t i c i p a t e d i d so because they f e l t t h a t the d i s t r i c t ' s labour d i f f i c u l t i e s had a l r e a d y s i g n i f i c a n t l y shortened t h e i r i n s t r u c t i o n a l time. I t should be noted that two of the p a r t i c i p a t i n g s c h o o l s d i d not f e e l t h a t the shortened I n s t r u c t i o n a l time was a hindrance to t h e i r p a r t i c i p a t i o n . T h e r e f o r e , s i n c e the f a c t o r s i n v o l v e d with p a r t i c i p a t i o n seem to be u n r e l a t e d to the f a c t o r s under study, i t can be concluded t h a t there probably was not any i n t e r a c t i o n ( I l l ) between the s e l e c t i o n p r o c e s s and the t e s t s used i n the s t u d y . Campbell and S t a n l e y a l s o p o i n t out t h a t the d e s i g n i s l i m i t e d by the p o s s i b l e e f f e c t of r e a c t i v e e l e m e n t s . To a c e r t a i n e x t e n t s t u d e n t s were a f f e c t e d by the u n u s u a l n a t u r e of the t e s t i n g p r o c e d u r e . For one, s t u d e n t s who r e c e i v e d the u n r e s t r i c t e d t e s t b o o k l e t s may have r e a c t e d more p o s i t i v e l y t o the t e s t i n g s i t u a t i o n s i n c e some e x p r e s s e d p l e a s u r e a t h a v i n g r e c e i v e d the u n r e s t r i c t e d t e s t b o o k l e t . U n f o r t u n a t e l y , the t e s t i n g s i t u a t i o n may have a l s o l i m i t e d the p o s s i b l e e f f e c t r e l a t e d t o d i r e c t i o n s s i n c e some s t u d e n t s may not have f u l l y c o o p e r a t e d w i t h the d i r e c t i o n s t o not r e a r r a n g e t h e i t e m o r d e r . W h i l e the m a j o r i t y of s t u d e n t s were c o o p e r a t i v e , a few s t u d e n t s i n each s c h o o l seemed t o be u n c o o p e r a t i v e s i n c e t h e y assumed t h a t i t was r e a l l y j u s t some type of an e x p e r i m e n t . Those s t u d e n t s p a s s i v e l y r e s i s t e d t e a c h e r a t t e m p t s t o have them f o l l o w the d i r e c t i o n s and do t h e i r b e s t . T h i s l i m i t s the a c c u r a c y of any c o n c l u s i o n s about the e f f e c t of t h e t e s t d i r e c t i o n s . On the o t h e r hand, the i t e m o r d e r e f f e c t s were l e s s l i k e l y t o be i n f l u e n c e d by such a f a c t o r s i n c e the s t u d e n t s were not t o l d t h a t the t e s t s were a l s o p r e p a r e d w i t h d i f f e r e n t i t e m sequences. In f a c t , some t e a c h e r s s a i d t h a t the hard t o easy sequence was more l i k e l y t o cause u n c o o p e r a t i v e ( 1 1 2 ) behaviour r a t h e r than the uncooperative behaviour l i m i t i n g the item order e f f e c t s . N e v e r t h e l e s s , d e s p i t e the l i m i t a t i o n s of t h i s study, the c o n c l u s i o n s of t h i s study should not be i n any way l i m i t e d to speeded t e s t s . Every attempt was made to have t h i s t e s t be a power t e s t w i t h i n the l i m i t a t i o n s of t e s t i n g 590 students i n the p u b l i c s c h o o l system. Most students e a s i l y f i n i s h e d the t e s t w i t h i n the time a l l o t t e d . The t e a c h e r s who a d m i n i s t e r e d the t e s t s t a t e d t h a t the time l i m i t s were ample and generous. Nonetheless, there are students who d i d not complete the t e s t , and there are items t h a t were omitted a t the end of the t e s t and t e c h n i c a l l y c l a s s i f i e d as "not reached". However, whether or not a t e s t i s a power t e s t because a l l s tudents completed the t e s t and o n l y omitted the most d i f f i c u l t or whether a t e s t i s a speeded t e s t because q u e s t i o n s were not reached by some students i s a d i f f i c u l t d i s t i n c t i o n . I t i s not r e a l i s t i c a l l y p o s s i b l e f o r t e s t s of the type used i n t h i s study to not have a s m a l l percentage of not reached q u e s t i o n s . For example, one student a f t e r t r y i n g the f i r s t q u e s t i o n of the hard to easy sequence booklet threw h i s t e s t a c r o s s the room and r e f u s e d to complete the r e s t of the t e s t . As a r e s u l t t h i r t y - n i n e q u e s t i o n s on h i s t e s t can e r r o n e o u s l y be scored as not reached r a t h e r than omitted. As another example, students ( 1 1 3 ) were ob s e r v e d t o use a t e s t t a k i n g s t r a t e g y of d o i n g the q u e s t i o n s a t the b e g i n n i n g and the end f i r s t w h i l e q u e s t i o n s i n the mi d d l e were l e f t u n t i l l a s t . I f the s e s t u d e n t s d i d not have enough time t o complete the t e s t t h e i r not reached q u e s t i o n s would then be s c o r e d as o m i t t e d . The d i f f e r e n c e between not reached and o m i t t e d q u e s t i o n s i s a l s o not c l e a r s i n c e s t u d e n t s who took the easy t o hard format had 1.7% of t h e i r q u e s t i o n s not r e a c h e d , but the s t u d e n t s w i t h t h e hard t o easy format had .9% of t h e i r q u e s t i o n s not r e a c h e d . S t u d e n t s w i t h hard q u e s t i o n s a t the end of the t e s t o m i t t e d more q u e s t i o n s a t the end of the t e s t which i n c r e a s e s the number of t e c h n i c a l l y not rea c h e d q u e s t i o n s . W h i l e o m i t t i n g q u e s t i o n s on a power t e s t i s v e r y d i f f e r e n t from not r e a c h i n g q u e s t i o n s on a speeded t e s t , i t i s not a c c u r a t e t o make a s t a t i s t i c a l d i s t i n c t i o n between the two under the c o n d i t i o n s of t h i s s t u d y . For a l l i n t e n t s and purposes the t e s t s used i n t h i s s t u d y were power t e s t s w i t h some s t u d e n t s c h o o s i n g t o omit q u e s t i o n s . I m p l i c a t i o n s C a u t i o n s h o u l d s t i l l be e x p r e s s e d by w r i t e r s i n the measurement f i e l d about i t e m s e q u e n c i n g . Under c e r t a i n c i r c u m s t a n c e s , i t i s p o s s i b l e f o r the c o n t e x t of the items t o i n f l u e n c e the s t a t i s t i c s of the i t e m s . Care must be t a k e n i n the development of p a r a l l e l forms of a t e s t t o (114) prevent s i g n i f i c a n t d i f f e r e n c e s i n sc o r e s as a r e s u l t of d i f f e r e n c e s i n the item sequencing. Future Research Many areas remain as s u b j e c t s f o r f u t u r e r e s e a r c h . For one, the d i f f e r e n t i a t i o n between the s i x d i f f e r e n t a b i l i t y groups c o u l d be the b a s i s of f u r t h e r r e s e a r c h . Students c o u l d be c l a s s i f i e d i n t o d i f f e r e n t groups based on a p r e - t e s t t h a t measures i n t e l l e c t u a l a b i l i t y , mathematical achievement, or both. The s c o r e s from those t e s t s c o u l d be used t o i d e n t i f y more or fewer groups as needed f o r the a n a l y s i s of any i n t e r a c t i o n between a b i l i t y l e v e l and other f a c t o r s . A second area of r e s e a r c h i s to determine i f th e r e i s a s i g n i f i c a n t d i f f e r e n c e between s u b j e c t i v e item d i f f i c u l t y and s t a t i s t i c a l item d i f f i c u l t y . Students c o u l d r a t e the s u b j e c t i v e d i f f i c u l t y of t e s t s , and those r a t i n g s c o u l d be compared with s t a t i s t i c a l r a t i n g s to determine the c o r r e l a t i o n . D i f f e r e n t s u b j e c t areas c o u l d be used to compare the c o r r e l a t i o n between content areas to determine i f the types of qu e s t i o n s with the hi g h e s t s u b j e c t i v e and s t a t i s t i c a l c o r r e l a t i o n are the content areas with the g r e a t e s t item order e f f e c t s . V a r i a t i o n s i n item and t e s t d i f f i c u l t y c o u l d a l s o be examined. For one, the number of d i f f i c u l t items a t the (115) b e g i n n i n g of a t e s t c o u l d be v a r i e d t o d e t e r m i n e the maximum number of d i f f i c u l t items t h a t c o u l d be t o l e r a t e d by s t u d e n t s w i t h o u t r e s u l t i n g i n lower t e s t s c o r e s . V a r i a t i o n s of the mean and range of i t e m d i f f i c u l t y would a l s o g i v e e v i d e n c e t o the s e n s i t i v i t y of s t u d e n t s t o i t e m d i f f i c u l t y . (116) References Ahmann, J . S., & Glock, M. D. (1963). E v a l u a t i n g p u p i l  growth (2nd ed.). Boston: A l l y n & Bacon. A l l i s o n , D. E. (1984). The e f f e c t of i t e m - d i f f i c u l t y sequence, i n t e l l i g e n c e , and sex on t e s t performance, r e l i a b i l i t y , and item d i f f i c u l t y and d i s c r i m i n a t i o n . Measurement and E v a l u a t i o n i n Guidance, i 6 , 2 1 1 - 2 1 7 . A l l i s o n , D. E., & Thomas, D. C. (1986). I t e m - d i f f i c u l t y sequence i n achievement examinations: Examinees' p r e f e r e n c e s and t e s t - t a k i n g s t r a t e g i e s . P s y c h o l o g i c a l Reports, 59, 867-870. Berger, V. F., Munz, D. C , Smouse, A. D., & A n g e l i n o , H. (1969). The e f f e c t s of item d i f f i c u l t y sequencing and a n x i e t y r e a c t i o n type on a p t i t u d e t e s t performance. J o u r n a l of Psychology. 71, 253-258. Brenner, M. H. (1964). Test d i f f i c u l t y , r e l i a b i l i t y , and d i s c r i m i n a t i o n as f u n c t i o n s of item d i f f i c u l t y o r d e r . J o u r n a l of A p p l i e d Psychology. 48, 98-100. Campbell, D. T., & S t a n l e y , J . C. (1963) Exper imental  and q u a s i - e x p e r i m e n t a l designs f o r r e s e a r c h . Boston: Houghton M i f f l i n . Cronbach, L. J . (1946). Response s e t s and t e s t v a l i d i t y . E d u c a t i o n a l and P s y c h o l o g i c a l Measurement. 6, 475-494. Cronbach, L. J . (1950). F u r t h e r evidence on response s e t s and t e s t d e s i g n . E d u c a t i o n a l and P s y c h o l o g i c a l  Measurement, 10, 3-31. F e l d t , L. S., & F o r s y t h , R. A. (1974). An examination of the context e f f e c t i n item sampling. J o u r n a l of E d u c a t i o n a l  Measurement, 1 1 , 73-82. Flaugher, R. L., Melton, R. S., & Myers, C. T. (1968). Item rearrangement under t y p i c a l t e s t c o n d i t i o n s . E d u c a t i o n a l and P s y c h o l o g i c a l Measurement, 28, 813-824. French, J . L., & Greer, D. (1964). E f f e c t of t e s t - i t e m arrangement on p h y s i o l o g i c a l and p s y c h o l o g i c a l behavior i n p r i m a r y - s c h o o l c h i l d r e n . J o u r n a l of E d u c a t i o n a l Measurement. 1 , 151-153. Hambleton, R. K., & Traub, R. E. (1974). The e f f e c t s of item order on t e s t performance and s t r e s s . J o u r n a l of  Experimental E d u c a t i o n , 43, 40-46. (117) Hodson, D. (1984). The e f f e c t of changes i n item sequence on student performance i n a m u l t i p l e - c h o i c e c h e m i s t r y t e s t . J o u r n a l of Research i n Science Teaching. 21, 489-495. Hopkins, C. D., & Antes, R. L. (1985). Classroom measurement & e v a l u a t i o n (2nd.). I t a s c a , IL: Peacock. Huck, S. W., & Bowers, N. D. (1972). Item d i f f i c u l t y l e v e l and sequence e f f e c t s i n m u l t i p l e - c h o i c e achievement t e s t s . J o u r n a l of E d u c a t i o n a l Measurement. 9, 105-111. Kestenbaum, J . M., & Weiner, B. (1970). Achievement performance r e l a t e d to achievement m o t i v a t i o n and t e s t a n x i e t y . J o u r n a l of C o n s u l t i n g and C l i n i c a l Psychology. 34, 343-344. Kingston, N. M., & Dorans, N. J . (1984). Item l o c a t i o n e f f e c t s and t h e i r i m p l i c a t i o n s f o r IRT equating and a d a p t i v e t e s t i n g . A p p l i e d P s y c h o l o g i c a l Measurement. 8, 147-154. K l e i n k e , D. J . (1980). Item order, response l o c a t i o n and examinee sex and handedness and performance on a m u l t i p l e - c h o i c e t e s t . The J o u r n a l of E d u c a t i o n a l  Research. 73, 225-229. Klimko, I. P. (1984). Item arrangement, c o g n i t i v e e n t r y c h a r a c t e r i s t i c s , sex, and t e s t a n x i e t y as p r e d i c t o r s of achievement examination performance. J o u r n a l of Experimental E d u c a t i o n . 52, 214-219. Klo s n e r , N. C , & Gellman, E. K. (1973). The e f f e c t of item arrangement on classroom t e s t performance: I m p l i c a t i o n s f o r content v a l i d i t y . E d u c a t i o n a l and P s y c h o l o g i c a l Measurement, 33, 413-418. Lane, D. S., B u l l , K.-S., Kundert, D. K., & Newman, D. L. (1987). The e f f e c t s of knowledge of item arrangement, gender, and s t a t i s t i c a l and c o g n i t i v e item d i f f i c u l t y on t e s t performance. E d u c a t i o n a l and P s y c h o l o g i c a l Measurement. 47, 865-879. Leary, L. F., & Dorans, N. J . (1985). I m p l i c a t i o n s f o r a l t e r i n g the context i n which t e s t items appear: A h i s t o r i c a l p e r s p e c t i v e on an immediate concern. Review of E d u c a t i o n a l Research. 55, 387-413. MacNicol, K. (1970). E f f e c t s of v a r y i n g order of item d i f f i c u l t y i n an unspeeded v e r b a l t e s t . Unpublished manuscript, E d u c a t i o n a l T e s t i n g S e r v i c e , P r i n c e t o n , NJ. (118) Marso, R. N. (1970). Test item arrangement, t e s t i n g time, and performance. J o u r n a l of E d u c a t i o n a l Measurement, 7, 113-118. Mi c r o c a t t e s t i n g system (3rd Ed.). (1988). S t . P a u l , MN: Assessment Systems C o r p o r a t i o n . Millman, J . , & Bishop, C. H. (1965). An a n a l y s i s of t e s t - w i s e n e s s . E d u c a t i o n a l and P s y c h o l o g i c a l Measurement, 25, 707-726. Mollenkopf, W. G. (1950). An experimental study of the e f f e c t s on i t e m - a n a l y s i s data of changing item placement and t e s t time l i m i t . Psychometrika, 15, 291-315. Monk, J . J . , & S t a l l i n g s , W. M. (1970). E f f e c t s of item order on t e s t s c o r e s . The J o u r n a l of. E d u c a t i o n a l , Research, 63, 463-465. Munz, D. C , & Jacobs, P. D. (1971). An e v a l u a t i o n of p e r c e i v e d i t e m - d i f f i c u l t y sequencing i n academic t e s t i n g , B r i t i s h J o u r n a l of E d u c a t i o n a l Psychology. 41, 195-205. Munz, D. C , & Smouse, A. D. (1968). I n t e r a c t i o n e f f e c t s of i t e m - d i f f i c u l t y sequence and achievement-anxiety r e a c t i o n on academic performance. J o u r n a l of E d u c a t i o n a l  Psychology. 59, 370-374. Plake, B. S. (1980). Item arrangement and knowledge of arrangement on t e s t s c o r e s . J o u r n a l of Experimental  E d u c a t i o n . 49, 56-58. Plake, B. S., & Ansorge, C. J . (1984). E f f e c t s of item arrangement, sex of the s u b j e c t , and t e s t a n x i e t y on c o g n i t i v e and s e l f - p e r c e p t i o n scores i n a n o n q u a n t i t a t i v e content a r e a . E d u c a t i o n a l and P s y c h o l o g i c a l Measurement. 44, 423-430. Plake, B. S., Ansorge, C. J . , Parker, C. S., & Lowry, S. R. (1982). E f f e c t s of item arrangement, knowledge of arrangement t e s t a n x i e t y and sex on t e s t performance. J o u r n a l of E d u c a t i o n a l Measurement. 19, 49-57. Plake, B. S., M e l i c a n , G. J . , C a r t e r , L., & Shaughnessy, L. C. (1983). D i f f e r e n t i a l performance of males and females on easy to hard item arrangements: I n f l u e n c e of feedback a t the item l e v e l . E d u c a t i o n a l and  P s y c h o l o g i c a l Measurement, 43, 1067-1075. (119) Plake, B. S., Thompson, P. A., & Lowry, S. (1980). E f f e c t of item arrangement, knowledge of arrangement, and t e s t a n x i e t y on two s c o r i n g methods. J o u r n a l of Experimental  E d u c a t i o n , 49, 214-219. R i n d l e r , S. E. (1980). The e f f e c t s of s k i p p i n g over more d i f f i c u l t items on t i m e - l i m i t e d t e s t s : I m p l i c a t i o n s f o r t e s t v a l i d i t y . E d u c a t i o n a l and P s y c h o l o g i c a l Measurement, 40, 989-998. R o b i t a i l l e , D. F., & Garden, R. A. (Eds.) (1987). The second i n t e r n a t i o n a l mathematics study: V o l . 2. Context and outcomes of s c h o o l mathematics. Albany, NY: I n t e r n a t i o n a l A s s o c i a t i o n f o r the E v a l u a t i o n of E d u c a t i o n a l Achievement. Ruch, G. M. (1929). The o b j e c t i v e or new-type examination. Chicago: S c o t t Foresman. Sax, G., & C a r r , A. (1962). An i n v e s t i g a t i o n of response s e t s on a l t e r e d p a r a l l e l forms. E d u c a t i o n a l and  P s y c h o l o g i c a l Measurement, 22,371-376. Sax, G., & Cromack, T. R. (1966). The e f f e c t s of v a r i o u s forms of item arrangements on t e s t performance. J o u r n a l  of E d u c a t i o n a l Measurement. 3, 309-311. S i r o t n i k , K., & W e l l i n g t o n , R. (1974). Scrambling content i n achievement t e s t i n g : An a p p l i c a t i o n of m u l t i p l e matrix sampling i n experimental d e s i g n . J o u r n a l of E d u c a t i o n a l  Measurement. 11, 179-188. Smouse, A. D., & Munz, D. C. (1968). The e f f e c t s of a n x i e t y and item d i f f i c u l t y sequence on achievement t e s t i n g s c o r e s . J o u r n a l of Psychology,. 68, 181-184. Smouse, A. D., & Munz, D. C. (1969). Item d i f f i c u l t y sequencing and response s t y l e : A follow-up a n a l y s i s . E d u c a t i o n a l and P s y c h o l o g i c a l Measurement. 29, 469-472. SPSS-X user's guide. (1983). New York, NY: McGraw-Hill. Towle, N. J . , & M e r r i l l , P. F. (1975). E f f e c t s of a n x i e t y type and i t e m - d i f f i c u l t y sequencing on mathematics t e s t performance. J o u r n a l of E d u c a t i o n a l Measurement, 12, 241-249 . Tuck, P. J . (1978). Examinees' c o n t r o l of item d i f f i c u l t y sequence. P s y c h o l o g i c a l Reports,. 42, 1109-1110. (120) W h i t e l y , S. E., & Dawis, R. V. (1976). The i n f l u e n c e of t e s t c o n t e x t on i t e m d i f f i c u l t y . E d u c a t i o n a l and P s y c h o l o g i c a l Measurement. 36, 329-337. Yen, W. M. (1980). The e x t e n t , causes and importance of c o n t e x t e f f e c t s on i t e m parameters f o r two l a t e n t t r a i t models. J o u r n a l of E d u c a t i o n a l Measurement. 17, 297-311. (121) APPENDIX I ( 1 2 2 ) TEACHER INSTRUCTIONS MATHEMATICS 8 EXAMINATION THESIS PROJECT OF M. J . SCALES I. General D i r e c t i o n s I I . T h e s i s P r o j e c t Background I I I . D e t a i l e d D i r e c t i o n s ( o p t i o n a l ) a. S t a r t E x p l a n a t i o n s b. Booklets c. Answer Sheets d. I d e n t i f i c a t i o n Number e. Name S e c t i o n f . Gender g. Grade h. B i r t h Date i . Answer Sheet Usage j . S t a r t Examination k. End Examination 1. C o l l e c t T e s t i n g M a t e r i a l s m. Math A b i l i t y R a t i n g n. Return T e s t i n g M a t e r i a l s IV. Appendix a. Student I d e n t i f i c a t i o n Sample (123) I. GENERAL DIRECTIONS M a t e r i a l s Required by the Examiner A. A copy of these i n s t r u c t i o n s . B. A c l a s s s e t of mixed t e s t b o o k l e t s , complete with an answer sheet and some s c r a p paper to giv e one booklet to each student. Test Format 1 (orange) Test Format 2 (yellow) Test Format 3 (blue) Test Format 4 (green) C. A su p p l y of sharpened s o f t - l e a d p e n c i l s . D. An e x t r a supply of b o o k l e t s , complete with answer sheets and s c r a t c h paper. 1. A c l a s s p e r i o d of one hour should be s u f f i c i e n t to e x p l a i n (15 min.) and a d m i n i s t e r (45 min.) the t e s t . 2. E x p l a i n to the students t h a t today they w i l l be t a k i n g a t e s t as pa r t of a study to determine i f s k i p p i n g back and f o r t h between t e s t q u e s t i o n s w i l l help students to do b e t t e r on t e s t s . Some students w i l l r e c e i v e t e s t s which a l l o w them to s k i p back and f o r t h . Other students w i l l r e c e i v e b o o k l e t s which r e q u i r e t h a t they do not s k i p ahead but must do the q u e s t i o n s i n the same order as in t h e i r b o o k l e t . Do not d i s c u s s the order of the items i n the t e s t or the t e s t t a k i n g s t r a t e g y of s k i p p i n g the hard Questions to do the easy q u e s t i o n s f i r s t . 3. Be sure a l l students have a p e n c i l (No. 2 or HB). 4. Ca u t i o n students not to open t h e i r t e s t b o o k l e t u n t i l t i l l they are t o l d to do so. 5. D i s t r i b u t e one t e s t booklet with a p p r o p r i a t e answer sheet and sc r a p paper to each student. A l t e r n a t e e v e n ly between the four d i f f e r e n t types of c o l o u r coded t e s t b o o k l e t s . 6. Have the students c a r e f u l l y remove the answer sheet and the pie c e of sc r a p paper from the t e s t b o o k l e t . Have them check to see i f box A i n the (124) " I d e n t i f i c a t i o n No." s e c t i o n of t h e i r answer sheet has been marked with the number t h a t corresponds to the t e s t format number on the f r o n t cover. 7. Have the students complete the name, sex, b i r t h date, and grade s e c t i o n s of t h e i r answer s h e e t s . I f the c l a s s i s unsure how to complete these s e c t i o n s , read the a p p r o p r i a t e " D e t a i l e d D i r e c t i o n s " of t h i s b ooklet to the c l a s s or use the sample sheet i n the appendix as a guide. 8. To get the students to be r e a l i s t i c a l l y motivated, please t e l l them t h a t these t e s t r e s u l t s may be used to c a l c u l a t e t h e i r f i n a l marks. 9. Encourage the students to read the remaining d i r e c t i o n s from number 5 to the end of the page. If necessary, read them to the whole c l a s s . 10. Remind those students with " S p e c i a l I n s t r u c t i o n s " t hat they may not s k i p ahead to new q u e s t i o n s or go back to o l d ones. They must do the q u e s t i o n s i n the same order as they are presented i n the t e s t . I f they can't answer a q u e s t i o n , they may omit t h a t q u e s t i o n and go on to the next one. Nonetheless, they should a t l e a s t t r y t h e i r best to answer every q u e s t i o n on the t e s t . 11. When you are sure t h a t a l l students understand the d i r e c t i o n s , begin the t e s t (45 minutes). 12. During the t e s t i n g p e r i o d , students might ask f o r h e l p . Encourage them to read and respond to each item to the best of t h e i r a b i l i t i e s . Do NOT change the wording of any items, or e x p l a i n s p e c i f i c terms, or d i s c u s s the o r d e r i n g of the q u e s t i o n s . T r e a t t h i s t e s t i n g s i t u a t i o n as normal and as s e r i o u s as any other examination. 13. A f t e r 45 minutes, or sooner 1£ a l l students are f i n i s h e d , end the t e s t . C o l l e c t the t e s t b o o k l e t s i n c o l o u r coded groups. C o l l e c t the answer sheets and check to see t h a t a l l of the i d e n t i f i c a t i o n s e c t i o n s of the answer sheets have been completed c o r r e c t l y . 14. On a c l a s s l i s t , r a t e each students a b i l i t y to do mathematics. Using a s i x p o i n t s c a l e , r e c o r d a number from 1 to 6 t h a t r e p r e s e n t s your best estimate of each students mathematical a b i l i t i e s . T h i s r a t i n g should be somewhat independent from (125) o v e r a l l i n t e l l i g e n c e or classroom behaviour. Use a "1" f o r those students with the lowest 10% of mathematical a b i l i t y , a "2" f o r the next 15% of students with higher mathematical a b i l i t y , a "3" f o r the next 25%, a "4" f o r the next 25%, a "5" f o r the next 15%, and a "6" f o r the 10% of students with the h i g h e s t mathematical a b i l i t y . ( 1 2 6 ) THESIS PROJECT BACKGROUND Ever s i n c e m u l t i p l e c h o i c e t e s t s f i r s t came out i n the e a r l y 1920s, most textbook authors have suggested that these t e s t s should be arranged with the e a s i e s t q u e s t i o n s at the beginning and the hardest q u e s t i o n s at the end. One j u s t i f i c a t i o n f o r such an arrangement i s to help low a b i l i t y students a v o i d e a r l y f r u s t r a t i o n with the t e s t . However, much of the r e s e a r c h over the l a s t 40 years has g e n e r a l l y found that the item order does not make much of a d i f f e r e n c e to the f i n a l r e s u l t s of the t e s t s . One purpose of t h i s study i s to examine the d i s c r e p a n c y between what r e s e a r c h has s t a t i s t i c a l l y found and what teach e r s and textbook w r i t e r s have i n t u i t i v e l y found. Since most of the past r e s e a r c h has used c o l l e g e s t u d e n t s , t h i s study w i l l i n v o l v e a younger and more d i v e r s e sample of high s c h o o l s t u d e n t s . I t i s the h y p o t h e s i s of t h i s study t h a t students who are i n f a c t a f f e c t e d by the item order are more l i k e l y to be found i n a t y p i c a l p u b l i c s c h o o l r a t h e r than i n a c o l l e g e classroom. I f s tudents of low a b i l i t y a r e , i n f a c t , e a s i l y d i s c o u r a g e d by s t a r t i n g t e s t s with the more d i f f i c u l t q u e s t i o n s , then another purpose of t h i s study i s to examine one of the s k i l l s t h a t high a b i l i t y students may use to a v o i d t h a t discouragement. One p o s s i b l e s k i l l of the more ab l e students i s the t a c t i c of o m i t t i n g the hard q u e s t i o n s u n t i l they have f i n i s h e d the easy q u e s t i o n s . T h i s may be a s k i l l t h a t the low a b i l i t y s t udents are e i t h e r unaware of or j u s t f a i l to use . A t h i r d purpose of t h i s study i s a more e s o t e r i c one which i n v o l v e s examining the r e s u l t s of t h i s t e s t u s i n g two types of t e s t s t a t i s t i c s . The s t u d i e s which have found no d i f f e r e n c e s as a r e s u l t of item order have used c l a s s i c a l s t a t i s t i c s to examine t h e i r d a t a . However, recent s t u d i e s which have found some e f f e c t s of item order, have used the more modern l a t e n t t r a i t s t a t i s t i c s . T h i s study would compare the r e s u l t s obtained from each type of s t a t i s t i c a l method. A f i n a l reason, of course, i s to complete the requirements to o b t a i n a Master of A r t s degree i n the F a c u l t y of E d u c a t i o n at the U n i v e r s i t y of B r i t i s h Columbia i n the department of E d u c a t i o n a l Psychology and S p e c i a l E d u c a t i o n with a s p e c i a l i z a t i o n in measurement, e v a l u a t i o n and r e s e a r c h methodology. (127) I I I . DETAILED DIRECTIONS ( o p t i o n a l ) A l l d i r e c t i o n s t h a t you can read to the students are indented so that they stand out. You may read them e x a c t l y as they are w r i t t e n , u s i n g a n a t u r a l tone and manner. I f necessary, you may supplement the d i r e c t i o n s with your own e x p l a n a t i o n s , but do not give help on s p e c i f i c t e s t q u e s t i o n s . T r y to maintain a n a t u r a l classroom atmosphere d u r i n g the t e s t a d m i n i s t r a t i o n . Encourage students to do t h e i r b e s t , and ad v i s e them not to spend too much time on any one q u e s t i o n . Check p e r i o d i c a l l y to make sure t h a t students are r e c o r d i n g t h e i r answers p r o p e r l y , are f o l l o w i n g i n s t r u c t i o n s , and are working to the end of the t e s t , or as f a r as they can. The s c o r i n g machine used to process the answer sheets i s capable of almost 100% accuracy i f the answer sheets are marked c o r r e c t l y and kept i n good c o n d i t i o n . Remind the students to handle the sheets with c a r e ; to re c o r d t h e i r answer with heavy, dark marks; and to avoid making s t r a y marks on t h e i r answer s h e e t s . Answer sheets should never be f o l d e d , c l i p p e d , or t o r n . a. S t a r t E x p l a n a t i o n s (Have a l l desks c l e a r e d , and see t h a t each student has a s o f t - l e a d p e n c i l (No. 2 or HB), and an e r a s e r . Say:) "You are going to take a s p e c i a l math t e s t today. Don't open your t e s t book or make any marks on i t u n t i l I t e l l you what to do." b_, BooKlets ( Give one t e s t booklet to each student. As you hand out the t e s t s , a l t e r n a t e between the four d i f f e r e n t types of t e s t b o o k l e t s to ev e n l y d i s t r i b u t e the four types among your s t u d e n t s . Place the booklet with the f r o n t cover up. A l s o , make sure each student has an answer sheet and a piece of s c r a t c h paper i n h i s bo o k l e t . When the bo o k l e t s have been d i s t r i b u t e d , say: ) "Please don't open your b o o k l e t s u n t i l you are t o l d to do so by me." "Four d i f f e r e n t b o o k l e t s have been d i s t r i b u t e d as pa r t of a s p e c i a l experiment to see i f students (128) can do b e t t e r on t e s t s i f they are allowed to s k i p around between t e s t q u e s t i o n s . Those of you with the y e l l o w or orange t e s t s are allowed to go back and f o r t h i n the t e s t b ooklet and do the q u e s t i o n s i n whatever order you wish. Those of you with blue or green t e s t b o o k l e t s are requested not to s k i p ahead to a new q u e s t i o n and then go back to an o l d one. You must answer the q u e s t i o n s i n the order they appear on the t e s t . Students with the blue or green t e s t b o o k l e t s w i l l a l s o f i n d t h a t they have some s p e c i a l i n s t r u c t i o n s on t h e i r t e s t b o o k l e t s and on t h e i r t e s t q u e s t i o n s to remind them of these s p e c i a l r u l e s . " (Pause and answer q u e s t i o n s . Do not d i s c u s s the s p e c i a l order of the items. T r y to maintain your normal t e s t i n g r o u t i n e . T r y to o b t a i n the c o o p e r a t i o n and m o t i v a t i o n of the students.) c. Answer Sheets (Say:) " C a r e f u l l y remove the answer sheet from the i n s i d e f r o n t cover of your t e s t b o o k l e t . Your answer sheet i s going to be scored by machine, so be c a r e f u l with i t . Keep i t as c l e a n as p o s s i b l e , and don't bend i t or f o l d the c o r n e r s . " d. I d e n t i f i c a t i o n Number "I may be u s i n g the r e s u l t s of t h i s exam to help me determine your f i n a l grade at the end of the year. I t i s t h e r e f o r e important t h a t you do your best. I t i s a l s o important that I know which t e s t you took. I want everyone to f i n d the box marked ' I d e n t i f i c a t i o n No.' on t h e i r answer sheet and the t e s t format number on the f r o n t of t h e i r t e s t book l e t s . " (Show the l o c a t i o n of the ' I d e n t i f i c a t i o n No.' s e c t i o n on the back of the answer sheet and the t e s t format number on the f r o n t of the t e s t b o oklet.) "In the box l a b e l l e d 'A' i n the i d e n t i f i c a t i o n number s e c t i o n , make sure t h a t the number of the t e s t format of your t e s t i s i n t h a t box. The orange t e s t i s format one. The y e l l o w t e s t i s format two. The blue t e s t i s format t h r e e . The green t e s t i s format f o u r . (129) (Pause) Name S e c t i o n "Find the spaces f o r your name." (Demonstrate) " F i r s t i n the boxes p r i n t as many l e t t e r s of your l a s t name as you can. Use one box f o r each l e t t e r . Then, leave one box as a space. Next, p r i n t as many l e t t e r s of your f i r s t name as you can. Then, leave another space. F i n a l l y , p r i n t your middle i n i t i a l . I f you cannot f i t your f u l l name i n the space p r o v i d e d , t r y to p r i n t at l e a s t most of your l a s t name, a space, your f i r s t i n i t i a l , a space, and f i n a l l y your middle i n i t i a l . " (Pause) "Now i n the column below each box, f i l l i n the c i r c l e t h a t has the same l e t t e r or space as the l e t t e r or space i n the box above i t . Be sure t h a t you mark o n l y one c i r c l e i n each column. F i l l i n the blank c i r c l e a t the top of every column i n which you have l e f t a space. Be sure to make heavy, s h i n y marks t h a t cover the whole c i r c l e . If you make a mistake, erase your mark completely. If you have any q u e s t i o n s , r a i s e your hand." (Pause u n t i l a l l students have f i n i s h e d f i l l i n g i n the name s e c t i o n . Then say:) "You should have 19 c i r c l e s f i l l e d i n under the name boxes. Count and make s u r e . " (Pause) f. Gender ( A f t e r students have checked the name s e c t i o n , say:) "Now look a t the box below the columns you f i l l e d i n f o r your name." (Demonstrate) " F i l l i n the c i r c l e next to 'Male' i f you are a male or next to 'Female' i f you are female." ( 1 3 0 ) (Pause) "Now look at the box and c i r c l e d numbers to the r i g h t l a b e l l e d 'Grade or E d u c a t i o n ' . J u s t f i l l i n the c i r c l e with an 8 s i n c e t h i s i s a grade 8 course. (Pause) h. B i r t h Date "Now look at the columns underneath the box l a b e l l e d ' B i r t h Date'." (Demonstrate) " F i l l i n the c i r c l e next to the month i n which you were born." (Pause) " F i l l i n the boxes l a b e l l e d 'Day' with two numbers f o r the day of your b i r t h . For example, i f you were born on the seventh of the month, you would w r i t e zero seven." (Pause) " F i l l i n the c i r c l e s i n the columns underneath the boxes l a b e l l e d day to show the number i n the box above the column. Be sure to on l y f i l l i n one c i r c l e i n each column." (Pause) "Now f i l l i n the boxes l a b e l l e d year with the two numbers f o r the year you were born i n , and f i l l in the c i r c l e under each box to i n d i c a t e the number in the box." (Pause) "Now check to make sure t h a t you have c o r r e c t l y f i l l e d i n a l l the r e q u i r e d i n f o r m a t i o n . " (Pause) (131) Li Answer Sheet Usage "Before I t e l l you to open your t e s t b o o k l e t and s t a r t , I am going to t e l l you how to p r o p e r l y mark your answer sheet. L i s t e n c a r e f u l l y so t h a t you w i l l know how to mark your answers. You are to mark a l l your answers on your answer sheet. Don't make any s t r a y marks on i t and do not w r i t e i n your b o o k l e t . You should a l r e a d y have some s c r a t c h paper f o r any f i g u r i n g t h a t you might have to do. For each q u e s t i o n , choose the best answer. Then, on your answer sheet, f i n d the number f o r the q u e s t i o n , and mark the space f o r your answer. Be sure to mark the space f o r your answer. Be sure to mark onl y one answer space f o r each q u e s t i o n . Make your mark heavy and s h i n y , and see t h a t i t completely f i l l s the answer space. I f you change your mind a f t e r you've marked an answer, erase the wrong mark completely; then make your new mark." (On the chalkboard, show students how to f i l l i n an answer space. Answer a l l q u e s t i o n s . ) "You w i l l have 45 minutes to work on t h i s t e s t . If you have any t r o u b l e r e a d i n g a q u e s t i o n , r a i s e your hand and I w i l l h e lp you. Of course, you may not use a c a l c u l a t o r . If you're not sure about the answer to a q u e s t i o n , do the best you can, but don't spend too much time on any one q u e s t i o n . You may omit a q u e s t i o n i f you are sure t h a t you cannot answer i t . " "Make sure t h a t you have turned your answer sheet over to s i d e one, so the name s e c t i o n i s face down, so the s i d e with the p i c t u r e of the p e n c i l i s face up, and so the answer space f o r q u e s t i o n one i s face up." (Demonstrate and check to make sure everyone i s s t a r t i n g on s i d e 1.) (132) i . S t a r t Examination (When you f e e l t h a t everyone understands the d i r e c t i o n s , say:) "You may s t a r t working now." (Record the s t a r t i n g and ending times on the chal k b o a r d . While students are working, walk around the room to make sure that the students are f o l l o w i n g d i r e c t i o n s . Try your best to make sure t h a t students do not change the order of the exam qu e s t i o n s i f they are i n the blue or green b o o k l e t s with the s p e c i a l i n s t r u c t i o n s . I f you see that a student i s having d i f f i c u l t y r e a d i n g a problem, you may help the student read the problem; however, do not give help i n answering any of the q u e s t i o n s . ) k. End Examination ( A f t e r 45 minutes, or sooner i f a l l students have f i n i s h e d , say:) "Stop! Put your p e n c i l down now, and c l o s e your b o o k l e t so t h a t the f r o n t cover i s up. I w i l l c o l l e c t your t e s t b o o k l e t s and answer s h e e t s . " C o l l e c t T e s t i n g M a t e r i a l s ( C o l l e c t the t e s t b o o k l e t s i n t o the four c o l o u r coded groups. C o l l e c t the answer sheets and check to make sure t h a t the student i d e n t i f i c a t i o n s e c t i o n s have been c o r r e c t l y f i l l e d out. C o l l e c t the s c r a t c h paper and dispose of i t . C o l l e c t any of the e x t r a p e n c i l s loaned to the students.) Sb Math A b i l i t y R a t i n g On a c l a s s l i s t , r a t e each students a b i l i t y to do mathematics. Using a s i x p o i n t s c a l e , r e c o r d a number from 1 to 6 that r e p r e s e n t s your best estimate of t h e i r mathematical a b i l i t i e s . T h i s r a t i n g should be somewhat independent of o v e r a l l i n t e l l i g e n c e and g e n e r a l classroom behaviour. Use a "1" f o r those students with the lowest 10% of mathematical a b i l i t y , a "2" f o r the next 15% of students with higher mathematical a b i l i t y , a "3" f o r the next 25%, a "4" f o r the next 25%, a "5" fo r the next 15%, and a "6" f o r the 10% of students with the h i g h e s t mathematical a b i l i t y . (133) T h i s i n f o r m a t i o n w i l l be kept s t r i c t l y c o n f i d e n t i a l and used o n l y to i d e n t i f y which s t u d e n t s , from a t e a c h e r ' s p o i n t of view, may be e i t h e r f r u s t r a t e d by the arrangement of the t e s t q u e s t i o n s or hindered by the d i r e c t i o n s of the t e s t b o o k l e t s . Please keep answer sheets grouped i n c l a s s e s with t h e i r c l a s s l i s t . The r e s u l t s f o r each of your students w i l l be sent to you a t your r e q u e s t . n. Return Testing Materials Please r e t u r n t e s t i n g m a t e r i a l , the t e s t s , the answer shee t s , the p e n c i l s , and the r a t i n g l i s t s to Michael S c a l e s . The t e s t w i l l be scored and analyzed by Michael S c a l e s , graduate student a t the U n i v e r s i t y of B r i t i s h Columbia, and teacher a t Aldergrove Secondary. The r e s u l t s w i l l be kept s t r i c t l y c o n f i d e n t i a l with the r e s u l t s of i n d i v i d u a l students o n l y being sent to t h a t student's c l a s s r o o m teacher i f so requested. I t i s not the i n t e n t i o n of t h i s study to make comparisons between i n d i v i d u a l c l a s s e s , s c h o o l s , t e a c h e r s , or s t u d e n t s . Thank you f o r your c o o p e r a t i o n and your e f f o r t s . ( 1 3 4 ) A P P E N D I X I I MATHEMATICS 8 EXAMINATION TEST FORMAT 1 ( 1 3 5 ) INSTRUCTIONS 1. Do NOT open the t e s t booklet u n t i l you are t o l d to do so. You w i l l have 45 minutes to complete t h i s t e s t . 2. C a r e f u l l y remove the answer sheet from i n s i d e the f r o n t cover and make sure there i s a 1 marked i n box A of the I d e n t i f i c a t i o n No. s e c t i o n of the answer sheet. 3 . Be sure you have a p e n c i l , an e r a s e r , and some s c r a t c h paper. 4. F i l l i n your answer sheet with your name, sex, grade, and b i r t h date. 5. Do NOT use a c a l c u l a t o r or a p r o t r a c t o r . 6 . For each q u e s t i o n , s e l e c t the best answer. Mark your c h o i c e on the answer sheet by f i l l i n g i n the bubble under the c o r r e c t l e t t e r . Make sure the q u e s t i o n number i s the same as the q u e s t i o n number i n the t e s t b o o k l e t . 7. Do not spend too long on any one q u e s t i o n . T r y your best p i c k a good answer to every q u e s t i o n . 8. I f you make a mistake, completely erase your f i r s t c h o i c e and f i l l i n the bubble of your new c h o i c e . 9. Do NOT w r i t e i n the t e s t b o o k l e t . Mark o n l y your answer sheet. If your booklet a l r e a d y has any i n a p p r o p r i a t e marks, ask f o r a c l e a n b o o k l e t . (136) FINISHED? C l o s e your t e s t b o o k l e t . Make s u r e you have f i l l e d i n your answer s h e e t w i t h your name, s e x , gra d e , and b i r t h d a t e . Make s u r e t h a t the I d e n t i f i c a t i o n No. Box A has a 1 i n i t Turn i n your t e s t b o o k l e t and answer s h e e t . THANK YOU MATHEMATICS 8 EXAMINATION TEST FORMAT 2 (137) INSTRUCTIONS 1. Do NOT open the t e s t booklet u n t i l you are t o l d to do so. You w i l l have 45 minutes to complete t h i s t e s t . 2. C a r e f u l l y remove the answer sheet from i n s i d e the f r o n t cover and make sure there i s a 2 marked i n box A of the I d e n t i f i c a t i o n No. s e c t i o n of the answer sheet. 3. Be sure you have a p e n c i l , an e r a s e r , and some s c r a t c h paper. 4. F i l l i n your answer sheet with your name, sex, grade, and b i r t h date. 5. Do NOT use a c a l c u l a t o r or a p r o t r a c t o r . 6 . For each q u e s t i o n , s e l e c t the best answer. Mark your c h o i c e on the answer sheet by f i l l i n g i n the bubble under the c o r r e c t l e t t e r . Make sure the q u e s t i o n number i s the same as the q u e s t i o n number i n the t e s t b o o k l e t . 7. Do not spend too long on any one q u e s t i o n . Try your best pick a good answer to every q u e s t i o n . 8. I f you make a mistake, completely erase your f i r s t c h o i c e and f i l l i n the bubble of your new c h o i c e . 9. Do NOT w r i t e i n the t e s t b o o k l e t . Mark o n l y your answer sheet. I f your bo o k l e t a l r e a d y has any i n a p p r o p r i a t e marks, ask f o r a c l e a n b o o k l e t . (138) FINISHED? Close your t e s t b o o k l e t . Make sure you have f i l l e d i n your answer sheet with your name, sex, grade, and b i r t h date. Make sure that the I d e n t i f i c a t i o n No. Box A has a 2 i n i t Turn i n your t e s t b ooklet and answer sheet. THANK YOU MATHEMATICS 8 EXAMINATION TEST FORMAT 3 (139) INSTRUCTIONS 1. Do NOT open the t e s t b o o k l e t u n t i l you are t o l d t o do so. You w i l l have 45 minutes to complete t h i s t e s t . 2. C a r e f u l l y remove the answer sheet from i n s i d e the f r o n t cover and make sure there i s a 3 marked i n box A of the I d e n t i f i c a t i o n No. s e c t i o n of the answer sheet. 3. Be sure you have a p e n c i l , an e r a s e r , and some s c r a t c h paper. 4. F i l l i n your answer sheet with your name, sex, grade, and b i r t h date. 5. Do NOT use a c a l c u l a t o r or a p r o t r a c t o r . 6 . For each q u e s t i o n , s e l e c t the best answer. Mark your c h o i c e on the answer sheet by f i l l i n g i n the bubble under the c o r r e c t l e t t e r . Make sure the q u e s t i o n number i s the same as the q u e s t i o n number i n the t e s t b o o k l e t . 7. Do not spend too long on any one q u e s t i o n . T r y your best p i c k a good answer to every q u e s t i o n . 8. I f you make a mistake, c o m p l e t e l y erase your f i r s t c h o i c e and f i l l i n the bubble of your new c h o i c e . 9. Do NOT w r i t e i n the t e s t b o o k l e t . Mark o n l y your answer sheet. I f your bo o k l e t a l r e a d y has any i n a p p r o p r i a t e marks, ask f o r a c l e a n b o o k l e t . SPECIAL INSTRUCTIONS 1. You must begin with q u e s t i o n 1. When you have chosen the best answer and marked your answer sheet, then you must go to q u e s t i o n 2. When you have f i n i s h e d q u e s t i o n 2, then you must go on to q u e s t i o n 3, then q u e s t i o n 4, then q u e s t i o n 5, and so on to the end of the t e s t . 2. T r y each q u e s t i o n once and o n l y once. I f you can't answer a q u e s t i o n , go on to the next one. Do NOT s k i p ahead to new q u e s t i o n s or go back to o l d ones. Trv to answer each Question i n i t s proper order. (140) END OF TEST Close your t e s t b o o k l e t . Do not go back to any of the q u e s t i o n s . Make sure you have f i l l e d i n your answer sheet with your name, sex, grade, and b i r t h date. Make sure t h a t the I d e n t i f i c a t i o n No. Box A has a 3 i n i t . Turn i n your t e s t b o o k l e t and answer sheet. THANK YOU MATHEMATICS 8 EXAMINATION TEST FORMAT 4 (141) INSTRUCTIONS 1. Do NOT open the t e s t booklet u n t i l you are t o l d to do so. You w i l l have 45 minutes to complete t h i s t e s t . 2. C a r e f u l l y remove the answer sheet from i n s i d e the f r o n t cover and make sure there i s a 4 marked i n box A of the I d e n t i f i c a t i o n No. s e c t i o n of the answer sheet. 3. Be sure you have a p e n c i l , an e r a s e r , and some s c r a t c h paper. 4. F i l l i n your answer sheet with your name, sex, grade, and b i r t h date. 5. Do NOT use a c a l c u l a t o r or a p r o t r a c t o r . 6. For each q u e s t i o n , s e l e c t the best answer. Mark your c h o i c e on the answer sheet by f i l l i n g i n the bubble under the c o r r e c t l e t t e r . Make sure the q u e s t i o n number i s the same as the q u e s t i o n number i n the t e s t b o o k l e t . 7. Do not spend too long on any one q u e s t i o n . T r y your best pick a good answer to every q u e s t i o n . 8. I f you make a mistake, completely erase your f i r s t c h o i c e and f i l l i n the bubble of your new c h o i c e . 9. Do NOT w r i t e i n the t e s t b o o k l e t . Mark o n l y your answer sheet. If your booklet a l r e a d y has any i n a p p r o p r i a t e marks, ask f o r a c l e a n b o o k l e t . SPECIAL INSTRUCTIONS 1. You must begin with q u e s t i o n 1. When you have chosen the best answer and marked your answer sheet, then you must go to q u e s t i o n 2. When you have f i n i s h e d q u e s t i o n 2, then you must go on to q u e s t i o n 3, then q u e s t i o n 4, then q u e s t i o n 5, and so on to the end of the t e s t . 2. T r y each q u e s t i o n once and on l y once. I f you can't answer a q u e s t i o n , go on to the next one. Do NOT s k i p ahead to new q u e s t i o n s or go back to o l d ones. Try to answer each q u e s t i o n i n i t s proper order. (142) END OF TEST Close your t e s t b o o k l e t . Do not go back to any of the q u e s t i o n s . Make sure you have f i l l e d i n your answer sheet with your name, sex, grade, and b i r t h date. Make sure t h a t the I d e n t i f i c a t i o n No. Box A has a 4 i n i t Turn i n your t e s t b ooklet and answer sheet. THANK YOU (143) APPENDIX III P - l e v e l Item A n a l y s i s Data (144) Easy to Hard Hard to Easy As Given Order Reversed Order Form Form Form Form Seq. 1 3 Seq. 2 4 Item No. Unr . Res. No. Unr . Res . A 1 .913 .919 40 .728 .678 B 2 .906 .838 39 .707 . 562 C 3 .866 .831 38 .762 .651 D 4 .832 .723 37 .619 .521 E 5 .678 .635 36 . 558 . 548 F 6 .765 .723 35 .599 . 589 G 7 .812 .750 34 . 565 . 527 H 8 .624 .696 33 .531 .445 I 9 .604 .520 32 . 537 . 527 J 10 .705 .676 31 .660 . 575 K 11 .651 .669 30 .449 .459 L 12 .470 . 466 29 .435 . 384 M 13 .631 . 595 28 . 483 . 452 N 14 .617 .541 27 .503 . 527 0 15 . 503 . 493 26 .442 . 404 P 16 .369 .385 25 . 374 .288 Q 17 . 362 . 399 24 .272 .418 R 18 .416 .351 23 . 313 .342 S 19 . 490 . 527 22 .435 .438 T 20 .456 .392 21 . 361 . 397 U 21 . 376 . 358 20 . 367 . 411 V 22 . 477 .419 19 .429 .30 8 W 23 . 523 . 527 18 .435 . 500 X 24 . 389 .378 17 . 367 . 329 Y 25 . 329 . 291 16 . 374 .432 Z 26 .456 .459 15 . 306 . 377 AA 27 .315 .284 14 .293 . 267 BB 28 .356 .351 13 .354 . 390 CC 29 .275 . 324 12 .272 . 253 DD 30 .201 .257 11 .204 .267 EE 31 . 362 . 29 7 10 . 313. . 281 FF 32 . 369 .439 9 . 367 . 370 GG 33 .255 .257 8 .218 .240 HH 34 .275 .209 7 . 238 . 219 II 35 . 242 . 223 6 . 252 .151 J J 36 .228 .250 5 . 272 .288 KK 37 .242 . 189 4 .238 . 137 LL 38 .262 .182 3 .190 . 205 MM 39 . 161 .169 2 .122 . 199 NN 40 .181 .176 1 . 245 . 247 Note Unr. = U n r e s t r i c t e d ; Res. = R e s t r i c t e d B-value Item Parameter Estimates (145) Easy to Hard Hard to Easy As Given Order Reversed Order Form Form Form Form Seq. 1 3 Seq. 2 4 I tern No. Unr. Res. No. Unr. Res. A 1 -2.743 -3.028 40 -1.257 -1.004 B 2 -2.653 -2.096 39 -1.134 -0.357 C 3 -2.206 -2.037 38 -1.476 -0.845 D 4 -1.909 -1.255 37 -0.641 -0.139 E 5 -0.915 -0.741 36 -0.325 -0.284 F 6 -1.427 -1.255 35 -0.535 -0.504 G 7 -1.752 -1.429 34 -0.360 -0.175 H 8 -0.633 -1.089 33 -0.188 0.262 I 9 -0.531 -0.128 32 -0.222 -0.175 J 10 -1.063 -0.970 31 -0.861 -0.430 K 11 -0.772 -0.931 30 0.233 0.189 L 12 0.126 0.155 29 0.292 0 .602 M 13 -0.667 -0.520 28 0.051 0.226 N 14 -0.599 -0.234 27 -0.051 -0.175 0 15 -0.037 0.014 26 0.257 0.487 P 16 0.633 0.592 25 0.611 1.178 Q 17 0.669 0.517 24 1.195 0.411 R 18 0. 392 0.783 23 0.950 0 .839 S 19 0.028 -0.163 22 0.292 0.299 T 20 0.192 0.555 21 0.684 0.525 U 21 0.598 0.744 20 0.647 0 .449 V 22 0.093 0.407 19 0.327 1.047 W 23 -0.135 -0.163 18 0.292 -0.030 X 24 0 . 529 0.630 17 0 .647 0.921 Y 25 0. 8 50 1.148 16 0.611 0.337 Z 26 0.192 0.191 15 0.990 0.641 AA 27 0.925 1.191 14 1.070 1.314 BB 28 0.704 0.783 13 0 .721 0.563 CC 29 1.162 0.941 12 1.195 1.409 DD 30 1.657 1. 369 11 1.658 1.314 EE 31 0.669 1.106 10 0.950 1. 223 FF 32 0.633 0 . 299 9 0.647 0 .680 GG 33 1. 287 1.369 8 1.558 1. 506 HH 34 1.162 1.710 7 1.416 1.660 II 35 1.374 1.608 6 1.325 2.259 J J 36 1. 465 1. 415 5 1.195 1.178 KK 37 1.374 1.872 4 1.416 2.403 LL 38 1.245 1.929 3 1.762 1.768 MM 39 1.981 2.047 2 2.389 1.824 NN 40 1.812 1.987 1 1.370 1.457 Note Unr. = U n r e s t r i c t e d ; Res. = R e s t r i c t e d (146) APPENDIX IV MATHEMATICS 8 EXAMINATION TEST FORMAT 3 (147) INSTRUCTIONS 1. Do NOT open the t e s t b o o k l e t u n t i l you are t o l d to do so. You w i l l have 45 minutes to complete t h i s t e s t . 2. C a r e f u l l y remove the answer sheet from i n s i d e the f r o n t cover and make sure there i s a 3 marked i n box A of the I d e n t i f i c a t i o n No. s e c t i o n of the answer sheet. 3. Be sure you have a p e n c i l , an e r a s e r , and some s c r a t c h paper. 4. F i l l i n your answer sheet with your name, sex, grade, and b i r t h date. 5. Do NOT use a c a l c u l a t o r or a p r o t r a c t o r . 6. For each q u e s t i o n , s e l e c t the best answer. Mark your c h o i c e on the answer sheet by f i l l i n g i n the bubble under the c o r r e c t l e t t e r . Make sure the q u e s t i o n number i s the same as the q u e s t i o n number i n the t e s t b o o k l e t . 7. Do not spend too long on any one q u e s t i o n . T r y your best pick a good answer to every q u e s t i o n . 8. I f you make a mistake, completely erase your f i r s t c h o i c e and f i l l i n the bubble of your new c h o i c e . 9. Do NOT w r i t e i n the t e s t b o o k l e t . Mark o n l y your answer sheet. I f your booklet a l r e a d y has any i n a p p r o p r i a t e marks, ask f o r a c l e a n b o o k l e t . SPECIAL INSTRUCTIONS 1. You must begin with q u e s t i o n 1. When you have chosen the best answer and marked your answer sheet, then you must go to q u e s t i o n 2. When you have f i n i s h e d q u e s t i o n 2, then you must go on to q u e s t i o n 3, then q u e s t i o n 4, then q u e s t i o n 5, and so on to the end of the t e s t . 2. T r y each q u e s t i o n once and o n l y once. I f you can't answer a q u e s t i o n , go on to the next one. Do NOT s k i p ahead to new q u e s t i o n s or go back to o l d ones. Trv to answer each q u e s t i o n i n i t s proper order. ITEM A: QUESTION 1 (148) The c i r c l e graph shows the proportions of various grain crops produced by a country. Which of the following statements is TRUE? • A More oats than rye 1s produced. 8 The largest crop is barley. C Equal quantities of wheat and barley are produced. 0 The smallest crop 1s oats. E Wheat and oats together make up less than half the total grain crop. PLEASE DO NOT TURN BACK TO THIS PAGE. ITEM B: QUESTION 2 ( 1 4 9 ) 162 x 45 is equal to A 1378 B 1458 C 5890 D 6290 • E 7290 PLEASE DO NOT TURN BACK TO THIS PAGE I T E M C: QUESTION 3 (150) A team scores an average of 3 points per game over 5 games. How many points altogether were scored in the 5 games. C 3 0 5 • E 15 PLEASE DO NOT TURN BACK TO THIS PAGE ITEM D: QUESTION 4 (151) In a discus-throwing competition, the winning throw was 61.60 metres. The second place throw was 59.72 metres. How much longer was the winning throw than the second place throw? A. 1.12 metres « B. 1.88 metres C. 1.92 metres D- 2.12 metres E. 121.32 metres PLEASE DO NOT TURN BACK TO THIS PAGE. I T E M E : QUESTION 5 (152) If. 102 x 103 = 10n then n is equal to A 4 « B 5 C 6 0 8 E 9 PLEASE DO NOT TURN BACK TO THIS PAGE QUESTION 6 A group of c h i l d r e n was d i v i d e d in to 7 teams with nine in each team. L a t e r , the same group of c h i l d r e n was d i v i d e d in to teams with seven in each team. How many teams were there then? A 7 B 8 C 9 D 16 E 63 PLEASE DO NOT TURN BACK TO THIS PAGE ITEM G: QUESTION 7 ( 154 ) Here 1s a table that shows the number of trees planted along a highway In a week. If the graph were completed, which point would indicate the top of the bar on Thursday? Days of the Week Number of Trees Planted On the diagram below, Mon Tues Wed Thurs FH 80 50 60 90 75 the graph for the f i rs t two days' plantings has been' A P B Q C R 0 S E T Hon Tu«a V«d thurs F r l PLEASE DO NOT TURN BACK TO THIS PAGE ITEM H: QUESTION 8 (155) What is the volume of a rectangular box with interior dimensions 10 cm long, 10 cm wide, and 7 cm high? A 21 cm3 B 70 cm3 C 140 cm3 0 280 cm3 0 E 700 cm3 PLEASE DO NOT TURN BACK TO THIS PAGE ITEM I: QUESTION 9 (156) If the ratio of 2 to 5 equals the ratio of n to 100, then n is equal to A 10 B 20 • C 40 0 150 E 250 PLEASE DO NOT TURN BACK TO THIS PAGE. ITEM J : QUESTION 10 (157) 5x / 4r B A 20 B 40 C 50 » 0 80 If AB i s a s t r a i g h t l i n e , what i s the measure i n degrees o f angle BCD? E 100 PLEASE DO NOT TURN BACK TO THIS PAGE ITEM K: QUESTION 11 (158) In a school of 800 pupils, 300 are boys The ratio of the number of boys to the number of girls is A . 3:8 B. 5:8 C. 3 : 11 D. 5:3 E. 3:5 PLEASE DO NOT TURN BACK TO THIS PAGE ITEM L: QUESTION 12 (159) Which of the following equals 7 x (3 + 9)? • A (7 x 3) + (7 x 9) B (7 x 9) + (3 x 9) C (7 x 3) + (3 x 9) 0 7 x 27 E 21+9 PLEASE DO NOT TURN BACK TO THIS PAGE ITEM M: QUESTION 13 (160) 0.40 x 6.38 is equal to A. .2552 B. 2.452 , C. 2.552 D. 24.52 E. 25.52 PLEASE DO NOT TURN BACK TO THIS PAGE ITEM N: QUESTION 14 (161) When x = 2, * \ = is equal to 11 B C Ii 5 9 5 7 PLEASE DO NOT TURN BACK TO THIS PAGE. ITEM 0: QUESTION 15 (162) 20 m 15 a A square is removed from the rec tangle as shown. What i s tne area of the remaining part? A. 316 ni* e. 300 m* c. 284-0. 80 m 2 E. 16 m2 P L E A S E DO NOT TURN B A C K TO T H I S P A G E ITEM P: QUESTION 16 (163) If segment 'RJ' were drawn for each figure shown below, 1t would divide one of the figures into two congruent triangles. Which figure? * ^  • • '/zu Q B C PLEASE DO NOT TURN BACK TQ THIS P A O E ITEM Q: QUESTION 17 (164) 7 j^j is equal to A. 7.03 • B. 7.15 C 7.23 0. 7.3 E. 7.6 PLEASE DO NOT TURN BACK TO THIS PAGE ITEM R: QUESTION 18 (165) What is the square root of 12 x 75? A 6.25 B 30 C 87 D 625 E 900 PLEASE DO NOT TURN BACK TO THIS PAGE ITEM S: QUESTION 19 (166) If x = - 3, the value of -3r is D « E 1 9 PLEASE DO NOT TURN BACK TO THIS PAGE ITEM T: QUESTION 20 (167) How many pieces of pipe each 20 metres long would be required to construct a pipeline 1 kilometre in length? ' A 5 • B 50 C 500 D 5000 E 50,000 PLEASE DO NOT TURN BACK TO THIS PAGE ITEM U: QUESTION 21 (168) 2 metres + 3 millimetres is equal to A. 2.0003 metres B. 2.003 metres C. 2.03 metres D. 2.3 metres E. 5 metres PLEASE DO NOT TURN BACK TO THIS PAGE ITEM V: QUESTION 22 (169) 8.S n A. 48 m2 5.9 m Which of the following is the closest approximation to the area of the rectangle with measurements given? B. 54 m2 C. 56 m2 D. 63 m.2 E. 72 m* PLEASE DO NOT TURN BACK TO THIS PAGE (170) Three hours after starting, car A is how many kilometres ahead of car B? P L E A S E DO NOT TURN BACK TO THIS PAGP. QUESTION 24 The arithmetic mean (average) of: 1.50, 2.40, 3.75 is equal to A 2.40 B 2.55 C 3.75 0 7.65 E None of these PLEASE DO NOT TURN BACK TO THIS PAGE. ITEM Y: QUESTION 25 (172) PLEASE DO NOT TURN BACK TO.THIS PAGE ITEM Z: QUESTION 26 (173) I f x - y = z = 1 , then J " 2 is equal to A - 2 B - 1 • C 0 PLEASE DO NOT TURN BACK TO THIS PAGE. ITEM AA: QUESTION 27 (174) 2 ca 2 cm 2 aa 2 ca The area in square centimetres of this figure is The rectangle shown above is cut along the dotted lines and the three parts put together, without overlapping, to give the figure shown below. 8 cm2 B 10 cm2 0 C 12 cn2 14 cm2 16 cm2 PLEASE DO NOT TURN BACK TO THIS PAHE-ITEM BB: QUESTION 28 ( 1 7 5 ) " 1 0 ca What 1s the area of the above parallelogram? A 30 cm2 8 36 cm2 C 48 on 2 • 0 60 cm2 E 80 cm2 PLEASE DO NOT TURN BACK TO THIS PAGE ITEM CC: QUESTION 29 (176) Suppose you start at point M(-l,-1), move a distance of one unit to N(-l,-2), then turn left and move one unit to the point P10.-2). If you again turn left and move one unit, you will now be at the point with coordinates 0. -2) 8 (0, -3) * C (0, -1) 0 (-1. -2) E None of the above P L E A S E DO NOT TURN BACK TO THTS Panp ITEM DD: QUESTION 30 (177) 0.00046 A. B. C D. E. is equal to 46 x TO"3 4.6 x 10"1* 0.46 x 103 4.6 x 10* 46 x 10* PLEASE DO NOT TURN BACK TO THIS PAGE. ITEM EE: QUESTION 31 (178) If, in the given figure PQ and RS are intersecting straight lines, then x + y is equal to A 15 B 30 • C 60 0 180 E 300 PLEASE DO NOT TURN BACK TO THIS PAGE. ITEM FF: QUESTION 32 (179 ) The table below compares the height from which a ball Is dropped (d) and the height to which i t bounces (b). d 50 80 100 130 b 25 40 50 75 Which formula describes this relation? A b - dl B b - 2d * C b - | D b - d * 25 E b - d - 25 PLEASE DO NOT TURN BACK TO THIS PAGE ITEM GG: QUESTION 33 ( 1 8 0 ) S i NO 3 8 ca 8 ca The total area of the two triangles is A B C 0 E 6 x 8 cm2 cm2 cm2 cm2 1*1 rmZ 2 10 x 6 —r~ 16 x 12 2~ 20 x 12 PLEASE DO NOT TURN BACK TO THIS PAGE ITEM HH: QUESTION 34 ( 1 8 1 ) PQRS 1s a rectangle. Its Image after a transformation Is the rectangle P 'Q 'R 'S ' , as shown above. The transformation used could have been • A a rotation about the or ig in . 6 a reflection in the y-axis H* Q ' C a translation parallel to the z-axis 0 a reflection in the z-axis E a translation parallel to the y-axis. P L E A S E DO NOT T U R N B A C K TO T H I S P A f i F I T E M I I : QUESTION 35 (182) One o f the f o l l o w i n g p o i n t s can be j o i n e d to the p o i n t (-3,4) by a l i n e segment which cuts NEITHER the * NOR the y a x i s . Which one? • A (-2,3) B (2,-3) C (2,3) D (-2,-3.) E (4,-3) PLEASE DO NOT TURN BACK TO THIS PAGE. ITEM J J : QUESTION 3 6 ( 1 8 3 ) In a quadrilateral, two of the angles each have measure of 110°, and the measure of a third angle is 90° . What is the measure of the remaining angle? • A 50° B 90° C 130° D 140° E None of the above PLEASE DO NOT TURN BACK TO THIS PAGE. ITEM KK: QUESTION 37 ( 1 8 4 ) The symbol P ^ q represents the intersection of sets P and Q and the symbol P u Q represents the union , to nn\ u a of sets P and Q. Which of the follow- ^ " w ; w * ing represents the shaded portion of the diagram below? , g P U ( q n R ) P A (Q U R) (P n 0) n R (P uQ) n R P L E A S E DO NOT T U R N B A C K TO T H I S P A G E ITEM LL: QUESTION 3 8 (185) There are 7,000,000 girls under the age of 21 in a country with a total population of 36,000,000. If a circle graph were drawn showing the d i s t r i -bution of the population, the angle in the sector representing girls under the age of 21 would have measure A 7° B 20° C 21° • 0 70° E 72° PLEASE DO NOT TURN BACK TO THIS PAGE ITEM MM: QUESTION 39 (186) If 5a: + 4 « 4x - 31 , then x is equal to ' A. -35 B. -27 C. 3 D. 27 E. 35 PLEASE DO NOT TURN BACK TO THIS PAGE ITEM NN: QUESTION 40 Given v and w as shown in the f i g u r e above, what i s DB, the v e c t o r from 0 to B PLEASE DO NOT TURN BACK TO THIS PAGE ( 1 8 8 ) END OF TEST Close your test booklet. Do not go back to any of the questions. Make sure you have f i l l e d in your answer sheet with your name, sex, grade, and b i r t h date. Make sure that the I d e n t i f i c a t i o n No. Box A has a 3 in i t . Turn in your test booklet and answer sheet. THANK YOU AUTHOR INDEX (190) Author Index Ahmann & Glock (1963) 1-2,13,28,48 A l l i s o n (1984) 3,7,58-61,73 A l l i s o n & Thomas (1986) 62-63,79,104 Berger et a l . (1969) 4,23-24, 44,73 Brenner (1964) 3,12-13,73 Campbell & S t a n l e y (1963) 110,111 Cronbach (1946) 57 Cronbach (1950) 57 F e l d t & F o r s y t h (1974) 50-52,73,107 Flaugher et a l . (1968) 3,10, 15-16,73 French & Greer (1964) 4,17-19,73 Hambleton & Traub ( 1974) 3,6-7,10,52-57,58, 59,60,61,62 73,76,78,79,88,104,107,109 Hodson (1984) 3,7,10, 30-32,45,46,73 Hopkins & Antes (1985) 1 Huck & Bowers (1972) 3,46-48,73 Kestenbaum & Weiner (1970 )....3,29-30,73 Kingston & Dorans (1984) 4,68-69, 73,75,81,107 K l e i n k e (1980) 3,42-44,45,46,73 Klimko (1984) 4,39-40,44,45,62,73,79 Klosner & Gellman (1973) 3,7,28-29,45,73 Lane e t a l . (1987) 3, 10, 40-42,44, 45,73 Leary & Dorans (1985) 2,10,71,74,76 MacNicol (1956) 3,10-11,73 Marso (1970) 4,25-26,44,73 Millman & Bishop ( 1965) 61,79 Mollenkopf (1950) 3,9-10,54,73 Monk & S t a l l i n g s ( 1970) 3, 10,57-58,72,73 Munz & Jacobs (1971) 4,22-23,28,73 Munz & Smouse (1968) 4,20-21,22,23,26,27,28, 44,54, 56,73 Plake (1980) 4,10, 32-33, 44,73 Plake & Ansorge (1984) 4,37-38,45,46,73 Plake, Ansorge et a l . (1982)..4,35-37,44,45,54,56,73,107 Plake, M e l i c a n et a l . (1983)..4,37-38,45,73 Plake, Thompson et a l . (1980). 4, 33-35, 36,44, 54,73 R i n d l e r (1980) 64-65,79,105 R o b i t a i l l e & Garden (1987)....85 Ruch (1929 ) 1 Sax & Carr (1962) 3,11-12,13,73 Sax & Cromack (1966) 3,7,13-15,28,45,73 S i r o t n i k & W e l l i n g t o n ( 1974 ).. 3, 48-50, 73 Smouse & Munz (1968) 4,19-20, 44,73 Smouse & Munz ( 1969) 4,21-22,28,44,73 Towle & M e r r i l l (1975) 4,26-27, 33,44,46,60,73,107 Tuck (1978) 58,62,79 Whitely & Dawis (1976) 4,66-68,73, 75,81 Yen (1980) 4,69-71,7 3,7 5,81,106,107 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0098329/manifest

Comment

Related Items