UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A criterion-referenced approach to test construction in physical education Brar, Surinder 1986

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1986_A7_5 B72.pdf [ 4.53MB ]
Metadata
JSON: 831-1.0077349.json
JSON-LD: 831-1.0077349-ld.json
RDF/XML (Pretty): 831-1.0077349-rdf.xml
RDF/JSON: 831-1.0077349-rdf.json
Turtle: 831-1.0077349-turtle.txt
N-Triples: 831-1.0077349-rdf-ntriples.txt
Original Record: 831-1.0077349-source.json
Full Text
831-1.0077349-fulltext.txt
Citation
831-1.0077349.ris

Full Text

A CRITERION-REFERENCED APPROACH TO TEST CONSTRUCTION IN PHYSICAL EDUCATION by SURINDER BRAR B.P.E., The U n i v e r s i t y of B r i t i s h C o l u m b i a , 1979 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTERS OF PHYSICAL EDUCATION i n THE FACULTY OF GRADUATE STUDIES (PHYSICAL EDUCATION) We a c c e p t t h i s t h e s i s as c o n f o r m i n g t o the r e q u i r e d s t a n d a r d THE UNIVERSITY OF BRITISH COLUMBIA OCTOBER 1986 © S u r i n d e r B r a r , 1986 In presenting t h i s thesis i n p a r t i a l f u l f i l m e n t of the requirements for an advanced degree at the University of B r i t i s h Columbia, I agree that the Library s h a l l make i t f r e e l y available for reference and study. I further agree that permission for extensive copying of t h i s thesis for scholarly purposes may be granted by the head of my department or by h i s or her representatives. I t i s understood that copying or publication of t h i s thesis for f i n a n c i a l gain s h a l l not be allowed without my written permission. Surinder Brar Department of Physical Education The University of B r i t i s h Columbia 1956 Main Mall Vancouver, Canada V6T 1Y3 Date October 10, 1986 i i A b s t r a c t The primary purpose of t h i s study was to apply c r i t e r i o n -r e f e r e n c e d (CR) t e s t c o n s t r u c t i o n procedures i n the development of a Bas i c F i t n e s s Theory Exam (BFTE) f o r the B r i t i s h Columbia R e c r e a t i o n A s s o c i a t i o n - F i t n e s s Branch as p a r t of a r e g i s t r a t i o n model f o r B a s i c F i t n e s s Leaders. The BFTE w i l l be used to determine whether or not i n d i v i d u a l s have s u f f i c i e n t knowledge to be c e r t i f i e d as f i t n e s s i n s t r u c t o r s i n B r i t i s h Columbia. A secondary purpose was to p r o v i d e a c l e a r s e t of procedures f o r CR t e s t development of P h y s i c a l Education knowledge t e s t s . The development of the BFTE c o n s i s t e d of a number of stages and i n v o l v e d t h r e e p i l o t t e s t s . P i l o t #1 c o n s i s t e d of 57 items and was a d m i n i s t e r e d to p h y s i c a l e d u c a t i o n students at The U n i v e r s i t y of B r i t i s h Columbia (92 i n Year I and 72 i n Year I I ) . A f t e r the r e s u l t s of the s t a t i s t i c a l a n a l y s e s were presented to the P r o v i n c i a l F i t n e s s A d v i s o r y Committee, 22 items were r e v i s e d , 22 items were r e p l a c e d , no items were d e l e t e d , 3 items were added, and 13 items were l e f t unchanged. P i l o t #2, c o n s i s t i n g of 60 items, was a d m i n i s t e r e d to 94 i n s t r u c t e d (upon completion of a 40 hour course) and 106 u h i n s t r u c t e d (upon commencement of a 40 hour course) s u b j e c t s . Again, comprehensive s t a t i s t i c a l a n a l y s e s were performed and then the r e s u l t s were presented to the PFAC. In a l l , 15 items were d e l e t e d , one item was r e v i s e d , and no items were added i n c o n s t r u c t i n g P i l o t #3. P i l o t #3, the f i n a l t e s t , c o n s i s t s of 45 items and i s p r e s e n t l y being used i n the B r i t i s h Columbia R e c r e a t i o n A s s o c i a t i o n - F i t n e s s Branch's r e g i s t r a t i o n model f o r the Bas i c F i t n e s s L eader ( L e v e l I ) . The BFTE has a "pass" c r i t e r i o n of 70% (32 out of 4 5 ) . i v Table of Contents 1. I n t r o d u c t i o n 1 2. C r i t e r i o n - r e f e r e n c e d Test C o n s t r u c t i o n : An Overview .... 5 Item C o n s t r u c t i o n 5 Item A n a l y s i s and S e l e c t i o n 8 Informal examinee feedback 9 Item d i f f i c u l t y 9 Item d i s c r i m i n a t i o n 10 Item homogeneity 10 C u t - o f f Score S e l e c t i o n 11 V a l i d i t y 13 Content v a l i d i t y 13 C o n s t r u c t v a l i d i t y 15 R e l i a b i l i t y 1 6 T h r e s h o l d l o s s 17 Squared-error l o s s 18 Domain score e s t i m a t i o n 18 Score I n t e r p r e t a t i o n s 19 3. S t r u c t u r e and Purpose of F i t n e s s I n s t r u c t o r C e r t i f i c a t i o n Program 20 O r g a n i z a t i o n a l S t r u c t u r e of Sport Governing Bodies ...20 Purpose of C e r t i f i c a t i o n Program 22 Overview of C e r t i f i c a t i o n Process 23 CR T e s t C o n s t r u c t i o n w i t h i n the BCRA-Fitness Branch Framework 24 4. Test C o n s t r u c t i o n Procedures and R e s u l t s f o r the BFTE ...26 Overview 26 V P i l o t Test #1 28 I d e n t i f i c a t i o n of L e a r n i n g Outcomes 28 Ranking s p e c i f i c areas 29 Submitting items 31 Randomization Procedures 35 V a l i d i t y ...35 A d m i n i s t r a t i o n 36 R e s u l t s 37 S u b j e c t i v e feedback 37 S t a t i s t i c a l a n a l y s e s - overview 38 S t a t i s t i c a l a n a l y s e s - item l e v e l 38 S t a t i c t i c a l a n a l y s e s - Subtest(ST) and T o t a l T e s t ( T T ) l e v e l s 47 P i l o t Test #2 55 I tern-objective Congruence 55 A d m i n i s t r a t i o n 55 S t a t i s t i c a l Analyses 56 R e l i a b i l i t y 56 Hoyt's Estimate of R e l i a b i l i t y 57 Standard E r r o r of Measurement 57 Cronbach's Composite Alpha 58 P i l o t Test #3 60 C u t - o f f Score 60 R e s u l t s 61 5. Summary and Recommendations 63 6. References 71 7. Appendices ..78 v i L i s t of T a b l e s T a b l e 1 T e s t C o n s t r u c t i o n D e v e l o p m e n t a l Stages 6 T a b l e 2 Proposed Time L i n e f o r P r o c e d u r e s ...27 T a b l e 3 P r i o r i z a t i o n of S p e c i f i c A reas 30 T a b l e 4 Item p - v a l u e s f o r P i l o t #1 45 T a b l e 5 D e s c r i p t i v e S t a t i s t i c s f o r P i l o t #1 48 T a b l e 6 C o r r e l a t i o n s f o r P i l o t #1 51 T a b l e 7 Summary of A c t u a l P r o c e d u r e s 65 T a b l e 8 Summary of Item R e v i s i o n s from P i l o t #1 t o P i l o t #3 67 v i i L i s t of Appendices A. G u i d e l i n e s f o r Submitting Items 78 B. Item S t a t i s t i c s 80 C. I tern-Objective Congruence 81 D. S u b j e c t i v e Feedback 84 E. R e l i a b i l i t y Procedures 85 F. O r g a n i z a t i o n a l S t r u c u r e of the Sport Governing Bodies ..86 G. Rating of " S p e c i f i c Areas" 87 H. G u i d e l i n e s f o r Item S t a t i s t i c s I n t e r p r e t a t i o n 88 I. S t a t i s t i c s f o r P i l o t #2 99 J . Number of F a l s e - p o s i t i v e s and F a l s e - n e g a t i v e s f o r V a r y i n g C u t - o f f Scores 103 K. B a s i c F i t n e s s Theory Exam--Pilot #3 104 1 I n t r o d u c t i o n The need f o r c r i t e r i o n - r e f e r e n c e d (CR) t e s t s was f i r s t demonstrated by G l a s e r (1963) over twenty years ago and subsequently expanded upon by Popham and Husek (1969). The purpose of the CR t e s t s was t o p r o v i d e i n f o r m a t i o n r e l a t i v e to a domain of w e l l - d e f i n e d o b j e c t i v e s or competencies which c o u l d then be used to make d e c i s i o n s r e g a r d i n g i n d i v i d u a l s (promotion, c e r t i f i c a t i o n ) or s p e c i f i c p o p u l a t i o n s (program e v a l u a t i o n ) . Since the f i r s t d i s c u s s i o n s about CR t e s t i n g there has been a growing t r e n d i n p h y s i c a l e d u c a t i o n to f o r m a l i z e competency e v a l u a t i o n . While p h y s i c a l educators have r e a l i z e d that motor s k i l l s as w e l l as knowledge can be e x p i c i t l y d e f i n e d , s p e c i f i c l e v e l s of competency have o n l y been d e f i n e d i n r e l a t i v e l y few domains (e.g., B r i t i s h Columbia M i n i s t r y of Ed u c a t i o n , 1979, 1980; S h i f l e t t & Schuman, 1982). The B r i t i s h Columbia P h y s i c a l Education C u r r i c u l u m Guide (B.C. M i n i s t r y of Edu c a t i o n , 1980) i s an example of the e f f o r t made by p h y s i c a l educators to d e f i n e l e v e l s of i n d i v i d u a l competency i n a v a r i e t y of a c t i v i t i e s . A l i s t of s p e c i f i c b e h a v i o r a l o b j e c t i v e s i s g i v e n f o r each of the h i e r a r c h i c a l l e v e l s of the a c t i v i t i e s , with the amount of d e t a i l given making i t easy to use the o b j e c t i v e s as a c r i t e r i o n f o r promotion of the students. Another example of i n d i v i d u a l competency e v a l u a t i o n and c e r t i f i c a t i o n i s the Canadian A s s o c i a t i o n of Sport S c i e n c e s ' c e r t i f i c a t i o n and a c c r e d i t a t i o n program f o r f i t n e s s a p p r a i s e r ' s . A f i n a l example i s the P h y s i c a l E d u c a t i o n 2 L e a r n i n g Assessment Report (B.C. M i n i s t r y of E d u c a t i o n , 1979) whi c h shows how CR t e s t s can be used t o make e v a l u a t i v e s t a t e m e n t s about a s p e c i f i c p o p u l a t i o n . Many o t h e r examples of bo t h i n d i v i d u a l and programmatic d e c i s i o n s a r e a v a i l a b l e . Due t o the n a t u r e of the d e c i s i o n s which a r e made, based on CR t e s t i n g , i t i s i m p o r t a n t t h a t they, be d e v e l o p e d u t i l i z i n g t e s t c o n s t r u c t i o n p r o c e d u r e s t h a t a r e a p p r o p r i a t e t o t h e i r f u n c t i o n . CR t e s t s a r e , "those which a r e used t o a s c e r t a i n an i n d i v i d u a l ' s s t a t u s w i t h r e s p e c t t o some c r i t e r i o n " (Popham & Husek, 1969, p. 2 ) , and show how an examinee s t a n d s w i t h r e s p e c t t o some s p e c i f i c o b j e c t i v e or domain of o b j e c t i v e s . T h i s i s d i f f e r e n t from t r a d i t i o n a l or n o r m - r e f e r e n c e d (NR) t e s t s , "which a r e used t o a s c e r t a i n an i n d i v i d u a l ' s p erformance i n r e l a t i o n s h i p t o the performance of o t h e r i n d i v i d u a l s on t h e same measuring d e v i c e " (Popham & Husek, 1969, p. 2 ) . NR t e s t s p r o v i d e l i t t l e i n f o r m a t i o n r e g a r d i n g an i n d i v i d u a l ' s a b s o l u t e degree of s k i l l or competency, but r a t h e r p r o v i d e the p o s i t i o n of an i n d i v i d u a l r e l a t i v e t o some sample or p o p u l a t i o n ( e . g . , z-s c o r e s , p e r c e n t i l e s ) . T e s t s i n p h y s i c a l e d u c a t i o n , i n c l u d i n g those mentioned above, a r e b e i n g used t o make c r i t e r i o n - r e l a t e d d e c i s i o n s even though they were not d e v e l o p e d a c c o r d i n g t o CR t e s t c o n s t r u c t i o n p r o c e d u r e s . F o r example, W i l s o n (1980), d e v e l o p e d a knowledge t e s t a c c o r d i n g t o s t a n d a r d NR methodology and when the t e s t was a d m i n i s t e r e d the s c o r e s were i n t e r p r e t e d as CR t e s t s c o r e s . That i s , the s c o r e s were i n t e r p r e t e d i n terms of a c r i t e r i o n t h a t was not e s t a b l i s h e d d u r i n g the t e s t c o n s t r u c t i o n . Another a r e a i n which CR t e s t s a r e b e i n g used i n p h y s i c a l e d u c a t i o n i s 3 i n the e v a l u a t i o n of p h y s i c a l s k i l l s . For example, S h i f f l e t t & Schuman (1982) have developed a CR t e s t f o r archery u s i n g a combination of CR and NR t e c h n i q u e s . As with Wilson's knowledge exam, the i n t e r p r e t a t i o n of the archery t e s t s cores i n a t r u e l y CR sense i s hard to j u s t i f y . The i n a p p r o p r i a t e i n t e r p r e t a t i o n s shown above can r e s u l t i n o r g a n i z a t i o n s being a c c r e d i t e d or i n d i v i d u a l s being c e r t i f i e d or promoted because of h i g h r e l a t i v e s c o r e s , but not n e c e s s a r i l y high a b s o l u t e s c o r e s . I t i s , t h e r e f o r e , necessary to develop, e x p l a i n , and j u s t i f y e x p l i c i t t e s t c o n s t r u c t i o n procedures f o r CR assessment i n p h y s i c a l e d u c a t i o n . Thus, one of the purposes of t h i s study i s to p r o v i d e evidence, t h e o r e t i c a l and e m p i r i c a l , of the procedures needed to develop a CR knowledge t e s t i n p h y s i c a l e d u c a t i o n . More s p e c i f i c a l l y , the purpose of t h i s study i s to apply CR t e s t c o n s t r u c t i o n procedures i n the development of a B a s i c F i t n e s s Theory Exam (BFTE) f o r the B r i t i s h Columbia R e c r e a t i o n A s s o c i a t i o n - F i t n e s s Branch as p a r t of a r e g i s t r a t i o n model f o r B a s i c F i t n e s s Leaders. The requirements of the model s t a t e t h a t , i n order to be e l i g i b l e f o r r e g i s t r a t i o n , the i n d i v i d u a l must, " s u c c e s s f u l l y complete a course which i s r e c o g n i z e d by the B r i t i s h Columbia R e c r e a t i o n A s s o c i a t i o n , p r a c t i c a l requirements, and B a s i c F i t n e s s Theory Exam." The BFTE w i l l be used to determine whether or not i n d i v i d u a l s have s u f f i c i e n t knowledge to be c e r t i f i e d as f i t n e s s i n s t r u c t o r s . Many authors have argued over the a p p r o p r i a t e n e s s of v a r i o u s psychometric i n d i c e s and procedures (e.g., Berk, 1980a; Hambleton, Swaminathan, & A l g i n a , 1978; Huynh, 1979; Subkoviak, 4 1980). T h e r e f o r e , w h i l e d e v e l o p i n g the exam, t h i s s t u d y w i l l a l s o c o n t r a s t s p e c i f i c CR t e s t c o n s t r u c t i o n p r o c e d u r e s . S t r e n g t h s , weaknesses, and a p p l i c a b i l i t y of t h e p r o c e d u r e s t o t h i s s p e c i f i c s i t u a t i o n w i l l be d i s c u s s e d . Upon c o m p l e t i o n , t h i s s t u d y w i l l p r o v i d e an exemplary model of t e s t c o n s t r u c t i o n f o r CR t e s t s i n p h y s i c a l e d u c a t i o n . 5 C r i t e r i o n - R e f e r e n c e d T e s t C o n s t r u c t i o n : An Overview S i n c e CR t e s t i n g was f i r s t p o p u l a r i z e d by G l a s e r (1963) and Popham and Husek (1969) many r e s e a r c h e r s have become i n v o l v e d i n t h i s r a p i d l y g r o w i n g a r e a of s t u d y . Even as e a r l y as 1978, Hambleton et a l . (1978) s t a t e d t h a t t h e r e were over s i x hundred r e f e r e n c e s on the t o p i c of CR t e s t i n g and t h a t , " t h e r e a r e as many i d e a s about what a c r i t e r i o n - r e f e r e n c e d t e s t i s as t h e r e a r e c o n t r i b u t o r s t o t h e f i e l d . " The number of c o n t r i b u t o r s has c o n t i n u e d t o i n c r e a s e over the p a s t few y e a r s , however, much more consensus has been reached over d e f i n i t i o n s , and a p p r o p r i a t e n e s s of v a r i o u s CR t e s t c o n s t r u c t i o n p r o c e d u r e s . An o v e r v i e w of t h e d e v e l o p m e n t a l s t a g e s f o r c o n s t r u c t i n g a CR t e s t a r e shown i n t h e f i r s t column of T a b l e 1 below (adapted from Berk, 1980b, p. 6 ) . For c o m p a r i s o n , the second column of T a b l e 1 shows the d e v e l o p m e n t a l s t a g e s f o r a t r a d i t i o n a l NR t e s t . The f o l l o w i n g s e c t i o n s d i s c u s s the development of t h e d i f f e r e n t s t a g e s of CR t e s t c o n s t r u c t i o n . Item C o n s t r u c t i o n T a b l e 1 shows t h a t the f i r s t t h r e e s t a g e s f o r CR t e s t c o n s t r u c t i o n a r e s i m i l a r t o t h e f i r s t f o u r s t a g e s i n NR t e s t c o n s t r u c t i o n . However, CR t e s t c o n s t r u c t i o n p l a c e s more emphasis on c l e a r o b j e c t i v e and domain s p e c i f i c a t i o n and on i t e m g e n e r a t i o n p r o c e d u r e s . CR t e s t p r o c e d u r e s c o u l d be used t o g e n e r a t e items f o r a NR t e s t , however, NR t e s t c o n s t r u c t i o n p r o c e d u r e s a r e not s t r i c t enough t o g e n e r a t e good CR t e s t items ( M a r t u z a , 1977, chap. 16). T a b l e 1 T e s t C o n s t r u c t i o n Developmental Stages C r i t e r i o n - r e f e r e n c e d N o r m - r e f e r e n c e d Content domain s p e c i f i c a t i o n Item c o n s t r u c t i o n Item domain c o n s t r u c t i o n Item a n a l y s i s Item s e l e c t i o n C u t - o f f s c o r e s e l e c t i o n V a l i d i t y R e l i a b i l i t y S core i n t e r p r e t a t i o n s C o ntent o u t l i n e O b j e c t i v e o u t l i n e (LO) Content o b j e c t i v e m a t r i x W r i t e i t e m s / c o n s t r u c t t e s t Item a n a l y s i s Item s e l e c t i o n V a l i d i t y R e l i a b i l i t y S c o r e i n t e r p r e t a t i o n s 7 Hambleton e t a l . (1978, p.3) a l s o c o n f i r m t h i s by s t a t i n g t h a t , "a n o r m - r e f e r e n c e d t e s t can be used t o make c r i t e r i o n - r e f e r e n c e d measurement, and a c r i t e r i o n - r e f e r e n c e d t e s t can be used t o make no r m - r e f e r e n c e d measurement, but n e i t h e r use w i l l be p a r t i c u l a r l y s a t i s f a c t o r y . " I t s h o u l d be p o i n t e d out t h a t CR t e s t c o n s t r u c t i o n methodology has o n l y d e v e l o p e d over the p a s t twenty y e a r s ; u n t i l v e r y r e c e n t l y CR t e s t s were c o n s t r u c t e d u s i n g NR methodology and then the r e s u l t i n g s c o r e s were j u s t i n t e r p r e t e d i n a CR sense. As mentioned above, t h i s was a v e r y u n s a t i s f a c t o r y method of o b t a i n i n g t h e i n f o r m a t i o n needed t o make CR d e c i s i o n s . The f i r s t s t a g e i n CR t e s t c o n s t r u c t i o n i s t o d e f i n e t h e domain of c o n t e n t or b e h a v i o r s t o be measured. T h i s i s a v e r y i m p o r t a n t s t a g e because t h e , " v a l i d i t y and i n t e r p r e t a t i o n of the t e s t s c o r e s a r e c o n t i n g e n t upon the p r e c i s i o n of the domain s p e c i f i c a t i o n s " ( B e r k , 1980b, p. 13). The p r e c i s i o n used i n t h i s v e r y f i r s t s t a g e a l o n e s e t s CR t e s t i n g a p a r t from s t a n d a r d NR t e s t i n g where the c o n t e n t and o b j e c t i v e o u t l i n e s can be done q u i t e a r b i t r a r i l y . While most a u t h o r s agree t h a t t h i s i s a i m p o r t a n t s t a g e , v e r y few (Berk, 1980b; Hambleton e t a l . 1978; M i l l m a n , 1974) have a c t u a l l y d i s c u s s e d methodology f o r d e v e l o p i n g good domain s p e c i f i c a t i o n s . The next s t a g e s i n v o l v e i t e m and i t e m domain c o n s t r u c t i o n . The items s h o u l d be g e n e r a t e d a c c o r d i n g t o a s p e c i f i c s e t of r u l e s t o en s u r e t h a t thay a d d r e s s t h e c o n t e n t d e f i n e d i n the f i r s t s t a g e . A sample of item c o n s t r u c t i o n g u i d e l i n e s can be seen i n Appendix A. The c o n t e n t domain and the it e m g e n e r a t i o n r u l e s s h o u l d be s p e c i f i e d i n such a r i g i d f a s h i o n t h a t the items 8 produced a r e t o t a l l y independent of the p e r s o n w r i t i n g them. Popham ( i n Berk, 1980b, chap.2) even d i s c u s s e s the use of a l g o r i t h m s f o r computer g e n e r a t e d t e s t i t e m s which would a l l be v e r y u n i f o r m i n n a t u r e and a l s o v e r y c l o s e l y r e l a t e d t o the domain s p e c i f i c a t i o n s . The i t e m domain i s c o n s t r u c t e d as a r e s u l t of g e n e r a t i n g items w i t h i n the c o n t e n t domain s p e c i f i e d i n t h e f i r s t s t a g e . I t can be e i t h e r f i n i t e or i n f i n i t e d epending on the p r e c i s i o n of t he c o n t e n t domain s p e c i f i c a t i o n s t r a t e g y employed. A f i n i t e s e t of items w i l l r e s u l t from a v e r y p r e c i s e l y d e f i n e d c o n t e n t domain. Item A n a l y s i s and S e l e c t i o n Item a n a l y s i s , and c o n s e q u e n t l y i t e m s e l e c t i o n and r e v i s i o n , a r e based on i t e m - o b j e c t i v e congruence and v a r i o u s i t e m s t a t i s t i c s as l i s t e d i n Appendix B. The d a t a f o r t h e s e s t a t i s t i c s a r e o b t a i n e d by the f o l l o w i n g methods: a) p r e t e s t - p o s t t e s t b) u n i n s t r u c t e d - i n s t r u c t e d group d i f f e r e n c e c) i n d i v i d u a l g a i n d) net g a i n A l t h o u g h the methods f o r d a t a c o l l e c t i o n a r e t r a d i t i o n a l , t he a c t u a l s t a t i s t i c s used a r e d i f f e r e n t , o r i n some c a s e s j u s t i n t e r p r e t e d d i f f e r e n t l y , than t h e s t a n d a r d NR s t a t i s t i c s . The main r e a s o n f o r needing new s t a t i s t i c s , or d i f f e r e n t i n t e r p r e t a t i o n s , i s due t o the l a c k of v a r i a n c e i n the CR t e s t s c o r e s . The f i r s t and most i m p o r t a n t a n a l y s i s a t the i t e m l e v e l i s 9 t h e i t e m - o b j e c t i v e congruence. I t shows, "the e x t e n t t o which an i t e m measures the o b j e c t i v e i t i s i n t e n d e d t o measure" ( B e r k , 1980b, p . 5 1 ) . Each of the items s h o u l d be s u b j e c t i v e l y r a t e d by a s e r i e s of i n d i v i d u a l s who a r e e x p e r t s i n the c o n t e n t domain s p e c i f i e d i n stage one. A summary of the judges r a t i n g s w i l l show how w e l l each i t e m i s a d d r e s s i n g t h e o b j e c t i v e i t was meant t o t e s t . E n s u r i n g a h i g h l e v e l of i t e m - o b j e c t i v e congruence w i l l h e l p towards c o n s t r u c t i n g a t e s t w i t h h i g h c o n t e n t v a l i d i t y . A sample i t e m - o b j e c t i v e congruence r a t i n g sheet can be seen i n Appendix C. I n f o r m a l examinee feedback. Feedback can be o b t a i n e d by a t t a t c h i n g a q u e s t i o n n a i r e t o the t e s t as i t i s a d m i n i s t e r e d . The q u e s t i o n n a i r e s h o u l d c o n s i s t of d i r e c t q u e s t i o n s r e g a r d i n g a m b i g u i t i e s , c o n f u s i n g words, poor w o r d i n g , d i f f i c u l t y l e v e l , e t c . A l i s t of p o s s i b l e q u e s t i o n s can be found i n Appendix D. T h i s feedback can p r o v i d e " v a l u a b l e i n s i g h t s and d i r e c t i o n s f o r i m p r o v i n g the t e s t t h a t o t h e r w i s e would not be d i s c l o s e d from a q u a n t i t a t i v e a n a l y s i s " (Berk, 1980b). Item d i f f i c u l t y . Item d i f f i c u l t y , e x p r e s s e d as a "p-v a l u e " , r e f e r s t o the p e r c e n t a g e of examinees c h o o s i n g the c o r r e c t r e s p o n s e . A h i g h e r p - v a l u e f o r t h e c o r r e c t response i n d i c a t e s t h a t many examinees answered c o r r e c t l y ; the i t e m may be t o o e a s y . I t s h o u l d be noted t h a t t h i s paper w i l l r e f e r t o the p - v a l u e of the d i s t r a c t o r s as w e l l as t h e c o r r e c t r e s p o n s e . In t h i s c a s e , the p - v a l u e f o r a g i v e n response w i l l r e p r e s e n t t h e p e r c e n t a g e of examinees who chose t h a t r e s p o n s e , r e g a r d l e s s of whether o r not i t i s the c o r r e c t r e s p o n s e . The p - v a l u e of d i s t r a c t o r s can p r o v i d e i n v a l u a b l e i n f o r m a t i o n r e g a r d i n g t h e 10 d i s c r i m i n a t i o n a b i l i t y of the i t e m . Item d i s c r i m i n a t i o n . Item d i s c r i m i n a t i o n i n d i c e s s h o u l d d e m o n s t r a t e changes from p r e t e s t t o p o s t t e s t or d i f f e r e n c e s between i n s t r u c t e d and u n i n s t r u c t e d s u b j e c t s . These i n d i c e s need t o show maximum d i s c r i m i n a t i o n between groups and minimum d i s c r i m i n a t i o n among i n d i v i d u a l s w i t h i n any one group. Berk (1980b) l i s t s , d e s c r i b e s , and e v a l u a t e s s e v e r a l i n d i c e s t h a t have been suggested by v a r i o u s a u t h o r s . These i n d i c e s (shown i n Appendix B) range from t h r e e s i m p l e p r o p o r t i o n s , which a r e v e r y s e n s i t i v e t o the a c c u r a t e d e f i n i t i o n of c r i t e r i o n g r o u p s , t o t h r e e t y p e s of c o r r e l a t i o n s w h i c h r e q u i r e more s o p h i s t i c a t e d computer programs f o r t h e i r c a l c u l a t i o n s . The f i n a l i n d e x Berk (1980b) e v a l u a t e s i s M i l l m a n ' s (1974) d i s c r i m i n a t i o n i n d e x which i n v o l v e s a s t e p w i s e r e g r e s s i o n w h i c h i n c o r p o r a t e s i n t e r i t e m c o r r e l a t i o n s . T h i s index not o n l y r e q u i r e s a s o p h i s t i c a t e d computer program, but a l s o r e q u i r e s a v e r y l a r g e sample s i z e . R e g a r d l e s s of which of the i n d i c e s i s c h o s e n , i t i s i m p o r t a n t t o ensure t h a t a l l items d i s c r i m i n a t e w e l l ; good i t e m d i s c r i m i n a t i o n w i l l r e s u l t i n a t e s t w i t h a h i g h l e v e l of d e c i s i o n v a l i d i t y . Item homogeneity. Item homogeneity r e f e r s t o s t a t i s t i c s t h a t a r e used t o v e r i f y , " t h a t i t e m s congruent w i t h an i n s t r u c t i o n a l o b j e c t i v e behave s i m i l a r i l y on a s i n g l e t e s t i n g or on r e p e a t e d t e s t i n g s " ( B e r k , 1980b, p. 6 4 ) . Item homogeneity i s more a p p r o p r i a t e i n s i t u a t i o n s where the o b j e c t i v e i s v e r y s p e c i f i c a l l y d e f i n e d ; i n a case where the o b j e c t i v e i s g e n e r a l l y d e f i n e d t o c o v e r a v a r i e t y of s k i l l i t may be u n r e a l i s t i c t o e x p ect th e items t o be homogeneous i n terms of i t e m d i f f i c u l t y 11 i n d i c e s , e t c . Berk (1980b) e v a l u a t e s f o u r p o s s i b l e i n d i c e s t h a t can be used t o demonstrate i t e m homogeneity. In g e n e r a l , i t e m s t a t i s t i c s s h o u l d not s e r v e as the s o l e c r i t e r i o n f o r i t e m s e l e c t i o n or r e v i s i o n , but r a t h e r s h o u l d p r o v i d e g u i d e l i n e s f o r t h e s e p r o c e d u r e s . M i l l m a n (1974, p. 339) s u g g e s t s t h a t , "Item s t a t i s t i c s c a n , however, be used t o d e t e c t f l a w e d i t e m s . " C u t - o f f Score S e l e c t i o n Many r e s e a r c h e r s have attempted, t o s o l v e the problem of d e t e r m i n i n g the b e s t method f o r e s t a b l i s h i n g a c u t - o f f s c o r e f o r c r i t e r i o n - r e f e r e n c e d t e s t s which would a s s i g n examinees t o mastery/nonmastery groups (Hambleton fie E i g n o r , 1979; Hambleton e t a l . , 1978; M i l l m a n , 1973; Shepard, 1980). The problem has y e t t o be r e s o l v e d t o a p o i n t which i s s a t i s f a c t o r y . Hambleton e t a l (1978, p. 26) s t a t e t h a t , "The a r b i t r a r i n e s s of t h e pr o p o s e d s o l u t i o n has proved t r o u b l i n g t o some measurement p e o p l e , t o the p o i n t where they s e r i o u s l y q u e s t i o n t h e m e r i t s of d e t e r m i n i n g and u s i n g c u t - o f f s c o r e s a t a l l . " C u t - o f f s c o r e s can a l s o have a p r o f o u n d e f f e c t on the t e s t s c o r e r e l i a b i l i t y and v a l i d i t y . R e s e a r c h e r s have p r o v i d e d e v i d e n c e which s u g g e s t s t h a t the . v a l u e of a c u t - o f f s c o r e i n f l u e n c e s s t u d e n t l e a r n i n g and t h e i r a t t i t u d e s . A h i g h e r c u t - o f f s c o r e c o r r e s p o n d s t o s t u d e n t s who s t u d y h a r d e r and have a more p o s i t i v e a t t i t u d e t owards t h e t e s t . As a r e s u l t , t h e i r t e s t s c o r e s w i l l be h i g h e r . The s e l e c t i o n of c u t - o f f s c o r e s w i l l always be a s s o c i a t e d w i t h e r r o r s . Examinees who a r e v e r y c l o s e t o the c u t - o f f s c o r e 12 ( j u s t above or below) w i l l be v e r y s i m i l a r i n t h e i r a b i l i t i e s . Hambleton (1978), Shepard (1980), and o t h e r s , w h i l e r e a l i z i n g t h a t the s t a n d a r d s e t t i n g methods a r e somewhat a r b i t r a r y , s u g gest t h a t , " p o t e n t i a l l y f l a w e d s t a n d a r d s a r e b e t t e r than none." I f p a s s - f a i l d e c i s i o n s a r e i n e v i t a b l e , good t e s t i n f o r m a t i o n , even w i t h an a r b i t r a r y c u t - o f f s c o r e , w i l l l e a d t o b e t t e r d e c i s i o n s than those t h a t would be made w i t h o u t the t e s t . Shepard (1980) r e v i e w s , e v a l u a t e s , and summarizes exemplary s t a n d a r d - s e t t i n g methods. The major methods f a l l i n the f o l l o w i n g c a t e g o r i e s : (1) a b s o l u t e judgements of t e s t c o n t e n t (2) judgements about mastery-nonmastery groups (3) norms and p a s s i n g r a t e s (4) e m p i r i c a l methods f o r d i s c o v e r i n g s t a n d a r d s (5) e m p i r i c a l methods f o r a d j u s t i n g c u t o f f s c o r e s , g i v e n a s t a n d a r d on an e x t e r n a l c r i t e r i o n measure. Most of the methods a r e based on d e c i s i o n t h e o r y (Hambleton & N o v i c k , 1973; S u b k o v i a k , 1980); the o b j e c t i s t o maximize d e c i s i o n v a l i d i t y once the e x t e r n a l c r i t e r i o n i s e s t a b l i s h e d . The dichotomy of the t e s t i s matched t o the dichotomy of the c r i t e r i o n w h i l e m i n i m i z i n g the number of f a l s e - p o s i t i v e s and f a l s e - n e g a t i v e s . Huynh (1976) a l s o p r e s e n t s a r e a s o n a b l e p r o c e d u r e f o r s e t t i n g the c u t - o f f s c o r e once the e x t e r n a l c r i t e r i o n i s e s t a b l i s h e d . M i l l m a n (1973), Shepard ( 1 9 8 0 ) , and o t h e r s have suggested t e c h n i q u e s f o r a d j u s t i n g the c u t - o f f s c o r e t o p r o t e c t a g a i n s t the more s e r i o u s t y p e of e r r o r . In some a p p l i c a t i o n s ( c e r t i f i c a t i o n , p r o m o t i o n , e t c . ) a f a l s e - p o s i t i v e c o u l d be c o n s i d e r e d a much more s e r i o u s e r r o r than a f a l s e -13 n e g a t i v e . Some of t h e o t h e r methods which reduce e r r o r s on b o t h s i d e s of the c u t - o f f s c o r e a r e d i s c u s s e d i n more d e t a i l i n t h e v a l i d i t y and r e l i a b i l i t y s e c t i o n s . Shepard (1980) a l s o s u g g e s t s c r i t e r i a f o r s e l e c t i n g t h e a p p r o p r i a t e s t a n d a r d - s e t t i n g method f o r s p e c i f i c u s e s . The t y p e s of i n f o r m a t i o n needed i s o r g a n i z e d i n t o t h r e e c a t e g o r i e s based on the l e v e l of d e c i s i o n - m a k i n g t h a t w i l l be done. The t h r e e c a t e g o r i e s a r e : (1) i n d i v i d u a l d i a g n o s i s , (2) i n d i v i d u a l c e r t i f i c a t i o n , and (3) program e v a l u a t i o n . Each of t h e t h r e e t y p e s would r e q u i r e a d i f f e r e n t s t r a t e g y f o r e s t a b l i s h i n g t h e c u t - o f f s c o r e . V a l i d i t y "A t e s t has v a l i d i t y i f i t measures what i t p u r p o r t s t o measure" ( A l l e n & Yen, 1979, p. 9 5 ) . Any t e s t , whether i t i s CR or NR must be v a l i d i n o r d e r t o a l l o w the u s e r t o a t t a t c h m e a n i n g f u l i n t e r p r e t a t i o n s t o the measurements p r o v i d e d . S e v e r a l methods can be used t o a s s e s s v a l i d i t y , depending on t h e t e s t and i t s i n t e n d e d use. The two major t y p e s of v a l i d i t y a r e c o n t e n t v a l i d i t y and c o n s t r u c t v a l i d i t y . The recommended p r o c e d u r e s f o r a s s e s s i n g each of t h e t h e s e t y p e s of v a l i d i t y f o r CR t e s t s a r e s i m i l a r t o the t r a d i t i o n a l NR approaches; however, because c o r r e l a t i o n c o e f f i c i e n t s , f a c t o r a n a l y s i s , and m u l t i t r a i t - m u l t i m e t h o d a n a l y s i s a l l r e q u i r e a l a r g e v a r i a n c e , new s t a t i s t i c s and new i n t e r p r e t a t i o n s of the o l d s t a t i s t i c s have been d e v e l o p e d (Ber k , 1980b; Popham & Husek, 1969). Content V a l i d i t y . C o n t e n t v a l i d i t y can be e s t a b l i s h e d t h r o u g h a l o g i c a l e x a m i n a t i o n of the c o n t e n t of a t e s t i n 14 r e l a t i o n t o a domain or a s e t of o b j e c t i v e s . I t can be d i v i d e d i n t o two forms: f a c e v a l i d i t y and l o g i c a l v a l i d i t y , b o t h which a r e s u b j e c t i v e i n n a t u r e . Face v a l i d i t y can be e s t a b l i s h e d by h a v i n g an i n d i v i d u a l (anyone from an e x p e r t t o an examinee) c r i t i c a l l y examine t h e t e s t and c o n c l u d e t h a t i t does measure the r e l e v a n t t r a i t . L o g i c a l or sa m p l i n g v a l i d i t y , " i n v o l v e s t h e c a r e f u l d e f i n i t i o n of the domain of b e h a v i o u r s t o be measured by a t e s t and the l o g i c a l d e s i g n of items t o c o v e r a l l the im p o r t a n t a r e a s of t h i s domain" ( A l l e n & Yen, 1979, p. 96),. Content v a l i d i t y i s q u i t e o f t e n the o n l y type of v a l i d i t y t h a t i s a d d r e s s e d by makers of CR t e s t s (Popham & Husek, 1969; L i n n , 1980). Because the development of CR t e c h n o l o g y has i n c r e a s e d the p r e c i s i o n w i t h w h i c h t h e t e s t items a r e l i n k e d t o the domain or l i s t of o b j e c t i v e s , many t e s t s a re c o n s t r u c t e d w i t h o u t mention of t h e c o n t e n t v a l i d i t y . L i n n (1980). and o t h e r s have su g g e s t e d t h a t many t e s t c o n s t r u c t o r s , assume t h a t the v a l i d i t y of a c r i t e r i o n - r e f e r e n c e d t e s t i s gua r a n t e e d by the d e f i n i t i o n of the domain and the p r o c e s s used t o g e n e r a t e the i t e m s . T h i s i s f u r t h e r c o n f i r m e d by Popham and M i l l m a n ( i n Berk, 1980c) i n t h e i r d i s c u s s i o n of domain s p e c i f i c a t i o n s , i t e m g e n e r a t i o n forms, and a m p l i f i e d o b j e c t i v e s . A l l t h r e e t e c h n i q u e s have been d e v e l o p e d t o ensure t h a t the items a r e t r u e l y r e f l e c t i v e of t h e domain b e i n g t e s t e d . In o r d e r t o e s t a b l i s h a c e r t a i n l e v e l of c o n t e n t v a l i d i t y f o r CR t e s t i t e m s , Hambleton e t a l (1978) have s u g g e s t e d t h a t the g e n e r a l approach i n v o l v e s judgements of t e s t items by c o n t e n t s p e c i a l i s t s . Berk (1980b) has summarized t h r e e methods f o r o b t a i n i n g t h e s e s u b j e c t i v e e v a l u a t i o n s of the c o n t e n t 1 5 v a l i d i t y . The f i r s t method i s the item-objective congruence as discussed in the e a r l i e r section on item analysis and s e l e c t i o n . The second method i s to use a rating scale, in which, each judge w i l l rate the items subjectively on t h e i r content in r e l a t i o n to the domain being tested. The l a s t method involves having the judges try to match each of the items to a s p e c i f i c objective from the domain being tested. A l l three of these methods can be used to demonstrate the content v a l i d i t y of a CR t e s t . Construct V a l i d i t y . Construct v a l i d i t y i s much more d i f f i c u l t to e s t a b l i s h and i s often not even reported for a CR t e s t . A test's construct v a l i d i t y i s "the degree to which i t measures the t h e o r e t i c a l construct or t r a i t that i t was designed to measure" (Allen & Yen, 1979). One of the possible reasons can again be associated with the low variance in the test scores from CR exams. The homogeneous results are p a r t l y due to the fact that the CR tests are generally administered to a group, just p r i o r to, or just a f t e r , a series of lessons on the domain being tested. This low variance deters the test maker from using the t r a d i t i o n a l methods of establishing construct v a l i d i t y such as factor analysis and multitrait-multimethod a n a l y s i s . Instead, once the content v a l i d i t y of the items i s ensured, the test constructor should concentrate on e s t a b l i s h i n g the construct v a l i d i t y through the use of an external c r i t e r i o n . This involves decision or c r i t e r i o n related v a l i d i t y and i s established through empirical methods which include s e t t i n g a cut-off score and testing predictions. As mentioned e a r l i e r , test v a l i d i t y i s c l o s e l y related to the cut-off score. A cut-off score that i s too high or too low w i l l result in too many 1 6 f a l s e - n e g a t i v e s or f a l s e - p o s i t i v e s , r e s p e c t i v e l y , which w i l l put t h e c o n s t r u c t v a l i d i t y i n doubt. Berk (1976) s u g g e s t s t h a t an a l t e r n a t i v e method f o r i d e n t i f y i n g " the o p t i m a l c u t - o f f s c o r e i s t o compute a v a l i d i t y c o e f f i c i e n t f o r each p o s s i b l e c u t - o f f s c o r e . The s u b j e c t s would be d i c h o t o m i z e d a c c o r d i n g t o an e x t e r n a l c r i t e r i o n ( e . g . , u n i n s t r u c t e d = 0 , i n s t r u c t e d = 1 ) and then t h e i r t e s t s c o r e s would be d i c h o t o m i z e d (a s c o r e below the c u t - o f f score=0, a s c o r e a t or above the c u t - o f f s c o r e = l ) . The v a l i d i t y c o e f f i c i e n t i s s i m p l y t h e P e a r s o n c o r r e l a t i o n ( p h i c o e f f i c i e n t ) between t h e s e two dichotomous v a r i a b l e s . A t e s t w i t h h i g h v a l i d i t y would r e s u l t i n a h i g h p o s i t i v e c o r r e l a t i o n . T h e r e f o r e , the t e s t c o n s t r u c t o r would s e l e c t the o p t i m a l c u t - o f f s c o r e a t a p o i n t where the v a l i d i t y ( c o r r e l a t i o n ) c o e f f i c i e n t would maximized. A l l e n & Yen (1979) r e f e r t h i s p r o c e d u r e as d e m o n s t r a t i n g c o n c u r r e n t v a l i d i t y w h i c h i s v e r y s i m i l a r t o p r e d i c t i v e or c r i t e r i o n r e l a t e d v a l i d i t y . R e l i a b i l i t y A CR t e s t has r e l i a b i l i t y i f i t shows " c o n s i s t e n c y of d e c i s i o n making a c r o s s p a r a l l e l forms of the c r i t e r i o n -r e f e r e n c e d t e s t o r a c r o s s r e p e a t e d measurements" (Hambleton & N o v i c k , 1973, p. 166). R e l a t i v e t o v a l i d i t y , much more a t t e n t i o n has been g i v e n t o the r e l i a b i l i t y of CR t e s t c o n s t r u c t i o n . More th a n a dozen i n d i c e s have been pr o p o s e d by v a r i o u s r e s e a r c h e r s , and c r i t i c a l r e v i e w s have been co n d u c t e d by Berk (1980a), Hambleton, Swaminathan, A l g i n a , & C o u l s o n ( 1 9 7 8 ) , M i l l m a n (1980), and Subkoviak (1978). A l t h o u g h t h e s e i n d i c e s 17 have been a v a i l a b l e f o r a number of y e a r s , t h e y a r e seldom r e p o r t e d f o r CR t e s t s . U s u a l l y o n l y a K u d e r - R i c h a r d s o n 20 or 21 i s r e p o r t e d f o r the i n t e r n a l c o n s i s t e n c y of t h e t e s t . The v a r i o u s i n d i c e s t h a t a r e a v a i l a b l e have been p l a c e d , by Hambleton e t a l . ( 1 9 7 8 ) , i n t o c a t e g o r i e s based on the s i m i l a r i t i e s i n t h e i r a s s umptions and methodology. The t h r e e c a t e g o r i e s , as shown i n Appendix E, a r e t h r e s h o l d l o s s , squared-e r r o r l o s s , and domain s c o r e e s t i m a t i o n . T h r e s h o l d l o s s . T h r e s h o l d l o s s i s d e f i n e d by Berk (1980b) as "the c o n s i s t e n c y of mastery-nonmastery c l a s s i f i c a t i o n d e c i s i o n - m a k i n g a c r o s s r e p e a t e d measures w i t h one t e s t form or on p a r a l l e l t e s t forms." The concept of u s i n g a t h r e s h o l d l o s s f u n c t i o n was f i r s t p r o p osed by Hambleton and N o v i c k (1973, p. 168). I t s use assumes (Berk , 1980b): a) a dichotomous, q u a l i t a t i v e c l a s s i f i c a t i o n of s t u d e n t s as m asters and nonmasters of an o b j e c t i v e based on a t h r e s h o l d or c u t t i n g s c o r e . b) the l o s s e s a s s o c i a t e d w i t h a l l f a l s e m a stery and f a l s e nonmastery c l a s s i f i c a t i o n e r r o r s a r e e q u a l l y s e r i o u s r e g a r d l e s s of t h e i r s i z e . S e v e r a l a u t h o r s have proposed v a r i a t i o n s on the o r i g i n a l index p roposed by Hambleton and N o v i c k . In t o t a l , s i x approaches have been s u g g e s t e d and a r e l i s t e d i n Appendix E. A l l of the s u g g e s t e d i n d i c e s a r e p r o p o r t i o n s based on one of two b a s i c i n d i c e s , "po, p r o p o r t i o n of i n d i v i d u a l s c o n s i s t e n t l y c l a s s i f i e d as masters and nonmasters a c r o s s ( c l a s s i c a l l y ) p a r a l l e l t e s t forms, and K, p r o p o r t i o n of i n d i v i d u a l s c o n s i s t e n t l y c l a s s i f i e d beyond t h a t by chance" (Ber k , 1980a). There a r e o b v i o u s 18 advantages and d i s a d v a n t a g e s f o r u s i n g each of t h e s e i n d i c e s depending on t h e c o n t e x t of t h e d e c i s i o n s b e i n g made w i t h the t e s t s c o r e s . Berk (1980a) d i s c u s s e s each of the s i x i n d i c e s , i n d e t a i l , i n t h e c o n t e x t of the t e s t b e i n g used f o r i n d i v i d u a l d e c i s i o n s , f o r c e r t i f i c a t i o n , and f o r program e v a l u a t i o n . S q u a r e d - e r r o r l o s s . S q u a r e d - e r r o r l o s s f u n c t i o n s a r e based on the squared d e v i a t i o n s of the i n d i v i d u a l s c o r e s from the c u t -o f f s c o r e s . T h i s i s s i m i l a r t o the t h r e s h o l d l o s s f u n c t i o n s e x c e p t t h a t i t b u i l d s i n a s e n s i t i v i t y t o the degree of mastery or nonmastery. M i s c l a s s i f y i n g examinees who a r e f a r above or below the c u t - o f f s c o r e i s c o n s i d e r e d more s e r i o u s t h a t m i s c l a s s i f y i n g t h o s e who a r e near t h e c u t - o f f s c o r e . A l s o the degree of e r r o r a s s o c i a t e d w i t h f a l s e - p o s i t i v e s and f a l s e -n e g a t i v e s i s not assumed t o be e q u a l . Only two i n d i c e s a r e a v a i l a b l e i n t h i s c a t e g o r y , one proposed by L i v i n g s t o n (1972,1977) and one by Brennan (1980). Both i n d i c e s a r e v e r y s i m i l a r i n t h e i r a ssumptions and even i n t h e i r c a l c u l a t i o n s . A g a i n , see Berk (1980a) f o r a d e t a i l e d d i s c u s s i o n of t h e i r common f e a t u r e s as w e l l as t h e i r s l i g h t d i f f e r e n c e s . Domain s c o r e e s t i m a t i o n . Domain s c o r e e s t i m a t i o n i n v o l v e s the use of s t a t i s t i c s which a r e con c e r n e d g e n e r a l l y w i t h e s t i m a t i n g t h e s t a b i l i t y of an i n d i v i d u a l ' s s c o r e or p r o p o r t i o n c o r r e c t i n t h e i t e m domain independent of any mastery s t a n d a r d ( B e r k , 1980a). F i v e s t a t i s t i c s , as l i s t e d i n Appendix I , have been p r o p o s e d by Brennan (1980), Berk ( 1 9 8 0 c ) , Cochran (1963), L o r d ( 1 9 5 9 ) , L o r d & No v i c k (1968), and M i l l m a n (1974). A l l of the s e s t a t i s t i c s can be used i n the c o n t e x t of i n d i v i d u a l or programmatic d e c i s i o n making and a r e c a l c u l a t e d from p a r a l l e l or 19 randomly p a r a l l e l test forms. The main u t i l i t y for th i s approach i s in situations where a cut-off score i s not available or i s not necessary for that p a r t i c u l a r a p p l i c a t i o n . A detailed discussion of the actual indices is beyond the scope of t h i s paper and can be found in Berk (1980a). Score Interpretations The interpretation of the test scores i s l i s t e d as the f i n a l stage of test construction in Table 1. In actual fact, many of the e a r l i e r procedures are dependent on the context in which the scores w i l l be interpreted. Score interpretation i s not only important when the test i s complete, but also essential throughout the developmental stages. Some of the interpretations that can be made include; (1) referencing the test score to an objective, a domain, or a cut-off score in order to assign p a s s / f a i l grades for minimum competency in a c e r t i f i c a t i o n process, (2) to monitor an individual's progress through an educational sequence or, (3) to make decisions regarding program effectiveness. Many decisions made during the construction of a CR test, such as sel e c t i o n of appropriate v a l i d i t y and r e l i a b i l i t y procedures, are based on the context of the test score interpretations. For example, a test score that w i l l be used s o l e l y for the purpose of assigning p a s s / f a i l grades should not employ r e l i a b i l i t y or v a l i d i t y procedures which are dependent on several cut-off scores. The interpretation of the test scores should be established at the start of the test construction and used as a guide throughout the developmental stages. 20 S t r u c t u r e and Purpose of F i t n e s s I n s t r u c t o r C e r t i f i c a t i o n Program O r g a n i z a t i o n a l S t r u c t u r e of S p o r t G o v e r n i n g B o d i e s The BFTE t h a t was c o n s t r u c t e d as p a r t of t h i s s t u d y was a CR t e s t f o r the B r i t i s h Columbia R e c r e a t i o n A s s o c i a t i o n - F i t n e s s B r a n c h . Because the exam was d e v e l o p e d s p e c i f i c a l l y f o r the BC R A - F i t n e s s Branch i t was n e c e s s a r y t o o p e r a t e w i t h i n the framework and p h i l o s o p h y of t h i s o r g a n i z a t i o n . Thus, i t i s a p p r o p r i a t e t o d e s c r i b e the a d m i n i s t r a t i v e arrangement and terms of r e f e r e n c e f o r the s p o r t g o v e r n i n g b o d i e s a s s o c i a t e d w i t h t h e p r o j e c t . The o v e r a l l a u t h o r i t y f o r i n i t i a t i n g , e v a l u a t i n g , and m o n i t o r i n g any f i t n e s s o r i e n t e d p r o j e c t s i n Canada i s F i t n e s s Canada. A subcommittee f o r F i t n e s s Canada i s the N a t i o n a l Committee on F i t n e s s L e a d e r s h i p T r a i n i n g i n Canada which i s d i r e c t l y r e s p o n s i b l e f o r the t r a i n i n g of f i t n e s s l e a d e r s i n Canada. Another group - c l o s e l y a f f i l i a t e d w i t h F i t n e s s Canada i s the Canadian A s s o c i a t i o n of S p o r t S c i e n c e s (CASS). CASS has a subcommittee, the Committee on F i t n e s s A p p r a i s a l and A c c r e d i t a t i o n (CFAA), which has d e v e l o p e d t h e r e g i s t r a t i o n model t o be implemented throughout Canada. Both of t h e s e subcommittees work j o i n t l y w i t h each o t h e r as w e l l as w i t h the s p o r t g o v e r n i n g b o d i e s w i t h i n each p r o v i n c e . In B r i t i s h C olumbia, the s p o r t g o v e r n i n g body i s R e c r e a t i o n and S p o r t B r a n c h , B r i t i s h C o l u m b i a , which g e t s i t s f u n d i n g from, and i s , t h e r e f o r e , a c c o u n t a b l e t o the Government of B r i t i s h C o l umbia. Working under the a u t h o r i t y of the R e c r e a t i o n and 21 Sport Branch, B.C. i s the B r i t i s h Columbia Recreation Association (BCRA), and more s p e c i f i c a l l y the B r i t i s h Columbia Recreation Association-Fitness Branch. Therefore, the BCRA-Fitness Branch i s responsible for the implementation of the nationwide f i t n e s s programs in B r i t i s h Columbia. The BCRA-Fitness Branch established the Provincial Fitness Advisory Committee (PFAC) to provide the leadership and guidance during the implementation of the above programs. The members of the PFAC were appointed by the BCRA-Fitness Branch and included representatives from the following groups or organizations: Sports Medicine Council B r i t i s h Columbia Government B r i t i s h Columbia Medical Association B r i t i s h Columbia YMCA-YWCA Canadian Association of Sport Sciences Private sector Corporate community B r i t i s h Columbia Recreation Association-Fitness Branch An overview of t h i s structure of sport governing bodies can be seen in Appendix F. The School of Physical Education and Recreation of University of B r i t i s h Columbia (UBC) dealt j o i n t l y with the BCRA-Fitness Branch and the PFAC only during the test construction phase of the f i t n e s s instructor c e r t i f i c a t i o n program implementation. UBC acted as consultant on test 22 construction procedures and methodology, providing instruction and guidance during a l l stages. UBC provided a l l s t a t i s t i c a l and computer analyses at the item, subtest, and t o t a l test l e v e l s . O r i g i n a l l y the BCRA-Fitness Branch had a three month timeline proposed for test construction, but l a t e r adopted the UBC suggestion for a nine month timeline to ensure a reasonable q u a l i t y exam. This timeline i s shown in Table 2. It should be stressed that the end result was not a UBC test--the BCRA-Fitness Branch made a l l f i n a l decisions (usually a f t e r consulting with UBC). Purpose of C e r t i f i c a t i o n Program The sudden increase in the popularity of fitness classes resulted in a drastic shortage of q u a l i f i e d , experienced f i t n e s s i n s t r u c t o r s . Both federal and p r o v i n c i a l sport governing bodies were becoming concerned about the high number of unqualified f i t n e s s i n s tructors. Coincidently, the number of i n j u r i e s , from minor muscle stretches to torn ligaments, and stress fractures, had also increased dramatically. Both federal and p r o v i n c i a l sport governing bodies became concerned about the high number of unqualified fitness instructors, and assumed that many of these i n j u r i e s would be preventable by taking the proper precautions and following sound t r a i n i n g p r i n c i p l e s . It was also assumed that by better preparing the instructors the participants can also develop appropriate t r a i n i n g habits and, therefore, minimize i n j u r i e s . It i s not uncommon for a participant to become an instructor after several months of attending fitness classes. 23 V a r i a b l e e x p e r i e n c e s s e r v e as a b a s i s f o r the knowledge t h a t i s l e a r n e d and w i l l be passed a l o n g t o any c l a s s e s t h a t he/she t e a c h e s , r e g a r d l e s s of the worth of the i n f o r m a t i o n . As a r e s u l t , many myths, as w e l l as t r u t h s , a r e p e r p e t u a t e d v e r y q u i c k l y w i t h o u t anyone q u e s t i o n i n g t h e i r v a l i d i t y . For example, u n t i l r e c e n t l y , i t was not uncommon t o observe groups of p e o p l e p e r f o r m i n g s i t u p s i n p a i r s w i t h one p a r t n e r s i t t i n g on t h e o t h e r ' s s t r a i g h t l e g s f o r more l e v e r a g e . I t i s w e l l known, by i n f o r m e d p r o f e s s i o n a l s , t h a t t h i s p r o c e d u r e i s v e r y d e t r i m e n t a l t o the lower v e r t e b r a e because of t h e e x c e s s i v e f o r c e a p p l i e d t o t h i s a r e a as a r e s u l t of t h e s i t u p motion i n t h i s p o s i t i o n . B e n t - l e g g e d s i t u p s or p a r t i a l c u r l - u p s , w i t h o u t a p a r t n e r , i s the p r e s c r i b e d p r o c e d u r e w i t h t h i s e x e r c i s e . Another example of a h a r m f u l e x e r c i s e i s i n c l u d e d i n many warm-up p r o c e d u r e s ; t h e whole group i s l e d t h r o u g h a s y s t e m a t i c r o t a t i o n t h e i r heads i n c l o c k w i s e and c o u n t e r c l o c k w i s e c i r c l e s i n an attempt t o s t r e t c h t h e mucles around the neck. In a c t u a l i t y , t h e s e r o t a t i o n s p r o v i d e v e r y h i g h s t r e s s , t h r o u g h the g r a t i n g of the s m a l l v e r t e b r a e found i n t h e neck. The p r e s c r i b e d p r o c e d u r e f o r t h e neck muscles has been shown t o be s i d e - t o - s i d e and forward-and-back motions r a t h e r than r o t a t i o n s . T h e r e f o r e , the major purpose of the c e r t i f i c a t i o n i s t o p r o v i d e i n c r e a s e d minimum c o n t e n t and more u n i f o r m body of knowledge, b e t t e r i n s t r u c t i o n , and a r e d u c t i o n i n the number of i n j u r i e s i n the un i n f o r m e d consumer. Overview of C e r t i f i c a t i o n P r o c e s s The f i t n e s s i n s t r u c t o r c e r t i f i c a t i o n program was t o be 24 implemented through several phases. The f i r s t phase involved establishing accredited courses, for f i t n e s s leaders, with the same set of objectives throughout the province. The next phase involved having f i t n e s s leaders completing the accredited course and then applying to the PFAC for c e r t i f i c a t i o n . The PFAC would review the applicant with respect to their experience, completion of the accredited course, and then would do an on-s i t e evaluation of the teaching a b i l i t i e s . After t h i s , the applicant would have to pass the Basic Fitness Theory Exam (BFTE) in order to receive c e r t i f i c a t i o n . The BFTE i s the criterion-referenced multiple choice test developed within t h i s study. The l a s t phase of the program was for the f i t n e s s center, i t s e l f , to be evaluated in order to e s t a b l i s h accredited f i t n e s s centers throughout B r i t i s h Columbia. These would be judged on various c r i t e r i a , one of which was q u a l i f i c a t i o n s of instuctors. The end result would be accredited f i t n e s s centers with proper f a c i l i t i e s and c e r t i f i e d i n s t r u c t o r s . CR Test Construction Within the BCRA-Fitness Branch Framework As stated previously, a l l f i n a l decisions regarding the test were made by the BCRA-Fitness Branch with input from the PFAC and UBC. UBC acted primarily as a consultant and advisor throughout the test construction process. The steps outlined e a r l i e r (in the CR Testing Overview section) were used primarily as guidelines for t h i s process. The f i r s t r e s t r i c t i o n the BCRA-Fitness Branch placed on the test was to i n s i s t that the test would be less than one hour in 25 length. They also wanted a multiple choice test for ease of marking. With the high number of applicants expected to write the test, the BCRA-Fitness Branch wanted to keep the administrative and marking costs to a minimum. The l a s t r e s t r i c t i o n involved the passing or f a i l i n g of the examinees. The examinees would pass or f a i l the test based on the t o t a l test score only. The subtest scores would not be considered, and thus an examinee could completely f a i l one subtest provided that he/she did reasonably well on the other two subtests. 26 Test Construction Procedures and Results for the BFTE Overview Table 2 below shows the general timeline for the procedures which were proposed for the construction of the BFTE. It i s based on the f i r s t column of Table 1 which l i s t e d the developmental stages for CR test construction as suggested by Berk (1980b). Table 2 also shows that the CR test construction i s b a s i c a l l y an i t e r a t i v e process which continues i n d e f i n i t e l y ( p i l o t #1, two, three, ... e t c . ) ; a "finished" test must be continually monitered, and when necessary, updated to the current application (population, test use, e t c . ) . This section of the paper deals with the construction procedures that were actually followed and the r e s u l t s obtained while constructing the BFTE. 27 T a b l e 2 Proposed Time L i n e f o r t h e P r o c e d u r e s Date P r o c e d u r e Sept. 15 Domain s p e c i f i c a t i o n t o P r i o r i z a t i o n and g r o u p i n g of s p e c i f i c a r e a s Dec. 15 C o n s t r u c t s p e c i f i c a t i o n s f o r i t e m g e n e r a t i o n and m a i l t o committee members J a n . 12 C o l l e c t i tems from committee members J a n . 26 C o n s t r u c t p i l o t #1 (feedback from committee) J a n . 31 F i n a l r e v i s i o n s of p i l o t #1 Feb. 2 A d m i n i s t e r p i l o t #1 (two UBC c l a s s e s ; N=200) Feb. 12 Computer a n a l y s e s (LERTAP s t a t i s t i c a l package) Item and f a c t o r a n a l y s e s Item s e l e c t i o n and r e v i s i o n Feb. 20 C o n s t r u c t p i l o t #2 D i s t r i b u t e f o r feedback S e l e c t a c u t - o f f s c o r e Feb. 25 F i n a l r e v i s i o n s of p i l o t #2 Feb. 28 A d m i n i s t e r p i l o t #2 Mar. 25 Computer a n a l y s e s Item and f a c t o r a n a l y s e s R e l i a b i l i t y V a l i d i t y Item s e l e c t i o n and r e v i s i o n Mar. 31 P r e s e n t r e s u l t s t o the PFAC and the BCRA-Fitness Branch Apr. 30 Complete " f i n a l " d r a f t ( p i l o t #3) 28 P i l o t T est #1  I d e n t i f i c a t i o n of L e a r n i n g Outcomes Once the BCRA needs had been e s t a b l i s h e d and q u a l i t y c o n t r o l i s s u e s r e s o l v e d the a c t u a l t e s t c o n s t r u c t i o n began. The f i r s t s t a g e i n v o l v e d d e f i n i n g the domain of knowledge ( i . e . , the o b j e c t i v e s ) t o be t e s t e d . I n Ta b l e 1 t h i s s t a g e i s r e f e r r e d t o as c o n t e n t domain s p e c i f i c a t i o n . As the t e s t was t o be a d m i n i s t e r e d t o s t u d e n t s who had j u s t c ompleted the f o r t y - h o u r i n s t r u c t o r ' s c o u r s e , t h i s s t a g e of the p r o j e c t i n v o l v e d d e f i n i n g the c o n t e n t of t h i s c o u r s e . The c o u r s e c o n t e n t was s p e c i f i e d i n terms of main l e a r n i n g o b j e c t i v e s which were c a t e g o r i z e d i n t o e i g h t " s p e c i f i c a r e a s . " These were c o n t a i n e d i n the " F i t n e s s I n s t r u c t o r C r i t e r i a P r o s p e c t u s " which was p r o v i d e d by the PFAC on b e h a l f of the BCRA. The " s p e c i f i c a r e a s " t o be t e s t e d i n c l u d e d : , 1. P l a n n i n g 2. B a s i c s of Anatomy and P h y s i o l o g y 3. S a f e t y 4. E x c e r c i s e P r i n c i p l e s f o r A d u l t F i t n e s s C l a s s e s 5. L e a r n i n g T h e o r i e s 6. T e a c h i n g S t r a t e g i e s 7. L e a d e r s h i p 8. E v a l u a t i o n 29 Ranking S p e c i f i c Area. Because of the large number of " s p e c i f i c areas", with each containing many "main objectives" the members of the PFAC were asked to rate the r e l a t i v e importance of the eight " s p e c i f i c areas." The members were asked to subjectively rate the three most important and the the three least important " s p e c i f i c areas." A copy of the rating sheet that was used can be found in Appendix G. Eleven members responded and aft e r tabulating the results a clear order was established. From the most to least important, the areas were ranked as in column one of Table 3. 30 T a b l e 3 P r i o r i z a t i o n of S p e c i f i c A reas New Number Rank Area C a t e g o r y No. Of Items 1. P r i n c i p l e s of A d u l t F i t n e s s C l a s s e s 1 12 2. B a s i c s of Anatomy and P h y s i o l o g y 2 12 [ i n c l u d i n g N u t r i t i o n ] 3. S a f e t y 3 9 4. T e a c h i n g S t r a t e g i e s 5. L e a d e r s h i p 4 8 8. L e a r n i n g T h e o r i e s 6. P l a n n i n g 7. E v a l u a t i o n 5 7 Total=48 31 Because of the c o n s i s t e n t l y low r a t i n g g i v e n t o c a t e g o r i e s f o u r t h r o u g h e i g h t , and a l s o the s u b j e c t i v e comments p r o v i d e d by the j u d g e s , i t was d e c i d e d t h a t t h e s e " s p e c i f i c a r e a s " s h o u l d be grouped t o g e t h e r t o form two l a r g e r , more g e n e r a l c a t e g o r i e s . T e a c h i n g S t r a t e g i e s , L e a d e r s h i p , and L e a r n i n g T h e o r i e s were combined t o e s t a b l i s h the new f o u r t h c a t e g o r y , w h i l e P l a n n i n g and E v a l u a t i o n were combined f o r the new f i f t h c a t e g o r y . The f i v e new c a t e g o r i e s are shown column t h r e e of T a b l e 3. T a b l e 3 a l s o shows the proposed number of t e s t items f o r each of t h e f i v e new c a t e g o r i e s . . T h i s a g a i n i s based on the r e l a t i v e i m p o r t a n c e of the f i v e c a t e g o r i e s and a l s o on the p r e f e r e n c e f o r a t e s t l e n g t h of a p p r o x i m a t e l y 45 i t e m s . I t has been shown t h a t one can assume t h a t examinees r e q u i r e a p p r o x i m a t e l y one minute per i t e m f o r m u l t i p l e c h o i c e items w i t h f o u r o p t i o n s i f the items a r e t e s t i n g a t or below the a p p l i c a t i o n s t a g e i n Bloom's taxonomy ( B e r k , 1980b). I t has a l s o been shown t h a t the minimum number of items r e q u i r e d f o r any s p e c i f i c a r e a i s a p p r o x i m a t e l y s i x ( B e r k , 1980b; W i l c o x , 1981). These f a c t o r s were a l l c o n s i d e r e d when the number of items f o r each s u b t e s t were d e t e r m i n e d . S u b m i t t i n g Items. At t h i s s t a g e i n the t e s t c o n s t r u c t i o n , members of t h e PFAC were asked t o submit items r e l a t i n g d i r e c t l y t o "main o b j e c t i v e s " w i t h i n t h e i r chosen " s p e c i f i c a r e a s . " They were a s k e d t o p i c k " s p e c i f i c a r e a s " about w h i c h they f e l t most kn o w l e d g e a b l e . Each member was f a m i l i a r w i t h the l i s t of "main o b j e c t i v e s " i n the F i t n e s s I n s t r u c t o r C r i t e r i a P r o s p e c t u s and was p r o v i d e d w i t h g u i d e l i n e s f o r s u b m i t t i n g i t e m s , as shown i n Appendix A ( B e r k , 1 9 8 0 b ) , which i n c l u d e d f i v e sample i t e m s . The 32 s u b m i t t e d items were t o be c a t e g o r i z e d by " s p e c i f i c a r e a " and "main o b j e c t i v e " b e i n g t e s t e d . I t was a l s o i m p o r t a n t t o s t r e s s t h a t the c o r r e c t answer be s u p p l i e d ( c i r c l e , a s t e r i s k , e t c . ) . Items were c o l l e c t e d over s e v e r a l months w i t h the a n t i c i p a t i o n t h a t p i l o t number one would be c o n s t r u c t e d w i t h a p p r o x i m a t e l y n i n e t y - s i x i t e m s , t w i c e t h e number needed f o r the f i n a l v e r s i o n . I t i s g e n e r a l l y assumed (Berk, 1980b) t h a t a p p r o x i m a t e l y h a l f of the items used i n the f i r s t p i l o t w i l l have t o be removed or d r a s t i c a l l y r e v i s e d . I f l e s s than t w i c e t h e needed number of items i s used i n a f i r s t p i l o t then t h e t e s t c o n s t r u c t o r runs the r i s k of not h a v i n g enough items i n the f i n a l v e r s i o n . I f more than t w i c e t h e needed number of items a r e i n the f i r s t p i l o t then the t e s t c o n s t r u c t o r c o u l d run i n t o a d m i n i s t r a t i o n a l problems such a s : 1. f a t i g u e i n examinees 2. m o t i v a t i o n of examinees 3. time c o n s t r a i n t s ( c l a s s e s , s u b j e c t s , e t c . ) 4. c o s t of a d m i n s t r a t i o n ( p e r s o n n e l i n v i g i l a t i n g , l o c a t i o n , p h o t o c o p y i n g , d a t a e n t r y / a n a l y s i s ) Over two hundred items were c o l l e c t e d from n i n e members of the PFAC. The breakdown of the items r e c e i v e d as they r e l a t e d t o the f i v e new c a t e g o r i e s was as f o l l o w s : 33 Number of Items S p e c i f i c C a t e g o r y S u b m i t t e d 1 . P r i n c i p l e s of A d u l t F i t n e s s C l a s s e s 51 2. B a s i c s of Anatomy and P h y s i o l o g y (+ N u t r i t i o n ) 83 3. S a f e t y 44 4. T e a c h i n g S t r a t e g i e s / L e a d e r s h i p / L e a r n i n g T h e o r i e s 13 5. P l a n n i n g / E v a l u a t i o n 17 Total=208 A r e l a t i v e l y l a r g e number of it e m s had been s u b m i t t e d f o r c a t e g o r i e s one, two, and t h r e e , however, o n l y a few q u e s t i o n s had been s u b m i t t e d f o r c a t e g o r i e s f o u r and f i v e and most of t h e s e were e i t h e r i n t u i t i v e l y o b v i o u s (no v a l i d d i s t r a c t o r s were p o s s i b l e ) or i n a p p r o p r i a t e f o r the b a s i c f i t n e s s l e a d e r . F o l l o w i n g l e n g t h y meetings w i t h c o u r s e i n s t r u c t o r s and PFAC members i t was c o n c l u d e d t h a t the q u e s t i o n s i n c a t e g o r i e s f o u r and f i v e d i d not r e f l e c t t h e c o n t e n t s of the f o r t y - h o u r c o u r s e f o r b a s i c f i t n e s s l e a d e r s , and i t was a g r e e d t h a t a l t h o u g h the m a t e r i a l i n these a r e a s i s w i t h i n the o v e r a l l l i s t of o b j e c t i v e s , i n a c t u a l f a c t , v e r y l i t t l e i s t a u g h t i n t h e s e a r e a s d u r i n g t h e l i m i t e d time a v a i l a b l e i n the c o u r s e . C l o s e r e x a m i n a t i o n of the d e t a i l e d t a b l e of c o n t e n t s showed t h a t t h e b a s i c f i t n e s s l e a d e r i s not r e q u i r e d t o handle a d m i n i s t r a t i v e p r o c e d u r e s such as a d v e r t i s i n g , o v e r a l l program p l a n n i n g and e v a l u a t i o n , and i n d i v i d u a l f i t n e s s a p p r a i s a l and e x e r c i s e p r e s c r i p t i o n . The b a s i c f i t n e s s l e a d e r i s more concerned w i t h e v a l u a t i o n s , based on e x e r c i s e and s a f e t y p r i n c i p l e s , done w h i l e 34 conducting a c l a s s . Also, the material in category four i s much too complex to teach in a d e t a i l during a forty-hour course. Because of the subjective nature of the material, only a minimal treatment i s possible in t h i s time. Therefore, i t was decided that the examinee's knowledge of materal in categories four and f i v e would be assessed through the Individual Competency Evaluation (ICE), rather than the BFTE. The ICE would be an assessment of the examinees knowledge and s k i l l s in a p r a c t i c a l s i t u a t i o n . This would allow for a much more objective evaluation of the examinee's knowledge with the BFTE. Based on the above decision, a new breakdown was derived for the proposed number of items in each of the three s p e c i f i c areas which were retained. The breakdown would r e f l e c t the r e l a t i v e importance of each category. As mentioned e a r l i e r , i d e a l l y , p i l o t #1 would consist of approximately ninety items. However, as the subject pool of examinees for p i l o t #1 were to be Physical Eduation students at The University of B r i t i s h Columbia, a maximum time l i m i t of 50 minutes was imposed (this i s the length of the c l a s s e s ) . This i s also implied that a maximum of 50 items should be used, based on the theory that approximately one minute i s required for each item (Berk, 1980b). Having assumed that these students could continue to write the exam into the 10 minute break they have between classes, a 57-item test was constructed. It consisted of 20 items from category one ( P r i n c i p l e s of Adult Fitness Classes), 22 items from category two (Basics of Anatomy and Physiology, including N u t r i t i o n ) , and 15 items from category three (Safety). 35 Randomization Procedures The items from each of the three categories were scrambled using a table of random numbers. This was done to ensure that there was no pattern in the selection of the questions from each category. After t h i s was complete, the table of random numbers was again used to ensure that no s p e c i f i c pattern emerged for the correct response to each question. While many randomization procedures ex i s t , a simple table of random numbers was employed for both randomizing the categories and the correct responses. V a l i d i t y Having organized the randomization of the items and the correct responses, a 57-item f i r s t draft of p i l o t #1 was constructed. At thi s point, several PFAC members subjectively evaluated the test to provide information concerning i t s face v a l i d i t y . Face v a l i d i t y i s "established when a person examines the test and concludes -that i t measures the relevant t r a i t . The person making t h i s examination can be anyone from an expert to an examinee" (Allen and Yen, p.96). A committee was then formed to make the f i n a l revisions to p i l o t #1. These changes were made based on the following considerations: 1. the information available concerning face v a l i d i t y 2. deviations from item construction guidelines 3. the need to correct and to ensure the consistency of the grammer throughout each item and the entire test 36 P i l o t #1 of the test was now ready to be administered. Administration Two classes in the Physical Education program at the UBC were designated as the examinees. The f i r s t course, Physical Education 163 (Biodynamics of Physical A c t i v i t y ) , i s compulsory for students entering the Bachelor of Physical Education program. These students (Year I group) were assumed to be r e l a t i v e l y uninstructed in the body of knowledge being tested through p i l o t #1. The p i l o t test was administered to t h i s group within the f i r s t few weeks of e n r o l l i n g in t h i s course. The second course, Physical Education 391 (Human Functional Anatomy and Applied Physiology), i s also compulsory, but i s taken by students in their second year. These students (Year II) had completed one term of th e i r course when they wrote the BFTE p i l o t test These two "diverse" groups of subjects permitted for an external variable to be used in evaluating the discrimination a b i l i t y of each item and the test as a whole. Several other factors were also considered to be important in the selection of these two classes. The factors included: 1. sample size -- both classes are r e l a t i v e l y large (approximately one hundred students each) 2. background of the subjects — i n t e r e s t in physical f i t n e s s 3. convenience -- easy to arrange -- easy to administer -- location (able to provide consistent 37 a d m i n s t r a t i v e c o n d i t i o n s , i n s t r u c t i o n s , e t c . ) 4. c o s t -- no d e l i v e r y / r e t u r n c o s t s 5. time — no d e l a y i n g e t t i n g completed exams back I n t o t a l , p i l o t #1 was a d m i n i s t e r e d t o 92 Year I and 72 Year I I s u b j e c t s . R e s u l t s S u b j e c t i v e feedback. To a c q u i r e u s e f u l s u b j e c t i v e f e e dback, d i r e c t q u e s t i o n s were asked of the examinees ( t h e q u e s t i o n s which were i n c l u d e d w i t h p i l o t #1 a r e shown i n Appendix D). T h i s feedback was v a l u a b l e i n d e t e c t i n g ambiguous q u e s t i o n s , d i f f i c u l t or i n a p p r o p r i a t e t e r m i n o l o g y , poor d i s t r a c t o r s , or many problems which do not become a p p a r e n t t h r o u g h the a n a l y s i s of the i t e m r e s p o n s e s . The d i f f e r e n c e s and s i m i l a r i t i e s between the s u b j e c t i v e r e s p o n s e s by each of t h e c r i t e r i o n groups s e r v e d t o h i g h l i g h t i m p e r f e c t i o n s i n i t e m c o n s t r u c t i o n i d e n t i f i e d p o t e n t i a l problem a r e a s . The f e e d b a c k , i n c o n j u n c t i o n w i t h the s t a t i s t i c a l a n a l y s i s , was used t o make the changes t o p i l o t #1. For example, 38 examinees (16 Year I and 22 Year I I ) i n d i c a t e d t h a t t hey d i d not u n d e r s t a n d th e word " v a r u s " i n q u e s t i o n number f o r t y - f i v e . The d a t a a n a l y s i s showed t h a t o n l y 41% of a l l s u b j e c t s answered t h i s i t e m c o r r e c t l y and t h a t i t d i s c r i m i n a t e s p o o r l y between the Year I and Year I I s u b j e c t s . As a r e s u l t of the above, and f o l l o w i n g c l o s e r e x a m i n a t i o n of the c o u r s e o b j e c t i v e s , as w e l l as m e e t i n g s w i t h c o u r s e l e a d e r s , the PFAC d e c i d e d t o r e p l a c e the i t e m . I t s h o u l d be noted t h a t many of the examinees commented 38 t h a t i t would have been more u s e f u l i f the s u b j e c t i v e feedback q u e s t i o n n a i r e was p l a c e d a t the s t a r t of the exam i n s t e a d of the end. T h i s would a l l o w the examinees t o more c r i t i c a l as they c o u l d r e a d each i t e m and a s s e s s i t a c c o r d i n g t o the g u i d e l i n e s i n the q u e s t i o n n a i r e . When i t was p r e s e n t e d a t the end, many of t h e examinees c o u l d not answer the q u e s t i o n s about the items v e r y w e l l because t h e y d i d not remember where the problems had been e n c o u n t e r e d as t h e y d i d the t e s t . S t a t i s t i c a l a n a l y s e s - O v e r v i e w . The d a t a c o l l e c t e d from p i l o t #1 were a n a l y z e d on the UBC mainframe computer u s i n g the L a b o r a t o r y of E d u c a t i o n R e s e a r c h Test A n a l y s i s Package (LERTAP). The LERTAP package p r o v i d e s s t a t i s t i c a l i n f o r m a t i o n a t a v a r i e t y of l e v e l s , i n c l u d i n g t o t a l t e s t , s u b t e s t , i t e m , and i n d i v i d u a l l y f o r a l l f o u r p o s s i b l e r e s p o n s e s . I t a l s o p r o v i d e s s u b t e s t and t o t a l t e s t s c o r e s f o r each of the examinees. The o u t p u t was used t o do an a n a l y s i s of p i l o t #1 a t the i t e m , s u b t e s t , and t o t a l t e s t l e v e l s . As the t o t a l t e s t was t o be used t o d e t e r m i n e whether or not i n d i v i d u a l s have s u f f i c i e n t knowledge t o be c e r i t i f i e d as b a s i c f i t n e s s l e a d e r s , i t s most i m p o r t a n t c h a r a c t e r i s t i c i s i t s a b i l i t y t o d i s c r i m i n a t e between t h o s e who a r e knowledgeable and those who a r e n o t . The o v e r a l l t e s t s c o r e must d i s c r i m i n a t e w h i l e m i n i m i z i n g the number of f a l s e - n e g a t i v e s and f a l s e - p o s i t i v e s . The i t e m l e v e l a n a l y s i s i s d e s c r i b e d i n t h e f o l l o w i n g s e c t i o n (see Appendix H f o r d e f i n i t i o n s and more i n f o r m a t i o n on the i n t e r p r e t a t i o n of i t e m s t a t i s t i c s ) . S t a t i s t i c a l a n a l y s e s - I t e m l e v e l . The i t e m a n a l y s i s r e s u l t s a r e d e s c r i b e d t h r o u g h t h e use of two examples of major t y p e s of 39 item s t a t i s t i c s that can emerge; an item with a high p-value and an item with a low p-value for the correct response (Due to the l i m i t a t i o n s of space in t h i s paper, only two types of items are considered). For each type, the discussion w i l l consist of the following fi v e parts: 1. label the type of item 2. present the o r i g i n a l item 3. present the item s t a t i s t i c s , including the subjective comments 4. i d e n t i f y the patterns and suggest revisions 5. present the new version of the item 40 High P-value (Q1, subtest EP.item #1) Correlations Means Option P-value ST TT EC ST TT EC * 1 96.3 * 0.04 0.10 0.04 * 14.43 34. 1 4 0.44 2 0.0 . 0.0 0.0 0.0 0.0 0.0 0.0 3 2.4 -0.03 -0.06 0.02 1 4.00 31 .75 0.50 4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Other 1.2 -0.02 -0.09 -0.10 14.00 29.50 0.00 1. Cardiorespiratory endurance can be best defined as: * a) The e f f i c i e n c y of the heart and lungs b) The a b i l i t y to sprint 100 metres in 10 seconds c) V i t a l capacity plus residual volume d) The mobility of the jo i n t No subjective comments were made by the Year I group. One Year II examinee referred to the item as too easy. The high p-value -(96.3) suggests that perhaps the item was too easy for both the Year I and Year II subjects. This i s reconfirmed by the low (almost zero) c o r r e l a t i o n with the external c r i t e r i o n and by the mean value for the external c r i t e r i o n of 0.44. Both of these s t a t i s t i c s suggest that the correct answer was chosen by approximately equal numbers of Year I and Year II subjects. The fact that v i r t u a l l y everyone selected the correct response also explains the low c o r r e l a t i o n between the correct answer and both the subtest scores and the t o t a l test score. A much higher degree of variance in the scores i s necessary before a reasonably high c o r r e l a t i o n is 41 p o s s i b l e . Berk (1980b) and A l l e n and Yen (1979) have suggested t h a t an i d e a l p - v a l u e i s between 40.0 and 70.0. D i s t r a c t o r s two and f o u r b oth have p - v a l u e s of z e r o , i n d i c a t i n g t h a t nobody s e l e c t e d them as p o s s i b l e c o r r e c t answers. As a r e s u l t , b o t h of them were r e p l a c e d . D i s t r a c t o r t h r e e was s e l e c t e d by 2.4 p e r c e n t (n =6) of the s u b j e c t s . U n f o r t u n a t e l y , f o u r of t h e s e were from the Year I I group. The n e g a t i v e , but s m a l l c o r r e l a t i o n s w i t h s u b t e s t and t o t a l t e s t s c o r e s i n d i c a t e t h a t the s u b j e c t s c h o o s i n g t h i s d i s t r a c t o r a r e l e s s k n owledgeable. T h e r e f o r e i t was assumed t h a t the f o u r Year I I s u b j e c t s who chose d i s t r a c t o r t h r e e were a c t u a l l y l e s s knowledgeable i n t h i s a r e a . T h i s i s a l s o shown by comparing the t o t a l the t o t a l t e s t mean (31.75) f o r the s u b j e c t s who chose d i s t r a c t o r t h r e e w i t h those who chose th e c o r r e c t answer 9 ( 3 4 . 1 4 ) . T h i s i n d i c a t e s t h a t d i s t r a c t o r t h r e e i s h e l p i n g t o d i s c r i m i n a t e . That i s , t h e s e f o u r Year I I s u b j e c t s a r e not r e a l l y b e i n g c l a s s i f i e d as f a l s e - n e g a t i v e s . A f t e r c o n s i d e r a t i o n of the s u b j e c t i v e comments and the items s t a t i s t i c s , d i s t r a c t o r s two (b) and f o u r (d) were r e p l a c e d . The new v e r s i o n of the i t e m was r e a d y . 1. C a r d i o r e s p i r a t o r y endurance can be b e s t d e f i n e d a s : * a) The e f f i c i e n c y of the h e a r t and l u n g s b) S t r o k e volume t i m e s h e a r t r a t e c) V i t a l c a p a c i t y p l u s r e s i d u a l volume d) C a r d i a c o u t p u t minus r e s i d u a l volume 42 Low P-value (Q8, s u b t e s t A P.item #4) C o r r e l a t i o n s Means O p t i o n P - v a l u e ST TT EC ST TT EC 1 36.6 -0.13 -0.06 -0.11 1 2.72 33.55 0.37 * 2 36.0 * 0.32 0.26 0.28 * 14.66 36.07 0.63 3 17.1 -0.23 -0.21 -0.14 1 1 .68 31 .29 0.29 4 8.5 -0.02 -0.03 -0.05 1 3.07 33.43 0.36 Other 1 .8 8. U l n a r d e v i a t i o n means: a) To t u r n the w r i s t outwards * b) To t u r n the w r i s t inwards c) To t u r n the w r i s t up d) To t u r n the w r i s t down One s u b j e c t i n the Year I I group i n d i c a t e d t h a t the was c o n f u s i n g . No o t h e r comments were made by s u b j e c t s from t h i s group. Two s u b j e c t s from the Year I group a l s o found the item c o n f u s i n g . Two o t h e r s f e l t t h e r e was e i t h e r no answer or more than one answer. E i g h t s u b j e c t s from t h e Year I group s u g g e s t e d t h a t t h e terms " u l n a r d e v i a t i o n " were u n f a m i l i a r and p r e s e n t e d d i f f i c u l t y . The c o r r e c t answer had p o s i t i v e c o r r e l a t i o n s w i t h the s u b t e s t , t o t a l t e s t , and the e x t e r n a l c r i t e r i o n . I t a l s o had s u b t e s t and t o t a l t e s t means which were much h i g h e r than f o r each of t h e d i s t r a c t o r s . A l l of the d i s t r a c t o r s had n e g a t i v e c o r r e l a t i o n s w i t h the s u b t e s t , t o t a l t e s t , and the e x t e r n a l c r i t e r i o n . The c o r r e l a t i o n s f o r d i s t r a c t o r 4(d) were not 43 s i g n i f i c a n t l y d i f f e r e n t from z e r o . That i s , i t i s the o n l y o p t i o n t h a t does not appear t o d i s c r i m i n a t e among the knowledgeable and those w i t h l i t t l e or no knowledge i n t h i s a r e a . However, even d i s t r a c t o r 4 ( d ) , as do the o t h e r d i s t r a c t o r s , a t t r a c t s more Year I t h a t Year I I s u b j e c t s . T h i s was shown by the mean e x t e r n a l c r i t e r i o n v a l u e of 0.36. In g e n e r a l , i t c o u l d be s t a t e d t h a t t h i s i t e m d i s c r i m i n a t e s q u i t e w e l l among not o n l y the knowledgeable and those w i t h o u t s u f f i c i e n t knowledge, but a l s o among the Year I and Year I I s u b j e c t s . However, a f t e r the PFAC ( r e v i s i o n committee) was p r e s e n t e d w i t h the above f i n d i n g s t hey recommended t h a t the e n t i r e i t e m s h o u l d be r e p l a c e d . They f e l t t h a t even though the i t e m d i s c r i m i n a t e d w e l l , the low p - v a l u e (36.0) suggested t h a t i t was t o o d i f f i c u l t . That i s , they f e l t t h a t too many f a l s e -n e g a t i v e d e c i s i o n s were made. I t was a l s o f e l t t h a t the i t e m was t e s t i n g f o r knowledge of t e r m i n o l o g y t h a t was beyond the scope of the f o r t y - h o u r c o u r s e . Knowledge of the term " u l n a r d e v i a t i o n " was not s t r e s s e d as an i m p o r t a n t o b j e c t i v e f o r the c o u r s e . The i t e m was r e p l a c e d by the f o l l o w i n g i t e m : 44 8. In a n a t o m i c a l p o s i t i o n , the bone l o c a t e d m e d i a l l y i n f o r earm i s c a l l e d t h e : a) r a d i u s * b) u l n a c) t i b i a d) f i b u l a The i t e m replacement here re-emphasizes the f a c t t h a t the i t e m s t a t i s t i c s d i d not e n t i r e l y d i c t a t e the i t e m r e v i s i o n p r o c e s s ; but were j u s t used t o g u i d e the t e s t c o n s t r u c t o r i n the d e c i s i o n - m a k i n g p r o c e s s . T a b l e 4 shows a comparison of the p - v a l u e s between the Year I and Year I I groups f o r each of the items i n p i l o t #1. For items which d i s c r i m i n a t e w e l l , the p - v a l u e i s c o n s i d e r a b l y h i g h e r f o r the Year I I group than f o r the Year I group. For example, i t e m number #19 d i s c r i m i n a t e s w e l l , showing a d i f f e r e n c e of 60.9 between the Year I I (75.0) and the Year I (14.1) g r o u p s . An example of a poor d i s c r i m i n a t o r i s i t e m #25 which has p - v a l u e s (Year 1=34.8 and Year 11=34.7) which are, v i r t u a l l y e q u a l . The items have been l a b e l l e d , f o r easy i d e n t i f i c a t i o n of the p o t e n t i a l p r oblems. Those l a b e l l e d w i t h a "*" i n d i c a t e t h a t the Year I I group does have a h i g h e r p - v a l u e than t h e Year I group, but by o n l y f i v e or l e s s p o i n t s . Those l a b e l l e d w i t h a " ? " i n d i c a t e an i t e m on which the Year I group a c t u a l l y has a h i g h e r p - v a l u e than the Year I I group. Both of t h e s e t y p e s of s i t u a t i o n s w i t h p - v a l u e s can warn the t e s t maker of i t e m s which may be ambiguous, i n c o r r e c t l y keyed, p o o r l y worded. 45 T a b l e 4 S u b t e s t I tern Year I Year I I S u b t e s t I tern Year I Year I I EP 1 95.7 97.2 * S 30 54.3 73.6 AP 2 85.9 94.4 EP 31 60.9 61.1 * AP 3 62.0 43. 1 ? EP 32 76. 1 84.7 S 4 10.9 19.4 EP 33 65.2 66.7 * S 5 12.0 11.1 ? S 34 6.5 15.3 S 6 20.7 43. 1 S 35 48.9 47.2 ? AP 7 60.9 84.7 AP 36 87.0 95.8 AP 8 23.9 51.4 S 37 57.6 65.3 AP 9 87.0 88.9 * AP 38 8.7 66.7 EP 1 0 91 .3 97.2 EP 39 73.9 91.7 S 1 1 60.9 68. 1 AP 40 63.0 94.4 S 1 2 40.2 40.3 * EP 41 87.0 95.8 EP 1 3 95.7 95.8 * AP 42 35.9 31.9 ? AP 1 4 84.8 • 83.3 ? AP 43 38.0 22.2 ? EP 1 5 97.2 73.6 ? AP 44 25.0 66.7 AP 16 34.8 100.0 AP 45 41.3 40.3 ? AP 1 7 59.6 80.6 EP 46 93.5 97.2 * AP 18 32.6 54.2 S 47 27.2 48.6 AP 19 14.1 75.0 AP 48 63.0 40.3 ? S 20 25.0 13.9 ? EP 49 75.0 90.3 EP 21 47.8 69.4 EP 50 91 .3 97.2 S 22 17.4 25.0 S 51 66.3 58.3 ? EP 23 45.7 56.9 S 52 77.2 84.7 AP 24 94.6 88.9 ? AP 53 54.3 68. 1 46 EP 25 34.8 34.7 ? EP 54 31.5 34.7 * EP 26 51 . 1 73.6 EP 55 68.5 62.5 ? EP 27 95.7 90.3 ? S 56 66.3 72.2 AP 28 87.0 77.8 ? EP 57 28.3 41 .7 AP 29 55.4 44.4 ? 1. Year I (N = 92) 2. Year I I (N = 72) 3. ? Year I p - v a l u e i s g r e a t e r than or e q u a l t o Year I I 4. * Year I I p - v a l u e i s g r e a t e r than Year I by 5 or l e s s 47 or c o n t a i n poor d i s t r a c t o r s . T a b l e 4 a l s o shows which s u b t e s t the items b e l o n g t o w i t h the symbols EP ( E x e r c i s e P r i n c i p l e s f o r A d u l t C l a s s e s ) , AP (Anatomy and P h y s i o l o g y p l u s N u t r i t i o n ) , S ( S a f e t y ) . T h i s a l l o w s f o r a q u i c k i d e n t i f i c a t i o n of a s u b t e s t w h i c h may c o n t a i n many poor i t e m s . O v e r a l l , 8 items were l a b e l l e d w i t h a "*", and 17 ite m s were l a b e l l e d w i t h a " ? " . T h i s shows t h a t , based on p - v a l u e s , a t l e a s t 25 out of 57 items needed f u r t h e r e x a m i n a t i o n f o r p o s s i b l e r e v i s i o n s or d e l e t i o n s . S t a t i s t i c a l a n a l y s e s - S u b t e s t ( S T ) and T o t a l T e s t ( T T ) . The ST and TT s t a t i s t i c s p r o v i d e d v e r y u s e f u l i n f o r m a t i o n f o r t h e t e s t c o n s t r u c t i o n p r o c e s s . These i n c l u d e d means, s t a n d a r d d e v i a t i o n s , low s c o r e , and h i g h s c o r e f o r each s u b t e s t and f o r the t o t a l t e s t as shown i n T a b l e 5 f o r p i l o t #1. The Year I I group had a h i g h e r mean s c o r e than the Year I group on t h e t o t a l t e s t (65% and 56%, r e s p e c t i v e l y ) , and on each of the s u b t e s t s . T h i s can be i n t e r p r e t e d as l e n d i n g s u p p o r t t o the v a l i d i t y of the t e s t as the m a j o r i t y of the s u b j e c t s would be c a t e g o r i z e d a c c u r a t e l y by the t o t a l t e s t s c o r e s . S u b t e s t t h r e e ( S a f e t y ) appeared t o be t h e most d i f f i c u l t , or a t l e a s t i t had the l o w e s t s u c c e s s r a t e , w i t h both groups s c o r i n g under f i f t y p e r c e n t . T h i s i n d i c a t e s t h a t some problems e x i s t e i t h e r i n the i t e m s , i n the l i s t of o b j e c t i v e s , o r i n t h e assumptions about t h e s e s u b j e c t s . C l o s e r e x a m i n a t i o n of the item l e v e l s t a t i s t i c s showed t h a t t h e r e were problems w i t h the S a f e t y i t e m s . T a b l e 5 D e s c r i p t i v e S t a t i s t i c s f o r P i l o t #1 Year I Year I I Combined T o t a l Test # of o b s e r v a t i o n s 92 72 164 mean 31.76=56% 36. 92 = 65% 34. 02 = 60% s 4.98 5. 64 5. 85 low 20 26 20 h i g h 45 48 48 # of items 57 57 57 S u b t e s t #1: E x e r c i s e P r i n c i p l e s mean 13.86=69% 15. 1 3= 76% 14. 41 = 72% s 1 .91 2. 37 2. 21 low 1 1 9 9 h i g h 18 19 19 # of items 20 20 20 S u b t e s t #2: Anatomy and P h y s i o l o g y mean 11.99=55% 14. 93 = 68% 13. 28 = 60% s 2.93 2. 81 3. 22 low 5 8 5 h i g h 20 20 20 # of items 22 22 22 49 Year I Year I I Combined S u b t e s t #3: S a f e t y mean 5.91=39% 6.86=46% 6.33=42% s 1.88 2.18 2.07 low 2 3 2 h i g h 10 13 13 # of items 15 15 15 50 The d i s p e r s i o n of s c o r e s , as r e p r e s e n t e d by t h e s t a n d a r d d e v i a t i o n , was v e r y s i m i l a r f o r both groups i n a l l t h r e e s u b t e s t s and f o r the t o t a l t e s t . A l s o , no major d i f f e r e n c e s were d e t e c t e d between groups on the low or h i g h s c o r e s i n each c a t e g o r y . T a b l e 6 a l s o d e a l s w i t h s u b t e s t and t o t a l t e s t s t a t i s t i c s . I t g i v e s the c o r r e l a t i o n s between s u b t e s t s , t o t a l t e s t , and an e x t e r n a l c r i t e r i o n (Year I or Year I I ) f o r p i l o t #1. A l l c o r r e l a t i o n s a r e p o s i t i v e , as e x p e c t e d , which shows t h a t t h e r e a r e no major problems w i t h the s u b t e s t s . A l l c o r r e l a t i o n s w i t h the t o t a l t e s t a r e 0.67 or h i g h e r f o r t h e combined g r o u p s , which a g a i n s u p p o r t s the r e l i a b i l i t y of the t o t a l t e s t s c o r e as w e l l as the s u b t e s t s . The l o w e s t c o r r e l a t i o n s a r e w i t h s u b t e s t t h r e e ( S a f e t y ) . T h i s a l s o s u p p o r t s the h y p o t h e s i s , p r e s e n t e d above, t h a t some of the S a f e t y items need r e v i s i o n or r e p l a c e m e n t . The c o r r e l a t i o n s w i t h the e x t e r n a l c r i t e r i o n a r e a l s o p o s i t i v e and s i g n i f i c a n t t h u s l e n d i n g s u p p o r t t o the v a l i d i t y of t h e t e s t . They a r e p l a c i n g most p e o p l e i n the same c a t e g o r i e s as the g r o u p i n g s i n t h e e x t e r n a l c r i t e r i o n v a r i a b l e . The above s t a t i s t i c s f o r p i l o t #1 p r o v i d e s a d d i t i o n a l e v i d e n c e f o r t h e v a l i d i t y and r e l i a b i l i t y of t h e t e s t . I f n e g a t i v e or z e r o c o r r e l a t i o n s had been found then major r e v i s i o n s may have been n e c e s s a r y . T h i s s u g g e s t s t h a t the c a r e taken i n the e a r l i e r s t a g e s of the t e s t c o n s t r u c t i o n had pr o v e d t o be w o r t h w h i l e . 51 T a b l e 6 C o r r e l a t i o n s f o r P i l o t #1 Combined Groups EP AP S TT EC EP 1.0 0.50 0.19 0.72 0.29 AP . 1 .0 0.44 0.90 0.46 S 1.0 0.67 0.23 TT . 1 .0 0.44 EC 1 .0 Year I Group EC AP S TT EP 1.0 0.37 0.05 0.62 AP 1 .0 0.42 0.89 S 1.0 0.64 TT 1 .0 Year I I Group EC AP S TT EP 1.0 0.52 0.21 0.76 AP 1 .0 0.36 0.86 S 1.0 0.66 TT 1 .0 52 Item revision or replacement was done with a variety of factors taken into consideration. Revision and replacement of the item stems, d i s t r a c t o r s , correct responses, and sometimes even the complete item, were systematically considered. The 57 items in p i l o t #1 varied in their patterns for p-values, co r r e l a t i o n s , and mean scores. The res u l t i n g large number of item s t a t i s t i c s patterns make i t impossible to discuss the decisions that were made on each one of the fi f t y - s e v e n items. However, the examples provided (and Appendix H) should provide s u f f i c i e n t evidence for the types of decisions that were made. It should be noted at t h i s point that the s t a t i s t i c a l analyses were used, only as a guide to item revision or replacement, as item s t a t i s t i c s should not be used as the sole basis for the decision-making process. Other factors needed to be considered before a decision, for change or against i t , i s made. Several of the factors are discussed below. Test conditions can greatly influence the s t a t i s t i c s that w i l l r e s u l t . For example, i f one group has plenty of time to do the test and another group is rushed, the res u l t s can be very d i f f e r e n t . P i l o t #1 was administered under si m i l a r condition to both groups; the subjects had no previous warning of the test and were given 50 minutes (in a classroom) to complete the test. Along with the test conditions, the appropriateness of the subjects can also influence the way the re s u l t s need to be interpreted. In p i l o t #1 the Year II group turned out to be only s l i g h t l y more knowledgeable than the Year I group. This also r e f l e c t e d the attitude of the students in the Year II group; these second year students had already been subjected to 53 numerous r e s e a r c h p r o j e c t s . A l s o , the Year I I group appeared t o be o n l y s l i g h t l y more knowledgeable because of the s p e c i f i c i t y of the t e r m i n o l o g y i n the a r e a b e i n g t e s t e d . The purpose of the t e s t and t h e e v e n t u a l i n t e r p r e t a t i o n of the t e s t s c o r e s a r e a l s o f a c t o r s t h a t can i n f l u e n c e d e c i s i o n s r e g a r d i n g i t e m r e v i s i o n or r e p l a c e m e n t . I n the case of t h i s t e s t , t he r e s u l t s were t o be used as a b a s i s f o r c e r t i f i c a t i o n of F i t n e s s I n s t r u c t o r s . As the c e r t i f i c a t i o n program was t o be implemented on a v o l u n t e e r b a s i s , any d e c i s i o n made was g r e a t l y i n f l u e n c e d by t h e a b i l i t y of the t e s t items t o d i s c r i m i n a t e w i t h a v e r y low number of f a l s e - p o s i t i v e s and f a l s e - n e g a t i v e s . Too many f a l s e - p o s i t i v e s would have r e s u l t e d i n c e r t i f i e d i n s t r u c t o r s w i t h o u t adequate knowledge; t h i s i m p l i e s t h a t the o b j e c t i v e s of t h e program t o improve the q u a l i t y of i n s t r u c t i o n and t o reduce i n j u r i e s would not be met. On the o t h e r hand, top many f a l s e - n e g a t i v e s would have r e s u l t e d i n many f r u s t r a t e d s u b j e c t s who d i d have s u f f i c i e n t knowledge, but were unable t o become c e r t i f i e d . T h i s would r e s u l t i n a v e r y q u i c k c o l l a p s e of t h i s v o l u n t e e r c e r t i f i c a t i o n program. Another f a c t o r t h a t i n f l u e n c e d the d e c i s i o n - m a k i n g p r o c e s s d u r i n g t e s t r e v i s i o n s was the knowledge of the e x p e r t s i n the a r e a b e i n g t e s t e d . I n some c a s e s t h e r e was a s u b j e c t i v e d e s i r e , by one or more e x p e r t s , t o i n c l u d e an i t e m w i t h o u t r e v i s i o n s , even though t h e s t a t i s t i c a l a n a l y s i s showed t h a t problems e x i s t e d . As n o t e d e a r l i e r , UBC a c t e d as c o n s u l t a n t s i n the t e s t c o n s t r u c t i o n p r o c e s s ; the f i n a l d e c i s i o n s were made by the BCRA-F i t n e s s B r a n c h . S e v e r a l items from p i l o t #1 which r e s u l t e d i n "poor" s t a t i s t i c s were r e t a i n e d as a r e s u l t of e x p e r t judgement. 54 Some of t h e s e items were l a t e r d i s c a r d e d o r r e v i s e d a f t e r p i l o t #2, w h i l e some of them showed " b e t t e r " s t a t i s t i c s w i t h p i l o t #2. A l l of the above f a c t o r s were a c c o u n t e d f o r d u r i n g t e s t r e v i s i o n s . A l s o , t o e l i m i n a t e i n d i v i d u a l b i a s e s , the t e s t r e v i s i o n s were done by a committee wh i c h c o n s i s t e d of s e v e r a l i n d i v i d u a l s who were e x p e r t s i n t h e s u b j e c t matter (PFAC members) and one member which had enough s t a t i s t i c a l knowledge t o i n t e r p r e t the computer p r i n t o u t s and g u i d e the committee. T h i s c o n c l u d e d the a n a l y s i s of p i l o t t e s t #1. In a l l , 22 items were r e v i s e d , 22 items were r e p l a c e d , no items were d e l e t e d , 3 items were added, and 13 ite m s were l e f t unchanged. The f i r s t d r a f t of p i l o t t e s t #2, a 60-item t e s t was c o n s t r u c t e d . 55 P i l o t Test #2 Item-objective Congruence The next stage in the va l i d a t i o n procedure consisted of eight members . of the PFAC were asked to judge how well each of the items could be matched to a p a r t i c u l a r objective from the forty-hour course. Each judge was asked to rate the item-objective congruence for each of the 60 items. The domain being tested was sp e c i f i e d by giving the name of the subtest to which the item belonged. Space was also provided beside each rating for any additional comments. The I tern-objective Congruence sheet which each reviewer received can be found in Appendix C. The reviewers were also asked to complete the Subjective Feedback questionnaires that the examinees had completed at the end of p i l o t #1. The results of the ratings, in conjuction with the subjective comments, both provided by the judges, were used to construct the f i n a l draft of p i l o t #2. The test consisted of 60 items. Administration The sample used for administering p i l o t #2 was more representative of the s p e c i f i c target population than the subjects used for p i l o t #1; these subjects were a l l intending to become c e r t i f i e d f i t n e s s i n s t r u c t o r s . The test was administered at various locations throughout B r i t i s h Columbia at which the accredited forty-hour course for fit n e s s instructors was being taught. Some groups wrote the exam at the end of the course, while others wrote the exam at the commencement of the course. 56 These groups were then put i n t o two c a t e g o r i e s , Year I and Year I I , r e s p e c t i v e l y . T h i s e x t e r n a l c r i t e r i o n p r o v i d e d an even more d i v e r s e group than i n p i l o t #1 and, t h e r e f o r e , can be c o n s i d e r e d a good t e s t of the exam's d i s c r i m i n a t i n g a b i l i t y . O v e r a l l , p i l o t #2 was a d m i n i s t e r e d t o 200 s u b j e c t s , 94 Year I and 106 Year I I . S t a t i s t i c a l A n a l y s e s S t a t i s t i c s were o b t a i n e d t h r o u g h t h e LERTAP package, as f o r p i l o t #1, the items were examined a t the i t e m , s u b t e s t , and t o t a l t e s t l e v e l s w i t h r e s p e c t t o p - v a l u e s , c o r r e l a t i o n s , and means. A d i s c u s s i o n of t h e s e i n d i c e s i s o m i t t e d a t t h i s p o i n t because of the s i m i l a r i t y t o t h o s e i n p i l o t #1. The a c t u a l v a l u e s of t h e s e i n d i c e s can be found i n Appendix I , T a b l e s 1-1, 1-2, and 1-3. However, as w e l l as r e p e a t i n g the same s t a t i s t i c s , s e v e r a l new a n a l y s e s were a l s o done t o h e l p d e t e r m i n e t h e r e l i a b i l i t y of t h e t e s t . These were postponed u n t i l t h i s s t a g e i n the t e s t c o n s t r u c t i o n because of the need t o i n i t i a l l y e s t a b l i s h the v a l i d i t y the t e s t . R e l i a b i l i t y Many d i f f e r e n t r e l i a b i l i t y i n d i c e s have been s u g g e s t e d by v a r i o u s a u t h o r s (see Appendix E ) . Because of the s i m i l a r i t i e s between many of t h e s e i n d i c e s and t h e r e a d i l y a v a i l a b l e a c c e s s t o LERTAP, o n l y t h r e e i n d i c e s were used h e r e . These a r e : Hoyt's E s t i m a t e of R e l i a b i l i t y , Cronbach's Composite A l p h a , and the S t a n d a r d E r r o r of t h e Measurement (SEM). Each of t h e s e i n d i c e s a r e d e f i n e d and d i s c u s s e d below. O t h e r s , such as the Kappamax 57 and the P , a r e d i s c u s s e d i n more d e t a i l e a r l i e r i n t h i s paper ( a l s o see Berk, 1980a). The t h r e e i n d i c e s were examined a t t h i s p o i n t t o e s t a b l i s h r e l i a b i l i t y w i t h i n the s u b t e s t s and f o r the t o t a l t e s t . Another e s t i m a t e of r e l i a b i l i t y which was a l s o used here i s the e x a m i n a t i o n of the c o r r e l a t i o n s between the s u b t e s t s t h e m s e l v e s and w i t h the t o t a l t e s t . For a d i s c u s s i o n of t h i s p r o c e d u r e r e f e r back t o p i l o t #1. Hoyt's E s t i m a t e of R e l i a b i l i t y . H oyt's r i s an e s t i m a t e of t h e i n t e r n a l c o n s i s t e n c y of the t o t a l t e s t , w hich i g n o r e s the s u b t e s t s and t r e a t s the items as e q u a l , but s e p a r a t e , e n t i t i e s . The v a l u e s c a l c u l a t e d f o r p i l o t #2 a r e a l l p o s i t i v e and s i g n i f i c a n t as shown i n Appendix I , T a b l e 1-1, w i t h the e s t i m a t e of r e l i a b i l i t y f o r the whole t e s t b e i n g 0.86. T h i s i m p l i e s t h a t i n g e n e r a l the items a r e t e s t i n g one body of knowledge. E s t i m a t e s f o r the i n d i v i d u a l s u b t e s t s a r e a l s o p o s i t i v e and g r e a t e r than 0.45, which i s somewhat low, but not when i t has r e s u l t e d from a CR t e s t w i t h l e s s v a r i a n c e than a t r a d i t i o n a l t e s t . The l a c k of v a r i a n c e , e s p e c i a l l y i n the s u b t e s t s which have o n l y 15 t o 20 i t e m s , r e s t r i c t s t h e p o s s i b l e range of the r e l i a b i l i t y i n d e x . S t a n d a r d E r r o r of Measurement. The SEM i s an e s t i m a t i o n of the s t a n d a r d d e v i a t i o n t h a t would o c c u r " f o r a s p e c i f i c examinee over r e p e a t e d independent t e s t i n g s w i t h the same t e s t or p a r a l l e l t e s t s , " ( A l l e n and Yen, p . 8 8 ) . I t r e f e r s t o the o b s e r v e d t e s t s c o r e as h a v i n g a t r u e p a r t and an e r r o r p a r t , w i t h SEM b e i n g an e s t i m a t e of the e r r o r p o r t i o n i n the o b s e r v e d 58 s c o r e . In p i l o t #2 the SEM i s o n l y 3.29 f o r the t o t a l t e s t w i t h b o t h groups combined. T h i s v a l u e i s q u i t e s m a l l c o n s i d e r i n g t h e r e a r e 60 items i n the t e s t . I t shows t h a t 68% of the t i m e , t h e i n t e r v a l formed by an i n d i v i d u a l ' s o b s e r v e d s c o r e p l u s / m i n u s 3.29 w i l l c o n t a i n the i n d i v i d u a l ' s t r u e s c o r e . The SEM f o r a l l t h r e e s u b t e s t s a r e a l s o r e l a t i v e l y s m a l l compared t o the number of items i n each s u b t e s t . A g a i n , t h i s d e monstrates the s m a l l e r r o r s i n measurement; t h a t i s , i t l e n d s s u p p o r t t o the r e l a t i v e l y h i g h l e v e l of r e l i a b i l i t y , e s p e c i a l l y f o r the t o t a l t e s t . Cronbach's Composite A l p h a . Cronbach's Composite A l p h a i s o n l y a v a i l a b l e f o r the t o t a l t e s t , not f o r t h e s u b t e s t s . I t does not c o n s i d e r the i t e m l e v e l d i f f e r e n c e s as i n Hoyt's i n d e x ; i n s t e a d , o n l y the s u b t e s t s c o r e s a r e c o n s i d e r e d as s e p a r a t e e n t i t i e s and an index i s c a l c u l a t e d t o show how w e l l the s u b t e s t s h o l d t e g e t h e r . That i s , the index shows t o what degree the s u b t e s t s a r e measuring a s i m i l a r body of knowledge. The v a l u e of t h i s index f o r p i l o t #2 i s 0.78. As w i t h the o t h e r measures, t h i s a l s o s u p p o r t s the r e l i a b i l i t y of the t e s t . O v e r a l l , t h e s e t h r e e i n d i c e s show support f o r the t o t a l t e s t s c o r e r e l i a b i l i t y . The s u b t e s t s have a v a r y i n g degree of r e l i a b i l i t y w i t h the most r e l i a b l e b e i n g b e i n g s u b t e s t two (Anatomy and P h y s i o l o g y ) and t h e l e a s t r e l i a b l e b e i n g s u b t e s t t h r e e ( S a f e t y ) . These r e s u l t s a l s o c o r r e s p o n d t o the c o r r e l a t i o n s between the s u b t e s t s themselves and between the s u b t e s t s and the t o t a l t e s t as shown i n Appendix I , T a b l e 1-2. S u b t e s t t h r e e c o r r e l a t e s l o w e s t (0.77) w i t h the t o t a l t e s t and i t c o r r e l a t e s lower w i t h each of t h e s u b t e s t s ( s u b t e s t one(EP) = 59 0.53 and s u b t e s t two(AP) = 0.62) than the c o r r e l a t i o n between the o t h e r two s u b t e s t s ( 0 . 7 0 ) . The c o m b i n a t i o n of the c o r r e l a t i o n s and the r e l i a b i l i t y i n d i c e s demonstrate the r e l i a b i l i t y of t h e t o t a l t e s t s c o r e . T h i s c o n c l u d e d the a n a l y s i s of p i l o t #2 and a committee was a g a i n formed t o r e v i s e p i l o t #2 based on t h e s e r e s u l t s . In a l l , o n l y one i t e m was r e v i s e d , no items were r e p l a c e d or added, 15 items were d e l e t e d , and 44 items were l e f t unchanged. P i l o t #3, a 45 i t e m t e s t , was c o n s t r u c t e d . 60 P i l o t T e s t #3 P i l o t #3 c o n s i s t e d of o n l y 45 i t e m s . D e l e t i o n of 15 items e n s u r e d t h a t o n l y t h e b e s t d i s c r i m i n a t i n g items needed t o be r e t a i n e d . T h i s t e s t conformed t o the o r i g i n a l r e s t r i c t i o n t o keep the t e s t t o under f i f t y i t e m s . C u t - o f f Score The number 45 was not randomly chosen, but was c a r e f u l l y s e l e c t e d by the PFAC d u r i n g t h e f i n a l phase of t h e t e s t c o n s t r u c t i o n i n which UBC was i n v o l v e d . T h i s phase i n v o l v e d s e t t i n g a c u t - o f f s c o r e f o r t h e t o t a l t e s t t h a t would d e t e r m i n e whether or not a s u b j e c t p a s s e d or f a i l e d . Appendix J shows a t a b l e t h a t h e l p e d the committee i n t h e i r d e c i s i o n - m a k i n g p r o c e s s . I t shows, based on the sample s i z e and s t a t i s t i c s from p i l o t #2, the number of f a l s e - n e g a t i v e s and f a l s e - p o s i t i v e s t h a t would r e s u l t w i t h d i f f e r e n t c u t - o f f s c o r e s , from 50% t o 90%, f o r t e s t s w i t h 60, 46, and 36 i t e m s . T h i s i s s i m i l a r t o a p r o c e d u r e d e s c r i b e d e a r l i e r i n t h i s paper and a l s o i n Berk (1980a), where he s u g g e s t s c a l c u l a t i n g v a l i d i t y c o e f f i c i e n t s t o f i n d t he o p t i m a l c u t - o f f s c o r e . For a l l t h r e e t e s t l e n g t h s the t a b l e shows t h a t a low c u t -o f f s c o r e , such as 50%, r e s u l t s i n more f a l s e - p o s i t i v e s than a h i g h e r c u t - o f f s c o r e of a p p r o x i m a t e l y 80%. A low c u t - o f f s c o r e means t h a t most p e o p l e w i l l pass and t h a t many may not a c t u a l l y have s u f f i c i e n t knowledge. The o p p o s i t e i s t r u e i f the c u t - o f f s c o r e i s s e t too h i g h ; v e r y few w i l l pass and as a r e s u l t , many s u b j e c t s who have the knowledge, w i l l s t i l l not r e c e i v e c e r t i f i c a t i o n . There a r e a l s o the p r e v i o u s l y mentioned p r a c t i c a l a s p e c t s , such as e n s u r i n g enough p e o p l e become 61 c e r t i f i e d , r a t h e r t h a t f r u s t r a t e d , so than t h i s v o l u n t e e r c e r t i f i c a t i o n program w i l l not c o l l a p s e . A f t e r s e v e r a l meetings w i t h the PFAC and the B CRA-Fitness Branch i t was d e c i d e d t h a t a f a l s e - p o s i t i v e was more s e r i o u s than a f a l s e - n e g a t i v e , because t h i s would r e s u l t i n c e r t i f i c a t i o n of the u n q u a l i f i e d f i t n e s s i n s t r u c t o r t h a t the program i s t r y i n g t o e l i m i n a t e . On the o t h e r hand, the m a j o r i t y of the committee f e l t t h a t the p u b l i c would not a c c e p t such an e x t r e m e l y h i g h c u t - o f f s c o r e as 75% or h i g h e r . Both the 60 and 3 6 - i t e m t e s t would r e q u i r e t h i s i n o r d e r t o keep the number of f a l s e - p o s i t i v e s l o w e r . The 4 6 - i t e m t e s t would keep the f a l s e - p o s i t i v e s low even a t a 65% c u t - o f f s c o r e ( o n l y two f a l s e - p o s i t i v e s ) , however, the PFAC f e l t t h a t the p u b l i c would i n f e r t h a t the t e s t was t o o easy. T h e r e f o r e , the c u t - o f f s c o r e was s e t a t 70% which r e s u l t s i n t h e same number of f a l s e - p o s i t i v e s , but many more f a l s e - n e g a t i v e s than 65% (82 t o 6 9 ) . The t e s t l e n g t h was s e t a t 45 items (one i t e m was d e l e t e d j u s t t o round o f f the number) f o r p i l o t #3. For the 45-item t e s t a p a s s i n g s c o r e of 70% i n d i c a t e s 32 c o r r e c t r e s p o n s e s . R e s u l t s P i l o t #3 was c o n s t r u c t e d , i n c o n j u n c t i o n w i t h the PFAC, a f t e r a s t a t i s t i c a l and s u b j e c t i v e a n a l y s i s of the d a t a o b t a i n e d from the a d m i n i s t r a t i o n of p i l o t #2. The s t a t i s t i c s d e r i v e d were s i m i l a r t o t h o s e performed on p i l o t #1, w i t h t h e a d d i t i o n of the r e l i a b i l i t y i n d i c e s and the s e t t i n g of the c u t - o f f s c o r e . In a l l , 15 items were d e l e t e d from p i l o t #2, one i t e m was r e v i s e d , and no items were added. The v a r i o u s i n d i c e s showed s t r o n g s u pport f o r the r e l i a b i l i t y of the t o t a l t e s t s c o r e and 62 the c u t - o f f s c o r e was s e t a t a p o i n t which d e m o n s t r a t e d maximum v a l i d i t y . I t was c o n c l u d e d t h a t p i l o t #3, a 4 5 - i t e m CR t e s t , w i t h a c u t - o f f s c o r e of 32 ( 7 0 % ) , was ready t o be used i n the B a s i c F i t n e s s Leader c e r t i f i c a t i o n program. 63 Summary and Recommendations Summary P i l o t t e s t #3 was the " f i n i s h e d " CR t e s t t h a t the BCRA-F i t n e s s BRanch s t a r t e d u s i n g i n the c e r t i f i c a t i o n p r o c e s s f o r f i t n e s s i n s t r u c t o r s . T h i s ended UBC's i n v o l v e m e n t i n the p r o j e c t and the BCR A - F i t n e s s Branch was s a t i s f i e d t h a t p i l o t #3 was s u f f i c i e n t l y v a l i d and r e l i a b l e . A copy of p i l o t #3 can be found i n Appendix K. The p r o c e d u r e s f o l l o w e d were c l o s e t o the ones proposed i n T a b l e 2, w i t h t h e e x c e p t i o n of t h e time l i n e b e i n g extended s l i g h t l y due t o u n f o r e s e e n d e l a y s . For easy r e f e r e n c e , a summary of the a c t u a l t e s t c o n s t r u c t i o n p r o c e d u r e s t h a t were implemented i n t h i s p r o j e c t can be found i n T a b l e 7. A summary of the changes i n the items ( r e v i s i o n s , r e p l a c e m e n t s , e t c . ) i s shown i n T a b l e 8. For example, T a b l e 8 shows t h a t i t e m number one was r e v i s e d a f t e r p i l o t #1, but then was not changed f o r p i l o t #3. Some items ( e . g . , 3,4, and 11) were r e v i s e d or r e p l a c e d when p i l o t #2 was c o n s t r u c t e d and the n d e l e t e d f o r p i l o t #3. Other i t e m s , such as 14,16, and 28, were w r i t t e n w e l l enough f o r p i l o t #1 and d i d not have t o be changed f o r p i l o t #2 or p i l o t #3. O v e r a l l , when p i l o t #2 was c o n s t r u c t e d from the a n a l y s i s of the d a t a o b t a i n e d from p i l o t #1, 22 items were r e v i s e d , 22 items were r e p l a c e d , no items were d e l e t e d , 3 items were added, and o n l y 13 items were l e f t unchanged. When p i l o t #3 was c o n s t r u c t e d from the a n a l y s i s of the d a t a o b t a i n e d from p i l o t #2, o n l y one i t e m was r e v i s e d , no items were r e p l a c e d or added, 15 items were d e l e t e d , and 44 items were l e f t unchanged. 64 The f i g u r e s demonstrate the g e n e r a l improvement i n the q u a l i t y of t h e t e s t i t e m s . The number of t e s t items from p i l o t #1 t o p i l o t #2 was i n c r e a s e d from 57 t o 60, r e s p e c t i v e l y , and then reduced t o 45 f o r p i l o t #3. T a b l e 7 Summary of A c t u a l P r o c e d u r e s L i s t of o b j e c t i v e s r e c i e v e d from the B C R A - F i t n e s s Branch R a t i n g and r a n k i n g of S p e c i f i c Areas ( r e d u c e d from 8 t o 3) G u i d e l i n e s f o r s u b m i t t i n g items m a i l e d out - i t e m s g e n e r a t i o n forms Items s u b m i t t e d C o n s t r u c t e d f i r s t d r a f t of P i l o t #1 -randomized items and c o r r e c t r e s p o n s e s - p r e s e n t e d t o the PFAC f o r feedback C o n s t r u c t e d P i l o t #1 w i t h PFAC - c o r r e c t grammer - i t e m r e v i s i o n A d m i n i s t e r e d P i l o t #1 a t UBC ( P h y s i c a l E d u c a t i o n ) -92 s u b j e c t s i n Year I -72 s u b j e c t s i n Year I I - s u b j e c t i v e q u e s t i o n n a i r e a t t a t c h e d t o t e s t S t a t i s t i c a l A n a l y s e s f o r P i l o t #1 - i n f o r m a l examinee feedback - i t e m d i f f i c u l t y ( p - v a l u e s ) - i t e m d i s c r i m i n a t i o n max p o s s i b l e g a i n ( r e s p o n s e means, r ) combined groups ( i t e m r w i t h ST and TT) i t e m c r i t e r i o n ( p a r t i a l r ) - d e s c r i p t i v e s t a t i s t i c s (ST and TT l e v e l ) mean s t a n d a r d d e v i a t i o n low, h i g h s c o r e s number of items - c o r r e l a t i o n s (ST, TT, and EC) C o n s t r u c t e d f i r s t d r a f t of P i l o t #2 P r e s e n t e d r e s u l t s t o the PFAC - s u b j e c t i v e feedback - i t e m - o b j e c t i v e congruence ( c o n t e n t v a l i d i t y ) C o n s t r u c t e d P i l o t #2 w i t h PFAC A d m i n i s t e r e d P i l o t #2 -94 u n i n s t r u c t e d s u b j e c t s -106 i n s t r u c t e d s u b j e c t s S t a t i s t i c a l a n a l y s e s f o r P i l o t #2 -as f o r P i l o t #1 p l u s the f o l l o w i n g - s e t a c u t - o f f s c o r e f o r TT mastery/nonmastery - r e l i a b i l i t y Hoyt's r S. E. M. T h r e s h o l d l o s s e r r o r s ( f a l s e - n e g / p o s ) - v a l i d i t y c r i t e r i o n - r e l a t e d d e c i s i o n v a l i d i t y P r e s e n t e d r e s u l t s t o the PFAC C o n s t r u c t e d P i l o t #3 w i t h the PFAC P r e s e n t e d P i l o t #3 t o the BCRA-Fitness Branch T a b l e 8 Summary of Item R e v i s i o n from P i l o t One t o Three One P i l o t Two Number Three One P i l o t Number Two Three 1 REV NC 31 REV NC 2 REV NC 32 REV NC 3 REV DEL 33 REV NC 4 REP DEL 34 REV NC 5 REP NC 35 REP DEL 6 REV NC 36 NC NC 7 REV NC 37 NC NC 8 REP NC 38 REP NC 9 REV NC 39 REV NC 10 REP NC 40 REV DEL 1 1 REV DEL 41 REV NC 12 REP DEL 42 REP NC 13 REV NC 43 REP NC 1 4 NC NC 44 REP NC 1 5 REP NC 45 REP NC 16 NC NC 46 REP DEL 1 7 REV NC 47 REP NC 18 REP NC 48 NC DEL 19 REV NC 49 REP NC 20 REP NC 50 REV NC 21 NC DEL 51 REV NC 22 REP NC 52 NC NC 23 REP NC 53 NC NC 24 NC DEL 54 REP DEL 25 REV REV 55 REV NC 26 NC DEL 56 REP DEL 27 REV NC 57 NC NC 28 NC NC 58 ADD NC 29 NC NC 59 ADD NC 30 REP DEL 60 ADD DEL # of items i n l a s t p i l o t 57 60 # of items i n new p i l o t 60 45 # of items r e v i s e d (REV) 22 1 # of items r e p l a c e d ( R E P ) 22 0 # of items d e l e t e d (DEL) 0 1 5 # of items added (ADD) 3 0 # of items not changed (NC) 1 3 44 69 Future plans for the BFTE include substituting a few s i m i l a r , but new, items into the test p e r i o d i c a l l y with a constant monitoring of the test score v a l i d i t y and r e l i a b i l i t y . In t h i s way, i t i s hoped that a large pool of v a l i d and r e l i a b l e items w i l l be generated. The plan also includes a desire to increase the item pool u n t i l i t i s large enough to generate equivalent forms of the test randomly. To date, only a few new items have been added to the item pool and the BFTE has been administered to about one hundred subjects, with ninety of these people recieving c e r t i f i c a t i o n as a Basic Fitness Leader Level I. The future of the BFTE and the implementation program look very o p t i m i s t i c . Recommendations Although the test has been given to the BCRA-Fitness Branch for use in the c e r t i f i c a t i o n model t h i s does not imply that the test construction process i s complete. The test needs to be continually monitored to ensure that i t i s v a l i d , r e l i a b l e , and that the cut-off score i s s t i l l appropriate. A change in the number of f a l s e - p o s i t i v e s or false-negatives may indicate that the cut-off score needs to be adjusted. Future projects of t h i s nature need to ensure several points in order to maintain control of the test construction procedures and thereby result in a quality product. F i r s t l y , the l i s t of objectives should be developed in conjunction with the test constructor in order to e s t a b l i s h a well-defined domain of objectives that can be expressed in behavioural terms and measured at the appropriate cognitive l e v e l (domain 70 s p e c i f i c a t i o n s ) . This allows the test constructor to e s t a b l i s h content v a l i d i t y during the early stages of the test construction. Secondly, the test constructor should maintain f i n a l approval over decisions regarding item revision and the setting of c u t - o f f scores. If t h i s i s not the case, many other factors, such as the public's rejec t i o n of low cut-off scores in th i s project, can inter f e r e with the decision-making process. And l a s t l y , when an external c r i t e r i o n i s to be used (such as instructed-uninstructed groups) to es t a b l i s h construct v a l i d i t y , the test constructor should ensure that two diverse groups ac t u a l l y e x i s t . 71 R e f e r e n c e s A i k e n , L. R. (1979). R e l a t i o n s h i p s between the i t e m d i f f i c u l t y and d i s c r i m i n a t i o n i n d e x e s . E d u c a t i o n a l and P s y c h o l o g i c a l  Measurement, 3 9 ( 4 ) , 821-824. A l l e n , M. J . & Yen, W. M. (1979). I n t r o d u c t i o n t o measurement  t h e o r y . Monterey, C a l i f o r n i a : B r o o k s / C o l e P u b l i s h i n g . Berk, R. A. (1976). D e t e r m i n a t i o n of o p t i o n a l c u t t i n g s c o r e s i n c r i t e r i o n - r e f e r e n c e d measurement. J o u r n a l of E x p e r i m e n t a l  E d u c a t i o n , 45, 4-9. Berk, R. A. (1978). Consumer's guide t o CRT i t e m s t a t i s t i c s . Measurement i n E d u c a t i o n , 9 ( 1 ) . Berk, R. A. (1980a). A framework f o r m e t h o d o l o g i c a l advances i n c r i t e r i o n - r e f e r e n c e d t e s t i n g . A p p l i e d P s y c h o l o g i c a l Measurement, 4, 563-573. Berk, R. A. (1980b). A consumer's guide t o c r i t e r i o n - r e f e r e n c e d t e s t r e l i a b i l i t y . J o u r n a l of E d u c a t i o n a l Measurement, 10, 159— 170. Berk, R. A. ( 1 9 8 0 c ) . C r i t e r i o n - r e f e r e n c e d measurement: The s t a t e  of t h e a r t . B a l t i m o r e : John Hopkins U n i v e r s i t y P r e s s . B l o c k , R. D., M i s l e v y , R., & Woodson, C. (1982). The next s t a g e i n e d u c a t i o n a l assessment. E d u c a t i o n a l R e s e a r c h e r , j j _ ( 3 ) , 4-11,plus 16. Brennan, R. L. (1980). A p p l i c a t i o n s of g e n e r a l i z a b i l i t y t h e o r y . In R. A. Berk ( E d . ) , C r i t e r i o n - r e f e r e n c e d measurement: The  s t a t e of the a r t . B a l t i m o r e : John Hopkins U n i v e r s i t y P r e s s . B r i t i s h Columbia M i n i s t r y of E d u c a t i o n . (1979). B r i t i s h Columbia  Assessment of P h y s i c a l E d u c a t i o n . V i c t o r i a , B.C.: Queen's P r i n t e r . B r i t i s h Columbia M i n i s t r y of E d u c a t i o n . (1980). Secondary  P h y s i c a l E d u c a t i o n C u r r i c u l u m and Resource G u i d e . V i c t o r i a , B.C.: Queen's P r i n t e r . 72 Canadian A s s o c i a t i o n of S p o r t S c i e n c e s . F i t n e s s a p p r a i s a l  c e r t i f i c a t i o n and a c c r e d i t a t i o n program. M a n u s c r i p t i n p r o g r e s s . C h e s t e r , W. H., A i k e n , M. C , & Popham, W. J . ( E d s . ) . (1974). Problems i n c r i t e r i o n - r e f e r e n c e d measurement. CSE Monogragh  S e r i e s i n E v a l u a t i o n , 3 . Cochran, W. G. (1963). S ampling t e c h n i q u e s (2nd ed.) New York: W i l e y . Cohen, J . (1983). The c o s t of d i c h o t o m i z a t i o n . A p p l i e d  P s y c h o l o g i c a l Measurement, 7 ( 3 ) , 249-254. C r a n t o n , P. A. (1976). An i n t r o d u c t i o n t o c r i t e r i o n - r e f e r e n c e d measurement. Canadian J o u r n a l of E d u c a t i o n , J_(4), 83-92. Cronbach, L. J . (1971). V a l i d a t i o n of e d u c a t i o n a l measures. In R. L. T h o r n d i k e ( E d . ) , E d u c a t i o n a l Measurement (2nd e d . ) . Washington:American C o u n c i l on E d u c a t i o n . D a r s t , P. W. Se S t e e v e s , D. (1980). A competency based a p p r o a c h t o secondary s t u d e n t t e a c h i n g i n p h y s i c a l e d u c a t i o n . R e s e a r c h  Q u a r t e r l y f o r E x e r c i s e and S p o r t , 5 1 ( 2 ) , 274-285. De G r u i j t e r , D. N. M. & Hambleton, R. K. (1983). U s i n g i t e m response models i n c r i t e r i o n - r e f e r e n c e d t e s t i t e m s e l e c t i o n . In R. K. Hambleton ( E d ) , A p p l i c a t i o n s of i t e m r e sponse t h e o r y (pp. 123-141). Vancouver, B r i t i s h C o lumbia: Hemlock P r i n t e r s . E i g n o r , D. R. & Cook, L. L. (1981). [Review of A c r i t e r i o n - r e f e r e n c e d measurement model w i t h c o r r e c t i o n s f o r g u e s s i n g and  c a r e l e s s n e s s ] 7 A p p l i e d P s y c h o l o g i c a l Measurement, 5 ( 1 ) , 137-139. F i t z p a t r i c k , A. R. (1983). The meaning of c o n t e n t v a l i d i t y . A p p l i e d P s y c h o l o g i c a l Measurement, 7 ( 1 ) , 3-13. G l a s e r , R. (1963). I n s t r u c t i o n a l t e c h n o l o g y and t h e measurement of l e a r n i n g outcomes. American P s y c h o l o g i s t , 18, 519-521. G l a s e r , R. & Bond, L. ( E d s . ) . (1981). T e s t i n g : C o n c e p t s , p o l i c y , p r a c t i c e , and r e s e a r c h . American P s y c h o l o g i s t , 3 6 ( 1 0 ) . 73 Green, K. E. (1983). S u b j e c t i v e judgement of m u l t i p l e c h o i c e i t e m c h a r a c t e r i s t i c s . E d u c a t i o n a l and P s y c h o l o g i c a l Measurement, 4 3 ( 2 ) , 563-570. H a l a d y n a , T. M. & R o i d , G. H. (1981). The r o l e of i n s t r u c t i o n a l s e n s i t i v i t y i n t h e e m p i r i c a l r e v i e w of c r i t e r i o n - r e f e r e n c e d t e s t i t e m s . J o u r n a l of E d u c a t i o n a l Measurement, J_8 ( 1 ) , 39-54. H a l a d y n a , T. M. & R o i d , G. H. (1983). A c omparison of two approaches t o c r i t e r i o n - r e f e r e n c e d t e s t c o n s t r u c t i o n . J o u r n a l  of E d u c a t i o n a l Measurement, 2 0 ( 3 ) , 271-282. Hambleton, R. K. (1 9 8 0 ) . C o n t r i b u t i o n s t o c r i t e r i o n - r e f e r e n c e d t e s t i n g t e c h n o l o g y : An i n t r o d u c t i o n . A p p l i e d P s y c h o l o g i c a l  Measurement, 4, 421-424. Hambleton, R. K. (1 9 8 3 ) . A p p l i c a t i o n s of i t e m response models t o c r i t e r i o n - r e f e r e n c e d assessment. A p p l i e d P s y c h o l o g i c a l  Measurement, 7 ( 1 ) , 33-44. Hambleton, R. K. ( E d . ) . (1980). C o n t r i b u t i o n s t o c r i t e r i o n -r e f e r e n c e d t e s t i n g t e c h n o l o g y . A p p l i e d P s y c h o l o g i c a l  Measurement, 4, 421-575. Hambleton, R. K. ( E d . ) . (1982). Item response t h e o r y . A p p l i e d  P s y c h o l o g i c a l Measurement, 6 ( 4 ) , 373-492. Hambleton, R. K. ( E d . ) . (1983). A p p l i c a t i o n s of i t e m response  t h e o r y . Vancouver, B r i t i s h Columbia: Hemlock P r i n t e r s . Hambleton, R. K. & De G r u i j t e r , D. N. M. (1983). A p p l i c a t i o n of i t e m response models t o c r i t e r i o n - r e f e r e n c e d t e s t i t e m s e l e c t i o n . J o u r n a l of E d u c a t i o n a l Measurement, 2 0 ( 4 ) , 355-368. Hambleton, R. K. & E i g n o r , D. R. (1978). G u i d e l i n e s f o r e v a l u a t i n g c r i t e r i o n - r e f e r e n c e d t e s t s and t e s t manuals. J o u r n a l of E d u c a t i o n a l Measurement, j_5(4), 321-327. Hambleton, R. K., M i l l s , C. N. r & Simon, R. (1983). D e t e r m i n i n g the l e n g t h f o r c r i t e r i o n - r e f e r e n c e d t e s t s . J o u r n a l of  E d u c a t i o n a l Measurement, 2(0(1), 27-38. Hambleton, R. K., Swaminathan, H., & A l g i n a , J . (1978). C r i t e r i o n - r e f e r e n c e d t e s t i n g and measurement: A r e v i e w of t e c h n i c a l i s s u e s and developments. Review of E d u c a t i o n a l 74 Re s e a r c h , 48, 1-47. Hambleton, R. K. & Van der L i n d e n , W. J . ( E d s . ) . (1982). Advances i n i t e m r esponses and a p p l i c a t i o n s . A p p l i e d  P s y c h o l o g i c a l Measurement, 6 ( 4 ) , 373-473. Huyhn, H. (1976). On the r e l i a b i l i t y of d e c i s i o n s i n domain-r e f e r e n c e d t e s t i n g . J o u r n a l of E d u c a t i o n a l Measurement, 13, 253-264. Huyhn, H. (1977). The kappamax r e l i a b i l i t y i n d e x f o r d e c i s i o n s  i n d o m a i n - r e f e r e n c e d t e s t i n g . Paper p r e s e n t e d a t the annual meeting of the American E d u c a t i o n a l R e s e a r c h A s s o c i a t i o n , New York. Huyhn, H. (1979). S t a t i s t i c a l i n f e r e n c e f o r two r e l i a b i l i t y i n d i c e s i n mastery t e s t i n g based on the b e t a - b i n o m i a l model. J o u r n a l of E d u c a t i o n a l S t a t i s t i c s , 4 ( 3 ) , 231-246. Kane, M. T. (1982). A s a m p l i n g model f o r v a l i d i t y . A p p l i e d  P s c h o l o g i c a l Measurement, 6 ( 2 ) , 125-160. L i n n , R. L. (1980). I s s u e s of v a l i d i t y f o r c r i t e r i o n - r e f e r e n c e d measures. A p p l i e d P s y c h o l o g i c a l Measurement, 4, 547-561. L i v i n g s t o n , S. A. (1972). C r i t e r i o n - r e f e r e n c e d a p p l i c a t i o n s of c l a s s i c a l t e s t t h e o r y . J o u r n a l of E d u c a t i o n a l Measurement, 9, 13-26. L i v i n g s t o n , S. A. (1977). P s y c h o m e t r i c t e c h n i q u e s f o r c r i t e r i o n - r e f e r e n c e d t e s t i n g and assessment, In J . D. Cone and R. P. Hawkins ( E d s . ) , B e h a v i o r a l assessment: New d i r e c t i o n s i n c l i n i c a l p s y c h o l o g y . New York: B r u n n e r / M a z e l . L i v i n g s t o n , S. A. (1980). Comments on c r i t e r i o n - r e f e r e n c e d t e s t i n g . A p p l i e d P s y c h o l o g i c a l Measurement, 4, 575-581. L i v i n g s t o n , S. A. & Wingersky, M. (1979). A s s e s s i n g the r e l i a b i l i t y of t e s t s used t o make p a s s / f a i l d e c i s i o n s . J o u r n a l  of E d u c a t i o n a l Measurement, j_6(4), 247-260. L o r d , F. M. (1 9 5 9 ) . T e s t s of the same l e n g t h have the same s t a n d a r d e r r o r of measurement. E d u c a t i o n a l and P s y c h o l o g i c a l  Measurement, 17,510-521. 75 Macready, G. B. & Dayton, C. M. (1980). The n a t u r e and use of mastery models. A p p l i e d P s y c h o l o g i c a l Measurement, 4, 493-516. M a r s h a l l , J . L. (1976). The mean s p l i t - h a l f c o e f f i c i e n t of  agreement and i t s r e l a t i o n t o o t h e r t e s t i n d i c e s : A st u d y  based on s i m u l a t e d d a t a ( T e c h n i c a l Report No. 350). Madison, WI: Wisconson R e s e a r c h and Development C e n t r e f o r C o g n i t v e L e a r n i n g . M a r t u z a , V. R. (1977). A p p l y i n g n o r m - r e f e r e n c e d and c r i t e r i o n - r e f e r e n c e d measurement i n e d u c a t i o n . B o s t o n : A l l y n and Bacon. M i l l m a n , J . (1 9 7 4 ) . C r i t e r i o n - r e f e r e n c e d measurement. In W. J . Popham ( E d . ) , E v a l u a t i o n i n e d u c a t i o n : C u r r e n t a p p l i c a t i o n s . B e r k l e y , CA: McCutchan. M i l l m a n , J . ( 1 9 7 9 ) . R e l i a b i l i t y and v a l i d i t y of c r i t e r i o n -r e f e r e n c e d t e s t s c o r e s . In R. E. Traub ( E d . ) , New D i r e c t i o n s  f o r t e s t i n g and measurement: M e t h o d o l o g i c a l Developments. San F r a n c i s c o , CA: J o s e y Bass. M i l l s , C. ( 1 9 8 3 ) . A comparison of t h r e e methods of e s t a b l i s h i n g c u t - o f f s c o r e s on c r i t e r i o n - r e f e r e n c e d t e s t s . J o u r n a l of  E d u c a t i o n a l Measurement, 2 0 ( 3 ) , 283-292. N i t k o , A. J . ( 1 9 8 0 ) . D i s t i n g u i s h i n g the many v a r i e t i e s of c r i t e r i o n - r e f e r e n c e d t e s t s . Review of E d u c a t i o n a l R e s e a r c h , 5 0 ( 3 ) , 461-485. P o i z n e r , S. B., Nicewander, W. A., & G e t t y s , C. F. (1978). A l t e r n a t i v e r e s p o n s e and s c o r i n g methods f o r m u l t i p l e c h o i c e i t e m s : An e m p i r i c a l study of p r o b a b i l i s t i c and o r d i n a l response models. A p p l i e d P s y c h o l o g i c a l Measurement, 2, 83-96. Popham, W. J . & Husek, T. R. (1969). I m p l i c a t i o n s of c r i t e r i o n -r e f e r e n c e d measurement. J o u r n a l of E d u c a t i o n a l Measurement, 6, 1-9. Rogosa, D. R. & W i l l e t t , J . B. (1983). D e m o n s t r a t i n g the r e l i a b i l i t y of t h e d i f f f e r e n c e s c o r e i n the measurement of change. J o u r n a l of E d u c a t i o n a l Measuremant, 355-344. S a f r i t , M. J . & Stamm, C. L. (1980). R e l i a b i l i t y e s t i m a t e s f o r c r i t e r i o n - r e f e r e n c e d measures i n the psychomotor domain. Re s e a r c h Q u a r t e r l y f o r E x e r c i s e and S p o r t , 5 1 ( 2 ) , 359-368. 76 Shepard, L. (1980). S t a n d a r d s e t t i n g i s s u e s and methods. A p p l i e d  P s y c h o l o g i c a l Measurement, 4, 447-467. S h i f l e t t , B. & Schuman, B. J . (1982). A c r i t e r i o n - r e f e r e n c e d t e s t f o r a r c h e r y . R e s e a r c h Q u a r t e r l y f o r E x e r c i s e and S p o r t , 53, 330-335. S m i t h , J . K. (1982). C o n v e r g i n g on c o r r e c t answers: A p e c u l i a r i t y of m u l t i p l e c h o i c e i t e m s . J o u r n a l of E d u c a t i o n a l  Measurement, 1 9 ( 3 ) , 211-220. S t a c k i n g , M. L. (1983). D e v e l o p i n g a common m e t r i c i n i t e m response t h e o r y . A p p l i e d P s y c h o l o g i c a l Measurement, 7 ( 2 ) , 201— 210. Stamm, C. L. & Moore, J . E. (1980). A p p l i c a t i o n s of g e n e r a l i z a b i l i t y t h e o r y i n e s t i m a t i n g the r e l i a b i l i t y of a motor performance t e s t . R e s e r c h Q u a r t e r l y f o r E x e r c i s e and  S p o r t , 5 1 ( 2 ) , 382-388. S t r i e k e r , L. J . (1982). I d e n t i f y i t e m s t h a t p e r f o r m d i f f e r e n t i a l l y i n p o p u l a t i o n subgroups: A p a r t i a l c o r r e l a t i o n . A p p l i e d P s y c h o l o g i c a l Measurement, 6 ( 3 ) , 261-274. S u b k o v i a k , M. J . (1976). E s t i m a t i n g r e l i a b i l i t y from a s i n g l e a d m i n i s t r a t i o n of a mastery t e s t . J o u r n a l of E d u c a t i o n a l  Measurement, 13, 265-276. S u b k o v i a k , M. J . (1978). E m p i r i c a l i n v e s t i g a t i o n of p r o c e d u r e s f o r e s t i m a t i n g r e l i a b i l i t y f o r mastery t e s t s . J o u r n a l of  E d u c a t i o n a l Measurement, 1 5 ( 2 ) , 111-116. S u b k o v i a k , M. J . (1980). D e c i s i o n - c o n s i s t e n c y approaches. In R. A. Berk ( E d . ) , C r i t e r i o n - r e f e r e n c e d measurement: The s t a t e of  the a r t . B a l t i m o r e : John Hopkins U n i v e r s i t y P r e s s . Truab, R. E. & Rowley, G. L. (1980). R e l i a b i l i t y of t e s t s c o r e s and d e c i s i o n s . A p p l i e d P s y c h o l o g i c a l Measurement, 4, 517-545. Van der L i n d e n , W. J . (1980). D e c i s i o n models f o r use w i t h c r i t e r i o n - r e f e r e n c e d t e s t s . A p p l i e d P s y c h o l o g i c a l Measurement, 4, 469-492. Van der L i n d e n , W. J . (1981). A l a t e n t t r a i t l o o k a t p r e t e s t -p o s t t e s t v a l i d a t i o n of c r i t e r i o n - r e f e n c e d t e s t i t e m s . Review 77 of E d u c a t i o n a l R e s e a r c h , 5J_(3), 379-402. Weltman, A. & Regan, J . (1982). A r e l i a b l e method f o r the measurement of c o n s t a n t l o a d maximal endurance performance on the b i c y c l e ergometer. R e s e a r c h Q u a r t e r l y f o r E x e r c i s e and  S p o r t , 5 3 ( 2 ) , 180-183. W i l c o x , R. A. (1980). D e t e r m i n i n g the l e n g t h of a c r i t e r i o n -r e f e r e n c e d t e s t . A p p l i e d P s y c h o l o g i c a l Measurement, 4, 425-446. W i l c o x , R. A. (1981). [Review of C r i t e r i o n - r e f e r e n c e d  measurement: The s t a t e of the a r t ] . A p p l i e d P s y c h o l o g i c a l Measurement, 5 ( 1 ) , 133-135. W i l c o x , R. R. (1979). On f a l s e - p o s i t i v e and f a l s e - n e g a t i v e d e c i s i o n s w i t h a mastery t e s t . J o u r n a l of E d u c a t i o n a l  S t a t i s t i c s , 4 ( 1 ) , 59-73. W i l s o n , G. (1980). The C o n s t r u c t i o n of a C r i t e r i o n - R e f e r e n c e d  P h y s i c a l E d u c a t i o n Knowledge T e s t . M a s t e r ' s T h e s i s , U n i v e r s i t y of B r i t i s h C o l u m b i a . Yalow, E. S. Se Popham, W. J . (1983). Content v a l i d i t y a t the c r o s s r o a d s . E d u c a t i o n a l R e s e a r c h e r , J_2(8), 10-14, p l u s 21. 78 Appendix A G u i d e l i n e s f o r S u b m i t t i n g Items 1. The stem s h o u l d be p h r a s e d as a d i r e c t q u e s t i o n or an i n c o m p l e t e s e n t e n c e . 2. A l l of the answers t o a g i v e n i t e m s h o u l d be of s i m i l a r l e n g t h and g r a m m a t i c a l form. 3. D i s t r a c t o r s s h o u l d be p l a u s i b l e enough t o draw some r e s p o n s e s . 4. Answers s h o u l d not be of the form: " a l l of the above" "a,b, and d, but not c" e t c . 5. Words used f o r emphasis, or which negate a p h r a s e , s h o u l d be u n d e r l i n e d . 6. Four re s p o n s e s s h o u l d be p r o v i d e d f o r each i t e m , o n l y one of which i s c o r r e c t . 7. Each i t e m s h o u l d be r e p r e s e n t a t i v e of a s p e c i f i c " e n a b l i n g o b j e c t i v e . " Example of t e s t items 1. The p o s t e r i o r i l i a c m u s c l e s a r e used m a i n l y a s : a) R o t a t o r s of t h e t h i g h b) A b d u c t o r s of the t h i g h c) A d d u c t o r s of the t h i g h d) E x t e n s o r s of the t h i g h 79 The main purpose of aerobic exercise i s to improve: a) Power b) F l e x i b i l i t y c) Endurance d) Coordination Which of the following programs would use the Far t l e t r a i n i n g p r i n c i p l e ? a) W e i g h t l i f t i n g program b) Gymnastics program c) V o l l e y b a l l program d) Running program Cardiac Output i s the product of: a) Blood pressure x heart rate b) VO max x 0 pressure 2 2 c) Stroke volume x resting blood pressure d) Heart rate x stroke volume Ligaments are used to connect: a) Bone to bone b) Cartilage to muscle c) Muscle to muscle d) Muscle to bone Appendix B Item Statistics 80 Informal examinee feedback Item d i f f i c u l t y Item discrimination (maximum between groups, minimum within groups) Maximum possible gain B index Internal sensitivity Combined groups (item-total r) Item criterion (partial r) Change-item r (correlation/multiple regression) Item homogeneity Four levels of Chi square 81 Appendix C Item-Objective Congruence Reviewer; Date: Fir s t , read through the domain specifications and the test items. Next, please indicate how well you feel each item reflects the domain specifications i t was written to measure. Judge a test item solely on the basis of the match between its content and the content defined by the domain specification. Please use the five-point rating scale shown below: Poor 1 Fair 2 Good 3 Very Good 4 Excellent 5 Circle the number corresponding to your rating beside the test item number. The domain specifications have been abbreviated as follows: (AP) Anatomy and Physiology (EP) Exercise Principles for Adult Fitness Classes (S) Safety Comments Domain Test Item Item Rating EP 1 1 2 3 4 5 AP 2 1 2 3 4 5 AP 3 1 2 3 4 5 S 4 1 2 3 4 5 S 5 1 2 3 4 5 S 6 1 2 3 4 5 AP 7 1 2 3 4 5 AP 8 1 2 3 4 5 AP 9 1 2 3 4 5 EP 10 1 2 3 4 5 S 11 1 2 3 4 5 S 12 1 2 3 4 5 EP 13 1 2 3 4 5 AP 14 1 2 3 4 5 82 Domain Test Item Item Rating Comments EP AP AP AP AP S EP S EP AP EP EP EP AP AP S EP EP EP S S AP S AP EP AP 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 4 5 4 5 4 5 4 5 4 5 4 5 4 5 4 5 4 5 4 5 4 5 4 5 4 5 4 5 4 5 4 5 4 5 4 5 4 5 4 5 4 5 4 5 4 5 4 5 4 5 4 5 Domain Test Item Item Rating Comments EP 41 1 2 3 4 5 AP 42 1 2 3 4 5 AP 43 1 2 3 4 5 AP 44 1 2 3 4 5 AP 45 1 2 3 4 5 EP 46 1 2 3 4 5 S 47 1 2 3 4 5 AP 48 1 2 3 4 5 EP 49 1 2 3 4 5 EP 50 1 2 3 4 5 S 51 1 2 3 4 5 S 52 1 2 3 4 5 AP 53 1 2 3 4 5 EP 54 1 2 3 4 5 EP 55 1 2 3 4 5 S 56 1 2 3 4 5 EP 57 1 2 3 4 5 AP 58 1 2 3 4 5 EP 59 1 2 3 4 5 S 60 1 2 3 4 5 In addition, complete 'subjective feedback.1 form found at end of test. 84 Appendix D Subjective Feedback Please respond to the following questions. Do not hesitate to provide additional comments about the entire exam or individual items. 1. Did any of the items seem confusing? Did any of the Items have no correct answer or more than one correct answer? 3. Did any of the words give you difficulty? Did any of the Items seem too easy? How would you rate the d i f f i c u l t y of the entire exam from one (easy) to ten (hard)? 6. Provide any additional comments. Appendix E Reliability Procedures Index Source Threshold loss P Q (Hambleton & Novick, 1973) K (Swaminath an, Hambleton, & Algina, 1974) •3 ( P ) (Marshall, 1976) o P q ,K (Subkoviak, 1976, 1980) P q ,K (Huynh, 1976) KM (Huynh, 1977) Squared-error loss K2(X,Tx) (Livingston, 1972, 1977) 0(2) (Brennan, 1980) Domain score estimation Ep (Berk, 1980, Cochran, 1963, Millman, 1974) S.E. meas. (Xa) (Lord, 1959) (Lord & Novick, 1968) $ (Brennan, 1980) (Brennan, 1980) 86 Appendix F Organizational Structure of the Sport Governing Bodies Fitness Canada CASS National Committee on Fitness Leadership Training in Canada A Committee on Fitness Appraisal and Accreditation V BCRA-Fitness Branch B.C. Government Recreation and Sport Branch P r o v i n c i a l Fitness Advisory Committee The UBC School of Physical Education and Recreation 87 Appendix G Rating of "Specific Areas" 3 most important 3 least important A. PLANNING B. BASICS OF ANATOMY AND PHYSIOLOGY C. SAFETY D. EXERCISE PRINCIPLES FOR ADULT FITNESS CLASSES E. LEARNING THEORIES F. TEACHING STRATEGIES G. LEADERSHIP H. EVALUATION 88 Appendix H G u i d e l i n e s f o r Item S t a t i s t i c s I n t e r p r e t a t i o n s  T e r m i n o l o g y p - v a l u e For a g i v e n r e s p o n s e , t h i s i s the p e r c e n t a g e of examinees c h o o s i n g t h i s r e s p o n s e . I t i n d i c a t e s the it e m d i f f i c u l t y l e v e l . A h i g h e r p - v a l u e f o r the c o r r e c t response i m p l i e s t h a t many s u b j e c t s answered c o r r e c t l y ; the it e m may be too easy. I t s h o u l d be noted t h a t t h i s paper w i l l r e f e r t o the p- v a l u e of d i s t r a c t o r s as w e l l as the c o r r e c t answer. T h i s i s d i f f e r e n t from some a u t h o r s who i n s i s t t h a t the p - v a l u e i s o n l y a p p l i c a b l e t o the c o r r e c t r e s p o n s e . The p - v a l u e of d i s t r a c t o r s can p r o v i d e i n v a l u a b l e i n f o r m a t i o n r e g a r d i n g t h e d i s c r i m i n a t i o n a b i l i t y of the i t e m . i t e m - s u b t e s t c o r r e l a t i o n For a g i v e n o p t i o n , w i t h i n an i t e m , an examinee who chooses t h i s o p t i o n i s coded as a one and a l l o t h e r s a r e coded as z e r o s . Then a p o i n t - b i s e r i a l c o r r e l a t i o n i s c a l c u l a t e d between these zero/one s c o r e s f o r t h i s o p t i o n and the examinees' s c o r e s on the s u b t e s t t h a t c o n t a i n t h i s i t e m . Because one v a r i a b l e i s dichotomous and the o t h e r i s c o n t i n u o u s / m u l t i s t e p i t i s a p p r o p r i a t e t o c a l c u l a t e a p o i n t -b i s e r i a l c o r r e l a t i o n . T h i s p r o c e d u r e i s r e p e a t e d f o r a l l f o u r of the p o s s i b l e o p t i o n s . i t e m - t o t a l t e s t c o r r e l a t i o n The s u b j e c t s a r e a g a i n coded as z e r o s and ones, f o r each o p t i o n , and then t h e s e v a l u e s a r e c o r r e l a t e d w i t h the examinees' t o t a l t e s t s c o r e s . 89 item-external c r i t e r i o n c o r r e l a t i o n The subjects are coded as zeros and ones as above. They are then also coded on an external variable based on whether or not they were considered knowledgeable in thi s area. Examinees were coded as ones i f they belonged to the instructed group and zeros i f they belonged to the uninstructed group. A c o r r e l a t i o n was then calculated between the external c r i t e r i o n variable zero/one scores and the zero/one scores for each of four possible options. item-subtest means For a given option, within an item, a l l examinees who choose t h i s option w i l l be included in the ca l c u l a t i o n s . These examinees' scores, on the subtest to which the item belongs, are then averaged to f i n d the mean score. item-total test means For a given option, within an item, a l l examinees who choose t h i s option w i l l be included in the c a l c u l a t i o n s . These examinees' t o t a l test scores are then averaged to fi n d the mean score. item-external c r i t e r i o n means For a given option, within an item, a l l examinees who choose t h i s option w i l l be included in the c a l c u l a t i o n s . These examinees have been previously coded as one (instructed) or zero (unistructed) based on which group group they belong to. Then a l l of these zeros and ones w i l l be averaged to fine the mean score. Guidelines Item s t a t i s t i c s obtained from the LERTAP package provide, for the test constructor, a good summary of the subjects' responses to each item. These s t a t i s t i c s , combined with the 90 s u b j e c t i v e feedback, a r e used t o g u i d e the d e c i s i o n - m a k i n g p r o c e s s r e g a r d i n g i t e m r e v i s i o n or r e p l a c e m e n t . Keeping i n mind t h a t the items w i l l p r o b a b l y never be p e r f e c t i n t h e i r d i s c r i m i n a t i o n a b i l i t y i n a l l p o p u l a t i o n s , i t i s p o s s i b l e t o p l a c e the items on a continuum based on t h e i r d i s c r i m i n a t i o n a b i l i t y as shown below. D i s c r i m i n a t i o n Continuum > P e r f e c t D i s c r i m i n a t i o n No D i s c r i m i n a t i o n I n r e a l i t y , exams would seldom c o n t a i n items a t e i t h e r end of t h e continuum; t h e s e a r e m a i n l y t h e o r e t i c a l r e f e r e n c e p o i n t s . In f a c t , most t e s t s used today would c o n t a i n items t h a t f a l l somewhere a l o n g the continuum. F o r c r i t e r i o n - r e f e r e n c e d t e s t s i t i s i m p o r t a n t t o attempt t o d e v e l o p items t h a t a r e c l o s e r t o the r i g h t - h a n d s i d e of s c a l e ; t h i s would be an i t e m t h a t d i s c r i m i n a t e s w i t h o u t any f a l s e - n e g a t i v e s or f a l s e - p o s i t i v e s . T a b l e H-1 shows an example of the LERTAP ou t p u t t h a t would r e s u l t from such an i t e m . The t a b l e shows t h a t the ou t p u t would have t h e f o l l o w i n g c h a r a c t e r i s t i c s : 1. The p - v a l u e ( d i f f i c u l t y i n d e x ) f o r the c o r r e c t response 91 should be equivalent to the percentage of the t o t a l number of subjects which are instructed. This assumes that only the instructed w i l l select the correct response. In Table H-1, 50% of the subjects were from the instructed group and the p-value i s 50. 2. The p-value for each d i s t r a c t o r should be equivalent to the percentage of the t o t a l number of subjects which are uninstructed divided by the number of d i s t r a c t o r s . In a four option t e s t , the p-value should be about 16.7 (50/3) i f equal numbers of instructed and uninstructed subjects were used. In Table H-1, i t should be noted that the correlations are shown as 1.0 and -1.0 . In actual fact these values would be impossible under the conditions described. Because of the homogeneity of each of the groups, there could be no variance on each variable within groups. This implies that the c o r r e l a t i o n would be zero. If we, however, assume that the subtest and t o t a l test scores are also dichotomous (which they r e a l l y are because a l l subjects score p e r f e c t l y or get zero) then these extreme co r r e l a t i o n s are possible. 3. The correct response w i l l have a high p o s i t i v e c o r r e l a t i o n with the scores from the subtest of which the item i s a part. Table H-1 shows t h i s to be a t h e o r e t i c a l value of 1.0. 4. The d i s t r a c t o r s w i l l each have a high negative 92 c o r r e l a t i o n (-1.0) with the scores from the subtest of which the item i s a part. 5. The correct response w i l l have a high p o s i t i v e c o r r e l a t i o n (1.0) with the t o t a l test scores. 6. The di s t r a c t o r s w i l l each have a high negative c o r r e l a t i o n (-1.0) with the t o t a l test scores. 7. The correct response w i l l have a high p o s i t i v e c o r r e l a t i o n (1.0) with the external c r i t e r i o n . 8. The d i s t r a c t o r s w i l l each have a high negative c o r r e l a t i o n (-l.O)with the external c r i t e r i o n . 9. The mean score on the subtest for those who chose the correct response should be equal to the number of items in the subtest. Table H-1 shows that the subtest mean to be a perfect score of twenty-two. This assumes that only the instructed subjects w i l l choose the correct response and that a l l of the instructed subjects w i l l choose the correct response. 10. For each d i s t r a c t o r the mean score on the subtest for those who chose a given d i s t r a c t o r should equal zero. Table H-1 shows subtest means of zero for a l l three d i s t r a c t o r s . The assumptions are similar to those in point nine. 11. For the correct response, the mean score on the t o t a l 93 test for those who chose t h i s response should be equal to the number of items on the t o t a l test (57). Also see point number nine. 12. For each d i s t r a c t o r , the mean score on the t o t a l t e s t , for those who chose t h i s d i s t r a c t o r , should be equal to zero. Also see point number nine. 13. For those subjects who chose the correct response, the mean score on the external c r i t e r i o n should be equal to the code number assigned to the instructed group of subjects. Table H-1 shows that examinees from the instructed group were coded as ones. 14. For those who chose a given d i s t r a c t o r , the mean score on the external c r i t e r i o n should be equal to the code number assigned to the uninstructed group. Table H-1 shows that examinees from the uninstructed group were coded as zeros. NOTE: The above comments are made under the assumption that a l l other items in the test are also perfect discriminators. 94 Tab l e H-1 A P e r f e c t D i s c r i m i n a t o r C o r r e l a t i o n s Means Opt i o n N P ST TT EC ST TT EC * 1 90 50 * 1.00 1.00 1 .00 * 22 57 1 . 0 * 2 30 16.7 -1 .00 -1 .00 - 1 .00 0 0 0.0 3 30 16.7 -1.0 -1.0 1.0 0 0 0.0 4 30 16.7 -1.0 -1.0 1 .0 0 0 0.0 T o t a l 180 •Table H-2 A N o n - d i s c r i m i n a t o r C o r r e l a t i o n s Means O p t i o n N P ST TT EC ST TT EC * 1 45 25 * 0.0 0.0 0.0 * 1 1 28 0.5 * 2 45 25 0.0 0.0 0.0 1 1 28 0.5 3 45 25 0.0 0.0 0.0 1 1 28 0.5 4 45 25 0.0 0.0 0.0 1 1 28 0.5 T o t a l 180 NOTE: 1. The items s t a t i s t i c s a r e based on a sample s i z e of 180. Of t h i s t o t a l , 90 were c o n s i d e r e d t o be i n s t r u c t e d and 90 were c o n s i d e r e d t o be u n i n s t r u c t e d . 2. The s u b t e s t c o n s i s t e d of 22 i t e m s . 3. The t o t a l t e s t c o n s i s t e d of 57 i t e m s . 95 No p e r f e c t d i s c r i m i n a t o r s were found d u r i n g the i t e m a n a l y s i s of p i l o t one. W h i l e some items c o u l d be r a t e d as good d i s c r i m i n a t o r s , many would f a l l c l o s e r t o the o t h e r end of the continuum. A few c o u l d even be r a t e d as b e i n g v e r y c l o s e t o the n o n - d i s c r i m i n a t i n g i t e m . T a b l e H-2 demonstrates some of the p o t e n t i a l c h a r a c t e r i s t i c s of the item s t a t i s t i c s t h a t would be d e r i v e d from the n o n - d i s c r i m i n a t i n g i t e m . These c h a r a c t e r i s t i c s a r e l i s t e d below. 1. A l l p o s s i b l e r e s p o n s e s s h o u l d have an e q u a l numer of s u b j e c t s c h o o s i n g i t ; t h a t i s , a l l of the o p t i o n s s h o u l d have e q u a l p - v a l u e s . T a b l e H-2 shows t h a t each o p t i o n r e c e i v e d 45 of the t o t a l 180 (25%) p o s s i b l e r e s p o n s e s . No d i s t i n c t i o n i s made between the d i s t r a c t o r s and the c o r r e c t r e s p o n s e . 2. The c o r r e l a t i o n s between the c o r r e c t response and the s u b t e s t would be v e r y low. Table H-2 shows t h i s v a l u e t o be z e r o . T h i s i m p l i e s t h a t t h e r e i s no r e l a t i o n s h i p between whether the s u b j e c t chooses t h e c o r r e c t response and how they s c o r e on the s u b t e s t . A s u b j e c t who chooses the c o r r e c t response w i l l not n e c e s s a r i l y do w e l l ( o r even p o o r l y ) on the s u b t e s t from which the i t e m was t a k e n . No a c c u r a t e p r e d i c t i o n s can be made. 3. The c o r r e l a t i o n s between each x>f the d i s t r a c t o r s and the s u b t e s t would a l s o be v e r y low. A g a i n , T a b l e H-2 shows the v a l u e t o be z e r o . T h i s i m p l i e s t h a t t h e r e i s no r e l a t i o n s h i p between which d i s t r a c t o r a s u b j e c t chooses and how w e l l they s c o r e on the s u b t e s t . 96 4. The c o r r e l a t i o n between the c o r r e c t r e s ponse and the t o t a l t e s t s c o r e would be v e r y low. 5. The c o r r e l a t i o n between each of the d i s t r a c t o r s and the t o t a l t e s t would a l s o be v e r y low. Both 4 and 5 show t h a t t h e r e i s no r e l a t i o n s h i p between the o p t i o n t h a t a s u b j e c t chooses and the s u b j e c t ' s t o t a l t e s t s c o r e . 6. The c o r r e l a t i o n between the c o r r e c t r e s ponse and the e x t e r n a l c r i t e r i o n would be v e r y low. 7. The c o r r e l a t i o n between each of the d i s t r a c t o r s and the e x t e r n a l c r i t e r i o n s h o u l d be v e r y low. Both 6 and 7 a r e shown as z e r o s i n t a b l e H-2. T h i s d emonstrates t h a t t h e r e i s no r e l a t i o n s h i p between the o p t i o n t h a t a s u b j e c t chooses and whether they were from t h e i n s t r u c t e d group or n o t . T h i s i m p l i e s t h a t i n s t r u c t e d , as w e l l as u n i n s t r u c t e d , s u b j e c t s were c h o o s i n g a l l f o u r o p t i o n s . No d i s t i n c t i o n can be made beween the d i s t r a c t o r s and t h e c o r r e c t response based on t h e c r i t e r i o n groups. 8. The mean s c o r e on the s u b t e s t f o r t h o s e s u b j e c t s who chose the c o r r e c t response s h o u l d be e q u a l t o about h a l f the number of items i n the s u b t e s t . T a b l e H-2 shows the v a l u e t o be e l e v e n . 97 9. For each d i s t r a c t o r , the mean s c o r e on the s u b t e s t , f o r th o s e who chose the d i s t r a c t o r i s a l s o e q u a l t o about h a l f the number of ite m s i n the s u b t e s t . For 9 and 10, T a b l e H-2 shows the mean v a l u e t o be e l e v e n f o r a twenty-two i t e m s u b t e s t . No d i s c r i m i n a t i n i s made between the d i s t r a c t o r s and t h e c o r r e c t r e s p o n s e . The mean s c o r e b e i n g o n e - h a l f of t h e t o t a l number of items r e f l e c t s t he f a c t t h a t an e q u a l number of i n s t r u c t e d and u n i s t r u c t e d s u b j e c t s were s e l e c t i n g each o p t i o n . 10. For the c o r r e c t r e s p o n s e , t h e mean s c o r e i n the t o t a l t e s t , f o r t h o s e who chose t h i s o p t i o n , s h o u l d be e q u a l t o about h a l f the number of items i n the t o t a l t e s t . 11. F o r each d i s t r a c t o r , the mean s c o r e on the t o t a l t e s t , f o r t h o s e who chose t h i s d i s t r a c t o r , s h o u l d be e q u a l t o about h a l f the number of ite m s i n the t o t a l t e s t . Note: For a l l of the above, assume t h a t a l l of the o t h e r items i n the s u b t e s t i n t o t a l t e s t a r e p e r f e c t (good) d i s c r i m i n a t o r s . 12. Assume t h a t the i n s t r u c t e d s u b j e c t s were coded as ones and t h e u n i n s t r u c t e d s u b j e c t s were coded as z e r o s ; f o r thos e s u b j e c t s who chose the c o r r e c t r e s p o n s e , the mean s c o r e on the e x t e r i o n c r i t e r i o n s h o u l d be 0.5. 98 13. Assume that the instructed subjects were coded as ones and the uninstructed subjects were coded as zeros; for those subjects who chose a given d i s t r a c t o r , the mean score on the external c r i t e r i o n should be equal to 0.5. Points 12 and 13 both demonstrate the fact that half the subjects choosing any option (distractor or correct response) are instructed and the other half are unistructed. 99 APPENDIX I STATISTICS FOR PILOT #2 TABLE 1-1 Descriptive Statistics for Pilot #2 Uninstructed Instructed Group Group Combined Total Test Number of Observations 106 94 200 X 28.5 = 48% 38.2 = 64% 33.1 = 55% s 7.9 7.2 9.0 Low 12 19 12 High 48 52 52 # of items 60 60 60 Hoyt's r 0.82 0.81 0.86 S.E.M. 3.34 3.61 3.29 Cronbach's alpha 0.73 0.76 0.78 Subtest 1 X 11.8 = 56% 14.0 = 67% 12.9 = 61% Exercise s 2.9 2.7 3.0 Principles Low 4 8 4 High 18 20 20 # of items 21 21 21 Hoyt's r 0.59 0.52 0.61 S.E.M. 1.85 1.81 1.85 Subtest 2 X 10.2 = 44% 15.7 = 68% 12.8 = 56% Anatomy & s 4.3 3.8 4.9 Physiology Low 1 6 1 High 21 23 23 # of items 23 23 23 Hoyt's r 0.75 0.72 0.82 S.E.M. 2.13 1.99 2.07 Subtest 3 X 6.5 = 41% 8.4 = 53% 7.4 = 46% Safety s 2.2 1.9 2.3 Low 2 4 2 High 13 13 13 # of items 16 16 16 Hoyt's Rel. 0.40 0.32 0.46 S.E.M. 1.66 1.54 1.63 100 Table 1-2 Correlations for Pilot #2 Combined Groups EP AP S TT EC EP 1.0 0.70 0.53 0.85 0.37 AP 1.0 0.62 0.94 0.56 S 1.0 0.77 0.42 TT 1.0 0.54 EC i.o Uninstructed Group EP AP S TT EP 1.0 0.58 0.45 0.81 AP 1.0 0.54 0.91 S 1.0 0.74 TT 1.0 Instructed Group EP AP S TT EP 1.0 0.72 0.44 0.87 AP 1.0 0.47 0.93 S 1.0 0.68 TT 1.0 101 Table 1-3 Item p-values for Pilot #2 SUBTEST ITEM UNIN1 IN 2 SUBTEST ITEM UNIN IN EP AP AP S S S AP AP AP EP S S EP AP EP AP AP AP AP S EP S EP 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 94.3 57.5 34.9 24.5 50.0 41.5 73.6 49.1 56.6 90.6 59.4 48.1 90.6 42.5 72.6 29.2 43.4 20.8 60.4 17.9 40.6 42.5 44.3 90.4 81.9 48.9 39.4 79.8 46.8 89.4 87.2 86.2 94.7 69.1 62.8 95.7 66.0 89.4 67.0 58.5 57.4 79.8 11.7 48.9 93.6 68.1 EP EP EP S S AP S AP EP AP EP AP AP AP AP EP S AP EP EP S S AP 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 50.9 51.9 52.8 8.5 59.4 60.4 50.0 46.2 76.4 45.3 78.3 34.0 54.7 56.6 42.5 10.4 46.2 31.1 20.8 94.3 12.3 77.4 40.6 79.8 80.9 64.9 8.5 ? 52.1 ? 84.0 54.3 * 71.3 90.4 75.5 54.3 ? 42.6 63.8 86.2 71.3 44.7 63.8 56.4 48.9 92.6 ? 47.9 98.9 61.7 1 0 2 SUBTEST ITEM UNIN1 IN 2 SUBTEST ITEM UNIN IN AP 24 45.3 75.5 EP 25 28.3 33.0 * EP 26 50.9 76.6 EP 27 88.7 93.4 * AP 28 21.7 55.3 AP 29 28.3 55.3 S 30 84.9 92.6 * EP 54 29.2 35.1 * EP 55 45.3 40.4 * S 56 11.3 75.5 EP 57 40.6 48.9 AP 58 45.3 53.2 EP 59 32.1 40.4 S 60 15.1 11.7 ? 1. UNIN - Uninstructed group (N=106) 2. IN - Instructed group (N=94) 3. ? - UNIN p-value is greater than or equal to IN 4. * - IN p-value is greater than UNIN by 5 or less Appendix J Number of False-Positives and False-Negatives  for Varying Cut-Off Scores 60 Items UNIN IN 46 Items UNIN IN 36 Items UNIN IN 103 50% 55% 60% 65% 70% 7 5% 80% 85% 90% F P F P F P F P F P F P F P F P F P 60 46* 75 31 87 19 94 12 97 9 103 3 105 1 106 0 106 0 14** 80 24 70 35 59 47 47 60 34 72 22 86 8 90 4 94 0 83 23 91 15 99 7 104 2 104 2 105 1 106 0 30 64 38 56 60 34 69 25 82 12 93 1 94 0 57 49 64 42 72 34 85 21 92 14 93 1 104 2 105 1 106 0 5 89 10 84 16 78 23 71 30 64 37 57 66 28 79 15 94 0 1. UNIN - Uninstructed group (N=106) 2. IN - Instructed group (N=94) 3. F - Fa i l 4. P - Pass 5. * - False positive 6. ** _ F a i s e negative Appendix K BFTE - Pilot #3_ Cardio respi ra tory endurance can be best defined as: a) The e f f i c i ency of the heart and lungs b) Stroke volume times heartrate c) V i t a l capacity plus residual volume d) Cardiac output minus residual volume The most e f fec t ive method for improving f l e x i b i l i t y i s through: a) Isotonic contract ions b) S ta t i c stretches c) Isometric contract ions d) B a l l i s t i c stretches Heat exhaustion may be caused by a l l of the fo l lowing except: a) Exerc i s ing while dehydrated b) Holding your breath while exerc is ing c) Inadequate v e n t i l a t i o n in room d) Exerc is ing in mul t i - layered c lo th ing Stretching the hamstrings i s important for low back problems because i t : a) Strengthens the abdominals b) Re-aligns the spine c) Allows the pe lv i s to t i l t backward d) Allows the pe lv i s to t i l t forward A s i g n i f i c a n t proportion of lower back problems are d i r e c t l y re la ted to a lack of muscular strength, p a r t i c u l a r l y in the: a) Abdominal area b) Lower back area c) Upper back area d) Hamstring The bones in your forearm are c a l l e d the: a) Ulna and T i b i a b) Radius and Ulna c) F ibu la and Radius d) T ib ia and Radius The by-product of anaerobic work i s : a) Glycogen b) Free fa t ty acids c) Carbohydrates d) Lac t ic ac id Which of the fo l lowing i s not considered a primary component of f i t n e s s : a) Muscular strength b) F l e x i b i l i t y c) Speed d) Cardio-vascular endurance 106 9. The best method for reducing body fat i s to: Decrease die tary carbohydrate intake Increase your a c t i v i t y leve l Decrease dietary fat intake Decrease c a l o r i c intake and increase your a c t i v i t y level 10. Human skele ta l muscle i s composed of fast twitch and slow twitch f i be r s . Which one of the fo l lowing i s a cha rac t e r i s t i c of slow twitch muscle f ibers? a) Functions by anaerobic metabolic processes b) Rich in g l y c o l y t i c enzymes c) More. res is tant to fat igue d) Has a higher power output 11. While performing strength exerc i ses , par t ic ipants should be encouraged to : a) Breathe out during the re laxa t ion phase b) Breathe out during the exer t ion phase c) Hold t h e i r breath to improve f i tness d) Breathe only between repe t i t ions of the exercise 12. The term that best describes the pos i t ion of the lumbar vertebrae r e l a t i v e to the c e r v i c a l vertebrae i s : a) Superior b) Infer ior c) Proximal d) Medial a) b) c) d) 107 13. When exerc is ing at very intense l eve l s the main fuel source i s : a) Fat b) Protein c) Glycogen d) Free fa t ty acids 14. Your g r a c i l i s muscles are located: a) In your upper back b) In your lower leg c) In your thigh d) In your forearm 15. The group of muscles mainly responsible for hip abduction are the: a) Hamstrings b) Quadriceps c) Gluteals d) Abdominals 16. Sweating causes: a) Temperatures in wo b) Sal t concentration c) An increased blood d) Oxygen u t i l i z a t i o n 'king muscles to increase wi th in you to increase flow to the working muscles to be increased 108 17. Swelling can be l i m i t e d BEST through: a) Cold treatments combined with e levat ion , b) Moist heat treatments with a s p i r i n c) Heat treatments combined with elevat ion and compression d) Cold treatments combined with e levat ion and compression 18. To lose a pound of body fat you 'd have to experience a d e f i c i t of: a) 3200 ca lo r i e s b) 3500 c a l o r i e s c) 3800 ca lo r i e s d) 3000 c a l o r i e s 19. A person with high blood pressure in your f i tness program should be t o l d a l l of the f o l l o w i n g , except: a) Do not hold your breath on exert ion b) Limi t or delete s t r i c t l y upper body exercises c) Avoid any extended isometr ic contract ions d) Stand s t i l l and ju s t do upper body and arm exercises during the cardio section of c lass 20. Doing a push-up on the hands and the feet instead of on the hands and the knees: a) Increases the workload b) Decreases muscular strength c) Provides bet ter balance d) Decreases the resistance 109 21. Cardiac output i s the product of: a) Heart rate x blood pressure b) Blood pressure x stroke volume c) Stroke volume x heart rate d) Heart rate x oxygen uptake 22. V/hich chamber of the heart receives the returning venous blood a) Left artn'um b) Left v e n t r i c l e c ) Right a t r i urn d) Right v e n t r i c l e 23. Of the choices below, the best equation for approximating maximum heartrate i s : a) 220 - Age = Maximum Heartrate b) 3 x Resting Heartrate = Maximum Heartrate c) 220 - Resting Heart Rate = Maximum Heartrate d) 170 - Age = Maximum, Heartrate 24. In order to maximize gains in muscular strength one must increase the: a) Duration of workouts b) Frequency of workouts c) Number of r epe t i t ions d) Resistance of the workload 110 25. The most important concept in learning the proper mechanics of a s i t -up i s to : a) Avoid using the arms to develop momentum b) Curl up, with the back rounded c) Keep the back s t ra ight d) Anchor the legs for support 26. In t rea t ing a heat stroke v i c t i m , which of the fo l lowing w i l l best hasten the cool ing process when applied in conjunction with sponging? a) Applying ice cubes to the armpit and groin b) Applying co ld compresses to the neck and forehead c) Giving the patient cold drinks d) Fanning the patient 27. Among the f o l l o w i n g , a l l are sources of energy for exercise except: a) Glycogen b) Fats c) Vitamins d) Glucose 28. The f i r s t i n d i c a t i o n of oxygen shortage would be: a) Cyanosis and d i l a t ed pupi ls b) D i f f i c u l t y in breathing and cyanosis c) Di la ted pupi l s and d i f f i c u l t y in breathing d) Increased resp i ra t ion and pulse rates 29. Glycogen i s not stored in the: 111 a) Stomach b) Muscles c) Kidney d) L ive r 30. A warm-up for a moderate paced f i tness c lass should not include a) Maximal isometric contract ions b) Light jogging c) Light s t re tching d) Stat ionary walking 31. The main reason for doing j o i n t mobi l i z ing exercises as a warm-up a c t i v i t y i s to : a) Prepare the musculoskeletal system for subsequent f u l l range of motion a c t i v i t y b) Increase synovial f l u i d volume c) Increase j o i n t c a r t i l age thickness d) Dispose of waste products b u i l t up in the muscle c e l l s 32. A l l of the fo l lowing are considered to be connective t issue except: a) Tendons b) Ligaments c) Sk i n d) Collagen 112 33. Which of the fo l lowing are the la rges t vertebrae in the spinal column? a) Lumbar b) Thoracic c) Coccyx d) Cervical 34. Lateral arm raises are associated with contract ion of the: a) Serratus Anter ior b) Del to id group c) Latissimus dorsi d) Trapezius 35. The correct order of sections of the spine from top to bottom would be: a) Thoracic, lumbar, c e r v i c a l , s a c r a l , coccyx b) C e r v i c a l , t ho rac ic , lumbar, s a c r a l , coccyx c) Coccyx, c e r v i c a l , s a c r a l , t ho rac i c , lumbar d) C e r v i c a l , s a c r a l , thorac ic , lumbar, coccyx 36. Pregnant women p a r t i c i p a t i n g in your f i tness c lass should: a) Avoid doing pe lv ic t i l t exercises b) Avoid doing slow s i t -ups c) Avoid reaching the i r maximal heart rate d) Avoid dr inking f l u i d s one hour p r i o r to c lass 37. According to Canadian Food Guides, of your tota l d a i l y nu t r i t iona l consumption % should be comprised of carbohydrates: a) b) 10-20' 20-30% '% c) d) 35-45% 50-60% 113 38. Of the fo l lowing progressions in locomotive cardiovascular t r a i n i n g , which i s the most appropriate for safe improvement at a beginner's f i tness l eve l? a) Walking, to in termit tent jogging, to continuous jogging to long slow distance running b) Jogging, to in terva l speed work, to long distance runni ng c) Interval running, to var ied pace running, to long slow distance running d) Long slow distance, to in te rva l fast running, to varied pace running 39. The i n i t i a l assessment, a search for immediate l i f e threatening problems, should be conducted in the fo l lowing order: a) Check the airway for obstruct ion Check i f they are breathing Check i f the heart i s beating to c i r c u l a t e blood b) Check the airway for obst ruct ion Check i f the heart is beating to c i r c u l a t e blood Check i f they are breathing c) Check i f they are breathing Check i f the heart i s beating to c i r c u l a t e blood Check the airway for obstruct ion d) Check i f they are breathing Check the airway for obstruct ion Check i f the heart i s beating to c i r c u l a t e blood 114 40. Wearing p l a s t i c exercise apparel i s : a) B e n e f i c i a l , because the increased perspi ra t ion causes greater fat loss b) B e n e f i c i a l , because i t causes heat retention and thereby keeps the muscles warm . c) Not benef ic ia l , because i t causes a reduction in blood c i r c u l a t i o n d) Not b e n e f i c i a l , because i t does not allow the body heat to d i s s ipa te 41. The hamstring muscles are important for f lexing the: a) Hip b) Knee c) Pe lv i s d) Ankle 42. A f i f t y year o ld man, considerably overweight, has been r e l a t i v e l y sedentary for the past two decades. What should he concentrate on f i r s t in order to improve his f i tness? a) Muscular and cardiovascular endurance b) F l e x i b i l i t y and muscular endurance c) F l e x i b i l i t y and card iovascular endurance d) Cardiovascular endurance and muscular strength 43. When teaching "target heart rate monitoring", i t i s most important to emphasize that : a) Maximal stroke volume increases with age b) A lower res t ing heart rate implies a better f i tness level c) . A heartrate of 170 beats per minute i s bet ter than one of 150 d) Maximal heart rate decreases with age 115 44. Which blood vessels carry nourishment to the heart muscle? a) Coronary a r t e r i e s b) Coronary veins c) Carot id a r t e r i e s d) Pulmonary ar tery 45. Residual lung volume does not increase wi th : a) Age b) Smoking c) Exercise d) Asthma 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0077349/manifest

Comment

Related Items