UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A comparison of some methods of evaluating outcomes of laboratory instruction in high school chemistry Chapman, Victor Lennie 1952

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1952_A8 C5 C6.pdf [ 5.78MB ]
Metadata
JSON: 831-1.0106531.json
JSON-LD: 831-1.0106531-ld.json
RDF/XML (Pretty): 831-1.0106531-rdf.xml
RDF/JSON: 831-1.0106531-rdf.json
Turtle: 831-1.0106531-turtle.txt
N-Triples: 831-1.0106531-rdf-ntriples.txt
Original Record: 831-1.0106531-source.json
Full Text
831-1.0106531-fulltext.txt
Citation
831-1.0106531.ris

Full Text

A COMPARISON OF SOME METHODS OF EVALUATING OUTCOMES OF LABORATORY INSTRUCTION IN HIGH SCHOOL CHEMISTRY by VICTOR LENNIE CHAPMAN A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF ARTS i n the School of Education We accept this thesis as conforming to the standard required from candidates for the degree of MASZER OF ARTS Members of the Department of THE UNIVERSITY OF BRITISH COLUMBIA OCTOBER, 1952 ABSTRACT This study compares methods to evaluate the outcomes of laboratory instruction i n high school chemistry and reports the instruments developed for that purpose. i. The objectives evaluated were: the a b i l i t y of students i n basic laboratory s k i l l s , a b i l i t y of pupils i n the selection of materials, apparatus and methods; and facts that are outcomes of laboratory instruction. These three objectives were selected from some fourteen general objectives gleaned from the literature per-taining to laboratory chemistry. They were chosen as representing outcomes due solely to laboratory instruction as compared with others that may have been achieved at least i n part, by the routine lessons. The experimental method was to evaluate 72 high school students of chemistry by means of: 1. a practical test of laboratory work designed to conform with the objectives chosen referred to as the criterion test. 2. a group pencil and paper test somewhat pa r a l l e l to the criterion test. 3. the laboratory notebooks of the students. 4. the teacher's estimates of student progress toward the objectives. i i Three classes of chemistry were evaluated i n the Spring of 1952. The teacher's estimates were prepared i n February from observation of the students at work i n the laboratory,* The labora-tory reports had been marked weekly for six months prior to the experiment and the total score on fifteen reports was taken as a measure of the notebooks to assess laboratory knowledge. In March the criterion test was administered i n two sec-tions. Section I tested chiefly manipulations and was an individual test. Section II consisted of a series of small tests based on the course of study. About one week later the group pencil and paper test was administered to the three classes i n successive class periods. The test consisted of two parts: 1. multiple-choice items, and 2. items matching diagrams with statements. The following s t a t i s t i c a l measures were reported for a l l tests: mean, standard deviation, r e l i a b i l i t y . For the criterion and pencil paper test the following were also reported: internal consistency of test items with their difficulties,, The v a l i d i t i e s of the items of the pencil and paper test were also reported. The correlations between the different tests were calculated as a means of appraising the predictive value of each. The simple regression and multiple regression equations and beta coefficients for predicting the criterion from the pencil and paper test were com-pared. T-scores were tabled for the pencil and paper test as well as i i i derived scores on the basis of a mean of 63 and a standard deviation of 13, designed so as to set 50 as the cr i t ica l score to cut off 15 percent of the testees. To compare the ability of the test to predict the upper quarter on the criterion with the lower quarter, a chi-square test of significance was applied. The following conclusions appear to be defensible: 1, The group pencil and paper test, in predicting the criterion, was significantly superior to other methods, 2, The laboratory notebooks failed significantly to predict the outcomes being tested, 3, The teacher's estimates did not materially assist the pencil and paper test to predict the outcomes being tested, 4, The two tests possess a range of difficulty conforming to the requirements of a good test, 5, The test items having indices of validity of less than,23 contribute l i t t l e to the predictive value of the pencil and paper test. 6, The pencil and paper test predicts the criterion equally well at either the upper or lower levels. ACKNOWLEDGMENTS This study was made possible through the cooperation of Mr.E.L.Yeo, Principal of Britannia High School, Vancouver, Br i t i s h Columbia. The author i s further indebted to Mr.Yeo, and also indebted to Mr.C.W.Abercrombie, for reading the manuscript and for valuable suggestions. To Mr.J.T.Ioung, Mr.E.H.Vollans and Mr.G 0L 0Phillips, the writer i s grateful for their service i n administering the experimental test to their classes prior to i t s revision, and for their forthright criticism and worthwhile suggestions. To Dr.J.R.Mcintosh, for his guidance both i n this study and i n the preparation of the manuscript, the writer i s especially indebted. V.L.C. CONTENTS LIST OF TABLES Chapter I THE PEOBLEM General The Problem Limitations of the Study Summary II STUDIES RELATED TO THE PROBLEM Laboratory Studies Tests of Objectives Standardized Laboratory Tests Summary III THE PROCEDURE Objectives Objectives for Instruction i n the School Chemistry Laboratory The Criterion Test The Pencil and Paper Test Preliminary Administration Correlation with the Criterion Page v i i 1 1 7 8 9 10 10 11 16 17 19 19 High 20 21 23 27 28 i i i i v Chapter Page Item Analysis 28 R e l i a b i l i t y Coefficient 29 The F i r s t Revision 30 The Second Revision 30 The Laboratory Notebooks 31 The Method of Scoring the Laboratory Notebooks 32 The Teacher 1s Estimates 33 The Method of Estimating 33 The Assembling of the Data 35 ' The Chemistry 91 Practical Test 34 The Revised Horton Test 36 The Pencil and Paper Test 37 Summary 38 IV ANALYSIS OF RESULTS 39 Methods of Analysis 39 The R e l i a b i l i t y Coefficient 39 The Internal Consistency of Items 40 Item Analysis Indices 41 Item Indices Based on a Continuum Dichotomized for Convenience 41 The D i f f i c u l t y of Items 43 The Data 45 The Criterion 45 The R e l i a b i l i t y Coefficient (Criterion Test 46 Page Chapter The Internal Consistency of Items (Criterion Test) 47 The D i f f i c u l t y of Items (Criterion Test) 48 The Pencil and Paper Test 48 The Reli a b i l i t y of the Pencil and Paper Test 48 The Internal Consistency of Items (Pencil and Paper Test) 49 The Validity Coefficients of Items (Pencil and Paper Test) 50 The Dif f i c u l t y of Items (Pencil and Paper Test) 50 Correlations with the Pencil and Paper Test 51 The Laboratory Notebooks 53 The Teacher's Estimates 55 The Revised Horton Test 56 The Chemistry 91 Laboratory Test 57 The Multiple Regression Equation 58 The Beta Coefficients 59 The Simple Regression Equation 59 Standard Scores, Derived Scores, and Percentiles 60 Elimination of Items with Internal Consistencies Below 023 61 Summary 62 V SUMMARY AND CONCLUSIONS 64 Conclusions 6? Page Chapter BIBLIOGRAPHY 71 APPENDIX A - Objectives 74 B - Approved l i s t of laboratory techniques ranked according to importance 75 C - The Criterion Test 79 1. The Revised Horton Test 80 2. -The Chemistry 91 Practical Laboratory Test 82 D - The Pencil and Paper Test 93 E - Check sheet for scoring The Revised Horton Test 101 F - Answer sheet for the Chemistry 91 Practical Laboratory Test 102 G - Data 103 H - Internal Consistencies, Validities, and Difficulties of Items in the i Pencil and Paper Test 105 I - Internal Consistencies and Difficulties of Criterion Test Items 106 J - T-scores and Percentiles for the Pencil and Paper Test 107 K - Pencil and Paper Test Scaled to place Fifteen Percent Below a Critical Score 108 LIST OF TABLES Table I Some S t a t i s t i c a l Measures of the Tests of Laboratory Outcomes II Criterion Test; Comparison of Internal Consistencies by the Methods Indicated III Pencil and Paper Test; Comparison of Internal Consistencies by the Methods Indicated IV Pencil and Paper Test; Comparison of Item Validities by the Methods Indicated V Product-Moment Correlations Between the Pencil and Paper Test and Five Other Measures VI Correlation Between the Notebooks and Five Other Measures VII Correlations Between the Teacher's Estimates and Five Other Measures VIII Correlations Between the Revised Horton Test and Four Other Measures IX Correlations of the Chemistry 91 Laboratory Test and Four Other Measures X Correlations of the Criterion and Other Measures after Deleting Inconsistent Items CHAPTER I THE PROBLEM GENERAL For over forty years methods of laboratory instruction have been under discussion and investigation. The failure to arrive at any definite conclusion has been due, chiefly, to the conflict i n the find-ings of the investigators© Two notable studies that failed to agree were those of Kiebler and Woody^ " and Horton.^ The former, an earlier study, d i s t i n c t l y favored the demonstration method, while the la t t e r strongly supported the individual method,, The situation was further complicated when a number of schools began placing a new emphasis on certain objectives with respect to the sci e n t i f i c method and the sc i e n t i f i c attitude. With the re-orientation of the objectives for secondary education, and with the new philosophy of "education for everyman's child", the secondary school population increased rapidly. In this connection there arose a demand for general science education, without detailed technical knowledge. With the increased school popu-lation, the expense of supplying laboratory equipment rose sharply. 1 Kiebler, E.W., and Woody, Cl i f f o r d , The Individual Laboratory  Versus the Demonstration Method of Teaching Physics. Journal of Educational Research, 7:50-58, January, 1923. i ; 2 Horton, Ralph E», Measurable Outcomes of Individual Laboratory  Work i n High School Chemistry. (Teachers College Contribution to Education, No.303.) New York, Bureau of Publications Teachers College, 1928, p.l©5. 2 Hence any means of holding or reducing costs became urgent. Conse-quently, the less costly demonstration method gained favor© At the same time, some of the more important objectives of individual labora-tory instruction were lost sight of. Particularly was this true where there was l i t t l e or no opportunity for students to handle apparatus and reagentso Nevertheless, laboratory work i s an integral part of the training of a true scientist and since high school special science courses are generally prerequisites for this training they ought to reflect the elements of the training, even to laboratory instruction. If desirable objectives f o r laboratory work can be ju s t i f i e d , then there i s an obligation to appraise the progress of students to-ward these objectives, i n the most valid, reliable and convenient method available. At present there appear to be five methods i n use«, Weak-nesses are evident i n a l l these methods: lo Marking the students laboratory notebooks It i s conceivable that a neatly written and carefully pre-pared book of assignments may i n no way indicate the student's a b i l i t y to perform an experiment, or to manipulate apparatus. It i s possible that he may have plagiarized his reports from the book of a student of a previous year. It i s even possible that he may have submitted the work of another student, 2, Estimating the student's laboratory proficiency Since a person's estimate may vary from time to time, and since different persons 1 estimates of the same student often dis-agree, they are probably highly unreliable. The variance could be due to the methods of estimating. It could also be due to the d i f -fering standards of judging as well'as to the changing of standards while judging, 3. Marking the chemical products prepared i n the laboratory While this method may evaluate some of the outcomes of the laboratory, i t may also lead to one of the greatest failures of traditional laboratory instruction; v i z , , the failure to promote growth i n sci e n t i f i c integrity, by permitting students to submit substitutions for the products they have prepared i n the laboratory. 4. Keeping attendance records It i s d i f f i c u l t to conceive how attendance alone can contri-bute to outcomes of instruction without there being some evidence of time profitably spent. However, an attendance record as a check on experiments performed would certainly have some merit when the reports were being scored. 5. Administering a practical test Providing the test were valid and reliable i t would probably be the best test of progress i n laboratory work as i t would be ap-praising either identical or related elements of the laboratory i n their natural setting, the laboratory. This method appears not to be i n general use, probably because i t i s so time-consuming. Since the foregoing objections may be raised i n connection 4 with the usual methods of appraisal, there seems to be need for a new approach to laboratory evaluation. In this connection a group pencil and paper test that would conform to the requirements of a good test and at the same time measure the attributes demanded by the ob-jectives suggests i t s e l f o The advantages of such a test would seem to be: 1. It would save time Practical tests are, as a rule, considered to be most suit-able to evaluate s k i l l s of manual dexterity,, They are, however, usually individual tests and as such are very time-consuming i n com-parison with a group test, with which as many as thirty candidates at a time may be supervised by one examiner i n contrast to one candidate. Furthermore, i t usually requires more time to perform a task than to select an answer to an objective test item, 2. It would be easy to administer Printed objective group tests with directions to examiners are not d i f f i c u l t to administer, can be reliable, and can usually be scored by a c l e r i c a l staff. On the other hand, a practical test, to be reliable, requires an experienced and capable administrator whose judgments have a minimum of v a r i a b i l i t y . 3. It would be more reliable The r e l i a b i l i t y and the vali d i t y of the scores on a test are affected by the methods of scoring, as well as by the conditions under which the test i s administered. Various studies have shown that the scores of a test, when the marking i s done objectively, 5 are more reliable than the scores when the marking requires the subjective judgment of the examiner. When the conditions under which a test i s administered are subject to a high degree of con-t r o l , Bcores. are more reliable than when the conditions are sub-ject to l i t t l e control. In administering the group pencil and paper test to different classes the external conditions can be well controlled. It would be very d i f f i c u l t to administer the i n -dividual practical laboratory test to different groups and control such external factors as the time of the day, the mood of the ex-aminer, and the physical conditions of the laboratory. For these reasons the group pencil and paper test would appear to be more reliable than the practical laboratory test when both tests are being aoidnistered by different examiners. 4. It could be used as a basis of promotion Providing the group test does possess the advantages l i s t e d under headings 1, 2, and 3, then an attempt might be made by some authorities to replace current promotional practices with the better measuring instrument. 5. It would be useful to evaluate teaching Tests that are not too lengthy and are easily scored may have some diagnostic value, particularly from the point of view of detect-ing gaps i n the instruction of students. Teachers should welcome any device that could be used for such a purpose. 6 6. It could be used to investigate some phases of the learning  process It would be interesting to know what effect a thorough trains ing in one laboratory science would have on onets ability i n another laboratory science. It, has been suggested that certain attitudes, such as care with delicate instruments and confidence in the use of apparatus, may be transferred from training in one science to another. It is only by investigations of these unknown educational processes that teaching can be advanced and modified. The development of tests that can give a measure of achievement in any field of endeavor has its place in assisting to discover some relationship i n another field. 7. It might indicate methods of evaluation at the college level If a pencil and paper test can be shown to correlate highly with actual performance in the laboratory at the high school level, i t would point the way to a similar test for measuring the extent to which the laboratory is achieving i t s objectives at the advanced level. Hendricks sums up some of the subtler advantages of such a test as follows: If a pencil and paper test can be developed that will have only tolerable validity, i t will help to determine what our chemistry teaching program is doing. To be more specific, i f we knew with some certainty just what our laboratory i s doing for our students we could review our procedures with more confidence and eliminate useless parts 1 Hendricks, B.Glifford, "Pencil and Paper Tests in the Laboratory," Journal of Chemical Education. 22:543, November, 1945. 7 Mallinson^" comes to the conclusion that there is need for reliable and valid tests for evaluating the outcomes of science teach-ing, other than the acquisition of factual knowledge* If the objec-tives of science teaching now considered of prime importance are accept-ed, then i t would be desirable to have valid instruments to measure their attainment. This is a considered opinion after reviewing some eighty-four articles, a l l but nine of which were published between 1940 and 1948. For these reasons i t would seem feasible to investigate the possibility of testing some outcomes of the laboratory by means of pencil and paper tests. This, of course, will necessitate not only the determination of the objectives but also the construction of a measur-ing device to appraise achievement i n the laboratory. THE PROBLEM Mention has been made of several possible methods to ap-praise laboratory work in high school chemistry. The problem is two-fold and may be stated: 1. To prepare a valid, reliable and usable pencil and paper test pertaining to the objectives of laboratory chemistry. 2. To compare different methods of evaluating the outcomes of instruction in high school laboratory chemistry. 1 Mallinson, George G., "The Implications of Recent Research in Teaching of Science at the Secondary School Level," Journal of  Educational Research. 43:321-42, January, 1950. 8 The specific methods to be employed i n pursuing the investi-gation are: 1, An individual practical test of the objectives of laboratory-instruction, to be conducted i n the laboratory* This test w i l l be called the criterion,, 2, A group pencil and paper test of the same objectives as the practical test© 3, The teacher's estimates of progress i n attaining the objectives of laboratory chemistry, 4e The grading of the traditional laboratory notebooks, LIMITATIONS OF THE STUDY The study w i l l be limited to high school students of chem-i s t r y . Caution must, therefore, be taken i n transferring any general-izations resulting from the study, to other high school sciences. The tests, both criterion and group pencil and paper, while possessing curricular v a l i d i t y for students of schools i n Bri t i s h Columbia, may well be invalid, at least i n part, for students whose chemistry courses deviate from the basis of the tests. Since the students w i l l be tested on certain objectives of laboratory work i n chemistry, the study does not presume to say how other outcomes of instruction i n chemistry may be appraised. The study w i l l not attempt to generalize as to what i s 9 assessed by the measures involved, except i n so far as the measures involved deal with the chosen objectives,, The experimental factor w i l l have been the method of ap-praising outcomes of laboratory instruction with respect to the object-ives chosen, SUMMARY The purposes of the study are to compare methods of evaluating the outcomes of laboratory instruction i n chemistry and to develop instruments for making such comparisons. In order to investigate more f u l l y the contributions of laboratory science, i t i s important to have devices for evaluation i n which confidence can be placed. Experiments i n the f i e l d of teaching methods require means of appraising outcomes of instruction. Until i t has been found which methods are valid and reliable, l i t t l e progress can be made i n the methodology of laboratory science. An attempt has been made to explore several methods of appraisal with respect to the results of laboratory attainment i n chemistry. CHAPTER II STUDIES RELATED TO THE PROBLEM For the purpose of convenience, previous studies of testing the objectives of chemistry w i l l be considered under the following headings: laboratory studies, tests of objectives, and standardized tests. LABORATORY STUDIES Of a l l the studies reviewed that bear on the present problem Horton's^" i s most noteworthy. In his study an attempt was made to discover i f there were outcomes of laboratory instruction not tested by the typical high school chemistry examination. For the purpose of 2 evaluating these outcomes, Horton devised practical individual tests of predetermined laboratory objectives. In his t a l l y Horton used six-teen laboratory manuals and chose 102 s k i l l s . The complete catalogue was submitted to a jury of sixteen teachers of chemistry or heads of chemistry departments i n high schools. " Each item was marked as: 1, habit; 2, model; or 3* to be omitted as undesirable. Of the 102 1 Horton, Ralph E., Measurable Outcomes of Individual Laboratory  Work i n High School Chemistry. (Teachers College Contribution to Education, No.303), New York: Bureau of Publications, Teachers College, 1928, p.49. 2 Ibid., p.74. 11 items 56.2$ averaged as habits, 32,552 averaged as models and 10,45? were undesirable. From the replies to his questionnaire Horton then ranked the 55 techniques, judged to be desirable as habits. In connection with the study Horton1 also prepared a pencil and paper test of some fifteen diagrams and twelve statements of l a -boratory preparations or procedures© The student was tested on his ability to match twelve of the fifteen diagrams with the twelve statements. In the Horton study no other pencil and paper tests pertaining to laboratory achievement were used and no attempt was made to correlate the results of the practical tests with the pencil and paper test, TESTS OF OBJECTIVES 2 Hendricks has published some nine test items on outcomes of instruction in the chemical laboratory. The items, -while not sufficient in number to form a reliable test, have published valid-ity indices ranging from ,50 to ,25. Each item i s prefaced by a statement of the outcome of instruction to be tested. If a catalogue of outcomes and test items could be compiled 1 Horton, Ralph,E,, op.cit.. p.74, 2 Hendricks, B.Clifford, "Pencil and Paper Tests in the Laboratory", Journal of Chemical Education. 22:543-46, November, 1945. 12 for laboratory sciences, then reliable and valid tests could be assembled from these items * Numerous tests have been prepared on those aspects of science teaching that are considered fundamental, and some of these are excell-ent. However, most of these tests measure the outcomes of science that are achieved jointly by classroom methods and the laboratory. In fact, i t i s conceivable that in many cases good classroom instruc-tion without any laboratory work would show high returns on some of these tests. The following i s a sample item from a test by Hendricks, 2 Tyler and Frutchey. In one experiment carbon monoxide and hydrogen were heated under pressure and a catalyst to 350°C. In a second experiment, under the same conditions, but with the temperature at 1500°C. will there be any difference in the reaction and why? (a) The reaction in the second experiment will proceed less rapidly ( ) (b) A smaller amount of methanol will be obtained in the first experiment than in the second . . . .' ( ) (c) The reaction in the second will proceed more rapidly than in the f i r s t ( ) (d) The amount of methanol will be the same in each experiment . . . . . . . . . . . . ( ) (e) A larger amount of methanol will be obtained in the f i r s t experiment ( ) 1 Mallinson George G., "The Implications of Recent Research in Teaching of Science at the Secondary School Level," Journal of  Educational Research. 43:321-42, January, 1950. 2 Hendricks, B.C., Tyler, R.W., and Frutchey, F.P., "Testing Ability to Apply Chemical Principles," Journal of Chemical  Education. 11:611-3, November, 1934. 1 13 Check the statements that give the reasons for the answers you checked above. (1) Temperature has no effect on rates of reaction i n these experiments . . . . . . . . . . . . ( ) (2) In this reaction an increased temperature favors the rate of reaction decomposing the product . . ( ) (3) Some catalysts retard rates of chemical change . ( ) By learning the Laws of Mass Action and gaining a f u l l understanding of them one should be able successfully to answer the questions of this type. I t i s conceivable that experimental evidence would help mentally to f i x a principle, thus assisting i n answering statement 4} but i t i s not necessary. Buckingham and Lee*' prepared a unique scheme for testing unified concepts i n science. The test consisted of four parts: (1) The student answered true-false items on the f i e l d of science to be tested. (2) He checked those statements that he would require i n order to write a theme on the f i e l d being tested. (3) He added any significant principles he would require i n his essay. (4) He wrote the essay unifying the s c i e n t i f i c concepts i n parts 2 and 3. 1 Buckingham, Guy E., and Lee, Richard E., "A Technique for Testing Unified Concepts i n Science," Journal of Educational  Research. 30:20-27, September, 1936. 14 The method would seem to be worthy of consideration for the purpose of testing a b i l i t y to write laboratory reports. The open book method has been suggested by Quam,* and he gives a sample test. This method.is useful i n the classroom, but has d i f f i c u l t i e s for departmental or,, standardized examinations because the textbooks are not uniform© Such a test probably does measure the student's a b i l i t y to use reference sources, but may also indicate his familiarity with his own text. By using diagrams or tables one might adapt the method to test a b i l i t y to apply principles or to reason with sc i e n t i f i c materials. One of the objectives of science teaching which i t i s d i f f i -cult to test i s the a b i l i t y to use the sci e n t i f i c method, A student may understand a general statement of the steps to be followed i n the s c i e n t i f i c method and s t i l l not be able to outline the specific steps i n a logi c a l manner or execute the procedures necessary to complete an investigation. Again, one may possess the habit of logical thinking, so necessary to apply the method, and yet lack the patience to complete an investigation. The best test of the a b i l i t y to use the s c i e n t i f i c method i s to carry out an investigation even at the high school l e v e l , according to "the method", Keeslar*s* statement of the elements of the 1 Quam, G,N., "Neglected Types of Examinations", Journal of Chemical Education. 17:363-5, August, 1940; 2 Keeslar, Oreon, "Elements of the Scienti f i c Method," Science  Education, 29:273-8, December 1945. 15 s c i e n t i f i c method would be a good basis from which to evaluate an investigation. To get around the d i f f i c u l t y of the time element, however, one could use tests already developed. Such tests usually measure isolated elements of the method such as attitudes, 1 a b i l i t y 2 3 to apply principles, or the a b i l i t y to interpret experimental data. It would be interesting to know how well a battery of tests of ele-ments of the sci e n t i f i c method would predict the a b i l i t y to apply the method i n i t s entirety. Webb and Beauchamp^ devised an interesting test i n labora-tory resourcefulness. It was of the individual type, practical i n nature, requiring the minimum of materials but considerable time to administer,, The thirteen items were tabulated i n order of d i f f i c u l t y . Laboratory resourcefulness did not find a place i n the l i s t of ob-jectives i n Appendix A, although i t merits mention as an objective. It would seem that a practical test i n laboratory resourcefulness could be extended and further study made i n this phase of training i n science. One could envisage par a l l e l items of a pencil and paper 1 Ter Keunst, John, and Bugbee, Robert E., "A Test on the Sci e n t i f i c Method", Journal of Educational Research. 36:489-501, March, 1943. 2 Hendricks, Tyler & Frutchey, op.cit.. 11:611-3. 3 Hendricks, B.Clifford, "Measuring the A b i l i t y to Interpret Experimental Data," Journal of Chemical Education, 13:62-4, February, 1936. 4 Webb, H.A., and Beauchamp, R.V., "A Test of Laboratory, Resourcefulness," School Science and Mathematics, 22:259-6?, March, 1922. 16 test that would reduce the time and labor i n evaluating laboratory resourcefulne ss• STANDARDIZED LABORATORY TESTS Standardized laboratory tests have been scarce and standard-ized tests i n chemistry have had few items pertaining to the laboratory. In most instances the same criticism i s applicable, viz., these tests measure learning of a factual nature that could be achieved by studying a text or laboratory manual with diagrams of traditional laboratory experimentso One of the f i r s t of these to be published was a test by Persing* i n which the items were related chiefly to the preparation and collection of gases* The Stanford Aptitude Test has some ingenious test items that could be adapted to testing a number of the objectives of chemistry instruction. The Ruch-Popenoe General Science Test-^ i s very factual and tests l i t t l e of the other objectives of the laboratory. The University of Chicago tests i n Educational Progress i n 1 Persing, K.M.. Persing Laboratory Chemistry Test. (Form A), Bloomington, H i . , Public School Publishing Go. 2 Zyve, D.L., Stanford Scientific Aptitude Test for High School  and College Students. Stanford, Gal., Stanford University Press. 3 Ruch, G.M., and Popenoe, H.F., Rqch-Popenoe General Science Test. Yonkers on Hudson, New York, World Book Co. 17 Biological Sciences 1 have items that are excellent for testing outcomes of the biology laboratory and are much better than the physical science counterpart for a similar purpose, This, i t would be said, i s i n no way a condemnation of the la t t e r test which i s excellent for testing many of the general objectives of the subject. The paucity of good standardized tests i n laboratory perform-ance makes i t desirable to have studies conducted to improve the situa-tion, not only i n chemistry, but also i n a l l laboratory sciences. Only when this i s done may the revisions i n our teaching methods be instituted with a background of knowledge based on experimental e v i -dence. SUMMARY Of those studies reviewed i n connection with testing the 2 outcomes of'laboratory instruction, Horton's i s the only one that considers outcomes other than those tested i n a typical high school chemistry examination, A number of excellent tests dealing with objectives i n chem-i s t r y instruction have been published. Most of these tests, however, 1 University of Chicago, Tests i n Educational Progress i n Bio- l o g i c a l Sciences. (Study of Educational Progress), Chicago, University of Chicago. 2 Horton, Ralph E,, Measurable Outcomes of Individual Laboratory  Work i n High. School Chemistry. (Teachers College Contribution to Education, No,303), New York: Bureau of Publications, Teachers College, 1928, p . 105 . 18 measure faetual information, or the attainment of objectives that may be achieved i n part, by classroom instruction, and i n part, by laboratory work. The few standardized laboratory tests l i s t e d by publishers, and the standardized chemistry tests reviewed for this study, appear to test few, i f any, of the outcomes of objectives achieved solely by laboratory chemistry. CHAPTER III THE PROCEDURE OBJECTIVES Before i t was possible to proceed with the preparation of the testing devices i t was necessary to have a l i s t of acceptable objectives. For the purpose of th i s study, eight l i s t s of objectives of laboratory chemistry were studied i n order to choose those general objectives ranked most often. The l i s t of objectives for teaching of chemistry i n the 46th yearbook of the National Society for the Study of Education* was taken as a basis. These objectives were broken down into more specific ones and i n some cases reworded. To these were added any additional ones from the seven other sources, A frequency distribution was made of the l i s t e d objectives, which were then written i n order of recur-rence and examined to determine which were applicable to training i n 2 the laboratory. From the l i s t the following were chosen as those that should be distinctly achievable i n the laboratory, as compared to others whose achievement accrues i n part, at least, from daily class methods of science teaching. 1 National Society for the Study of Education, Science Education  i n American Schools. (Part I), Chicago: The Society, 1947, p , 2 5 . 2 See Appendix A, p, 74, 20 OBJECTIVES FGR INSTRUCTION IN THE HIGH SCHOOL CHEMISTRY LABORATORY 1. A b i l i t y to perform basic laboratory s k i l l s , 2. A b i l i t y to select appropriate materials and apparatus. 3. Ab i l i t y to make accurate observations. 4. Abi l i t y to r e c a l l and use facts that are an outcome of laboratory instruction. 5. A b i l i t y to make an accurate record of observations. 6. A b i l i t y to write an acceptable piece of sci e n t i f i c literature or a report. For the present study i t was decided to concentrate on ob-jectives 1, 2 and 4 of the above l i s t . The three general objectives were separated into specific objectives. The specific laboratory s k i l l s chosen as most suitable for the present purpose was the Horton l i s t of f i f t y - f i v e basic techniques J~ The a b i l i t y to select suitable materials and apparatus could be tested to a degred i n appraising the basic manipulations. The facts that are the outcome of the laboratory instruction would have to be determined prior to the experiment. In order that they be curricularly valid i t was necessary to choose these objectives from the chemistry program (entitled Chemistry 91) of the students involved. Tests relating to the outcomes of the actual ex-periments of the chemistry course would serve to assess objectives 2 and 4. 1 Horton, Ralph E., Measurable Outcomes of Individual Laboratory  Work i n High School Chemistry, New York: Bureau of Publications, Teachers College, Columbia University, 1928, p.49. 21 THE CRITERION TEST It was decided in setting up the criterion to use a revision of tests prepared by Horton* supplemented by a test of Chemistry 91 laboratory learning, which was prepared for this study. Horton's test entitled "Individual Performance of Laboratory 2 Manipulations" consists of seven parts, each of the f i r s t four parts of which could be administered to a class of twenty-five students in one period of fift y minutes. Items five to seven would require about fifteen minutes per pupil. A testing procedure of this latter type would make i t difficult to obtain comparable results i n testing a class of twenty-five pupils and would also consume eight periods of about fif t y minutes. In order to reduce the time consumed and also to cover the majority of the class in one period items five to seven of Horton*s original test^ were revised into three more balanced tests^ of approxi-mately the same elements. The test of learning of Chemistry 91 from laboratory experi-ments was designed so that each student being tested worked simultan-eously on a different test item and the group was rotated every four 1 Horton, R.E., on.cit.. p«74© 2 Ibid., p.74. 3 Ibid., p.74. 4 See Appendix CI. p<j8G. 22 minutes* In this way ten students could do ten test items i n forty minutes. By duplicating the test materials twenty pupils per period could be accommodated providing there were laboratory places available. The score on the revision of the Horton test totals t h i r t y -four (34) items and that of the Chemistry 91 Laboratory test totals t h i r t y (30) items. Adding the two scores sets up a measure of the achievement of at least two phases of Chemistry 91 laboratory work. It may be argued that i n combining the two tests standard scores rather than raw scores should be added. However, since the two tests are aspects of the same criterion and since the rank orders of the students i n the tryout tests did not d i f f e r materially i t was deemed satisfactory to add the raw scores. A l l r e l i a b i l i t y formulas are based on the assumption that the greater the number of items the greater the r e l i a b i l i t y . It i s advisable therefore, to lengthen tests with valid items, but not be-yond the point that they become unwieldy. Some sixty items requiring at least sixty minutes of each pupil's time and requiring an estimated average of twenty minutes per pupil of the teacher's time i s as much as the t r a f f i c w i l l bear. However, Davis claims that so great i s the importance of having a criterion variable which measures the real objective of a selection program that no effort should be spared to obtain quantitative measurements of as many ele-ments of the real objective - the ultimate criterion -as possible, even i f these measurements can be made with a r e l i a b i l i t y only slightly greater than zero. 1 Davis, F.Bo, U t i l i z i n g Human Talent, Washington, D.C., American Council on Education, 1947, p.64. 23 While Davis i s speaking chiefly of a personnel selection program, he nevertheless indicates that his statement i s applicable to tests i n academic subjects; and he amplifies this point at some length* In fact, one of the chief implications of the armed services Testing Program of the United States i s the necessity of validating tests and school marks against r e a l i s t i c c r i t e r i a . It should be pointed out again that Horton's criterion test was the result of very careful screening of objectives from numerous texts by a jury of competent chemists and teachers* The addition of items from the Chemistry 91 course of British Columbia would serve to include objectives of practical chemistry that are not solely manipula-tive but also interpretative. The r e l i a b i l i t y coefficient of Morton's test* i s given as , 7 8 by the spl i t - h a l f method; the practical test for Chemistry 91 on a t r i a l run gave a value for Rho of , 8 4 (n - 32). The criterion, then, i s a composite test that appears to f u l f i l the four prime considera-tions of val i d i t y , r e l i a b i l i t y , face validity and practicability. THE PENCILED PAPER TEST Items for this test were prepared to paral l e l as nearly as possible the actual items tested i n the two parts of the practical tests. This was impossible i n some, instances, since the choices i n 1 Horton, op.cit.. p , 7 4 . 24 one question would undoubtedly have acted as cues for another related question. However, i t does not matter too much that a l l items are not pa r a l l e l , as the real test of the predictive value,of the pencil and paper test i s how well i t correlates with the c r i t e r i o n . After some preliminary consideration, i t was decided to adhere to the multiple choice type of question i n Part I, and items were prepared i n this form with five choices per item. For less than five choices per item the factor of guessing i s rather!too high. More than fiv e choices increases the time to administer the;test, and the gain i n reducing guessing i s not worth the time consumed. Guessing i s better handled by composing more attractive misleads. Furthermore, the d i f f i c u l t y of preparing six or seven choices of an attractive nature i s great. It i s obvious that a test item with several misleads so poor that no student chooses them becomes really a test item of only a few choices. A l l items were revised i n an attempt to minimize any ambig-uity that appeared to exist as well as to eliminate cues. Where items had several parts, care was taken to avoid the situation where a given wrong answer i n one part would affect the calculations of a later answer. The pencil and paper test i n i t s f i n a l form may be seen by referring to Appendix ID*" of this study. However, several typical 1 See page 93„ 25 questions are cited below. 1. A typical multiple-choice item to parallel a criterion test item Criterion test item The student was confronted with a beaker of solution, a glass plate, a stirring rod, a v i a l of red litmus paper, and a v i a l of blue litmus paper. Pinned to the table was the following question. A student has been preparing common salt by neutralization. Use the s t i r r i n g rod and litmus paper to test the solution i n the beaker marked ' 4 ' . Answer this question on your sheet. Should the student add a solution of (1) acid, (2) base, (3) neither, ( ) The students had been issued answer sheets1- and had been instructed to follow the directions and to write the answers to the practical questions i n the appropriate spaces on the answer sheets. Group pencil and paper test to parallel the preceding item A student was preparing common salt by neutralization. On testing with litmus paper he found that the pink litmus paper became blue. What should he do? (1) Add a few drops of acid and test again. (2) Add a few drops of base and test again. (3) Add nothing, i t i s neutral. (4) Remove the litmus paper before evaporating. (5) Add a few drops of salt water to replace those used i n testing ( ) 2. A typical multiple choice item not par a l l e l to the criterion test item but later shown to have a high internal consistency and v a l i d i t y 1 See Appendix F, p. 102. 26 If you wished to compare the rates of reaction at two different temperatures, the most convenient temperatures to use would be: (1) 20° and 100°, (2) 10° and 90°, (3) 20° and 80°, (4) 4° and 100°, (5) 30° and 50° . . . ( ) Part II of the pencil and paper test was composed of Hor-ton 's*- test-matching diagrams of laboratory situations with statements describing those circumstances* L i t t l e revision was attempted except to rearrange the statements i n order of d i f f i c u l t y after the tryout. 3. Typical matching item to parallel criterion test item Criterion test item Prepare a f i l t e r and f i l t e r one-third of a test tube of a l i q u i d i n bottle number I into a beaker. 2 The pupil's work was scored on a check sheet. Group pencil and paper test item Apparatus to obtain quickly a suspended solid from a solution . . . . . . . . . . . . . ( ) 1 Horton, Ralph E., Measurable Outcomes of Individual Laboratory  Work i n High School Chemistry. New York: Bureau of Publications, Teacher College, Columbia University, 1928, pp.72^3. 15 2 See Appendix E, p. 101* 27 4. Typical matching item not para l l e l to the criterion test but later shown to have a high internal consistency and va l i d i t y Apparatus to prepare hydrogen ( ) Preliminary Administration Two classes of chemistry 91 students of Britannia High School were tested i n the Spring of 1951* One-half of the pencil and paper test was administered and marked but the papers and marks were withheld. The criterion test was then administered over several weeks. Care was taken to eliminate as much as possible a leakage of test information by: 1. Testing a l l students of a class on one particular item at a time. 2. Testing the two classes on the same items on the same half of the school day. Following the practical test, the remainder of the pencil and paper test was given. ' In order to improve the test i t was subjected to the follow-ing analysis: 28 1. Correlation with the criterion. 2. Item analysis. 3. (a) Validity coefficients, (b) D i f f i c u l t y coefficients. 4. R e l i a b i l i t y coefficients. 5. Analysis of responses. 6. Editing of items. Correlation with the Criterion In the preliminary tryout, by rank difference a correlation of .63 (n s 25) was shown between the test and the practical criterion test. Following the item analysis, a second correlation was computed by the same method after deleting a l l items from the pencil and paper test that had a validity coefficient of less than ,15»Rho for this c a l -culation was . 7 2 . Item Analysis For this analysis i t was decided to use Thorndike's chart** adapted from Flanagan's abac . This chart requires the top and the bottom twenty-seven percent of the papers to be analyzed so as to give the percentage of successful responses for each item i n the upper and lower groups. From these two values the vali d i t y coefficient (Pearsoniah r) can be read off the chart. In a study reported by 1 Thorndike, Robert L., Personnel Selection. New York: John Wiley and Sons, Inc., 1949, Appendix B, p. 347-351. 2 Flanagan, J.C, "General Considerations i n the Selection of Test Items and a Short Method of Estimating the Product-Moment Coefficient from data at the Tails of the Distribution", Journal  of Educational Psychology. 30:678, December, 1939. 29 Kelley*" i t i s shown that the ratio of the obtained difference to i t s standard error i s a maximum when the top and bottom group i n -cludes approximately twenty-seven percent of the population tested, Kelley states that the most satisfactory item v a l i d i t y index based on the upper and lower twenty-seven percent Is the estimate of the coefficient of correlation between item and test obtainable from tables prepared by Flanagan,^ Thorndike^ points out that, i f the items i n a test blank are examined they w i l l be found to cover a rather narrow range i n validity coefficients. An item with a v a l i d i t y coefficient as high as ,30 usually represents an outstandingly valid item. The whole range of item v a l i d i t i e s from the most to the least may cover no more than thirty points. Kelley suggests that an analysis for practical purposes should consist of the above method coupled with an index of d i f f i c u l t y based on the average of the item d i f f i c u l t y of the upper and lower groups. Re l i a b i l i t y Coefficient The r e l i a b i l i t y coefficient of the unedited pencil and paper test using the Kuder-Richardson formula gave a coefficient of ,75, The mean of the test was 18.6 and the standard deviation was 4,6. 1, Kelley, T,L., "The Selection of Upper and Lower Groups for the Validation of Test Items", Journal of Educational Psychology. 30:17-24, January, 1939, 2 Flanagan, op.cit., p.678. 3 Thorndike, op.cit.. p.245. 30 The F i r s t Revision The f i r s t revision of the pencil and paper test was made to include: 1. Those items of a va l i d i t y of .15 or better, arranged i n order of d i f f i c u l t y where possible. 2. A revision of items where a considered opinion indicated changes that would probably increase the validity by removing the ambiguity or by substituting a more suitable mislead for one that discriminates i n the reverse direction. Some items were deleted and some new items were cast. The Second Revision In order further to improve the pencil and paper test i t was administered to some sixty-four Senior Matriculation (Grade 13) students i n three Vancouver High Schools, v i z . , King Edward, John Oliver and N 0rth Vancouver. In order to save a year's time, the test was administered at the end of September to the students of Chemistry 100 who had taken Chemistry 91 or i t s equivalent the previous year. By testing i n the f a l l i t was f e l t that, while the results might not be as high as i f the students had been tested i n June, nevertheless, useful information would be at hand for the f i n a l revision of the test. The outcome of the analysis i s as follows: 31 Number of items 64 Numbers of candidates 62 Mean score 26.45 Standard deviation 5©87 * *73 From the item analysis i t was possible to prepare a f i n a l paper of f i f t y items with internal consistencies of .20 or better. For the f i n a l draft each item was edited i n order to replace misleads that fa i l e d to discriminate, or to recast them. The revised items were then l i s t e d i n order of d i f f i c u l t y , re-edited and mimeographed."'' The f i n a l draft of the paper resulted i n twenty of the f i f t y items being related to the f i f t y - f i v e basic techniques of Horton's study. The remaining thirty reflected objectives of chemistry 91 laboratory learning, THE LABORATORY NOTEBOOKS Bulletin IX of the Department of Education of B r i t i s h Columbia-' Lists some thirty-one experiments which are starred i n a l i s t of fifty-eight experiments. It i s intimated that the teacher should choose a suitable number of experiments including the starred l i s t . It i s possible to combine a number of the starred items into one exercise. Under the heading "Pupil Activities"! i t i s indicated that the starred l i s t i s a minimum l i s t and that a record of a l l 1 See Appendix D, p. 93. 2 See Appendix B, p, 95. 3 Bulletin IX Department of Education, Program of Studies for  the Senior High Schools of B r i t i s h Columbia. Victoria, B.C.: 1939, pp. 109-115. 32 experiments be kept i n a notebook. In a curriculum directive from the Department of Education i t was indicated that twenty experiments written up would constitute an acceptable laboratory notebook provided the instructor cer t i f i e d the book. For the present study the f i r s t f i f t e e n experiments of the laboratory notebooks of the students used i a the investigation were graded and the scores f i l e d with Dr.J.R.McIntosh, School of Education, University of British Columbia, early i n March, 1952, and prior to the collection of data for this study. The Method of Scoring the Laboratory Notebooks The experiments were each scored out of ten points with one exception where a score of twenty-two was possible. Each score was the subjective judgment of the investigator. The points kept i n mind while scoring were: 1 . Correct format and good English, including spelling. 2 . Accuracy i n procedure, materials used, formulas and equations. 3. Neatness, use of tabular outlines, l e g i b i l i t y , and the i n -clusion of graphs and il l u s t r a t i o n s . 4. Originality of thought i n the conclusions. Errors were marked but no subdivision of marks was made for the above c r i t e r i a . Many suggestions have been made on developing check-lists for this type of marking, one of which i s that the scoring should be one point per item. However, i t was f e l t that the average teacher marks his notebooks with less pains than perfection would re-33 quire. For this study the method of the average teacher i s indicated. A student's mark would be his score on the sum of the fifteen experi-ments • THE TEACHER'S ESTIMATES If teachers' estimates were valid and reliable then i t would be most expeditious to use teachers' estimates i n place of tests as the estimates are time-saving and labor-saving. The estimated scores, prepared by one teacher, the investigator, w i l l be used as one means of rating laboratory performance. No generalizations can be made from -estimates i n this one instance, although the comparisons to be made may be of interest. The Method of Estimating Each student was observed during several laboratory periods unknown to him during the month of January and a subjective score given; possible 100 points. Scores ranged from 90 to 6. A l i s t of students and their estimated scores were f i l e d with Dr.J.R.Mcintosh i n February, 195 2 , prior to gathering data for the i n -vestigation. THE ASSEMBLING OF THE DATA It appears that pupils discuss factual answers rather than methods or procedures. As a consequence i t was decided to administer the criterion test prior to the pencil and paper test. By this arrange-ment there would probably be less discussion of what was being tested, namely, the method of doing tasks. 34 If the carry-over from the f i r s t test situation to the second test situation was equal i n amount and i n the same direction for a l l students, then i t should have no effect on the eventual re-sults i n determining correlations© However what the transfer would be one cannot say. Where possible, answers and scores were withheld from the student. These precautions were taken i n an attempt to reduce the effect that the f i r s t testing might have on the scores of the second test. Students were promised that a thorough discussion of a l l the tests would be undertaken after the completion of the testing. They were quite satisfied with the explanation inasmuch as the re-sults would contribute to their Easter grade i n Chemistry. In fact, they realized the necessity of secrecy i n order not to jeopardize their grades by warning others of the test items prior to testing. The Chemistry 91 Practical Test The test was divided into three parts each of which was administered on successive days. Each class was divided into two parts by l o t . The f i r s t part of each class was tested i n three consecutive class periods on one day. The following day the remain-der of each class was tested similarly. Part two of the test was administered to each half of a class i n the same period; the three classes being tested i n succeeding periods. Part three of the test was administered to the whole class at one sitting; thus completing 35 the testing of i t i n three successive periods 0 In a l l , this took four days to complete, but did not require a l l the time of every period. To achieve this end, the investigator took the testees from their regular classes to the chemistry laboratory for the test and they returned to the regular classroom after the testing was completed. The t o t a l time that a student was absent from class would approximate f i f t y minutes. The parts of the test''" were as follows: Part 1. Items one to six inclusive. Part 2. Items seven to eleven. Part 3. Items twelve to thirteen. In administering parts one and two the test materials were set out i n t r i p l i c a t e , that i s , there were three groups of items, one group for each of four or five testees* Each student took his place at one station and performed the test. At the end of the allotted time each student moved, following chalk arrows on the floor to the next station, where he performed the next test item, and so on. In this way i t was possible to accommodate fifteen students on part one or twelve students on part two of the test, at one time. An attempt was made to so place the stations that duplicate test items would be 1 See Appendix G, p. 79 ^  36 sufficiently far apart to prevent copying. The time was kept with a stop-watch and each student was allowed four minutes per station. At each station there was a printed sheet of instructions pinned to the table and also "toe required test materials e^ Each student carried 2 with him an answer sheet on which he wrote the answers to the test. Students were instructed beforehand on the use of the answer sheet. At the completion of each part of the test the sheets were collected and scored for that part of the test. The sheets were reissued for the next part of the test at the time of testing. Part three of the test was done by teacher demonstration. The students wrote their answers on the test blank from questions on the blackboard each of which was covered u n t i l the time of that par-ticular test. The Revised Horton Test This test was administered i n four parts. 3 Part 1, Test 1. Items one to eleven on the check-sheet. Part 2. Tests 2 items twelve to twenty-one on the check-and 3, ' . sheet. Part 3. Test 4. Items twenty-two to twenty-eight on the check-sheet. Part 4. Test 4. Items twenty-nine to thirty-four on the check-sheet. 1 See Appendix C, p, 79. 2 See Appendix F, p. 102. 3 See Appendix E, p. 101. 37 The student being tested worked behind a plywood screen at the demonstration bench and was marked by the investigator while the class proceeded with seat-work. Students averaged between three and four minutes per part of the test. In this way i t was possible to test between ten and fourteen students i n one class period. The students were scored directly on a check sheet using the symbol n l M for a correct response and n0" for an incorrect one. The order of testing students was by a random selection from a l i s t of random num-bers prepared by the investigator. The students were tested i n the same order for each of the four parts of the test. In order not to over-penalize a student who made a blunder i n part of the test, the examiner put the student right after having marked the erroneous pro-cedure. Ih this test the instructions were printed and pinned to the desk and a l l necessary material was available. The Pencil and Paper Test Students were tested i n three consecutive class periods by the investigator. There was no preliminary warning that a test of this nature was to be written but the students had been told that the laboratory work would be tested for the Easter reports to parents. The papers were distributed face down after the students had been instructed as to the nature of the test. After the directions had been read and discussed the students were given exactly forty minutes to complete the test. Papers were then collected but no dis-cussion was allowed u n t i l a l l classes had been tested. 38 SUMMARY A l i s t of general objectives was prepared for laboratory chemistry as a basis for evaluation of these outcomes© Two tests of chemistry laboratory attainment were devised; a practical c r i -terion test and a somewhat parallel pencil and paper test* Every effort was made to keep these tests valid and reliable,, Seventy-two students selected from Britannia High School were rated on the basis of teacher's estimates, laboratory notebooks, the group pencil and paper test and the practical laboratory test. A l l possible precautions were taken to standardize the testing procedure* CHAPTER IV ANALYSIS OF RESULTS METHODS OF ANALYSIS The R e l i a b i l i t y Coefficient The r e l i a b i l i t i e s of a l l tests were computed by means of the Kuder-Eichardson formula. Providing the assumptions upon which i t i s derived are scrupulously adhered to, this formula w i l l give a value comparable with other methods and w i l l avoid some of their d i f f i c u l t i e s . However, i f these assumptions are not s t r i c t l y follow-ed then the results w i l l be low. The formula i s r t = (S tD.) 2 -$pq . ( & T p q ) 2 (££pq) 2-(pq (S.D.) 2 where: p i s the d i f f i c u l t y of each item, i . e . , the percentage correct for each item, q i s 1 - p. S.D. i s the standard deviation of the test© pq i s the product for the p and the q for one item on the test. £ pq i s the sum of a l l the pq's for a l l the items on the test. "Vpq i s the square root of the product pq for one item on the test, yfpq i s the sum of the-\/pq ! s for a l l items on the test. For the following reasons the Kuder-Richardson formula was chosen even though the optimum conditions for i t s use were not present. 40 lo The" time required to administer the criterion test had been held to a minimum and to repeat the test was out of the question. Hence the test-retest procedure to determine the r e l i a b i l i t y co-efficient could not be considered. 2. To avoid carry-over from one administration of the test to another, a considerable time lapse would be required. While this plan might have been arranged, there was a danger that an increase i n laboratory knowledge, due to instruction i n the meantime, would materially affect the scores and lower the r e l i a b i l i t y coefficient obtained for the test. 3. The test items were so dissimilar that i t would have been d i f f i c u l t to divide the test into two comparable halves for the purpose of using the spli t - h a l f method of computing r e l i a b i l i t y coefficients. 4. The tests used were not long and hence, to have reduced them to as few as twenty-five items would make them too short for the purpose of computing r e l i a b i l i t y coefficients. The Internal Consistency of Items The basis of internal consistency i s the degree to which each item differentiates those students who are high from tho se who are low on the standard, i . e . , the performance on the test. Each item purports to assess, i n part, some simple aspect of a b i l i t y . Also, the right answer for each item can be determined i n advance, so i t i s possible to score the items on the test by a key prepared beforehand. Al In validating the test i t i s appropriate to discover to what extent each item measures the same a b i l i t i e s as does the test as a whole. Nevertheless, i f the test i s to have breadth and scope, the indices may not be expected to be extremely high, or conversely, i f the in-* dices are very high they must be overlapping i n their function as well as highly reliable. When an item index i s very low, i t must be either very unreliable,or i t measures functions quite different from the other items on the test. So i t may be said, generally speaking, that items with ex-tremely low or negative indices are undesirable, but those of inter-mediate size have their place along with-those that are high. Item Analysis Indices There are two types of situations, (1) where the performance of ah item i s related to a continuous measure, for example, the test score of which the item i s a component; (2) where performance i s being related to a dichotomy, for example, comparing the performance on an item i n two groups dichotomized, say, at the median or at some level of d i f f i c u l t y . An adaptation of this second situation w i l l be used i n this study. Item Indices Based on a Continuum Dichotomized for Convenience If the testees are divided at the median the upper group may be expected to score more highly on an item than the lower group. However, i f two extreme groups of, say, five percent of the t o t a l group are taken at the upper and lower l e v e l , a much greater discrimina-42 tion may be expected than i n the previous case. Kelley** has shown that the ratio of the obtained difference to the standard error of the difference i s a maximum when approximately twenty-seven percent of the total testees determines the upper and lower group, 2 Flanagan has prepared a table of product-moment correlation coefficients on the assumption that the variables responsible for item success and test score are normally distributed. One should note that i n these coefficients, equal differences do not have the same s i g n i f i -cance at different levels, that i s , the change from ,1© to ,15 i3 not equal to a change from ,50 to ,55, According to Thorndike^, "an item with a v a l i d i t y coefficient as high as ,25 or ,30 usually represents an outstandingly valid item©" On the basis of 72 cases the one percent level of confidence i s ©30 and the five percent l e v e l i s ,23. Hence, any item over ©30 i s outstanding and any below ,23 should perhaps be rejected as not being significantly different from zero. 1 Kelley, T,L», "The Selection of Upper and Lower Groups for the Validation of Test Items", Journal of Educational Psychology. 30:17-24, January, 1949, 2 Flanagan, J.G,, "General Considerations i n the Selection of Test Items and a Short Method for Estimating the Produet-Moment Coefficient from the Data at the Tails of the Distribution", Journal of Educational Psychology. 30:674-80, December, 1939. 3 Thorndike, R»L,, Personnel Selection, New York, John Wiley and Sons, Inc., 1949, p,245. 43 On the basis of the present study, a comparison of the item indices computed by three methods shows them to be in agreement. The following three methods were used, of which the results of the f i r s t will be reported: 1. The upper and lower groups method according to Kelley 1 and Flanagan.2 2. A method of computing internal consistency utilizing the whole group and using the formula: r-s no. - nw pq. where r is the validity coefficient. p i s the proportion of students passing an item, stated as a percent. q i s 100 - p; the proportion of students failing an item, stated as a percent, n i s 100. w is the number of students in the group q who passed the item. 3. The point biserial coefficient of correlation. ,The Difficulty of Items The difficulty of items i s an important consideration in a 1 Kelley, op.cit.. pp.17-24. 2 Flanagan, op.cit.. pp.674-30. 44 test. Obviously, items that are passed by a l l testees do not di s -criminate, nor do items failed by a l l . For test construction, item d i f f i c u l t i e s require the following several considerations: 1. The highest r e l i a b i l i t y i s achieved when item d i f f i c u l t y i s at the f i f t y percent level, as the product of those passing and f a i l i n g i s at a maximum. 2. The greatest discrimination occurs when half the testees pass an item. 3. According to Adkins, As a general rule, the average item d i f f i c u l t y i n a test should correspond to the average a b i l i t y of the subjects; i . e . , the items should be such that, on the average about half the subjects w i l l answer correctly. 1 If one wishes to select the top seventy percent then the d i f f i c u l t y should cluster around an index of ©70. However, i f the wish i s to spread the whole group tested i n rank order, then i t i s better to have the items of such d i f f i c u l t y that they range from easy to d i f f i c u l t with the majority at the average level of d i f f i c u l t y for the group. Since the purpose of each of the tests i n this investigation i s to rank a l l the students -t i t would be best to have item d i f f i c u l t i e s range from easy to hard with a cluster near the .50 index l e v e l . 1 Adkins, D.G., Construction and Analysis of Achievement Tests. U.S. Office of Printing, Washington, D.C.: 1947, p.147. 45 THE DATA The results of the tests are assembled i n Appendix G, i n which are li s t e d , i n order of scores on the criterion test: lo Intelligence Quotient (IQ) as taken from student record cards. The quotients were based mainly on the Otis Self-Adminis-tering Test, 2. Raw score on the criterion (maximum - 64 points). 3o Raw score on the group pencil and paper test (maximum -50 points). 4. Total score on the students' notebooks (maximum - 162 points). 5. The teacher's estimates (maximum - 100 points). 6. Revised Horton Test (maximum - 34 points). 7. Chemistry 91 Laboratory Test (maximum - 30 points). Due to the absence, at various times, of different testees, scores i n a l l data are available for seventy-two of out of some ninety participants. THE CRITERION On the assumption that the criterion conforms with a number of the objectives of the course i n Chemistry 91, i t can be said to have curricular v a l i d i t y . However, a study of the internal consistency and d i f f i c u l t y of the items, as well as the r e l i a b i l i t y of the criterion, 46 would permit a better judgment to be made of the ability of the test to do its appointed task. The Reliability Coefficient (Criterion Test) Table I based on the results i n Appendix G shows the reliabil-ity of the sixty-four item criterion test to be .82. TABLE I SOME STATISTICAL MEASURES OF THE TESTS OF LABORATORY OUTCOMES Measure Range Mean S.D. S.E.m s» E«sd r t Criterion 50-20 33.75 6.571 0.775 0.547 •823 Pencil and Paper Test 41-15 28.08 6.316 0.756 0.526 .76© Laboratory Notebooks 148-46 114.86 21.01 2.477 1.750 .117 Teacher's Estimates 86-6 52.94 18.96 2.235 1.580 .470 Revi sed Horton Test 24-8 17.69 3.75 0.442 0.312 .590 Chemistry 91 Lab. Test 26-7 16.22 4.20 0.495 0.350 .610 S.D. refers to the standard deviation. S.E.m refers to the standard error of the mean. S.E«sci refers to the standard error of the standard deviation. r-|. refers to the Kuder-Richardson reliability of the measure. This result, therefore, appears to be sufficiently reliable to give a true picture of the status of student achievement on the 47 criterion. In order to raise the r e l i a b i l i t y coefficient to .90 i t would be necessary to lengthen the test from ©4 items to 124 items. The formula used was n = rnn (1 *-*•.> r l l (1 - r m ) where n i s the number of times the test must be lengthened to attain r „ TJJJJ i s the r e l i a b i l i t y coefficient of the lengthened te s t . r l l l s ^ 8 r e l i a b i l i t y coefficient of the original test. Such an increase i n the length of the test would make i t too unwieldy for testing any reasonably large number of subjects. The Internal Consistency of Items (Criterion Test) On the basis of Flanagan's^ table, and on the basis of the 2 formula r - pq - nw pq internal consistencies were computed for the criterion test. A compari-son i s given i n Table II. From this i t w i l l be seen that the indices vary from -.23 to .71. Of these six are negative and seventeen are positive but below .23. 1 Thorndike, R.L., Personnel Selection. Mew York, John Wiley and Sons, Inc., 1949, pp. 347-351. 2 See page 43. 48 TABLE II CRITERION TEST COMPARISON OF INTERNAL CONSISTENCIES BY THE METHOD INDICATED Flanagan r r pq - nw pq Range ,71 to -.23 .79 to -.12 Median index .32 .42 Number of items exceeding index ,23 40 26 Total items 64 64 The D i f f i c u l t y of Items (Criterion Test) In computing data for Table II, item d i f f i c u l t i e s emerged routinely i n the calculation of internal consistencies. The items of the criterion range i n d i f f i c u l t y from ,04 to ,96 with a median of .53, a mean of ,54 and one-half the items between ,40 and ,75, The test, therefore, i s neither too d i f f i c u l t nor too easy, and has a desirable distribution of item d i f f i c u l t i e s , THE PENCIL AND PAPER TEST The R e l i a b i l i t y of the Pencil and Paper Test The r e l i a b i l i t y as determined by the Kuder-Richardson for-mula 1 gives a value of ,76 which would require a test of 142 items, 1 See page 39, 49 that i s , another 92 items, equivalent in every sense to the odginal 50 to produce a reliability of .90. The formula1 used was n s rnn ^ ~ r l l > r_. (1 - r ) 11 nn The Internal Consistency of Items (Pencil and Paper Test) The internal consistencies of the items are compared i n Table III. The three methods serve to screen out the same items in most cases* TABLE III PENCIL AND PAPER TEST COMPARISON OF INTERNAL CONSISTENCIES BY THE METHODS INDICATED Flanagan r = pa - nw pq Point BIserial Range .31 to .00 .60 to-.11 .55 to-©08 Median index .37 .28 .31 Number of items over index *23 37 28 33 Total items 50 50 50 This gives us confidence in those items that are consistently good. 1 See page 47 50 The Validity Coefficients of Items (Pencil and Paper Test) The vali d i t y coefficients were determined by the same three methods as the internal consistencies except that the individual items were compared with the total scores on the criterion test rather than with the total scores on the test i t s e l f • It w i l l be noted by comparing Table IV with Table III that the v a l i d i t i e s of the items tend to be somewhat lower than the internal consistencies. TABLE IV PENCIL AND PAPER TEST COMPARISON OF ITEM VALIDITIES BI THE METHODS INDICATED Flanagan r a pq — nw pq Point Biserial Range .65 to -.33 .50 to -.22 .63 to-.19 Median index .28 .22 .18 Number of items over index .23 30 22 20 Total items 50 50 50 The D i f f i c u l t y of Items (Pencil and Paper Test) As before, the indices of item d i f f i c u l t y were calculated i n the preparation of item v a l i d i t i e s . The range of d i f f i c u l t y i s from .86 to .08 with a median of .61 and a mean of .57. The middle half of the indices ran from .44 to .71. These results compare favorably with those of the criterion. -51 Correlations of the Pencil and Paper T est Table V shows the correlations between the various measures and the pencil and paper test, computed by the product-moment method. The correlation between the pencil and paper test and the criterion i s .69 * .06 from the data available. This value may be taken as the Validity Coefficient for the Pencil and Paper Test as a whole since the practical criterion test i s the most sure measure of the outcomes of laboratory instruction that can be obtained. The predictive value for a correlation coefficient of .69 can be inferred from the standard error of estimate. For the pencil and paper test predicting the criterion test the standard error of estimate i s 4.55 calculated from the formula S.E.t = S.D^l - r 2 where S«Eot i s the standard error of estimate. S.D. i s the standard deviation of the pencil and paper test, r i s the correlation of the pencil and paper test with the criterion test. The value 4.55 computed from the above formula which i s based on Kelley"s Coefficient of Alienation, may be interpreted as follows: When the pencil and paper test i s used to predict the c r i -terion, the chances are 68 out of a hundred that the true score would l i e within * 4.55 points of the predicted score. Stated another way, i t may be said that a correlation of .69 has an index of forecasting efficiency of 28 percent. 52 It should be noted that the pencil and paper test predicts the criterion to a much greater extent than i t does either the manipu-lation of apparatus (Revised Horton Test) or the knowledge of labora-tory situations (Chemistry 91 Laboratory Test). The explanation i s probably twofold. Since both the criterion and the experimental test were prepared with a view to consisting of two dissimilar elements, i t would be expected that the correlation between the experimental test and the criterion would be greater than between the experimental test and the two parts. The fact that the two parts of the criterion are not long would contribute to the keeping the correlations low. , TABLE V PRODUCT-MOMENT CORRELATIONS BETWEEN THE PENCIL AND PAPER TEST AND FIVE OTHER MEASURES Measure Criterion Laboratory Notebooks Teacher's Estimates Revised Horton Test Chemistry 91 Laboratory Test Correlation .69 + .06 .20 ± .11 .67 ± .06 .38 * .10 .41 * .10 The validity coefficient i s affected by the r e l i a b i l i t y of the test. T© reduce chance factors w i l l increase the v a l i d i t y . Since lengthening the test w i l l reduce chance factors, i t w i l l also raise the v a l i d i t y coefficient. I t has been shown that t r i p l i n g the length 53 of the test w i l l raise the r e l i a b i l i t y of the test to ,90. If th i s were done the va l i d i t y would r i s e from ,69 to #75. The formula used was t rxx where: r(xx)y I s the validity coefficient of the lengthened test,, r — . i s the validity coefficient of the original test* rxx * s ^ n e r e l i a b i l i t y of the original teste n i s the number of times the original test i s lengthened,, It should be pointed out that since the r e l i a b i l i t i e s are probably low (due to the method of computation) the validity corrected for attenuation would probably be high. Hence, ,75 may be high for the v a l i d i t y coefficient of this test when increased from 50 to 150 items, THE LABORATORY NOTEBOOKS Correlations were computed for the relation of the notebooks to the other measures i n the investigation. Table VI indicates that there i s a lack of relationship with the exception of the notebooks and teacher's estimates. One would surmise that the marking of a set of ' notebooks, weekly, would colour the teacher's judgment as to the 1 See page 47, 54 TABLE VI CORRELATION BETWEEN THE NOTEBOOKS AND FIVE OTHER MEASURES Measure Correlation Criterion .12 - .11 Pencil and Paper Test .20 + .11 Teacher's Estimates .70 • .06 Revised Horton Test .06 + .12 Chemistry 91 Laboratory Test .22 * .11 a b i l i t y of students to do laboratory work. It i s conceivable that neat, well-ordered notebooks would leave a favorable impression on the teacher that would be reflected i n estimating progress. It would be well to point out the low correlation between the notebooks and the c r i t e r i o n . Where correlations are not substantial i t indicates either, (1) marked dissimilarity, (2) unreliability, (3) coarse group-ing, or, (4) non-linear relationships. In the present study the l a s t two reasons may be dismissed, but either marked dissimilarity or the unre l i a b i l i t y of marks assigned to the notebooks, or both, i n compari-son with the criterion i s a pos s i b i l i t y . Either reason would seem sufficient to deem i t unworthy to use the notebook to evaluate progress of the student i n laboratory work. 55 THE TEACHER'S ESTIMATES The correlations of the teacher's estimates with the other measures are not high. The best correlation i s with the notebooks and i t has been discussed on page 54. The correlation with the criterion .47, has an index of forecasting efficiency of 12 percent as compared with one of 28 percent for the pencil and paper test. The standard error of estimate*" of a criterion score predicted from the teacher's estimates i s 5.78 which i s considerably higher than one predicted by the experimental test. There i s the po s s i b i l i t y that the particulars of the pencil and paper test had so engrossed the investigator that they influenced his estimation of student achievement. This factor might account for the correlation of .67 between the test and the estimates. TABLE VII CORRELATIONS BETWEEN THE TEACHER'S ESTIMATES AND FIVE OTHER MEASURES Measure Correlation Criterion .47 j .09 Pencil and Paper Test .67 + .06 Laboratory Notebooks .70 ± .06 Revised Horton Test .31 * .10 Chemistry 91 Laboratory Test .49 • .09 1 See page 51 56 THE REVISED HORTON TEST For his original test, Horton 1 reported a r e l i a b i l i t y eo«* efficient of ,88 by the s p l i t half method. The revised test of 34 items as compared to 36 items of the original gave a r e l i a b i l i t y coefficient of .59 using the Kuder-Richardson formula. The original test had a median of 28.5 as compared to 17.8 for the revised test. There may be two possible explanations for the di screpancy i n the results i f we assume that the conditions for administering the tests were not too different, 1. The emphasis i n science teaching has changed i n the l a s t twenty-five years from the more rigorous and narrow to the less precise and general, 2. The high school student of two decades ago was more scholastically inclined than the high school student of today. There are no marked correlations; this may be due to the unr e l i a b i l i t y of the test or to the lack of similarity between the test and the correlatives, or to the fact that the test i s short, being about one-half the length of the experimental test. If we assume that the Revised Horton test of laboratory manipulations i s reliable, then i t would indicate that there i s considerable dissimilar-i t y between i t and the Chemistry 91 test of laboratory facts and 1 Horton, Ralph E,, Measurable Outcomes of Individual Labora- tory Work i n High School Chemistry, New York: Bureau of Publicap-tions, Teachers College, 1938, p ,74» 57 associated laboratory knowledge since the correlation coefficient is .39. The results also show that the pencil and paper test is a better measure of the combined abilities of the two tests than i t i s of either one individually. The correlation with the criterion is .69 (Table V) with the Revised Horton is ,38 (Table VIII) and with the Chemistry 91 test i t i s ,41 (Table IX). TABLE VIII CORRELATIONS BETWEEN THE REVISED HORTON TEST AND FOUR OTHER MEASURES Measure Correlation Pencil and Paper Test .38 + .10 Laboratory Notebooks .06 ± .12 Teacher's Estimates .31 + .10 Chemistry 91 Laboratory T est .39 j .10 THE CHEMISTRY 91 LABORATORY TEST The correlations of the Chemistry 91 Laboratory test do not run high, perhaps because of it s shortness, and perhaps also because there may be a lack of relationship with the correlatives. Since checking laboratory notebooks emphasizes, in the mind of the teacher, the experiments performed, the teacher's estimates would be expected to show some correlation with the laboratory experiments, a correct assumption, (r » .41). 58 TABLE EC CORRELATIONS OF THE CHEMISTRY 91 LABORATORY TEST AND FOUR OTHER MEASURES Measures Correlations Pencil and Paper Test .41 ± .10 Laboratory Notebooks .22 + .10 Teacher's Estimates .49 .09 Revised Horton Test .39 .10 THE MULTIPLE REGRESSION EQUATION This investigation i s concerned with deriving the best method of assessing a student's worth on the criterion. Hence, a multiple correlation was run between the criterion, on one hand, and the pencil and paper test and the teacher's estimates on the other. The resulting correlation was .6901. The correlation between the pencil and paper test and the criterion has been reported as .69. The extremely small increase in the correlation i s indicative of the negligible amount the teacher's estimates contribute to predicting the criterion -when com-bined with the group pencil and paper test. The multiple regression equation was derived to be: X x - .707 X 2 * .0048 X3 + 13.64 ( 1 ) where: Xi IS the predicted criterion score. 59 %2 is the actual pencil and paper test score. X3 is the actual estimate by the teacher of laboratory progress0 This equation shows the relative influences of X2 and X3 in predicting the criterion,. The maximum value of the term .0048 X3 can only be .48 which i s about one-half point in 59. THE BETA COEFFICIENTS To get a clearer picture, the Beta coefficients were computed and compared. These standard partial regression coefficients show the relative importance of the two variables X2 and X3 to predict variable Xl, disregarding the differences i n standard deviation. For the variable X2; Betai2.3 = .680? For the variable X 3 ; B e t a ^ ^ = .0140 The ratio Betajj?^ ... * = 48.5, which indicates that the pencil and B e t a13.2 paper test i s almost fi f t y times as important as the teacher's estimates in predicting the criterion. It must be reiterated that generalizations cannot be made from one case of teacher's estimates. However, the size of the rati© of the Beta coefficients may be explained by the fact that there i s a high correlation between the pencil and paper test and the teacher's estimates which reduces the size of Beta]_3o2 greatly, thus increasing the ratio. THE SIMPLE -REGRESSION EQUATION The simple regression equation was computed to be: Y n .717 X • 13.61 ( 2 ) 60 Compare this equation with equation (1) and note the similarity i n the f i r s t and la s t terms on the right side. When the second term . i s deleted equation (1) becomes: X x « .707 X2 + 13.64 ( 3 ) By the deletion of term .0048 X3 the predicted score i s lowered by an amount of .48 points when X3 i s at i t s maximum. STANDARD SCORES, DERIVED SCORES, AND PERCENTILES The criterion score can be predicted from the pencil and paper test score by means of the regression equations. However, i n some cases i t i s desirable to compare scores, and one with a possible 64 would not be suitable. For this purpose, percentiles and standard scores are use-f u l , although the * character of standard scores i s cumbersome. De-rived scores have the advantage of being positive and of being geared to any predetermined standard deviation and mean. Two sets of derived scores have been determined. 1. based on a mean of 50 and a standard deviation of 10. 2. based on a mean of 63 and a standard deviation of 13. The f i r s t i s sometimes called a T-score and the second i s the method employed by the Department of Education of Bri t i s h Columbia i n scaling marks for departmental examinations. After setting a c r i t i c a l score of 50 that would cut off the 61 lower 15 percent i n a normal distribution, a comparison was made of the predictive quality of the written test with respect to the upper and lower quarters of the distribution. From a 2 X 2 contin-r gency table, the chi value was computed to be 1.12. Since i t requires a chi value of 3.842 to be significant at the five percent le v e l of confidence, the hypothesis that the test w i l l predict equally well at any level has not been disproved. The percentiles were interpolated from a graph prepared from the decile values as computed from the frequency distribution of data i n Appendix G. These values are reported i n Appendix J . There i s no doubt that these results would be modified by taking a larger sample. It could also be argued that results of student's work from one school, under one teacher would tend to be more homogeneous than the whole high School population. Hence, a greater variance i n the larger population would be expected, with the percentiles spread over a greater range and the derived scores would be compressed. An increase i n the mean would lower the derived scores and a decrease i n the mean would raise them. ELIMINATION OF ITEMS WITH INTERNAL CONSISTENCIES BELOW *23 The tests were rescored after eliminating the items of low internal consistency and validity. Correlations were computed to compare the effects of the deletion. The results are reported i n Table X. By a comparison with Table V i t w i l l be seen that the elimina-tion of debatable items has had very l i t t l e effect on the correlations. 1 See Appendix K, p. 1080 62 TABLE X CORRELATIONS OF THE CRITERION AND OTHER MEASURES AFTER DELETING INCONSISTENT ITEMS Measure Number Correlation Correlation of Items with with Pencil and Deleted Criterion Reduced Paper Test Reduced Criterion 22 Pencil and Paper Test 15 Criterion Reduced .70 .67 .68 SUMMARY For the purpose of analysis the following st a t i s t i c s were computed. 1. The r e l i a b i l i t i e s of the six measures. 2. The intercorrelations of the six measures. 3. The internal consistencies of items on the criterion and experimental tests. 4. The valid i t i e s of items on the experimental test. 5. The d i f f i c u l t y of items on the criterion and experimental test. 6. The multiple regression equation for predicting the criterion from the experimental test and teacher's estimates. 7. The simple regression equation predicting the equation from the pencil and paper test. 8. Derived scores and percentiles. 63 9 o Ghi-square test of consistency of pencil and paper test with respect to predicting the upper and lower groups on the criterion, 10, Correlations of the criterion and experimental test after eliminating the inconsistent items. CHAPTER V SUMMARY AND CONCLUSIONS The present investigation was undertaken to discover whether a carefully prepared, valid and reliable pencil and paper test of out-comes i n laboratory instruction i s as effective i n measuring a stu-dent's worth i n the laboratory as the traditional methods of evaluation. The problem eventually was stated: 1. To prepare a valid, reliable and usable group pencil and paper test pertaining to the objectives of laboratory chemistry. 2. To compare different methods of evaluating the outcomes of instruction i n high school laboratory chemistry. After the objectives were chosen and limited, the study pro-ceeded to measure student'B achievement i n laboratory chemistry by: 1. The traditional laboratory notebook. 2. The teacher's estimates. 3. A group pencil and paper test of the outcomes of the ob-jectives chosen, 4. A practical test of the outcomes of the objectives chosen. It has been indicated by related studies that traditional examinations have neglected the objectives of laboratory instruction and that these could be measured by practical individual testso 65 These studies do not indicate to what extent pencil and paper tests could replace the practical type of test. The subjects selected for the experiment were the students of Chemistry 91 i n grades eleven and twelve i n Britannia High School, Vancouver, British Columbia. It was decided to run a preliminary investigation i n which one class of students provided data for refining the measuring devices and techniques used. The following year i n March, the students' laboratory note-books were graded and the teacher's estimates were prepared prior to the experiment proper. The practical laboratory test was administered i n two parts: 1. The test of manipulation of apparatus called the Revised Horton test. 2. The test of practical knowledge i n Chemistry 91 laboratory work called the Chemistry 91 Laboratory test. About one week later the pencil and paper test was administered to a l l students of chemistry i n Britannia High School. Complete results were obtained for seventy-two students. To evaluate the effectiveness of the different measures to assess the student's worth i n laboratory work, correlations were c a l -culated between a l l measures« Simple and multiple regression equations predicting the score on the criterion from the experimental test and 66 the teacher's estimates were derived* Furthermore, r e l i a b i l i t i e s , internal item consistencies and v a l i d i t i e s were computed to evaluate tests and discover trends* Percentiles and derived scores were pre-pared for comparisons when further work on the problem i s done. A chi-square test of consistency of the pencil and paper test to predict the c r i t e r i o n was attempted on the basis of the upper and lower quarters of the criterion scores. The results obtained were: 1. The pencil and paper test was a significantly better predictor of the criterion than any of the other measures used, (r s .69). 2. The inclusion of the teacher's estimates i n the multiple re-gression equation did not significantly improve the predictive value of the simple regression equation. 3. The notebooks and teacher's estimates correlate to the extent of .70. 4. Of the measures tested the students' notebooks show the lowest correlation with the criterion, i t being not significantly different from zero. 5© After the inconsistent items were deleted and the papers re-scored the correlations between the criterion and the experi-mental test were not changed materially. 6. In comparing the degree to which the pencil and paper test w i l l predict the upper and lower quarters of the criterion, chi was computed from a 2 X 2 contingency table to be 1.12, for which value the n u l l hypothesis i s not to be rejected. 67 CONCLUSIONS The conclusions have been arranged i n two divisions as they apply to the two divisions of the problem, A. Conclusions with respect to the r e l i a b i l i t y and the validity of the pencil and paper test 1, The range and distribution of d i f f i c u l t i e s for the criterion and for the experimental test conform to the requirements for a good test, 2, About two-thirds of the items of the experimental test have internal consistencies of ,23 or better, and about one-half the items have indices of v a l i d i t y of at least ,23. 3 , Since there i s l i t t l e change i n the correlation coefficients by the deletion of items whose internal consistencies and v a l i d i t i e s are less than .23, i t would indicate that these items do not contribute anything to the correlation. 4, By inspection, there appears to be some evidence that items of satisfactory validity but low internal consistency, or vice versa are reducing the correlation. Until more informa-tion i s available regarding the indices, i t would seem to be a wise compromise to drop only those items definitely i n v a l i d . B. Conclusions with respect to the comparison of methods of evaluating outcomes of instruction i n high school chemistry 1, Assuming the evaluation of laboratory a b i l i t i e s i s best done by a practical test i n the laboratory, this investigation, 68 based on the scores of seventy-two high school students, has shown that the best substitute for the time-consuming prac-t i c a l test i s the group pencil and paper test, with respect to the objectives chosen. 2. I t has further shown that the students 1 notebooks have fai l e d to predict, significantly, the outcomes of these same objectives. 3. The teacher's estimates seem as successful i n predicting the score on the students' notebooks as the pencil and paper test i s i n predicting the criterion. B'ihce the\only teacher's estimate possible was that made by the investigator himself, any generalizations regarding estimates must be very cautiously advanced. Even though the estimates were made well i n advance, the investigator was not unaware of what the various factors i n the testing program were to be. The estimates by the inves-tigator might be expected, therefore, to agree more with the scores on the experimental test than would the estimates of another teacher. 4. Since the teacher's estimates correlate with the experimental test to the extent of .67 with the notebooks to the extent of .70 and yet with the criterion test to the extent of .47, i t would appear that some element not present i n the criterion i s common to the other two measures. One hypothesis would suggest that the common element i s related to the a b i l i t y to write a report. 5. Both the multiple regression equation and the Beta coefficients indicate that the teacher's estimates do not materially assist 6 9 the group pencil and paper test i n predicting the outcomes of the laboratory instruction. This conclusion i s based on the similarity of the simple and multiple regression equations when the term . O O 4 8 X 3 i s deleted from equation (1)"*". This i s further indicated since the ratio of the Beta coefficients shows that the pencil and paper test i s almost f i f t y times as important as the teacher's estimates i n predicting the c r i t e r -ion. By computing a multiple correlation coefficient between the criterion and the combination of the pencil and paper test and teacher's estimates, i t has been shown that a simple cor-relation of . 6 9 was raised to only . 6 9 0 1 . Such an increase i s negligible, further strengthening the case for discarding teacher's estimates i n this instance. 6 . The relatively low correlation between the two parts of the criterion serve to support the contention that the criterion i s composed of at least two dissimilar elements, v i z . , a test of manipulations and a test of laboratory knowledge. SUGGESTIONS FOR FURTHER RESEARCH 1 . Further research i s indicated i n the realm of testing the objectives of the laboratory. Investigations regarding the writing of a s c i e n t i f i c report may vindicate the use of the laboratory notebook as a measuring device for attainment of 1 See page 5 8 . 70 that objective of chemistry. Other objectives that might be tested are: laboratory resourcefulness, and the a b i l i t y to apply the sc i e n t i f i c method. 2. Similar investigations i n the fields of physics and biology would seem to have their place i n providing suitable devices for measuring the outcomes of laboratory work i n those areas of science teaching. 3 . The present investigation has only begun to probe the f i e l d of testing outcomes of laboratory instruction i n chemistry. Since the validities of one-half the pencil and paper test items were below .23, the five percent le v e l of confidence for these data, the test w i l l require further revision before i t can be used with much confidence. New items should be cast and the f i n a l form administered to a sufficiently large and representative cross-section of students to develop re-l i a b l e norms and s t a t i s t i c s . 4. In the development of test items i t appears that items with diagrams tend to have greater vali d i t y than verbal items and i t might be worthwhile to concentrate on p i c t o r i a l or dia-grammatic items. 5. The improvement of instruction depends i n part on the a b i l i t y to evaluate that instruction. When suitable tests of the outcomes of objectives become available, then w i l l investiga-tors of methods of instruction have tools to assess their efforts and point the way to better teaching, backed up by knowledge based on experimental evidence. 71 BIBLIOGRAPHY Adkins, Dorothy G., Constraction and Analysis of Achievement Tests, Washington, DoC: U.S, Government Printing Office, 1947, p.292. Buckingham, Guy E # , and Lee," Richard E., "A Technique for Testing Unified Concepts i n Science," Journal of Edu- cational Research. 30:20-27, September, 1936. Carmody, W.R., "Elementary Laboratory Instruction," Journal of ' Chemical Education, 12: 233-238, May, 1935. Curtis, Francis, D., A Digest of Investigations i n the Teaching of Science. (Textbooks i n Science Education), Philadelphia: P.Blakiston & Son & Co,, 1926, p,341. Curtis, Francis D., "Milestones i n the Teaching of Science," Journal of Educational Research, 44:161-178, November, 1950, pp,177. Davis, F.B,, U t i l i z i n g Human Talent. Washington, D.C: American Council on Education, 1947, p.85, Department of Education of Br i t i s h Columbia, Program of Studies for the Senior High Schools of British Columbia  (Bulletin IX). Victoria, B.C.: The Department, 1937, pp. 105-119. Edwards, Allen L., S t a t i s t i c a l Analysis. New York: Binehart and Co., Inc., 1946, p„360. Flanagan, J.G., "General Considerations i n the Selection of Test Items and a Short Method for Estimating the Product-moment Coefficient from the Data at the Tails of the Distribution," Journal of Educa- tional Psychology. 30:674-680, December, 1939. Fuller, Robert W, "Demonstration or Individual Laboratory Work for High School," Journal of Chemical Education. 13:262-264, June, 1936. Hawkes, Herbert E., Linquist, E.F. and Mann, G,R.> The Construction and Use of Achievement Examinations, Boston: Houghton M i f f l i n Co., 1936, p.491. 72 Hendricks, B.Glifford, "Measuring the A b i l i t y to Interpret Experimental Data," Journal of Chemical  Education. 13:62-64, February, 1936. Hendricks B.Glifford, "Pencil,and Paper Tests i n the Labora-tory, " Journal of Chemical Education. 22:543-546, November, 1945. Hendricks, B.C., Tyler, E.W., and Frutchey, F.P., "Testing A b i l i t y to Apply Chemical Principles", Journal of Chemical Education. 11:611-613, November, 1934. Horton, Ralph E., Measurable Outcomes of Individual Laboratory Work i n High School Chemistry. (Teachers College Contribution to Education, No.303) New York: Bureau of Publications, Teachers College, 1928, p.105. Hunter, George W., and Spore, Leroy, "The Objectives of Science i n the Secondary Schools of U.S.," School  Science and Mathematics. 43:633-47, October, 1943. Keeslar,, Oreon, "Elements of the Scientific Method," Science  Education. 29:273-78, December, 1945. Kelley, T.L., "The Selection of Upper and Lower Groups for the validation of Test Items," Journal of Educa- tional Psychology. 30:17-24, January, 1939. Mallinson, George G., "The Implications of Recent Research i n Teaching of Science at the Secondary School Level," Journal of Educational Research. 43:321-^ 42, January, 195©. The National Society for the.Study of Education, Seience Educa- tion i n American Schools. (Part 1), Chicago, The Society, 1947, p.306. Persing, K.M., Persing Laboratory Chemistry Test. (Form A), Public Schools Publishing Co., Bloomington, 111. Quam, G.N., "Neglected Types of Examinations?" Journal of Chemi-cal Education. 17:363-5, August, 1940. 73 Richardson, M.W., and Kuder, G.F., "The Calculation of Test R e l i a b i l i t y Coefficients Based on the Method of Rational Equivalence," Journal of Educa- tional Psychology. 30:681-7, December, 1939. Ruch, G.M., and Popenoe,H.F., Ruch Popenoe General Science Test, Yonkers-on-Hudson, N,Y,: World Book Co, Schlesinger, H.J,, "The Contributions of Laboratory Work to General Education," Journal of Chjemi^aLJEdu-cation, 12:$42-8, November, 1935. Stewart, A.W,, "Measuring Ability to Apply Principles," School Science and Mathematics. 35:695-9, October, 1935. Ter Keunst, John and Bugbee, Robert E,, "A Test on the Scienti f i c Method," Journal of Educational Research, 36:489-501, March, 1943. Thorndike, Robert L«, Personnel Selection, New York: John Wiley & Sons, Inc., 1949, p .358. University of Chicago, Tests i n Educational Progress i n Biological Sciences. (Study of Educational Progress), Chicago, University of Chicago. Webb, H.A,-, and Beauehamp, R.V., "Test of Laboratory Resourceful-ness," School Science and Mathematics. 22:259-67, March, 1922. Wise, Harold E., "A Determination of the Relative Importance of Principles of Physical Science for General Education (1 and 2 ) , » Science Education. 25:371-79, December, 1941 and 26:8-12, January, 1942. Wrinkle, F,B.> Improving Marking and Reporting Practices i n Ele- mentary and Secondary Schools, New York: Rinehart, 1947, p.120. Zyve, D.L., Stanford Scientific Aptitude Test for High School and College Students. Stanford, Cal,: Stanford University Press, 74 APPENDIX A OBJECTIVES This l i s t of fourteen objectives has been derived from eight sources and has been ranked i n order of frequency. Rank lo A b i l i t y to make conclusions from observations. 1 2. A b i l i t y i n basic laboratory s k i l l s . 2 3. A b i l i t y i n the selection of materials and apparatus. 3 4. Understanding of the scientific method. 3 5. The student i s developing an interest i n science. 3 6. A b i l i t y to make accurate observations. 4 7© Ability to make an accurate record of observations. 4 8. Understanding of principles. 4 9. A b i l i t y to apply principles. 4..-10. Facts that are an outcome of laboratory instruction. 4 11. A b i l i t y to write an acceptable piece of sc i e n t i f i c literature or a report. 4 12. Develop habits of accuracy. 4 13. Development of attitudes. 4 14. Appreciation of science. 4 75 APPENDIX B APPROVED LIST OF LABORATORY TECHNIQUES RANKED ACCORDING TO IMPORTANCE1 1. Twist or screw a stopper into a tube. 2. Twist or screw a glass tube into a rubber stopper. 3. Smooth the ends of freshly cut glass tubing, (fire-polishing). 4. Always pour concentrated sulfuric acid into water - never water into concentrated acid. 5. Smell gases by fanning toward the nose - never inhaling. 6. Wash a l l glassware when through using. 7. Turn the water faucet off when through using. 8. Avoid pointing the mouth of the test tube containing a reaction at anyone's face. 9. Always replace reagent bottle i n exact place where found immediate-l y after using. 10. Throw a l l solidwaste i n waste jars - not i n sink. 11. Flush sink after pouring i n aeid. 12. Be able to cut a glass tube at any point by making a scratch with a f i l e and then breaking with pressure. 13. Wash the table top after each experiment. 14. Avoid 'sucking back' of a delivery tube by disconnecting, or by taking the end from the water, as soon as heating i s completed. 1 Horton, Ralph E., Measurable Outcomes of Individual Laboratory  Work i n High School Chemistry. New York: Bureau of Publications, Teachers College, 1928, p.49. 76 15c In f i l t e r i n g , keep the l i q u i d below the edge of the f i l t e r paper. l6» Use the t i p of the buns en flame - not the base - when applying heat. 17» Use a flame spreader when heating glass tubing to be bent. 18. GLamp a test tube firmly but without pressure. 19. Fold a f i l t e r paper to form a smooth cone to f i t a funnel. 20. Take a stopper from a-bottle by turning the palm upward and holding the stopper between the fingers. 21. Hold the stopper i n the hand u n t i l through using the bottle, then replace i t i n the bottle. 22. When washing the table, squeeze the sponge and take up excess water. 23. Dry glass vessels on the outside before heating them. 24. In heating a glass vessel move the heat around - do not heat i n one place. 25. Begin to heat any vessel of glass gradually. 26. Use a wire gauze or asbestos beneath beakers and flasks when heating them. 27. Be able to adjust a ringstand clamp to any height or any angle. 28. Wet a f i l t e r paper before using i t for f i l t e r i n g . 29. In evaporating to dryness, remove the flame before the last b i t of water disappears. 30. In using a t h i s t l e tube i n a generator, be sure that the lower end i s below the surface of the li q u i d i n the generator. 31, Put powders on creased papers and pour them into small mouthed bottleso 32, Without admitting a i r , be able to invert a bottle of water with a glass plate over the mouth beneath the water i n a trough, 33, Insert the delivery tube beneath an inverted bottle of water i n a trough without admitting a i r , 34, Set up bottles of gas, upright or inverted as determined by the weight, 35, When necessary use a pestle and mortar to pulverize coarse materials, 36, When about to li g h t a bunsen burner, light the match before turning on the gas. 37, For ordinary use, turn the flame down to about four inches. 38, Keep the flame down below the level of the liquid i n a vessel which i s being heated, 39,* Wet a rubber stopper when connecting i t to glass and wet a glass tube when inserting i t into rubber tubing, 40, Slide solids into test tube with the tube i n an oblique position, to avoid breaking the tube. 41, When a crucible i s to be heated select a pipestem triangle for i t s support on the ringstand. 42, When a dry gas, fighter than a i r but soluble i n water i s to be collected, collect i t i n an inverted bottle by the displace-ment of a i r . 43, When a dry gas, heavier than a i r , but soluble i n water, i s collected displace a i r from an upright bottle. 78 44, Wash and save zinc after using a hydrogen generator. 45, To correct the striking back of a bunsen burner, extinguish the flame and relight* 46, When heating a solid i n a test tube, hold the tube i n an almost horizontal position with the mouth slightly lower than the closed end* 47* When a funnel i s to be set on the table, stand i t with the mouth down. 48. Be able to make a smooth, rounded, right angle bend from a straight glass tube. 49* Test the force of water before putting a vessel beneath the faucet. 50. Be able to estimate, approximately five grams, by reference to the weight of a nickel coin. 51. In weighing, use the right hand pan for weights, placing object to be weighed on the l e f t . 52. Read a centigrade thermometer to 0.5 of a degree. 53. Rotate a bottle when pouring powders from i t . 54. Devise a condenser by surrounding a test tube, with cold water i n a beaker or pan. 55. Touch the sides of the receiving vessel with the end of a funnel when making f i l t r a t i o n . APPENDIX C THE REVISED HORTON TEST THE CHEMISTRY 91 LABORATORY TEST 30 1 0 REVISED HORTOM TEST Prepare a f i l t e r and f i l t e r one-third of a test tube of a l i q u i d i n bottle number 1 into a beaker. REQUIREMENTS lo A shelf of reagents including one marked *1*. 2. A f i l t e r stand. 3. A funnel. 4. A pack of test tubes. 5. A sink and tap. 6. A box of f i l t e r paper. 2. Light a bunsen burner; adjust the flame for use. Correct the flame that has struck back. REQUIREMENTS 1. A bunsen burner connected to the gascock. 2. A box of matches. 3. Half f i l l a test tube with water; clamp i t to the ring stand and heat i t to boiling. REQUIREMENTS lo A ring stand and clamp. 2. A bunsen burner. 3» A rack of test tubes. 4. A box of matches. 81 4 Take about five grams of powder from each of the bottles *1* and * 2 ' . Mix the powders and place i n a test tube. After you have f i n -ished set i t up to generate a gas by heating the mixture. REQUIREMENTS 1. A bottle of powder marked *1*. 2 . A bottle of powder marked ' 2 ' . 3 . A pad of paper. 4 . A * rack of test tubes. 5 . A piece of rubber hose. 6 . A rubber stopper with glass tube inserted. 7 . A spatula. 8. A pestle and mortar. 9 , A beaker. 5 Set up a jar to collect hydrogen i n the usual way. Show how you would set a jar of hydrogen on the table where i t i s to remain for several hours. REQUIREMENTS 1. Two gas bottles. 2 . Two glass plates. 3 . A pneumatic trough. 4 . A \ n rubber tube 24" long. 5 . Water tap. 6 . A sink. 82 2 e CHEMISTRY 91 LABORATORY TEST 1. The three solutions marked 1, 2, and 3 may contain iodine. Test a few c.c.'s of each solution with hypo (sodium thiosulfate) and state which contains iodine, 1, The bottle marked contains iodine. ( ) REQUIREMENTS Three solutions: 1, Ferric chloride, 2, Potassium dichromate. 3, Iodine and Potassium iodide solution. Test solution: Hypo (sodium thiosulfate) solution. 2. A student has been preparing common salt by neutralization. Use the sti r r i n g rod and litmus paper to test the solution i n the beaker marked »4*. Answer these questions on your sheet. 2, Should the student add a solution of (1) acid, (2) base, (3) neither? ( ) 3, What acid or base should he use? If none, write • • n i l ' i n the blank,( ) REQUIREMENTS 1. Slightly basic salt solution, 2. Red litmus paper, 3. Blue litmus paper. 4. A glass plate. 5o A stir r i n g rod. $3 3 . DO NOT TOUGH THE BURETTE! Before t i t r a t i o n the burette was f i l l e d with base to the zero mark. The investigator used the pipette for the acid and completed the ti t r a t i o n . The base i s 0.20N. 4. Has the end point (1) been reached? (2) been overrun? (3) not been reached? (4) been neutralized? ( ) 5. What volume of base has been used? ( ) 6. Assuming neutralization to be complete at 20.0 ce's, calculate the normality of the acid ( N.) REQUIREMENTS 1. A burette f i l l e d to 15.3 c.c. 2. A 10 ml. pipette. 3 . A beaker containing 25 c.c.'s of solution colored red with phenolphthalein. 84 Smell each of these solutions as a preliminary test and then verify i t using the reagents i n front of you. I f any gas is not present write " n i l " i n the parentheses. 7. Which solution contains hydrogen sulfide? ( ) 8. Which solution contains sulfur dioxide? ( ) 9. Which solution contains carbon dioxide? ( ) REQUIREMENTS 1. A solution of sulfur dioxide marked "4". 2. A solution of hydrogen sulfide marked '*$". 3. A solution of carbon dioxide marked w 6". 4. A solution of limewater reagent (calcium hydroxide). 5. Lead acetate paper. 6. A DILUTE SOLUTION of potassium permanganate labelled *red dye 1. 85 5 The jars marked •|7I, ,8t and t9t contain one each of the following: gypsum (GaS04.2H20), common salt (HaGl) and potassium nitrate (KN03), By dissolving a small portion of each i n water discover which sample i s : 10. most soluble i n cold water ( ) l l e second most soluble i n water . ( ) 12. least soluble i n water (.) REQUIREMEHTS 1 0 A rack of test tubes. 2. A spatula. 3. A jar of sodium chloride labelled '7'. 4. A jar of gypsum labelled t8". 5. A jar of potassium nitrate labelled t9,« 6. A pad of paper. 4" X 4". 86 DO NOT TOUCH THE BALANCE OR RAISE THE PANS I You may handle the weights with forceps. Return the weights to the pan when you are finished. The crucible and contents have been weighed. 13. Show how you would calculate the weight. 14. What weight has the crucible REQUIREMENTS 1, A balance with (a) (b) and contents? ( gm0) a crucible of salt on the l e f t hand pan. the following weights on the right hand pan: 10, 2, and 1 grams; 500, 200, 50 and 5 milligrams. 87 One flask contains lead chloride precipitated and the other contains silver chloride precipitated. Shake each flask well and pour about 5 c,c. of the suspension into two separate test tubes, 15, Heat each test tube i n turn and decide which flask contains lead chloride ( ) DO NOT EXTINGUISH THE BURNER i REQUIREMENTS 1« A flask of lead chloride precipitated, 2, A flask of silver chloride precipitated, 3, A burner, 4, A rack of test tubes, 5, A test tube clamp. 8 Each of the three bottles marked '12', '13', and ^ l ^ 1 contains one of the following salts i n solution; Sodium chloride, sodium bromide, and sodium iodide. Using the chlorine water, bromine water, and benzene, test a small sample of each solution to determine: 16. Which bottle contains the iodide? ( ) 17. Which bottle contains the bromide? ( ) 18 e Which bottle contains the chloride? . • • . ( ) REQUIREMENTS 1, A solution of sodium chloride marked '14*. 2. A solution of sodium bromide marked f13*. 3o A solution of sodium iodide marked '12*. 4o A flask of chlorine water. 5. A flask of bromine water. 6. A bottle of benzene. 7. A rack of test tubes. 89 9 The unknown solution i n bottle *15' may contain silver ions and barium ions. Test for the presence of each ion using about a 5 c.c. sample for each. 19. Does the sample contain s i l v e r ions? . . . . . . . . ( ) 20. Does the sample contain barium ions? . ( ) REQUIREMENTS 1© A solution of Silver nitrate marked *15 2. Hydrochloric acid reagent. 3. Ammonium hydroxide reagent. 4. Sulfuric acid reagent. 5. A rack of test tubes. 90 10 DO NOT CONTAMINATE THE SOLUTIONS BY CHANGING THE WIRES Test each of the solutions *16», '17* and »18», to determine by a flame test which solution contains: 21* a barium salt 8 . . . . . . . . . ( ) 22, a sodium salt ( ) If a solution i s absent write ' n i l ' i n the blank, REQUIREMENTS 1, A flask of concentrated sodium chloride marked *l6t and containing a flame test wire, 2. A flask of concentrated barium chloride marked 117' and containing a flame test wire, 3. A flask of concentrated calcium chloride marked '18' and containing a flame test wire, 4, A lighted burner. 91 11 In the rack are f i v e precipitates of metallic sulfides. By their colours choose: 23, copper sulfide • ( ) 24. cadmium sulfide . . . . . . . . ( ) 25 o antimony sulfide • *. . . - . . . . ( ) REQUIREMENTS A rack of test tubes containing: (1) zinc sulfide precipitated, (2) antimony sulfide precipitated. (3) manganous sulfide precipitated. (4) cadmium sulfide precipitated, (5) copper sulfide precipitated. 12 26, What term i s best applied to the solution? (1) superheated. (2) supersaturated. (3) oversaturated. . ( ) (4) superconcentrated. (5) hydrated. 27, The process of so l i d i f i c a t i o n i s called: (1) crystallization. (2) precipitation, (3) consolidation. (4) coagulation. (5) petrifaction. . . . . . . . . ( ) REQUIREMENTS 1, A flask of supersaturated hypo (sodium thiosulfate) 2. A crystal of hypo. 92 13 28 & 29 What test i s being performed? ( 30. Was the unknown present? EEQUIREMENTS 1 0 A solution of sodium nitrate. 2. Goncentrated sulfuric acid. 3. A freshly prepared solution of ferrous chloride. The teacher performs test 12 by adding a crystal of hypo to the supersaturated solution and showing the pupils the crystallization. Test 13 i s performed by the teacher i l l u s t r a t i n g the brown ring test for nitrates. test for • • . ( . . . . 93 APPENDIX THE PENCIL AND PAPER TEST NAME . . SCHOOL 94 CHEMISTRY 91 Laboratory Examination. DATE You are being tested on your knowledge o f ; ( l ) l a b o r a t o r y proced-ures you have learned i n chemistry, and (2) experiments you have learned, observe:! or performed. DIRECTIONS Read each question c a r e f u l l y and place the number of the best answer i n the space provided at the r i g h t of each question. EXAMPLE: About f i v e grams of s a l t should be: (1) one-quarter t e a -spoonful. (2) one tea spoonful. (3) one and one-half t e a s p o o n f u l l s . (4) two t e a s p o o n f u l l s . (5) f i v e t e a s p o o n f u l l s . (2 A (2) i s placed i n the parentheses because i t i s the best answer. 1. When a chemist i s i d e n t i f y i n g a gas by smell he should- (1) have h i s a n t i d o t e s f o r poison on the bench beside him. (2) s n i f f i t ge n t l y f i r s t and only deeply i f i t i s not nauseating or i r r i t a t i n g , (3) waft i t g e n t l y toward him a nd s n i f f c a u t i o u s l y . (4) hold a damp c l o t h near h i s nose i n order to reduce the concentration of ..• the gas. (5) stand by an open window i n case the gas I s . smells. ( 2..Which block of diagrams shows the c o r r e c t sequence f o r heating a s o l i d mixture i n a t e s t tube to produce a gas? ( fl i i w I ! t \ / n —-1 1 \ / J L_ • W V W n -! \ / L i / .y-w-'vv n j 4. ( i ) 95 3. In f i n d i n g the percent of water of c r y s t a l l i z a t i o n i n a s a l t h^ "' heating the hydrate to the anhydride;- hof? of ten. should you a l t e r * n a t e l y heat i t and weigh -It? (1) U n t i l the c a l c u l a t e d amount of water has been d r i v e n o f f . (2) Just once is-enough. (3) Twice f o r accuracy. (4) U n t i l the l a s t weight i s unchanged from the previous one. (5) As often as c l a s s time permits. ( ) 4. A student was determining the combining weight of magnesium and found i t to be 12.0 grams. The true combining weight i s 12.16 grams. He made a c a l c u l a t i o n (.16 X 100^ ). What was he attempt-ing t o c a l c u l a t e ? (T£7l6-( l ) Percent y i e l d . (2) Percent d e v i a t i o n . (3) Percent e r r o r . (4) Average percent. (5) Percent c o r r e c t . ( } 5. Into a c l e a r s o l u t i o n a small c r v s t a l was dropped, ^he S o l u t i o n immediately s o l i d i f i e d and became warm, what term best a p p l i e s -•: to the so l u t i o n ? (1) Superheated. (2) .Supersaturated. (3) Over-saturated. (4) sluperconcentrated. (5) Hvdratftd. 6. A student was preparing common s a l t by n e u t r a l i z a t i o n . On t e s t i n g with l i t m u s he found the pink l i t m u s became blue.- What should he - do? (1) Add a few drops of a c i d and t e s t again. (2) Add a few drops of base.and t e s t again. (3) Add nothing, i t i s n e u t r a l . (4) Remove the l i t m u s paper before evaporating, (5) Add a few drops of s a l t water to repla c e those used i n t e s t i n g . ( ) 7. A f t e r you have prepared hydrogen w i t h z i n c a hd HC1 and are clean i n g up, which step i s most important? (1) Throw the unused z i n c i n the waste- j a r and pour the acid, down the sink. (2) Save the a c i d s o l u t i o n and r e t u r n i t to the HC1 Winchester."(3) B'urn a l l the hydrogen l e f t over and so prevent an expl o s i o n . (4) Fash the a c i d down the sink with p l e n t y of water., (5). Put both the a c i d and z i n c i n the waste j a r . . ( ) 8. When h y d r o c h l o r i c a c i d i s being poured from a reagent b o t t l e , the chemist should: (1) l a y the stopper on the table.. (2) l a v the stopper on a clean piece of g l a s s , (3) withdraw the stopper between the- f i n g e r s of the right-hand with the palm f a c i n g down. (4) F i t h -draw the stopper between t h e , f i n g e r s of h i s r i g h t hand with the palm f a c i n g up. (5) place the stopper i n the rack provided. ( ) 9. In l i g h t i n g a bunsen burner the f i r s t t h i n g to do i s : (1) turn the gas on strong before l i g h t i n g the match. (2) t u r n the gas on weak before l i g h t i n g the ma t c h . (3) l i g h t the match before t u r n -'-' ing on the gas. (4) open the the a i r valve at the base, of the burner before t u r n i n g on the gas. (5) l i g h t the gas before c l o s -ing the a i r v a l v e at the base of the burner. 10.If you accidentally:, s p i l l e d a l i t t l e , spot of s u l f u r i c a c i d on your coat, you should: (1) put i t near the r a d i a t o r so the- a c i d w i l l evaporate q u i c k l y . (2) put your coat i n water immediately. (3) sponge the area a f f e c t e d w i t h d i l u t e ammonium hydroxide and water, (4) pour a d i l u t e sodium hydroxide s o l u t i o n on the affected, p a r t . (5) sponge with water a nd l e t i t dry. ( ) (2) 96 i ' 11. Which o f t h e f o l l o w i n g g r a d e s i s n o t f o u n d on l a b e l s i n t h e l a b o r a t o r y s t o r e r o o m ? (1) C P . (2) U.S.P. (3) T e c h . ( 4 ) S ; Q . (5) m e e t s A . C . S . s t a n d a r d s . ( 12. A g r o u p o f s t u d e n t s were d o i n g an e x p e r i m e n t i n v o l v i n g t h e d i f f -e r e n c e s i n s e v e r a l r e a d i n g s o f t e m p e r a t u r e . T h ey d e c i d e d t o l e t one b o y do a l l t h e r e a d i n g s and c h o s e h i m b y l o t . T h e i r r e a s o n f o r h a v i n g one b o y r e a d t h e t h ermometer was: (1) i f the; r e s u l t s were p o o r t h e y would know whom t o b l a m e . ( 2 ) t h a t a n y e r r o r s i n "one p e r s o n s r e a d i n g s would, most l i k e l y be c o n s i s t e n t and c a n c e l o u t . (3) t h a t b y c h o o s i n g h i m b y l o t t h e y would, n o t l i k e l y g e t t h e p o o r e s t p e r s o n t o r e a d t h e t h e r m o m e t e r . (4) t o o much t i m e would be s p e n t i n a r g u i n g i f more t h a n one p e r s o n r e a d t h e t h e r -mometer. (5) i t would f i t i n t o a p l a n t o d i v i d e up t h e work i n d o i n g t h e e x p e r i m e n t . 13. A s t u d e n t was c o n f r o n t e d w i t h w a t e r s o l u t i o n s o f t h e f o l l o w i n g g a s e s : (1) c a r b o n d i o x i d e , ( 2 ) h y d r o g e n s u l f i d e , (3) n i t r o g e n , (4) oxygen, and (5) s u l f u r d i o x i d e . He s m e l l e d them and c h o s e one t h a t s m e l l e d l i k e l o w - t i d e . He t e s t e d i t w i t h l e a d a c e t a t e p a p e r . The r e s u l t was'dark c o l o r a t i o n , t h e g a s was ( 14. The s o l u t i o n o f gas ( l i s t e d i n q u e s t i o n 13) t h a t i r r i t a t e d , h i s n o s t r i l s and b l e a c h e d a r e d dye c o l o r l e s s was { 15. The t h i r d s o l u t i o n ( l i s t e d i n q u e s t i o n 13) t e s t e d h a d no odour b u t gave a w h i t e p r e c i p i t a t e w i t h c a l c i u m h y d r o x i d e s o l u t i o n . The d i s s o l v e d g a s was ( 16. I f t h e g a s f l a m e of a b u n s e n b u r n e r s t r i k e s b a c k ( i . e . b u r n s a t t h e b a s e o f t h e b u r n e r ) one s h o u l d : (1) t u r n i t o f f and r e l i g h t . ( 2) t u r n i t o f f and g o t ..another b u r n e r . (3) c l o s e t h e a i r v a l v e a t t h e b a s e of t h e b u r n e r and i t w i l l be c o r r e c t e d . (4) c a l l t h e i n s t r u c t o r and have him. r e l i g h t i t . (5) r e d u c e t h e g a s p r e s s u r e a t t h e s t o p c o c k . , • ( 17. A s t u d e n t mixed some f e r t i l i z e r and l i m e t o t e s t f o r ammonia. The r e s u l t i n g g a s s m e l l e d l i k e ammonia b u t d i d n o t a f f e c t e i t h e r r e d o r b l u e l i t m u s p a p e r . H i s most p r o b a b l e e r r o r was i n : (1) i d e n t i f y i n g t h e g a s b y s m e l l . (2) u s i n g old. l i t m u s p a p e r . (3) n o t w e t t i n g t h e l i t m u s p a p e r . (4) u s i n g t h e wrong i n d i c a t o r . (5) u s i n g t h e wrong c h e m i c a l s . ( 18. I n t o a c l e a r s o l u t i o n a s m a l l c r y s t a l was d r o p p e d . The s o l u t i o n i m m e d i a t e l y s o l i d i f i e d and became warm.- The p r o c e s s of s o l i d i f -i c a t i o n i s b e s t c a l l e d : (1) c r : / s t a . l l i z a t i o n . ( 2 ) p r e c i p i t a t i o n . ( 3) c o n s o l i d a t i o n . (4) c o a g u l a t i o n . (5) p e t r i f a c t i o n . 19. I n m a k i n g a t e s t f o r an unknown a c i d r a d i c a l t h e s t u d e n t added f i v e c u b i c c e n t i m e t e r s o f f r e s h l y p r e p a r e d f e r r o u s s u l f a t e s o l -u t i o n t o an e q u a l volume o f t h e unknown, He t h e n c a r e f u l l y p o u r e d c o n c e n t r a t e d s u l f x i r i c a c i d down t h e i n s i d e o f t h e t e s t t u b e c o n -'. . t a i n i n g t h e m i x t u r e j u s t p r e p a r e d . The t e s t p e r f o r m e d was t o t e s t • t h e p r e s e n c e o f : (1) s u l f a t e . (2) c h l o r a t e . ( 3 ) p h o s p h a t e . (4) c h l o r i d e . (5) n i t r a t e r a d i c a l . ( (3) 97 20. The name of the t e s t described i n question 19 i s the t e s t . (2) molybdate t e s t . (3) reduced i r o n t e s t . (4) i r o n t e s t . (5) brown r i n g t e s t . (1) s u l f a t e o x i d i z e d . ( 21. The pre p a r a t i o n of d i l u t e s u l f u r i c a c i d from concentrated i n the la b o r a t o r y i s a slow process because: (1) the a c i d i s not very . soluble and so takes some time to d i s s o l v e . (2) the sudden heat ! gehoratod ^ould break any common g l a s s v e s s d l unless, i t i s mixed alowly. (3) The a c i d v a p o r i z e s and so must be kept covered. ; (4) s u l f u r i c a c i d i s o i l y and so i t i s d i f f i c u l t to mix i t w i t h 22. water. (5) contai n e r . -—~ i j i 21 i f the a c i d gets too hqt i t w i l l d i s s o l v e the gl a s s U':y"-'^ "-'v'"' i;-'-V.< 10 ml. PS ( Before t i t r a t i o n the 100 c c . bu r e t t e was f i l l e d to the zero mark. A f t e r one t i t r a t i o n the l e v e l of the base appeared as i n the diagram. The a c i d was d e l i v e r e d from the p i p e t t e shown. The base was 0.-20 S. T i t r a t -i o n was continued u n t i l the phenolphthalein i n d i c a t o r became, a deep re d . The volume of \ base used was: (1) 50. I c e (2) 31.0c.c-. (3) 32.20 . 0 . (4) 30.2c.c. (5) 30.22c.c ( The end point i s (1) reached„ (2) (4) not reached. said t o have been: overrun„ (3) achieved. (5 ) confirmed» ( Assuming the burette to read 12.0 c c . then the no r m a l i t y of the a c i d would be: (1) 0.24 N. (2) 0.4 N. (3) 0*06 N. (4) 0.60 N. (5) 0.167 N. ( I f the experimenter wished t o repeat the experiment he should take a f r e s h sample of a c i d and i n d i c a t o r and then: (1) use a funnel t o f i l l the wi t h h i s t i t r a t i o n w i t h h i s t i t r a t i o n reached. (4) d r a i n volume (e.g. 40c.c the t i t r a t i o n . (5) wash before f i l l i n g b u r e t t e . (2) proceed to 62 c.c. (3) proceed u n t i l the end-point i s the b u r e t t e to an even ) before proceeding with empty the -burette and • with 0.20 N. base. ( 26.How i s the fo l d e d f i l t e r paper h e l d i n p o s i t i o n i n the fu n n e l befor< f i l t e r i n g i s commenced? (1) Use one hand to hold the paper and pour the l i q u i d from the v e s s e l with the other hand. (2) The cohesion between the dry paper and the g l a s s w i l l keep i t i n p o s i t i o n . (3) The adhesion between the dry paper and the g l a s s w i l l h o ld i t i n p o s i t i o n . (4) Wet. the f i l t e r paper w i t h your solvent a f t e r i n -s e r t i n g i t i n the f u n n e l . (5) Wet the f i l t e r paper w i t h your s o l -u t i o n a f t e r i t i s in. p o s i t i o n i n the f u n n e l . ( ' 27.The f o l l o w i n g colours are produced by the vapours of d i f f e r e n t metals, (.1) b r i c k red, (2) l i g h t green, (3) yellow, (4) crimson r e d , (5) v i o l e t , and (6) blue green. Barium would produce what colour? ( (4) 98 28. U s i n g t h e c o l o u r s s t a t e d i n q u e s t i o n 27, write- t h e number o f t h e c o l o u r p r o d u c e d b y sodium v a p o u r . ( 29. You h a v e t h r e e unknowns w h i c h c o n t a i n (1) a c h l o r i d e , ( 2 ) a bromide and (3) a n I o d i d e i n s o l u t i o n . I n o r d e r t o t e s t and i d e n t i f y e a c h y o u would add: (1) c h l o r i n e w a t e r . (2) b r o m i n e w a t e r . ( 3 ) c a r b o n d i s u l f i d e . (4) c h l o r i n e w a t e r and t h e n c a r b o n d i s u l f i d e . ( 5) b r o m i n e w a t e r and t h e n c a r b o n d i s u l f i d e . ( 6 ) e i t h e r c h l o r i n e w a t e r o r b r o m i n e w a t e r and t h e n c a r b o n d i s u l f i d e . ( 7) none o f t h e methods s t a t e d a b o v e . You c a n o n l y d e t e r m i n e i t b y e l i m i n a t i n g t h e o t h e r two...haIides. Which of t h e above s t a t e m e n t s i s t h e b e s t e x p l a n a t i o n o f d e t e r -m i n i n g : , t h e b r o m i d e ? . ( 50. the- i o d i d e ? . ( 31. t h e c h l o r i d e ? ( 32. You h a v e f i v e f l a s k s c o n t a i n i n g y e l l o w s o l u t i o n s * T h e y a r e . • ( 1 ) impure h y d r o c h l o r i c a c i d , (2) c o l l o i d a l a r s e n i c t r - i s u l f i d e , ( 3 ) m e t h y l o r a n g e i n d i c a t o r , (4) d i l u t e f e r r i c c h l o r i d e s o l u t i o n , and ( 5 ) d i l u t e p o t a s s i u m chrornate s o l u t i o n . W h i c h o f t h e above w i l l be p r e c i p i t a t e d b y a d d i n g a few c c . ' s o f d i l u t e ammonium h y d r o x i d e ? ( 33. Which of t h e s o l u t i o n s i n q u e s t i o n 232 would be p r e c i p i t a t e d b y a d d i n g a few c c . ' s o f h y d r o c h l o r i c a c i d ? • ( 34. I f y o u w i s h e d t o compare t h e r a t e s o f r e a c t i o n a t two d i f f e r e n t t e m p e r a t u r e s , t h e most c o n v e n i e n t t e m p e r a t u r e s t o u s e would b e : (1) 2 0 ° C . and L 0 0 ° C . (2) 10°C» and 9 0 ° C . (3) 2 0 ° C . and 8 0 ° C . (4) 4 ° C . and 1 0 0 ° C . (5) 3 0 ° G s and 5 0 ° C . ( 35. A sample of b a k i n g powder u n d e r g o i n g a n a l y s i s p r o d u c e d t h e f o l l o w -i n g t e s t s : (1) t h e f i l t r a t e t e s t e d f V r s u l f a t e . (2) t h e f i l t r a t e t e s t e d f o r p h o s p h a t e . (3) No ammonium s a l t s were i n t h e f i l t r a t e . W h ich two o f t h e f o l l o w i n g s u b s t a n c e s were d e f i n i t e l y p r e s e n t i n t h e b a k i n g powder? (1) combined calcixom, (2) m o l y b d a t e s . (3) com-b i n e d aluminum. (4) t a r t a r a t e s . (5) y e a s t . ( 6 ) Ammonium b i c a r b o n a t e . ' ( 36. • ( 3 7 . I n t h e p r o c e d u r e o f l i g h t i n g ; a b u n s e n b u r n e r one s h o u l d : (1:) .open- th< a i r v a l v e a t t h e b a s e b e f o r e t u r n i n g on t h e g a s . (2) t u r n t h e g a s on weak u n t i l i t i s l i g h t e d . (3) h o l d t h e l i g h t e d m a t ch c l o s e t o t h e b u r n e r . (4) t u r n t h e g a s on s t r o n g u n t i l i t i s l i g h t e d . (5) t e s t t h e g a s p r e s s u r e b e f o r e a t t a c h i n g t h e b u r n e r . ( 38. I f y o u were u s i n g a 200 c . c . g r a d u a t e w i t h 10 c . c . g r a d u a t i o n s and measured out 150 c . c . o f w a t e r , t o w h i c h was added 120 c . c . of a l c o h o l , what p e r c e n t of t h e whole m i x t u r e was a l c o h o l ? Choose t h e answer t h a t y o u c a n be most -sure o f . (1) 44?£. (2) 40$. (3). 44 .-4$. (4) 4.4.44$. (5) 44.444$.' ( (5) 100 PART I I o e l e c t f r o m t h e s k e t c h e s o f t h e a p p a r a t u s on t h e o p p o s i t e page, t h e a p p a r a t u s b e s t d e s i g n e d t o do t h e t a s k r e q u i r e d i n e a c h o f th e f o l l o w i n g c a s e s . W r i t e t h e number o f t h e a p p a r a t u s i n t h e p a r e n t h e s e s p r o v i d e d a t t h e r i g h t . 1. A p p a r a t u s t o o b t a i n q u i c k l y a d i s s o l v e d s o l i d f r o m s o l u t i o n , ( 2. A p p a r a t u s t o p r e p a r e a g a s h e a v i e r than a i r , s o l u b l e i n w a t e r and made f r o m h e a t i n g a l i q u i d and a s o l i d . (• 3. A p p a r a t u s t o o b t a i n q u i c k l y a suspended s o l i d f r o m s o l u t i o n . ( 4. . A p p a r a t u s to'distil'Water. ( 5. A p p a r a t u s u s e d t o p r e p a r e a g a s h e a v i e r t h a n a i r , s o l u b l e i n w a t e r and made b y h e a t i n g two s o l i d s . ( 6. A p p a r a t u s u s e d t o make a g a s l i g h t e r t h a n a i r , s o l u b l e i n w a t e r , and f o r m e d b y t h e a c t i o n o f a l i q u i d on a s o l i d w i t h o u t h e a t i n g . ( 7. A p p a r a t u s used t o make c r y s t a l s o f a s o l i d f r o m a s o l u t i o n of t h e s o l i d . ( 8. . A p p a r a t u s t o p r e p a r e a g a s l i g h t e r t h a n a i r , s o l u b l e i n w a t e r ' •. and;-m&d:e;::by' h e a t i n g t w o i ' s o l i d s . ( 9. A p p a r a t u s t o p r e p a r e o xygen. ( 10. A p p a r a t u s t o p r e p a r e h y d r o g e n c h l o r i d e g a s . ( 11. A p p a r a t u s t o p r e p a r e h y d r o g e n . . ( 12. A p p a r a t u s t o p r e p a r e c h l o r i n e . ( (7) 101 APPENDIX E CHECK SHEET for scoring THE REVISED HORTON TEST 1. Folds paper properly. 2. Inserts paper i n funnel correctly. 3. Wets paper. 4. Pours liquid not above paper. 5. Touches funnel to edge of beaker. 6. Takes stopper between fingers palm up. 7. Keeps stopper i n hand while pouring. 8. Keeps bottle i n hand u n t i l through. 9. Replaces stopper and bottle to right place. 10. Hold test tube obliquely. 11. Catches Last drop on edge of test tube. 12. Lights match before turning on the gas. 13. Turns gas on strong at f i r s t . 14. Holds match high. 15. Closes a i r i n l e t before lighting. 16. Turns flame down to four inches. 17. Puts paper i n jaws of clamp. 18. Slopes the test tube. 19. Applies heat to the top of the water. 20. Adjusts the clamp to the proper height. 21. Clamps firmly but without excessive pressure. 22. Rotates bottle when pouring. 23. Estimates one teaspoonful. 24. Mixes i t on a piece of paper. 25. Uses V paper to insert i t i n test tube. 26. Twists stopper when inserting i n test tube. 27. Sets test tube horizontal. 28. Twists glass into rubber tube. 29. Tests water pressure before f i l l i n g jar. 30. F i l l s pan to suitable depth. 31. Points overflow into sink. , 32. Uses glass to cover bottle when inverting. 33. Allows no a i r to enter. 34. Sets bottle on table inverted. CO o 1 ft O g 1 t O » © 3 » l-i H* CO CO 1 » Jarvis A. j Olsen C. | • 102 APPENDIX F PRACTICAL LABORATORY TEST (Answer Sheet) Test I . . . 1 ( ) Test VII 15 . . . ( ) Test II . . .2 ( ) Test VIII . . . . 16 . . . ( ) 3 . . . ( ) 17 . . . • ( • ) Test III . . 4 ( ) 18 . . . ( ) 5 . . . . ( c c . ) Test IX 19 . . . ( ) 6 . . . . ( N. ) 20 . . . ( ) Test IV . . .7 ( ) Test X 21 . . . ( ) 8 ( ) 22 . . . ( ) 9 ( ) Test XI 23 . . . ( ) 24 . . • ( ) Test V . . .10 . . . . . . ( ) -25 . . . ( ) 11 ( ) 12 ( ) Test XII . . . . . 26 . . . ( ) 27 . . . ( ) Test VI . . 13 Test XIII . . . . 28 ( . . . test 29 for . . . . 30 . . . . ( 14 . . . . ( gin.) 103 APPENDIX G DATA NAME IQ Criter-ion Test Pencil & Paper Tes Notebooks Teacher's Estimates I •P _ t% O t-i 1 Criter-ion B Duncan M„ 120 50 37 140 75 24 26 Ratushny F e 115 47 40 148 • 87 24 23 Glaum L. 130 46 36 46 37 24 22 Greenough R« 101 45 35 90 72 24 21 Mah G. 118 45 39 110 63 22 23 Westlund W. 133 45 38 145 85 21 24 Costanzo P e 141 44 30 109 44 24 20 Gronlie M, 112 44 28 130 65 19 25 Scrimgeour G e 133 44 41 103 86 23 21 Gillingham J . 133 43 29 136 58 23 20 Lortie G. 101 42 32 109 18 23 19 Davies J. 121 40 35 82 58 19 21 Johanssen J, 111 40 27 137 70 19 21 Lum W. 108 40 29 121 66 19 21 Rosen L. 119 40 33 141 71 23 17 Wilson T. 147 40 39 103 63 17 23 Crane R© 126 39 27 127 66 19 20 Brown R. 127 38 37 127 76 17 21 Con B. 103 38 24 125 47 22 16 Hall J. 121 38 35 140 75 20 18 Lamb K. 149 38 36 135 82 21 17 Mitchell W. 131 37 37 136 81 19 18 Roscoe Mo 114 36 34 123 61 14 22 Baker C. 100 35 29 113 33 19 16 Mitchell R. 123 35 31 133 78 17 18 Vea A, 129 35 26 102 60 16 19 Wong C 0 107 35 30 136 66 22 13 Yip Y. 114 35 22 101 34 22 13 Campbell R. 110 34 35 104 51 17 17 Jarvis A» 126 34 24 102 50 18 16 Kraft D. 114 34 22 116 38 20 14 Moore R« 109 34 32 141 70 18 16 Ottewell D« 106 34 23 109 32 22 12 T i l l y e r D. 113 34 22 100 39 17 17 Brown D. 132 33 22 99 35 17 16 Dennis G« 94 33 28 117 65 17 16 Fortin L. 114 33 32 93 52 18 15 Carle R. 130 33 30 128 62 13 20 Lee N. 122 33 34 121 63 15 18 Baker G. 90 32 31 98 58 21 11 B e l l Ho 122 32 29 121 54 18 14 Carfrae M. 102 32 38 132 74 18 14 Chin R s 120 32 22 122 46 21 11 Knight R. 129 32 20 129 57 17 15 104 APPENDIX G (Cont'd.) Name IQ Criterion Test Pencil & jPaper Test Notebooks Teacher's Estimates c i . -P -a! Tl Criterion B Makort A. 113 32 30 46 59 19 13 Williams F. 117 32 22 105 24 20 ' 12 Goff G. 106 31 37 118 31 14 17 Lee C. 114 31 28 111 63 17 14 Kihara S. 103 31 21 126 50 12 19 Bouzevetsky N. 116 30 28 77 33 14 16 Godson K. 127 30 19 109 55 20 10 Hendry P. 127 30 30 104 66 16 14 Welbourn C. 113 30 22 113 38 14 16 Yee B. 122 30 26 113 40 14 16 Borsato F, 90 29 32 144 80 15 14 Kisielewich P. 117 29 23 78 33 14 15 Lowe D* 120 29 19 123 67 18 11 Newton S. 115 29 25 112 30 13 16 Saimoto J . 88 29 21 125 37 13 16 Shynkaryk W. 109 29 25 100 50 13 16 Smith C. 90 29 15 111 10 20 9 Brisseau G. 119 27 22 125 59 16 11 HHenderson P. 117 27 23 128 54 14 13 Potter R. 123 27 23 55 6 16 11 Englemann M. 108 25 19 108 44 14 11 Lawrence W. 105 25 17 113 25 17 8 Oberholtzer B. 111 25 22 102 52 13 12 Shillington S. 115 25 27 122 26 14 11 Sweet D« 107 25 24 126 51 11 14 Perdia N. 91 23 19 105 67 12 11 Lessman E. 107 20 18 118 21 13 7 Smith J . 128 20 27 121 41 8 12 POSSIBLE SCORE — 64 50 162 100 34 30 CRITERION TEST i s composed of two partsj A a The Revised Horton Test -a test of manipulating apparatus, B. The Practical Test on the Labora-tory Experiments of Chemistry 91. PENCIL AND PAPER TEST i s a written test of f i f t y items based on the criterion. THE NOTEBOOK i s the score,on the f i r s t f i f t e e n experiments i n the student's notebook priofc to the investigation. .. THE TEACHER'S ESTIMATE i s an estimated score of the student's a b i l i t y to do laboratory work by his teacher, viz., the investigator. 105 APPENDIX H INTERNAL CONSISTENCIES? VALIDITIES AND DIFFICULTIES OF ITEMS ON PENCIL AND PAPER TEST Item Internal Consist-ency Validity Coeffi-cient Difficulty Index i" i i Item Internal Consist-ency Validity Coeffi-cient Ma 1. .00 -^.25 .85 26. .31 *.05 .49 2. .31 .26 .50 27. .24 .15 .76 3. .68 .33 .83 23. .00 .33 .75 4. .15 -.30 .86 29. .26 .05 .41 5. .48 .40 .70 30. .33 .26 .26 6. .48 .16 .68 31. .35 .33 .26 7. .00 -.05 .53 32. .36 .36 .52 8. .15 .31 .78 33. .21 -.07 .29 9. .00 -.06 .63 34 .33 .23 .63 10. .26 .44 .61 35. .36 .21 .50 11. .10 .41 .44 36 .10 .21 .44 12. .35 .40 .75 37. .31 .10 .43 13. .60 .45 .60 38. .00 -.40 .03 14. .59 .51 .61 39. .41 .22 .63 15. .38 .51 .65 40. .81 .65 .71 16. .51 .47 .79 41. .38 .33 .63 17. .45 .33 .65 42. .70 .55 .51 18. .25 .13 .79 43. .75 .59 .64 19. .35 .21 .79 44. .59 .58 .76 20. .20 .24 .82 45. .45 .28 .58 21. .45 .21 .53 46. .45 .23 .46 22. . .24 .00 .23 47. .59 .20 .35 23. .44 .65 .75 43. .44 .44 .26 24. .42 .33 .60 49. .68 •60 .26 25. .22 .33 .33 50. .21 -.06 .26 These v a l i d i t i e s were determined from a table of values of the Product-moment Correlation i n a normal Bivariate Population corresponding to given proportions of success, given by Thorndike 1 and prepared by the Cooperative Test Service from a chart by Flanagan. The upper and lower groups were determined on the basis of 1*16 scores on the Pencil and Paper Test. 1 Thorndike, R.L., Personnel Selection. New York: John Wiley and Sons, Inc., 1949, pp. 347-351. 106 APPENDIX I INTERNAL CONSISTENCIES AND DIFFICULTIES OF CRITERION TEST ITEMS § M i Vi G © <D O t-i o o Diffi-culty Item Coeffi-cient Diffi-culty i 1. .27 .96 1. .37 .46 2. .55 .78 2. -.23 .54 3. .35 .72 3. .21 .54 4. . .18 .87 4. .63 .69 5. .56 .54 5. .60 .43 6, .48 .74 6. .68 .17 7. .48 .95 7. .55 .49 -So .10 .89 8. .50 .43 9. .15 .54 9. .21 .67 10. .18 .75 10. .00 .49 11. .15 .40 11. .00 .52 12. -.10 .89 12. .39 .68 13. .25 .89 13. .63 .54 14. -.11 .36 14. .52 .65 15. .24 .73 15. .12 .56 16. .34 .33 16. -.05 .43 17. .33 .68 17. .11 .29 18 0 -.05 .89 i s ; .30 .48 19. .30 .74 19. .28 .68 20. .00 .20 20. .15 .54 21. .06 .33 21. .48 .72 22. .25 .06 22. .71 .79 23. .21 .38 23. .50 .49 24. .40 .22 24. .07 .18 25. .55 .78 25. .16 .36 26. .07 .24 26. .42 .79 27. .54 .17 27. .67 .76 28. .1*0 .35 28. .11 .63 29. -.15 .04 29. .4© .70 30. .4© .20 30. .51 .76 31. .48 .22 32. .59 .60 33. .36 .28 34. .68 .79 These v a l i d i t i e s were determined from a table of values of the Product-moment Correlation i n a normal Bivariate Population corresponding to given proportions of success, given by Thomdike^ and prepared by the Cooperative Test Service from a chart by Flanagan. The upper and lower groups were determined on the basis of the scores on the criterion test. 1 op..cit., pp.347-351. 107 APPENDIX J T-SCORES FOR THE PENCIL AND PAPER TEST Raw Percen- T Raw Percen- D Score t i l e Score Score t i l e Score 50 85 25 42 45 49 83 24 36 44 48 81 23 . 30 42 47 80 22 / 24 4L 46 78 21 17 39 45 100 77 20 12 37 44 100 75 19 9 36 43 100 74 18 6 34 42 100 72 17 4 33 u 99 70 16 3 31 40 98 69 15 2 29 39 96 67 14 1 28 38 93 66 13 0,5 26 37 90 64 12 0 25 36 86 61 11 23 35 82 59 10 22 34 79 58 9 20 33 75 56 8 19 32 73 55 7 17 31 71 54 6 15 30 67 53 5 14 29 63 51 4 12 28 59 50 3 10 27 54 48 2 9 26 48 47 1 7 •* © 6 Percentiles computed from graph prepared from frequency distribution. The Derived scores were computed from the formula: T.S. = 10 (X - M) ± 5 0 S.D. Where T.S. i s the derived score. X i s the raw score. M i s the mean of the distribution, v i z . , 28.08. S.D. i s the standard deviation, v i z . j 6.316. 108 APPENDIX K PENCIL AND PAPER TEST SCALED TO PLACE FIFTEEN PERCENT BELOW A CRITICAL SCORE OF 50 Raw Score Scaled Score Raw Score Scaled Score 10 10 30 66 11 11 31 68 12 12 32 70 13 20 33 72 14 25 34 74 15 30 35 76 16 34 36 77 17 38 37 79 18 41 38 82 19 45 39 86 20 46 40 90 21 48 Al 92 22 50 42 95 23 52 43 97 24 54 44 97 25 56 45 98 26 59 46 98 27 61 47 99 28 63 48 99 29 65 49 99 50 100 The Scaled Score was derived from cumulative frequency curves based on (1) the raw scores of the pencil and paper test, and (2) a normal distribution of scores with the median set at 63 and the standard deviation set at 13o This method i s employed by the B r i t i s h Columbia Department of Education i n scaling scores on University Entrance Examinations, 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0106531/manifest

Comment

Related Items