Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

An investigation of the effects of discovery learning on retention at two levels of mental functioning Kroeker, Leonard Paul 1967

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1967_A8 K76.pdf [ 5.79MB ]
Metadata
JSON: 831-1.0104557.json
JSON-LD: 831-1.0104557-ld.json
RDF/XML (Pretty): 831-1.0104557-rdf.xml
RDF/JSON: 831-1.0104557-rdf.json
Turtle: 831-1.0104557-turtle.txt
N-Triples: 831-1.0104557-rdf-ntriples.txt
Original Record: 831-1.0104557-source.json
Full Text
831-1.0104557-fulltext.txt
Citation
831-1.0104557.ris

Full Text

AN INVESTIGATION OF THE EFFECTS OF DISCOVERY LEARNING ON RETENTION AT TWO LEVELS OF MENTAL FUNCTIONING  A Thesis Presented to the Faculty of the College of Education The University of B r i t i s h Columbia  In P a r t i a l F u l f i l l m e n t of the Requirements f o r the Degree Master of Arts  by Leonard Paul Kroeker August 1967  We accept t h i s thesis as conforming t o the required standard  In p r e s e n t i n g t h i s t h e s i s  in p a r t i a l f u l f i l m e n t of the requirements  f o r an advanced degree at the U n i v e r s i t y of B r i t i s h Columbia,  I agree  t h a t the L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e and Study. thesis  I f u r t h e r agree t h a t p e r m i s s i o n  this  f o r s c h o l a r l y purposes may be g r a n t e d by the Head of my  Department or by h.ijs r e p r e s e n t a t i v e s . or p u b l i c a t i o n o f t h i s t h e s i s  Department of  Education  The U n i v e r s i t y of B r i t i s h Columbia Vancouver 8, Canada August 15.  1967  It  i s understood t h a t  copying  f o r f i n a n c i a l g a i n s h a l l not be a l l o w e d  w i t h o u t my w r i t t e n p e r m i s s i o n .  Date  f o r e x t e n s i v e c o p y i n g of  ii ABSTRACT It was hypothesized that greater mean retention would occur at the Knowledge and Application l e v e l s of mental functioning, as defined by Bloom's Taxonomy of Educational Objectives, among students taught by a discovery method than among students taught by a more conventional, lecture-demonstration method.  It was also hypothesized that greater  mean retention would occur at the Application l e v e l than at the Knowledge l e v e l among students taught by a discovery method than among students taught by a lecture-demonstration method. Each of two ninth grade science classes In a single school was taught a heat unit using one of the methods mentioned above. The teaching methods were assigned randomly t o intact classes, both handled by the same teacher. Two multiple choice achievement t e s t s covering the content of the heat unit were constructed; one consisting of items i n the Knowledge category and the other consisting of items i n the Application category. A tryout of each of these t e s t s was conducted upon l6o students i n a single school thus allowing the elimination of unsuitable items, s p e c i f i c a l l y , items whose discrimination indices were negative and those whose d i f f i c u l t y indices were either too high or too low.  The r e s u l t i n g  unit t e s t s , with f o r t y Knowledge and t h i r t y two Application items respectively, were administered t o the students of the two classes both immediately following and s i x weeks following the conclusion of the heat u n i t . The r e l i a b i l i t y c o e f f i c i e n t s of the Knowledge and Application t e s t s , estimated by c o r r e l a t i n g the h a l f test scores and applying the Spearman-Brown formula, were .82 and .80 respectively.  iii Covariates,  chosen on the basis of t h e i r c o r r e l a t i o n with loss  scores (measures of retention), were used t o adjust experimental and control group loss score means. The analysis o f covariance showed a s i g n i f i c a n t difference between Application loss score means and a non-significant difference between Knowledge loss score means at the pre-set 5 percent significance l e v e l .  I t was therefore concluded that  t h i s experiment provided evidence f o r the acceptance of the experimental hypothesis dealing with the retention of a p p l i c a t i o n objectives and f o r the r e j e c t i o n of the experimental hypothesis dealing with the retention of the knowledge objectives. Items, matched on the basis of content, d i f f i c u l t y index, and discrimination index were selected from the two unit t e s t s t o form Knowledge and Application subtests.  Loss scores were calculated from  the subtest r e s u l t s and the d i f f e r e n t i a l retention hypothesis was tested using a  Z  statistic.  The analysis revealed a non-significant  difference between d i f f e r e n t i a l loss score means at the pre-set 5 percent significance l e v e l . It was concluded that there was no s t a t i s t i c a l basis t o support the hypothesis that greater mean d i f f e r e n t i a l retention (between Application and Knowledge) would occur among students taught by a discovery method than among students taught by a lecture-demonstration method. It was, however, suggested that further experimental refinements might possibly produce s i g n i f i c a n t r e s u l t s when t e s t i n g the d i f f e r e n t i a l retention hypothesis i n a future r e p l i c a t i o n of the study. and p r a c t i c a l implications of the findings were also  Theoretical  discussed.  V  ACKWOWLEDGEMEHT  I wish t o express my thanks t o Prof. H. Cannon of the Faculty of Education f o r h i s patient guidance throughout.  I am also indebted  to Dr. D. McKie of the Faculty of Education f o r h i s time spent i n discussion and f o r h i s many h e l p f u l suggestions. Thanks are also due t o Mrs. J . Woodrow of the Faculty of Education f o r her c a r e f u l proof-reading and many suggestions regarding the writing of the t h e s i s , and t o the l a t e Mr. D. Webster of the Faculty of Education f o r h i s c a r e f u l analysis of the t e s t s used.  F i n a l l y , I wish t o thank my wife, Mrs. A. Kroeker, f o r her c a r e f u l typing of the manuscript.  TABLE OF CONTENTS CHAPTER I.  II.  PAGE  THE PROBLEM AND HYPOTHESIS 1.0  Introduction  1  1.1  The Problem  3  a)  Statement of the Problem  3  b)  Statement of the hypotheses  k  c)  The n u l l hypotheses  k  d)  The alternative hypotheses  5  1.2  D e f i n i t i o n of Terms  5  1.3  Discussion of the Problem  11  l.k  Importance of the Study  13  REVIEW OF THE LITERATURE  16  2.0  Introduction  l6  2.1  Literature Related t o Teaching Methods  16  2.2  Literature Related t o Instructional Objectives and Retention  2.3 III.  1  Summary  SPECIFICATION OF RELEVANT PREPARATIONS  19 23 2k  3.0  Introduction  2k  3.1  The Content of the Unit  2k  3.2  The Unit Tests  26  a)  Content of the unit tests  26  b)  Type of t e s t  27  3.3  The Pretests  28  3.U  The Subjects  32  3.5  The Tryout Problem  33  3.6  The Experimental Design  35  vii CHAPTER IV.  PAGE  4.0  Introduction  38  4.1  Preparation of the Students  38  4.2  The Instruction of the Unit  1+0  4.3  Preparation of the Tryout Tests  4l  4.4  Administration of the Tryout Tests  43  a)  The problem of guessing  43  b)  The r e s u l t s of the tryout t e s t s  44  4.5  The Administration of the Pretests  4.6  The Preparation and Administration of  k.J  V.  38  METHODOLOGY  48  the F i n a l Test  48  a)  The selection of the items  48  b)  The construction of the unit t e s t s  50  c)  The administration of the unit t e s t s  51  The Administration of the Retest  53  a)  Administration of the retest  53  b)  The t e s t r e s u l t s  53  THE STATISTICAL ANALYSIS  55  5.0  Introduction  55  5.1  The Unit Test Results and R e l i a b i l i t i e s  55  a)  The unit test scores  55  b)  The r e l i a b i l i t y of the unit t e s t s  59  5.2  The Pretest Results  60  5.3  The S t a t i s t i c a l Test of Hypothesis I  63  5.k  The S t a t i s t i c a l Test o f Hypothesis I I  65  5.5  The Loss Score R e l i a b i l i t i e s  67  5.6  The Subtests Used i n the Test of Hypothesis III  68  viii  CHAPTER  PAGE 5.7  VI.  The Statistical Test of Hypothesis III  CONCLUSIONS AND SUMMARY  71 73  6.0  Introduction  73  6.1  Limitations of the Study  73  6.2  Conclusions  7^  6.3  Implications of the Study  75  6.k  Possibilities for Further Research  76  6,5  Summary of the Investigation  76  BIBLIOGRAPHY  79  APPENDIX A. Test Directions  83  APPENDIX B.  Sample Lessons  85  APPENDIX C. The Unit Tests  90  APPENDIX D. The Data  104  APPENDIX E.  108  Derivation of Test Statistic  ix  LIST OF TABLES TABLE I.  PAGE Distribution of the Item Difficulty Indices for the Knowledge and Application Tryout TestB  II.  Distribution of the Item Discrimination Indices for the Knowledge and Application Tryout Tests  III.  V.  VI. VII.  Distribution of the Unit Test Scores  IX.  56  Indices for the Unit Tests  57  Distribution of the Loss Scores  58  Distribution of the Verbal Reasoning and  Distribution of the Read General Science Pretest Scores ....  6l 62  Distribution of Items by Content Category in the Knowledge and Application Subtests  X.  52  Distribution of the Difficulty and Discrimination  Numerical Ability Pretest Scores VIII.  47  Distribution of Items by Content Category in the Knowledge and Application Unit Tests  IV.  46  Distribution of Difficulty Indices in the Subtests  69 70  CHAPTER I 1.0  INTRODUCTION The old high school general science courses in British Columbia  although intended to achieve major objectives such as the development of fundamental scientific concepts, the development of understanding of scientific principles and the acquisition of an appreciation of the scientific method, have in fact tended toward an overemphasis upon factual recall for examination purposes.*The advent of the P.S.S.C., B.S.C.S., and the CHEM. STUDY courses in the senior high school program has issued a challenge to the old general science courses in that the former may tend to achieve common desirable objectives such as those listed above in a significantly greater way than the latter.  It i s believed, for example, by the  authors of the new science courses listed above that intuitive concept development i s given greater opportunity to flourish under conditions of empirical laboratory investigation and that these concepts so developed lead to a greater understanding of the inter-relationships within the structure of the subject. As a result of the concern whether high school general science courses met objectives such as those listed above, science revision committees in British Columbia have been prompted to write alternative science units which have recently been collected and adopted as the prescribed general science courses in British Columbia high schools by  British Columbia Department of Education, Science Curriculum Bulletin, (Victoria, B.C.: The Queen's Printer, 1963) p. 1.  2 the Curriculum Division of the Department of Education. The physics units in these general science courses for grades eight, nine and ten are short physics courses at a very elementary level.  The units, developed within the framework of objectives alluded  to above, have been designed to allow students to discover basic physical principles for themselves.** The basis for the implementation of these courses has been the belief that self discovery learning w i l l achieve the above objectives more effectively than other instructional methods. The extent to which this belief i s Justifiable i s a problem worthy of investigation. Whether or not students w i l l discover principles for themselves w i l l depend upon, among other things, the materials and f a c i l i t i e s available, the time allotted, and the preparation, industriousness, and resourcefulness of the teacher.  Therefore, any investigation  central to the problem of self discovery in the classroom situation requires information concerning the factors mentioned above. It i s within the context of the above discussion that the present study  finds i t s roots. A single aspect of discovery learning in the  classroom w i l l be investigated in order to shed some light on the efficiency of self discovery classroom learning compared to lecture demonstration learning.  ^British Columbia Department of Education, Science Curriculum Bulletin, (Victoria, B.C.: The Queen's Printer, 1965), p. 1.  3 1.1  THE PROBLEM a)  Statement of the problem.  The purpose of this experiment is  to compare the effectiveness of a laboratory centered method of science teaching and a lecture demonstration method, using retention as the criterion variable at two levels of mental functioning as defined by Bloom's Taxonomy of Educational Objectives.3 A unit on heat w i l l be taught to two classes of grade nine students.  The f i r s t class w i l l be taught through self discovery and  the second through a lecture demonstration approach. Retention of the material in the heat unit w i l l be measured by the difference in an individual's scores on the same test given on two occasions separated by about one and a half months. Two self-defining achievement tests w i l l be constructed in order to test two separate levels of mental functioning} as defined by the Taxonomy of Educational Objectives * they are the knowledge level 1  and the application level. Each of these two tests w i l l then be used separately in the determination of retention on the Knowledge criterion and on the Application criterion.  Each separate set of test results w i l l be  compared to determine whether retention has been affected by the teaching method used.  ^Benjamin S. Bloom, et a l . , Taxonomy of Educational Objectives. Handbook I: Cognitive Domain (New York: Longmans, Green & Co., 1956), 1+  Ibid.  k A secondary i n v e s t i g a t i o n w i l l a l s o be c a r r i e d out i n order to determine whether teaching method d i f f e r e n t i a l l y a f f e c t s r e t e n t i o n on the Knowledge c r i t e r i o n and r e t e n t i o n on the A p p l i c a t i o n c r i t e r i o n . b)  Statement o f the hypotheses.  I t i s hypothesized that there  i s s i g n i f i c a n t l y greater mean r e t e n t i o n a t the Knowledge l e v e l of mental f u n c t i o n i n g among n i n t h grade science students taught by means of a s e l f discovery method than among those taught by means o f a l e c t u r e demonstration method. S i m i l a r l y , i t i s hypothesized that there i s s i g n i f i c a n t l y greater mean r e t e n t i o n a t the A p p l i c a t i o n l e v e l o f mental f u n c t i o n i n g among n i n t h grade science students taught by means o f a s e l f discovery method than among those taught by means o f a l e c t u r e demonstration method. F i n a l l y , i t i s hypothesized that there i s s i g n i f i c a n t l y greater mean d i f f e r e n t i a l r e t e n t i o n a t the A p p l i c a t i o n l e v e l than a t the Knowledge l e v e l of mental f u n c t i o n i n g among n i n t h grade science students taught by means of a s e l f discovery method than among those taught by means o f a l e c t u r e demonstration method. c)  The n u l l hypotheses. 1.  There i s no s i g n i f i c a n t d i f f e r e n c e i n mean l o s s score at the Knowledge lervel between the experimental group and the c o n t r o l group when the s i g n i f i c a n c e l e v e l i s set a t f i v e percent.  2.  There i s no s i g n i f i c a n t d i f f e r e n c e i n mean l o s s score a t the A p p l i c a t i o n l e v e l between the experimental group and the c o n t r o l group when the s i g n i f i c a n c e i s set a t f i v e percent.  5 3.  There i s no significant difference between the difference in experimental and control group loss 3core means at the Application level and the difference in experimental and control group loss score means at the Knowledge level when the significance level i s set at five percent,  d)  The alternative hypotheses. 1.  The experimental group mean loss score is significantly smaller than the control group mean loss score at the Knowledge level when the significance level i s set at five percent.  2.  The experimental group mean loss score is significantly smaller than the control group mean loss score at the Application level when the significance level is set at five percent.  3.  The difference in experimental and control group loss score means at the Application level is significantly greater than the difference in experimental and control group loss 8core means at the Knowledge level when the significance level i s set at five percent.  1.2  DEFINITION OF TERMS Knowledge level.  According to Bloom, knowledge "involves the  r e c a l l of specifics and universals, the recall of methods and processes, or the recall of a pattern, structure, or setting...... The knowledge objectives emphasize most the psychological processes of remembering."^  Bloom, op. c i t . , p. 2 0 1 .  6  It ia this definition which w i l l be referred to when the phrase "knowledge level" i s used in this study. Application level.  According to Bloom, application refers to  the use of an appropriate abstraction in the solution of a problem in which no mode of solution has been specified.  The abstraction referred  to may be used in particular and concrete situations.  They "may be in  the form of general ideas, rules of procedures, or generalized methods  and may also be technical principles, ideas, and theories 7  which must be remembered and applied."  The behaviors thus specified  w i l l be referred to as the "application level" in this study. Academic-technical stream.  The academic-technical stream refers  to the British Columbia high school stream composed of students who are interested primarily in furthering their education at the university level. Experimental group. The experimental group refers to the ninth grade science students in this study who were taught by means of a self discovery method. Control group.  The control group refers to the ninth grade  science students in this study who were taught by means of a lecture demonstration method.  Bloom, oj>. c i t . , p. 120. Bloom, og. c i t . , p. 205.  7  Los8 score.  Loss score refers to the signed difference obtained  when an individual subject's second score i s subtracted from his f i r s t score on the same test. Mean loss score.  Mean loss score refers to the average of the  signed differences of a l l the subjects within the same group. Mean retention.  Mean retention is defined as the measure which  is the average score of a l l the subjects within a single group on the second administration of the test to the group.  Mean retention and  mean loss score are complementary in that their sum is the average score of the same subjects on the f i r s t administration of the test. Self discovery method.  Self discovery method i s defined as the  procedure in which students are guided by the teacher In discovering for themselves physical principles in the laboratory. The behavior of the teacher w i l l depend upon the goals to be achieved and the present behavior of the students.  These goals Include  ones cited earlier, namely, the development of fundamental scientific concepts, the development of understanding of scientific principles, and the acquisition of an appreciation of the scientific method. Teaching procedures which may lead to these goals are the following: i)  the opportunity for each student to proceed at his own rate;  ii)  the provision for a basic learning sequence for a l l students  iii)  the provision for a laboratory experience as an Integral part of the learning sequence; and  iv)  the provision for enrichment through the investigation of optional problema.  8  Student understanding of the scientific method of inquiry i s brought about through emphasis on careful observation, diligent record keeping, c r i t i c a l evaluation of assumptions and circumspect procedure in the drawing of inferences. The teacher i s to define the task for the students by requesting they prepare at home for the experiment to be performed by reading carefully the distributed mimeographed experiment description and guide questions and by discussing with them on the day the experiment i s to be performed the nature of the problem under investigation, the possible hypotheses and methods of investigation, and the operation of any unfamiliar equipment. The teacher i s not to demonstrate how the experiment i s to be performed however. The teacher is to be well organized and have a l l materials ready for distribution when the students enter the classroom.  He i s expected  to communicate well with his students and manage his classroom effectively in terms of productive student behavior and time allotted to various activities.  The teacher i s expected to maintain a friendly atmosphere  and to draw on his own resourcefulness and creativity to promote intrinsic student motivation by preparing additional materials for optional investigation.  He i s to give only sufficient information to kindle  interest and i s not to solve a problem for the student. The students are to be grouped in threes and are to perform the experiment, answer the guide questions, and formulate their own conclusions.  They are to submit the experimental report in a format of  their own choice at the end of the science period and they w i l l receive the graded report at the beginning of the following science period. The grading procedure i s calculated to encourage the students through  9 constructive written comments and stimulate additional investigation. A brief discussion concerning the results of the experiment w i l l also take place when the reports are returned. It i s expected that some students w i l l be ahead of others; they may therefore continue with the next experiment in the sequence or continue with investigation of interesting problems arising from their previous work. Lecture demonstration method. Lecture demonstration i s defined as the procedure in which students learn through listening to the teacher's verbal presentation of material, through watching his demonstration of physical principles and through completing the written assignments required of them by the teacher. The chief goals for this method are the same as those for the self discovery method, namely, the development of scientific concepts, the development of understanding of scientific principles, and the acquisition of an appreciation of the scientific method. During lectures as well as demonstrations the students are encouraged to observe carefully, keep records accurately, evaluate assumptions c r i t i c a l l y , and draw inferences cautiously. The teacher i s to define the task for the students by presenting aspects of theory both verbally and written on the blackboard in order to illuminate certain physical principles in the material taught. He is to consider problems similar to those posed in the self discovery method.  In a similar way he i s to discuss the nature of the problem to  be investigated by demonstration experiment and the possible hypotheses and methods that might be used in the Investigation.  The teacher i s to  perform the demonstration experiment and ask specific questions of  students in order to provide cues for observation.  Guide questions  10  similar to those used in the self discovery method are to be answered by the students and finally the students are asked to formulate their own conclusions. The teacher i s similarly expected to exhibit the same characteri s t i c s of organization, communication, management, friendliness, and creativity in the performance of his tasks in the lecture demonstration method as are exhibited in the self discovery method.  Adequate prepara-  tion and anticipation should serve to control these characteristics Q  which have been related to student performance. Since the teacher may have more time available in the lecture demonstration method because of the expected efficiency in comparison to the self discovery method he may e l i c i t more responses from students to assist in the formulation of concepts and the presentation of ideas. Additional demonstrations pertinent to the development of the unit may also be performed. The students are to complete demonstration experiment reports in a given format and are to submit the reports and guide questions at the end of a science period in which a demonstration experiment has been performed.  These are to be graded in the same manner as those submitted  by students in the self discovery method and are also to be returned at the beginning of the following science period with a f u l l discussion of the experimental results and answered guide questions. ^Fletcher G. Watson, "Research on Teaching Science," Handbook of Research on Teaching, N. L. Gage, editor (Chicago: Rand McNally & Co., 19&3), PP. 1037-1C4O.  11 1.3  DISCUSSION OF THE PROBLEM The question of the relative merits of various teaching methods  has been considered by a number of educational researchers.  Although  the research to date on the effects of teaching method on achievement is inconclusive and i t i s unlikely that any one method i s superior to any other when the over-all effects are appraised, "the best one might hope for would be slight differences in teaching effectiveness within narrow aspects of the learning process, and this i s roughly what i s found by empirical research."9 One of the problems of the researcher in science education i s whether or not the new general science courses which have been designed to be taught by discovery methods are necessarily superior to the science courses which have been replaced. This i s not the object of the present study.  However, the discovery method has such an apparently  heavily invested interest in the new science courses i t has been proposed in this study to determine the effectiveness of a self discovery teaching method in comparison to a lecture demonstration method which one might reasonably expect to be used for instructional purposes because of inadequate materials or f a c i l i t i e s , inadequate teacher preparation in laboratory science or overlarge classes. In order to clarify this problem a discussion of the problem in context i s in order.  Traditionally the laboratory in high school  science courses has been given a place of secondary importance since  ^Norman E. Wallen and Robert M. W. Travers, "Analysis and Investigation of Teaching Methods," Handbook of Research on Teaching, N. L. Gage, editor (Chicago: Rand McNally & Company, 196377 P* 500.  12 the laboratory has long been regarded as a location where verification of classroom science learning occurred.  Scientists, however, use the  laboratory primarily as a place for investigation.  If the student is  to appreciate how the scientist works and thinks i t would seem apparent that the laboratory should serve the needs of science teaching in the same manner. Many individuals currently involved in science education believe increases in student involvement in science program activities is essential to greater subject matter understanding.  This belief is borne  out by the replacement of science courses cited earlier.  Whether or not  this action can be supported by research findings is a question whose solution may in a partial measure be indicated by the results of this study. Suchman asserts that educators "have been prompted to reformulate their methods to capitalize on the intense motivation and deep Insight that seem to accrue from the 'discovery* approach to concept attainment." ^ The Physical Science Study Committee has adopted the view 1  that student8 can be sufficiently Interested in physics by an experimental approach and can thus be motivated to perform the d i f f i c u l t task of scientific discovery.  The Department of Education's philosophy underlying  the new science courses appears to be similar in this respect. Although i t appears that no single teaching method has been demonstrated to be clearly superior to any other in terms of achievement, the effectiveness of a teaching method seems to be increased when student  J . Richard Suchman, "Inquiry Training: Building Skills For Autonomous Discovery," Merrill-Palmer Quarterly, vol. 7 (July, 1961), p. 147. 1 0  13 inquiry is emphasized.^- Thus, i t is this aspect of student inquiry which one might reasonably expect to he stimulated to a greater extent by self discovery learning than by a lecture demonstration method. In order to form a basis upon which to evaluate the relative merits of the two teaching methods listed above, several criteria other than achievement might be considered.  Among these, the most  prominent are retention, transfer of training, and attitudes towards science.  For the purposes of this study retention at two levels of  mental functioning as defined by the Taxonomy has been chosen as the criterion upon which to evaluate the two methods. l.k  IMPORTANCE OF THE STUDY It has been stated earlier that the replacement of general science  courses has occurred in British Columbia upon limited supportive research evidence.  The working hypothesis of the Department of Education has  evidently been that a laboratory centered approach to science teaching is more beneficial to students in terms of understanding scientific facts and laws than a non-laboratory centered approach.-^ This study w i l l attempt to support this working hypothesis by comparing a typical non-laboratory centered approach with a laboratory centered approach to science teaching.  As such an attempt, i t is  expected to provide information concerning only a single portion of the  -"•-HJ.S. Department of Health, Education, and Welfare, Office of Education, New Dimensions in Higher Education (Effectiveness in Teaching No. 2. Washington, D. C. : i960), p. 15. ^ B r i t i s h Columbia Department of Education, Junior Secondary School Science (Victoria, B. C. : The Queen's Printer, i960), p. 3.  Ik  problem leaving sufficient scope for much further investigation. However, should i t prove f r u i t f u l , a significant contribution to this field w i l l have been made. It is apparent from previous research findings that teaching method appears to have a non significant effect on achievement.  This  does not suggest, however, that further research on teaching methods be abandoned but rather that further research be carried out to devise other teaching methods which perhaps make as much use as possible of a wide range of learning principles* Also, further investigation is necessary to determine the possibility of the existence of a significant relationship among other instructional objectives and teaching methods.  One of these instructional  objectives is retention, which of course is the basis for the present study. Although much investigation concerning retention has been of a general nature, very l i t t l e has dealt with the nature of specific outcomes. This study is not only attempting to establish retention as a function of teaching method but also retention as a function of behavioral objectives.  Research to date has included investigation  concerning differential retention among the behavioral objectives defined by Bloom at the knowledge and the comprehension l e v e l . ^  The usefulness  to educators of research findings Indicating significantly greater retention on the upper levels of mental functioning is clearly evident especially when considered with the relatively low relationship between  ^William P. McDougall, "Differential Retention of Course Outcomes in Educational Psychology," Journal of Educational Psychology, 49, No. 2, April 1958, p. 53.  t e s t s of some of the more complex cognitive a b i l i t i e s and s k i l l s and measures of higher  intelligence. '* 1  On a more immediate, u t i l i t a r i a n l e v e l , confirmation of the research hypotheses might w e l l be s u f f i c i e n t encouragement f o r science teachers t o use the s e l f discovery method, despite the inconveniences, instead of abandoning i t out o f sheer  Sjloom, og. c i t . , p. 2 2 .  frustration.  CHAPTER II REVIEW OF THE LITERATURE 2.0  INTRODUCTION This chapter surveys the literature pertaining to relevant  instructional methods and objectives and to noteworthy retention studies. It also provides a c r i t i c a l discussion of a study, similar to the present one, Investigating the effects of Instructional methods on achievement. 2.1  LITERATURE RELATED TO TEACHING METHODS. Teaching methods based on philosophical tradition, folklore, and  personal needs of teachers have frequently been used in Instructional method comparison studies without much success.  This is perhaps  attributable to the fact that very l i t t l e has been done to develop teaching methods on the basis of scientific knowledge of learning. Studies comparing teaching methods have also lacked scientific sophistication in that the variables involved have reflected few of the properties of well developed scientific variables. Wallen and Travers state that these variables tend to be intuitively derived rather than empirically derived.* no surprise.  The following results should therefore come as  In the 1930's, many studies were undertaken in which out-  comes of education in schools reported to be progressive were compared with those in schools reputed to be more traditional.  -'•Wallen and Travers, op. c l t . , p.  k66.  The independent  17 variable in this case was not differences in teaching practices but differences in reputation.  Characteristically, these studies have  2  shown negative results. The Eight-Year Study, a comparison between "progressive" and "traditional" school graduates on the basis of success in college, showed that a positive correlation existed between the degree of success in college and the degree of experimentation in the school. However, several major criticisms regarding the internal validity of the study have been made; f i r s t , the lack of subject randomization and i t s potential consequences such as selection-treatment interaction bias and secondly, the possible confounding of the variable under study by factors such as teacher competence, personality, and enthusiasm. ^ Although exhaustive designing of the two teaching methods i s not contemplated i t would seem appropriate for the purposes of this study to specify characteristics of optimum teacher behavior with respect to classroom activities for each of the methods. This would meet the criticism levelled at so many methods comparison studies that these studies essentially compare two unknowns and do not provide for the replication of the study.  A r t i f i c i a l i t y of treatment also has less  opportunity to endanger the validity of the study. The value of laboratory work in high school science courses has also been the subject of a large number of studies.  Cunningham has  found that many of these early studies have suffered from inadequacy of  18  s t a t i s t i c a l treatment, i n v a l i d i t y of design, u n r e l i a b i l i t y of r e s u l t s , and lack of c l a r i t y of objectives.'* Recently, Boldt has made a comparison between a group of grade nine and ten students taught by a s e l f discovery method and a similar group taught by a demonstration method on the basis of achievement.^ He reported a s i g n i f i c a n t difference between the means of the two methods groups at the one percent l e v e l of s i g n i f i c a n c e .  The demonstration  group performed a t a s i g n i f i c a n t l y higher l e v e l than the experimental group on a teacher constructed, s e l f defining achievement t e s t . His findings should be interpreted with considerable caution, however.  Most of the students involved lacked experience i n the use o f  the laboratory.  S p e c i f i c a l l y , they lacked s u f f i c i e n t knowledge, s k i l l ,  and experience i n handling the apparatus and i n interpreting t h e i r data. No follow up discussions were held a f t e r a lesson with the r e s u l t that the students i n the experimental group appeared t o be i n doubt concerning the correctness of t h e i r conclusions.  The experimental group also  appeared t o have i n s u f f i c i e n t time t o check t h e i r conclusions by repeating parts of the experiment. No attempt was made t o compare the groups on the basis o f any other teaching objectives.  *H. A. Cunningham, "Lecture method versus i n d i v i d u a l laboratory method i n science teaching - A 8ummary," Science Education, 30 (I9U6), pp. 70 - 82. ^Walter B. Boldt, "Grade Placement o f an Experimental Unit i n Secondary School Physics" (unpublished Master's t h e s i s , the University of B r i t i s h Columbia, Vancouver, I963).  2.2  LITERATURE RELATED TO INSTRUCTIONAL OBJECTIVES AND RETENTION  19  Specification of the outcomes in this experiment is to be made in terms of the broad classifications of educational objectives listed in the Taxonomy of Educational Objectives.° The Taxonomy i s to be chosen for use in the study because of i t s classification of educational objectives In observable, describable, behavioral terms.  7  Detailed descriptions and illustrations of the  behaviors listed in each category facilitate classification of test items to be used in this study in terms of the objectives which they are to measure. In view of the Results of studies on the retention value of different outcomes i t is f e l t that the educational objectives known as the Knowledge and Application levels in the Taxonomy are representative of the cognitive s k i l l s and a b i l i t i e s one would expect to develop through the instruction of an experimental science unit such as the one to be used in this study. Tyler's studies show that knowledge of specific information is not a lasting outcome of instruction in comparison with the ability to apply principles to new situations.® Furst notes: The performance of learning is likely to be greatly enhanced i f the various outcomes have acquired some interrelation in behavior.  ^Bloom, op. c i t . , 7  I b i d . , p. 5.  ^Ralph W. Tyler, "Permanence of Learning," With the Technicians, Journal of Higher Education, IV, (April 1933)* pp. 203 - 204.  20 Isolated s k i l l s and items of information tend to be forgotten rapidly but those aspects of behavior which bear a functional relationship to other aspects have a much greater probability of being called in to use periodically and of being reinforced. It i s known from research studies that those learning outcomes which are continually reinforced and progressively developed into more generalised modes of reaction are most likely to survive in the long run.9 Tyler correlated tests designed to measure different types of outcomes and found low correlation between scores on Information and scores on application.  10  He concluded that since students did not  develop corresponding degrees of a b i l i t y In the course outcomes studied, the outcomes must have been different from each other.  Tyler also  produced evidence that the more complex objectives were retained somewhat better than the less complex objectives. Furst notes that an educational implication of Tyler's findings is that i t i s necessary in teaching to aim explicitly at each of these major objectives, rather than to assume that the development of a b i l i t y to think arises from the acquisition of  Information.  11  A major problem facing educators is that of "making learning experiences of students more lasting." *' The Taxonomy points out the 1  ^Edward J. Furst, Effect of the organization of learning experiences upon the organization of learning outcomes," Journal of Experimental Education, 18, (1950), p. 215. Ralph W. Tyler, "The relation between recall and higher mental processes," Education as Cultivation of the Higher Mental Processes, C. H. Judd, editor (Hew York: The Macmillan Co., 193&), pp. 6-17. 10  ^Edward J. Furst, Constructing Evaluation Instruments (New York: Longmans, Green and Co., 1958), pp. £9 - 50. ^William P. McDougall, op_. c l t . p. 53.  21  need for further research concerning retention of specific course outcomes in the following quotation: For the most part research on the problems in retention, growth and transfer has not been very specific with respect to the particular behavior involved. Thus, we are not usually able to determine whether one kind of behavior is retained for a longer period of time than another or which kinds of educative experiences are most efficient in producing a particular kind of behavior. Many claims have been made for different educational procedures, particularly in relation to permanence of learning; but seldom have these been buttressed by research findings.^3 McDougall reports that there was differential retention among the behavioral objectives he measured, namely: (c) interpretation, and (d) extrapolation.^  (a) knowledge, (b) translation, The tests constructed to  measure these four behavioral outcomes were administered as a pre-test before a unit on educational psychology was studied, at the completion of the unit, and after a period of four months had elapsed.  "The results  of the study of retention indicated that the a b i l i t i e s to interpret and extrapolate were retained to a significantly greater degree than the ability to recall or translate this knowledge from one form to another."^ About 79 percent of the gains in interpolation and extrapolation abilities were retained after four months.  Gains in knowledge and in the ability  to translate knowledge were retained to a significantly lesser degree, approximately 73 percent. McDougall also found that tests constructed to measure certain behavioral outcomes performed separate functions as evaluation devices. He suggested that i f multiple course outcomes, in line with the  ^Bloom, oj>. c i t . ,  p. 23.  •^-SicDougall, op_. c i t . , ^Ibid.  p. 59.  22 i n s t r u c t i o n a l objectives were to be achieved, i t would be necessary t o design evaluation instruments t o accomplish these separate f u n c t i o n s . ^ 1  Smelts investigated a problem concerning the extent to which learnings of high school chemistry were retained one year following the completion of the course.^  7  He reported that pupils retained approxi-  mately 68 percent of the information achieved during the year of i n s t r u c t i o n as measured by c e r t a i n standardized chemistry t e s t s .  He  also concluded that the amount of chemistry retained was more closely r e l a t e d to achievement than t o i n t e l l i g e n c e . In the realm of pure research as opposed to technological research O'Kelley and Heyer have shown that a simple habit learned by male albino r a t s under a high degree of motivation i s retained more e f f e c t i v e l y than a s i m i l a r habit under a low degree of m o t i v a t i o n . ^  A similar study  involving human subjects showed s i m i l a r s i g n i f i c a n t r e s u l t s .  1 9  Bruner, discussing human memory, states that "unless d e t a i l i s placed into a structured pattern, i t i s r a p i d l y forgotten".  He  suggests that a v i v i d d e t a i l that carries the meaning of an event  1 ^McDougall, op_. c i t . ,  may  p. 59.  J o h n R. Sraeltz, "Retention of Learnings i n High School Chemistry,' The Science Teacher, 23, October 1956, p. 285. 17  T O  ""^Lawrence I. O'Kelley and Albert W. Heyer, "Studies i n Motivation and Retention," The Journal of Comparative and Physiological Psychology,  kl,  1948,  p. 46o7~  ^Lawrence I O'Kelley and Albert W. Heyer, "Studies in Motivation and Retention: I I . Retention of Nonsense Syllables Learned Under Different Degrees of Motivation," The Journal o f Psychology, 27, I9A9, P. 143. ^Jerome S. Bruner, The Process of Education (New Books, i960), p. 24.  York:  Vintage  23 be a technique of condensation.  "What learning general or fundamental  p r i n c i p l e s does i s t o ensure that memory loss w i l l not mean t o t a l l o s s , that what remains w i l l permit us t o reconstruct the d e t a i l s when needed.'" Bruner implies that a method involving discovery i s a good means of i n s t i l l i n g attitudes concerning  the organization of student learning  in such a way that learning i s made useable and meaningful i n t h i n k i n g .  2 2  He implies that learning gained by a student through discovery may be more accessible t o him i n the future. 2.3  SUMMARY This concludes the survey of the relevant l i t e r a t u r e .  An attempt  has been made t o discuss methods comparison studies with respect t o experimental improvements that might be incorporated within the design of the present  study.  This has drawn attention to the need f o r a more  precise statement of educational objectives among other things. Studies relevant to the retention of various educational outcomes were then c i t e d .  The findings of Tyler and McDougall provide support  for the present experimental hypotheses concerning among educational objectives.  21  B r u n e r , o|>. c i t . , p. 25.  22  B r u n e r , og. c i t . , p. 20.  d i f f e r e n t i a l retention  CHAPTER III SPECIFICATION OF RELEVANT PREPARATIONS 3-0  INTRODUCTION The test of the experimental hypotheses required the choice of  an appropriate experimental design, the choice of relevant pretests, and the construction of pertinent unit tests.  Prior to the Inception  of relevant procedures a number of decisions were to be made. This chapter concerns i t s e l f with these decisions and also with the essential remaining experimental preparations. 3.1  THE CONTENT OF THE UNIT The hypotheses stated in Chapter I, section 1 indicated that an  experimental unit from the revised grade nine science curriculum was to be used in the study.  In order to reduce the possible Hawthorne effect  i t was felt that the unit should be one taught not at the beginning of the course but at some later date when the students had become accustomed to the teacher and also to the laboratory procedures.  The heat unit  appeared to be a suitable choice especially since the author who has to teach the unit was more familiar with physics than with the other scientific disciplines. The unit on heat was prepared by the British Columbia Junior Secondary Science Revision Committee.  There were 26 experiments listed  in the unit with several of the experiments having a varying number of parts.  The suggested time to be taken for the Instruction of the unit  was 22 class periods, each of one hour duration. Certain material i n the J r . Sec. Sc. Rev. Com. curriculum outline was deleted because i t  25 did not appear to contribute materially to the understanding of the unit while in other instances further clarification of relevant concepts was sought by increasing the number of guide questions. Finally, a sequence of 17 lessons was developed, each lesson to be taught within a 55 minute class period. The choice of material in the unit was not only governed by factors listed above but also by the equal treatment time factor.  It  was decided that both treatments (methods) were to have equal class time to minimize maturational and historical Interference which could adversely affect the internal validity of the study.  A l l of the basic  experiments recommended by the revision committee and a number of the enrichment sections were included in the f i n a l unit.  Several members  of the Science Education department at the university of British Columbia, namely, Mrs. J. Woodrow, Mr. D. Webster, and Mr. H. Cannon, were asked to confirm the choice of material to be included in the unit. The content of the unit was divided into eight sections which might be given the following t i t l e s : 1.  The relationship between work and heat;  2.  Sources of heat;  3.  The expansion of solids, liquids, and gases;  k.  Temperature and thermometers;  5.  The measurement of heat;  6.  The transference of heat energy;  7.  Heat i s a form of molecular motion; and  8.  Heat causes a change of state.  26 The heat unit was taken from the Science 9 Experimental Edition of the 1966 curriculum b u l l e t i n . 3.2  1  THE UNIT TESTS a)  Content of the Unit Tests. As indicated by the earlier  discussion regarding objectives i t was decided that the outcomes ©f interest were to be the Knowledge and Application categories of the Taxonomy.  Since i t has been generally accepted that the Taxonomy  categories in the Handbook I :  Cognitive Domain form a progression from  simple to complex, and that each category contains the behaviors included in previous categories, i t was f e l t that the choice of these objectives listed above was also desirable because of the degree of their categori c a l separation. After thorough familiarization with the heat unit, the author designed the test items and classified them by objective.  Several  d i f f i c u l t i e s were noted during item classification; the f i r s t , that classification i s relative to the nature of the Instruction and thai second, that classification i s relative to the behavior considered c r i t i c a l in the item since most items involve more than one type of behavior.  With respect to the former, Furst states:  that one must know or assume something about the nature of the students' prior experiences before one can classify a test item In a particular oategory. Thus, while on the surface a test item may seem to deal with the application of a principle, in the actual  B. C. Department of Education, Junior Secondary School Science (Victoria, B. C. : The Queen's Printer, 1966). X  Edward J. Furst, Constructing Evaluation Instruments (New York: Longmans, Green and Co., 195$)> p. 95* 2  27 situation one could not be sure unless one knew whether the situation was in fact new to the students or whether i t had been discussed previously. To consider the item as an application situation, one would have to assume that i t was new to the students; otherwise i t would f a l l In the recall-of-information category.3 To ensure the least number of errors in Item classification several members of the Science Education department referred to earlier were asked to check the classifications. Following this item classification two tests were to be constructed; one containing items evoking behaviors classifiable as Knowledge and the other containing items evoking behaviors classifiable as Application. i  *>)  Type of Teat.  It was decided that each of the two tests should  be power tests since the degree of speededness was of no interest as a factor In the study.  Generous time limits were to be provided so that  at least 90 percent of the students might attempt an answer to the last question on the test. In comparison with other commonly used types of objective test items, multiple choice items are relatively high in ability to discriminate between better and poorer students.  It was for this reason that  multiple choice items were to be constructed for the two tests.  The  problem of the number of responses to each item was considered and i t was f e l t that in the interests of analysis efficiency there should be an equal number of responses to each item.  Further, five responses" for  each item would have been desirable from the standpoint of chance score reduction but four responses per item appeared a more practical objective especially in view of the construction difficulty of valid distracting responses.  3lbid.  28 It was expected on a priori grounds that the mean difficulty index of the items on the Knowledge test would be greater than i t s counterpart on the Application test so that one might be required to allow a greater portion of time per question for the Application items.  This would  imply decreasing the number of useable Application items on the one test compared with the larger number of Knowledge items on the other. This disparity In item numbers between tests could almost certainly be expected to yield different r e l i a b i l i t y coefficients for the two tests, since the greater the number of items on an examination, other things being equal, the more reliable the scores obtained from i t . Professional test writers are able to produce achievement test r e l i a b i l i t y coefficients above .90 with fewer than one hundred Items. However, i t was deemed impractical to design eighty to one hundred items for each of the two tests in view of the short duration of  "experimental"  instruction and undesirable in view of the overlong testing time required. The decision on the number of items to be included on each test, therefore was largely determined by the amount of time available for test administration within a class period.  A prominent consideration  in the decision was the balance attempted between Increasing the number of items to Increase r e l i a b i l i t y and decreasing the number of items to decrease speededness.  A practical compromise was struck and i t was  decided that given a 50 minute test writing period, a test containing kO items would be suitable on the basis of requirements mentioned previously. 3.3  THE PRETESTS The concept of pretest use in experimental design has been well  entrenched in the methodology of research workers in psychology and  29 education even though randomization between groups i s recognised as the most adequate all-purpose assurance of lack of I n i t i a l biases.^ However, when the use of intact groups precludes randomization one may use s t a t i s t i c a l techniques to adjust group means on the dependent variable for variations among the groups on one or more concomitant variables provided that the latter correlate to some degree with the dependent variable.  In order to provide for such equalization among intact groups  to be used in this study, i t was necessary to identify those factors which might be expected to affect the experimental dependent variable differentially. It was felt that numerical a b i l i t y , verbal reasoning, and prior general science knowledge were such factors.  It i s known that numerical  a b i l i t y and verbal reasoning are highly related to science achievement.5 Evidence from the study of behaviors related to achievement would tend to suggest that many of the same behaviors may be responsible for retention.^  The authors of the Taxonomy have noted that behaviors which  can be generalized and applied in a number of different situations may be expected to exhibit greater permanence of learning than those which are so specific that they are likely to be encountered only rarely  ^Donald T. Campbell and Julian C. Stanley, "Experimental and QuasiExperimental Designs for Research on Teaching," Handbook of Research on Teaching, N. L. Gage, editor (Chioago: Rand McRally & Co., 1963), p. 195. York:  5G. K . Bennett, et, a l . , Differential Aptitude Tests Manual, (New The Psychological Corporation, I963).  ^Donald M. Medley and Harold E. Mitzel, "Measuring Classroom Behavior by Systematic Observation," Handbook of Research on Teaching, N. L. Gage, editor (Chicago: Rand McNally & Co., 1963), pp. 286 - 290.  30 throughout the educational program.7 Similarly, prior general science knowledge may be expected to influence retention on a priori grounds.  If a student scores well on a  measure of general science knowledge then his performance can be explained in terms of his manifestation of greater Interest and motivation.  0  Since standardized tests providing measures on the three criteria were readily available commercially i t was decided that the following tests were to be administered as pretests prior to the instruction of the unit: 1.  The Numerical Ability Test of the Differential Aptitude Tests;  2.  The Verbal Reasoning Test of the Differential Aptitude Tests;9  3.  The Read General Science Test: Series.  Evaluation and Adjustment  10  The Numerical Ability Test and the Verbal Reasoning Test were chosen for reasons cited above. The r e l i a b i l i t y coefficients range from .82 to .89 for very large samples of Junior high school students throughout the United States.  Both tests exhibit long term consistency to the  degree exhibited by the following r e l i a b i l i t y coefficients.  The Numerical  Ability coefficients ranged from .7k to .75 whereas the Verbal Reasoning coefficients ranged from .82 to .87 for groups comparable to the ones  Tsioom, o j . c i t . ,  p. 42.  "Pauline S. Sears and Ernest R. Hilgard, "The Teacher's Role in the Motivation of the Learner," Theories of Learning and Instruction, N.S.S.E. Yearbook, Part I (Chicago: The University of Chicago Press, 1964), p. 182. 9G. K. Bennett, et a l . , oj>. c i t . John G . Read, Read General Science Test Manual (New York: World Book Co., 1951). 10  31 mentioned above. One of the principal criticisms of the Differential Aptitude Test battery has been that intercorrelation between tests has not been as low as might be desirable. Ideally one would like to deal with pure instead of overlapping factors in order to account for a greater portion of variability in one's variable of interest.  Carroll illustrates this  hybrid nature of the tests when he suggests that the Verbal Reasoning Test probably measures a combination of the verbal ability and deductive reasoning f a c t o r s .  11  The intercorrelation between the Numerical Ability  Test and the Verbal Reasoning Test i s reported to be .58 .  Although  this is higher than one might like i t to be i t is not expected to impair the validity of experimental measurement. The Read General Science Test was chosen for i t s high physical science orientation.  The test i s reported to contain k2 percent physics  items, 28 percent biology items, k percent chemistry items, and 26 percent general science items.  The test was standardized on students whose  median chronological age was that of ninth graders and the r e l i a b i l i t y coefficients on small samples reported by the test author ranged from .85 to .88 .  In these respects i t appeared to be the most suitable  standardized test available. Benjamin Bloom notes that the test attempts to measure knowledge of basic facts and principles of science as well as the ability to apply knowledge in problem solving situations. ^ 1  With respect to these  •"•John B. Carroll, "Differential Aptitude Test Review", Fifth Mental Measurements Yearbook. Oscar K. Buros, editor (New Jersey: The Gryphon Press, 1959)~ p. 670 - 673. ^Benjamin S. Bloom, "Read General Science Test Review", Fourth Mental Measurements Yearbook. Oscar K. Buros, editor (New Jersey: The Gryphon Press, 1953)" P> 628.  32 objectives i t was f e l t that p r i o r general science knowledge measured by the Read Test would correlate well with the measures obtained by the two unit t e s t s and hence with the retention v a r i a b l e . 3.4  THE SUBJECTS The subjects t o be used i n the investigation were members of two  of the author's three grade nine classes at Britannia Secondary School i n Vancouver, B r i t i s h Columbia.  A l l of the students were on the academic  t e c h n i c a l program and most had attended the school during the previous year. During the summer o f 1 9 6 6 , p r i o r t o the timetabling of students to s p e c i f i c classes, e f f o r t s were made t o assign students randomly t o the f i v e academic t e c h n i c a l classes a v a i l a b l e .  These e f f o r t s were  unsuccessful as a r e s u l t of subsequent timetabling d i f f i c u l t i e s . A l though the use of intact groups was considered  f a r from i d e a l practice  i t was nevertheless decided t o proceed with intact groups since indirect or s t a t i s t i c a l c o n t r o l could remove p o t e n t i a l sources of bias i n the experiment. The problem concerning  which three of the f i v e science classes  were t o he assigned t o the author was e s s e n t i a l l y determined by timet a b l i n g considerations.  Although randomization i n assigning classes t o  teachers would have been preferred i t i s highly u n l i k e l y that any systematic bias would have been present  i n t h i s assignment.  I t was  therefore considered u n l i k e l y that the author's classes were not representative of the school's grade nine academic t e c h n i c a l population. There was no evidence a v a i l a b l e on optimum sample else so that an a r b i t r a r y l i m i t of two classes appeared t o provide a suitable number o f  33 experimental subjects.  At f i r s t the use of several teachers In the  experiment was contemplated but this was abandoned for a variety of reasons.  Among these were the apparent d i f f i c u l t i e s of achieving  uniformity of teaching quality and uniformity of treatment.  It was  also f e l t that the addition of another dimension to the study without theoretical expectation of greater than average gains in efficiency design was an unsound scientific practice.  Accordingly, i t was decided  to proceed with two groups of thirty-eight and thirty-nine students respectively, both groups being taught by the same teacher. 3.5  THE TRYOUT PROBLEM  Prior to the use of the f i n a l unit tests in the study, decisions had to be made concerning the number of Items to be used, the difficulty and discrimination levels to be required and the r e l i a b i l i t y coefficients to be expected.  To obtain information upon which to base these decisions  tryout tests were to be planned.  These tests were to identify weak or  defective items and to provide the difficulty and discriminating power of individual items. The number of items on each tryout test could be expected to be considerably reduced when a l l decisions regarding the f i n a l tests were complete. The time limit on the tryout tests was set at f i f t y minutes each, the same as that set for the f i n a l tests.  To answer the Items on the  Knowledge test would require about one minute each.  Thus a sample of  50 items was to constitute the Knowledge test and a somewhat smaller sample of ^5 items was to constitute the Application test. In accordance with information found in the literature i t was at f i r s t thought that several schools should be involved in the  experimental tryout of the tests. Conrad suggests that the assessment of the adequacy of a tryout sample solely in terms of the number of students tested is inappropriate since school differences as well as pupil differences ought to be taken into account.*3 A variety of reasons prevented the use of several schools in the tryout sample.  A number of schools were either teaching parts of the  grade nine revised science course or none at a l l .  Others had either  completed the instruction of the heat unit much earlier or were s t i l l in the process of completion.  The former were not considered due to the  well known effect of student indifference toward tests which are not credited toward f i n a l grades. It was finally decided that a single school was to be used; one in which the grade nine students had Just completed the study of the heat unit.  It was felt that this procedure would certainly provide a  better measure of the performance of the tests under conditions similar to those prescribed by this study's methodology than would procedures involving several schools. Although the school was not randomly chosen there were no indications of systematic bias which might lead one to believe that any differences on relevant criteria between tryout and experimental samples were due to chance fluctuation. Further considerations concerning the tryout sample are given in Chapter k, section 3«  -'-^Herbert S. Conrad, "The Experimental Tryout of Test Materials," Educational Measurement, E. F. Lindquist, editor (Washington: American Council on Education, 1951), p. 253.  35 3.6  THE EXPERIMENTAL DESIGN The experimental design chosen for use in this study was a varia-  tion of one which has frequently been used in educational research. The latter experimental design involves an experimental group and a control group both given a pretest and a post-test, but in which the two groups do not have pre-experimental sampling equivalence. The modification of this design concerns a shift in the time of treatment from between the test and retest to before the test, since the experimental variable of interest i s retention. When test-retest loss specific to the experimental or control group i s to be explained by the experimental hypothesis, alternative hypotheses must be effectively eliminated. Such alternative hypotheses may arise from factors threatening the internal validity of a study* Factors listed by Campbell and Stanley such as the specific events occurring between the f i r s t and second measurement in addition to the experimental variable, the processes within the students operating as a function of time, the effects of taking a test upon the scores of a second testing, biases resulting in differential selection of students for the comparison groups, and differential selection due to experimental mortality were not seen as a threat to the internal validity of this study.  However, regression and interaction between specific selection  y  differences  distinguishing the two groups and the variables mentioned  Donald T. Campbell and J u l i a n C. Stanley, "Experimental and QuasiExperimental Designs f o r Research on Teaching," Handbook of Research on Teaching, N. C. Gage, editor (Chicago: Rand McNally & Co., 1963), p. 175. x  36 above could pose a serious threat to i n t e r n a l v a l i d i t y .  It was  decided  to control the p o t e n t i a l d i f f i c u l t y of regression by analysis of covariance and i t was  f e l t that Interaction would not present  difficulty  i n the present study due to the occurrence of both t e s t s following the treatment. According t o Campbell and Stanley the factors which may  jeopardise  external v a l i d i t y or g e n e r a l i z a b i l i t y are the Interaction e f f e c t of t e s t i n g , the Interaction e f f e c t s of s e l e c t i o n biases and the experimental v a r i a b l e , and the reactive e f f e c t s of experimental arrangements. ^ 1  The  f i r s t of these i n which a pretest might increase or decrease the student's responsiveness to the experimental variable would not be considered  of  great Importance In t h i s study because the pretest has followed the treatment.  In order f o r t h i s factor to pose a threat the l i k e l i h o o d of  a student becoming more highly motivated to engage i n a d d i t i o n a l learning by the t e s t rather than by the Instruction which preceded i t should reasonably high.  be  Naturally, a basic requirement f o r the above argument  i s the lack of awareness on the part of the student that he w i l l write the same t e s t at a l a t e r date. The second factor concerns the p o s s i b i l i t y that the experimental e f f e c t s which may  v a l i d l y be demonstrated hold only f o r the unique  population from which the two groups were j o i n t l y selected.  Thus,  although one's r e s u l t s may be i n t e r n a l l y v a l i d the d i f f i c u l t y l i e s i n the attempts to generalise the r e s u l t s to the population of Interest. There i s no convenient way to resolve t h i s problem although i t has been suggested that the requirements of the present design place fever l l m i t a -  1 5  Ibid.  37 t l o n s on sampling than do s i m i l a r d e s i g n s t h u s p a r t i a l l y r e d u c i n g t h e potential threat t o external v a l i d i t y . ^ 1  The  l a s t f a c t o r concerns u n r e p r e s e n t a t i v e n e s s brought about by  a r t i f i c i a l i t y o f experimental s e t t i n g . i s p a r t i c i p a t i n g i n an experiment  A s t u d e n t ' s knowledge t h a t he  i s expected t o have an extraneous  e f f e c t on h i 3 b e h a v i o r . In  t h e p r e s e n t study one o f t h e o b j e c t i v e s was t h e removal o f any  artificiality t i o n jof  i n t h e t e a c h i n g methods through d e l i b e r a t e t e a o h e r c o n s i d e r a -  and p r o v i s i o n f o r p r e v i o u s l y observed c l a s s r o o m  circumstances  as o u t l i n e d i n Chapter U, s e c t i o n 1, so t h a t a comparison  o f two methods  a c t u a l l y found i n p r a c t i c e might be made without compromise requirements o f experimental design.  toward  F u r t h e r , i t was f e l t t h a t  con-  d i t i o n i n g o f t h e e x p e r i m e n t a l group p r i o r t o t h e commencement o f t h e experiment  s h o u l d minimize t h e n o v e l t y o f t h e e x p e r i m e n t a l  setting.  T h i s c o n c l u d e s t h e account o f t h e p r e p a r a t i o n s e s s e n t i a l t o p r o c e d u r a l implementation methodology.  Ibid.,  p. 220.  and a l l o w s t h e d e s c r i p t i o n o f e x p e r i m e n t a l  CHAPTER IV METHODOLOGY k.O  IHTRODUCTION This chapter deals with the preparation o f the t r y o u t and f i n a l  t e s t s as w e l l as the a d m i n i s t r a t i o n and r e s u l t s o f the p r e t e s t s , the t r y o u t t e s t s and the f i n a l t e s t s .  The methodology p e r t i n e n t t o the  preparation o f the students and the i n s t r u c t i o n o f the u n i t I s a l s o discussed. k.l  PREPARATION OF THE STUDENTS I t was decided t h a t i t was necessary t o f a m i l i a r i z e the pros-  pective members o f the experimental groups with the procedures followed in scientific  investigation.  The f a m i l i a r i z a t i o n procedures d i d not  appear t o favor subsequent l e a r n i n g by any p a r t i c u l a r group. Each o f the author's three grade nine c l a s s e s was divided i n t o small groups a t the beginning o f the year and about four o r f i v e periods were spent i n small group d i s c u s s i o n .  Various aspects o f s c i e n t i f i c  method, deductive reasoning and inference were discussed by the students and reports made.  A l l persons i n the three groups were given the same  f a m i l i a r i z a t i o n treatment. The students were taught a u n i t on chemistry i n which they were frequently c a l l e d upon t o use parts of the information gathered during the group discussions.  Experimentation was performed on a r o t a t i o n  b a s i s with s e v e r a l small groups engaged i n discovery under s u p e r v i s i o n at any one time.  The students not c a l l e d upon t o perform an experiment  on a p a r t i c u l a r day solved chemistry problems a t t h e i r desks.  39 By l a t e October the three classes vera randomly assigned to three teaching methods i n a p r e l i m i n a r y i n v e s t i g a t i o n of the effects o f i n s t r u c t i o n a l methods on achievement. The three methods were a discovery method, a demonstration method, and a l e c t u r e method. The u n i t taught was one on f o r c e s , taken from the revised grade nine science course.  The  members o f the discovery and demonstration groups received instruction sheets r e l a t e d to the experimental laboratory work d a i l y .  The primary  difference between these two methods used at this time was the demonstration of the experiment by the teacher for one class and not for the other. Otherwise conditions were similar i n that both answered the same questions and formulated their own experimental conclusions. An achievement test of AO items constructed to measure a variety of objectives ranging from Knowledge to Analysis was administered to a l l three groups following the completion of the unit.*  The results  were rather inconclusive since the group test means found were i n close proximity. The demonstration group test mean was slightly larger than either of the remaining group means. No statistical analysis of the means was carried out. Following the instruction of the unit on forces a l l classes were again taught by a combination of methods, that i s , lecture, discussion, and demonstration methods. The discovery method was not used however* The unit taught i n this way dealt with magnetic and electrical properties of matter.  Bloom, pj) o cl/t.  k.2  THE INSTRUCTION OP THE UNIT The class taught by the lecture method during the preliminary  investigation was randomly deleted and hence was not used in the experiment on retention.  The group formerly taught by the self discovery  method was now taught using the same method whereas the class taught by the demonstration method was now taught using a modified version of the demonstration method. Part of the instruction of the unit rests upon the completion of a sequence of experiences.  Some notion of the presentation used may  be obtained by examining the content of the sample lessons reproduced in appendix B. Any equipment new to the students in the experimental group was demonstrated but i t s use in the solution of the problem was not discussed. Equipment was then distributed and the students proceeded to perform the experiment. of students.  The teacher meanwhile was free to circulate among the groups I f d i f f i c u l t i e s arose, the students were asked questions  to bring out ideas which might be useful in the solution or clarification of the difficulty. Approximately ten to fifteen minutes before the end of each period the students were reminded to begin formulating conclusions i f they had not already begun. The last three minutes of the period were relegated to clean-up activities.  The students submitted laboratory reports at  the end of each period. The laboratory reports were checked daily and helpful consents written where required. These reports were then returned to the students at the beginning of the following period.  It was thought that a daily  check would maximize motivational factors and also lead to more efficient  kl use of student time. Reports were also collected from the control group whenever a demonstration was performed and graded i n a manner s i m i l a r t o that used for the experimental group. A number of topics received only lecture treatment i n the lecture demonstration treatment while others received both lecture and demons t r a t i o n treatment.  The lectures were generally of ten t o f i f t e e n  minutes i n length and were usually followed by student note making. The points raised i n the lecture occasionally served as s t a r t i n g points for class discussion.  Other points were i l l u s t r a t e d by means of  diagramatic analysis at the blackboard. Members of the author*3 thesis committee acted as judges of the teaching procedures used i n order t o confirm the implementation of the two methods as defined i n Chapter I. 4.3  PREPARATION OF THE TRYOUT TESTS It was evident that no commercial achievement t e s t s on the heat  unit were available f o r the purposes of t h i s study.  This necessitated  the preparation of tryout t e s t s and subsequently, the f i n a l t e s t s . F i r s t , the Taxonomy was used i n making an outline of objectives defining the ways i n which the students were t o deal with various parts of unit content.  Following t h i s a table of s p e c i f i c a t i o n s was prepared  on the basis of the statement of objectives and the outline of unit content. of items.  This table was then used i n the preparation of the two sets An abbreviated form of t h i s table may be seen i n Table I I I .  In the preparation of the items appropriate evaluation situations were sought so that i n d i v i d u a l items would r e f l e c t the attainment or  42 non-attainment  of the relevant educational objective.  The wording of  the items was c a r e f u l l y scrutinized to prevent the inclusion of unintentional clues.  Item d i s t r a c t o r s were constructed on the basis of  the author's experience of student errors.  This was considered more  e f f e c t i v e i n some cases than i n others since i t was thought that discussions with students and corrected laboratory reports revealed only a f r a c t i o n of the misconceptions held by students.  Frequent consult-  a t i o n of Ebel's Measuring Educational Achievement provided numerous h e l p f u l suggestions f o r writing the multiple choice items.  2  The  Taxonomy and Gerberich's Specimen Objective Test Items were also consulted i n order to minimize c l a s s i f i c a t i o n errors.3 The number of items within a content section of each t e s t  was  roughly proportional to the amount of class time devoted to the i n struction of the content section.  Thus the two t e s t s were similar i n  item proportions within content sections. A t o t a l number of 97 items was constructed; 52 belonging to the Knowledge category and k-5 to the Application category.  Several members  of the Science Education department then checked the items f o r content v a l i d i t y , s c i e n t i f i c accuracy, and correctness of c l a s s i f i c a t i o n .  Two  items from each category were deleted f o r reasons of c l a s s i f i c a t i o n a  disagreement  and unforeseen item weaknesses leaving 50 Knowledge items  and 43 Application items. The items were then c l a s s i f i e d with respect to d i f f i c u l t y by  Robert L. Ebel, Measuring Educational Achievement (Englewood C l i f f s , New Jersey: Prentice H a l l , Inc., I965), pp. I j l - 200. 3 j . R. Oerberich, Specimen Objective Test Items (New  Longmans Green & Co.,  1957).  York:  *3 the author.  A broad classification of three groups ranging from easy to  d i f f i c u l t was f i r s t established for the items of each test. A finer classification containing five groups was then established and the items within each group were re-examined to determine their likely position on this crude scale. Items within each test were subsequently matched on content and d i f f i c u l t y , each member of the matched pair being randomly assigned to an odd or even position on the tryout test. This was done to facilitate the calculation of r e l i a b i l i t y coefficients.  The items of each test  were arranged in order of increasing difficulty; a practice which i s commonly followed in test construction.  These arrangements were examined  to determine the presence of possible correct response patterns and to detect the presence of unequal correct response proportions among the four alternatives. Changes in item position were then effected to remedy the imperfections noted. h.k  ADMINISTRATION OF THE TRYOUT TESTS a  )  The problem of guessing.  Upon considering whether to correct  the scores on the tryout tests for guessing i t was noted that most experimental studies on the subject have shown that the effect of announced correction for guessing has been a very slight improvement on the r e l i a b i l i t y and validity of the scores. * 1  Thus, corrected scores  w i l l rank students in approximately the same relative positions as uncorrected scores.  Also, i t is not considered bad pedagogy to accept  **Ebel, og. c i t . ,  p. 227.  kk  rational guessing since one frequently finds himself in a situation requiring a decision hut without sufficient evidence upon which to base t h a t decision.  Further, the correction formula assumption that a l l  wrong answers are the result of blind guessing is untenable since numerous exceptions may be noted.  Also, on a speeded test one could  expect slower students to guess blindly on the items near the end of the test but since both tryout t e s t s are power t e s t s a correction f o r guessing would be much l e s B useful than f o r a speeded test. A l l things considered, then, i t was d e c i d e d that no correction f o r guessing should be made but that a well worded set of instructions should be adopted t o encourage making optimum use of partial information but to advise against blind guessing.5 The same t e s t directions were used f o r t h e tryout tests as for the f i n a l t e s t s .  These directions a r e reproduced in appendix A.  Test  instructions concerning time limits were given to the tryout test administrators verbally.  It was later reported by the test administrators  that a l l instructions had been carried out. b)  The results of the tryout tests. The two tryout tests were  written by the members of six grade nine classes from Montgomery Junior Secondary School in Coquitlam.  A total number of l6o students wrote  the test8. With few exceptions, most students attempted a l l of the Items on the Knowledge test. There was no evidence to indicate that the test  ^instructions adapted from T. D. M. McKie, "An Investigation of the Relationship between the Relevance Category of Achievement Test Items and their Indices of Discrimination," (unpublished Master's Thesis, The University of British Columbia, 1962), pp. 8 l - 82.  *5 had not functioned as a power test. apeededness on the Application test. not complete the last seven items.  However, there was evidence of Eight percent of the students did Admittedly, these had been judged  the most d i f f i c u l t items but more than ninety five percent of the students had attempted a l l earlier items. Also, there was evidence that despite the Instructions, considerable guessing took place.  In this case the guessing problem i s  more likely to be associated with the Application test than with the Knowledge test. For eaeh test, the papers were scored and the students' performance on each item recorded, Including omissions.  The papers were  then ranked in order of decreasing score and divided into three groups to f a c i l i t a t e the calculation of the Johnson discrimination index.  0  This procedure requires the separation of the top 27 percent and the bottom 27 percent of the ranked test papers from the middle k6 percent. Difficulty indices were computed for each item and are shown in Table I. The difficulty index of an Item i s the proportion of students in the sample who answered the item correctly. Discrimination Indices were calculated for each item and are shown in Table II. The discrimination index used in this study i s the difference In proportion of correct responses between the group scoring in the top 2J percent on the total test and the group scoring i n the bottom 2 7 percent on the same test.  °A. Pemberton Johnson, "Notes on a Suggested Index of Item Validity: The U - L Index," Journal of Educational Psychology, k2  (1951),  pp. *99 - 504.  46  TABLE I DISTRIBUTION OF DIFFICULTY INDICES COMPUTED FROM TRYOUT TESTS 1  Difficulty Indices (percents)  Number o f Items Knowledge  Application  80-84 75 - 79 70 - 74 65-69 60-64 55 - 59 50 - 54 45 - 49 4o - kk 35 - 39 30 - 34 25 - 29 20-24 15 - 19 10 - 14  2 3 1 4 5 11 4  N a  50  43  51.8  4l.o  Mean D i f f i c u l t y Index (percent)  =  6 2 3 3 1 3 1 1  •Corrections have been made f o r omissions.  1 1 1 1 1 4 2 2 8  6 5  6  3 2  TABLE I I  DISTRIBUTION OF DISCRIMINATION  INDICES  Discrimination  COMPUTED FROM TRYOUT TESTS  Number of Items  Indices Knowledge  .70 .65 .60 .55 .50 .1+5 .1+0 • 35 .30 .25 .20 .15 .10 .05 .00  -  Application  1  .74 .69  2 2 3 5 5 12 5 5 1 6  .61+  .59 .54  .W+  • 39 • 3^ .29 .21+ .19 .14  .09 .01+  !  1 3 5 3 7 10  1+  3 2 3  1  1 N  Mean Discrimination Index  »  =»  1+0  2  .35I+  l+l  .27!+  Corrections have not been made f o r omissions. p Two items from each t e s t discriminated negatively and thus were not included i n t h i s t a b l e .  2  48 4.5  THE ADMINISTRATION OP THE PRETESTS The students were informed well beforehand that the three tests  (to be used as covariates) vere to be written. They were urged to do as well as they could and were assured that test performance did not dictate failure in the course. Each of the pretests was administered on a separate day during regular class periods.  Timetable d i f f i c u l t i e s prevented the two  classes involved in the experiment from writing the tests at the same time and since the classes were not scheduled into successive blocks i t was necessary to maximize test security arrangements by administering the test f i r s t to one of the groups during their regular science period and then to the second during a period in which they were to receive instruction in another subject. tests himself.  The author was able to administer the  It was assumed that there was l i t t l e or no information  passed between classes since the time between classes was a short four minute period and since the students recognized that test performance did not dictate success or failure in the course. 4.6  THE PREPARATION AND ADMINISTRATION CP THE FINAL TEST a)  The selection of the items.  The difficulty index distributions  of the two tryout tests were markedly different.  Similarity between  these two distributions would have facilitated greatly the testing of the third experimental hypothesis.  Since this was not the case an  alternative procedure to be described in section 5.5 was used. It is known that a test w i l l provide a maximum number of discriminations among test candidates i f the test items are uncorrelated and i f they are a l l of 50 percent d i f f i c u l t y .  Further, i f the Items  1+9  are a l l perfectly correlated, the number of discriminations made by of the items v l l l he Identical vlth the number made by one item of 50 percent d i f f i c u l t y .  Clearly, when student rank on some criterion i s  desired, i t i s to one's advantage to design a test in which item d i f f i c u l t y indices cluster as closely as possible around the 50 percent level since item inter-correlations are generally rather low. However, the point has been made that not only the theoretical soundness of this procedure should be considered i n the construction of achievement tests but also i t s psychological soundness. With reference to the latter one need only consider the plight of the duller than average youngster who writes the test, proceeding item by item, only to sense that his chance of failure on each Item i s greater than his chance of success.  It Is not d i f f i c u l t to see how this might lead to more guessing  and lower test r e l i a b i l i t y coefficients. A possible solution was to distribute items about a somewhat higher value than  50. ( 60. was aimed at), and to avoid items of extremely low  difficulty indices. Two Items on the Knowledge test discriminated negatively, that i s to say they were discriminating in a direction opposite to that of the remaining items of the test, and were discarded. Knowledge items with d i f f i c u l t y indices below 34.0 were then discarded except for one item whose d i f f i c u l t y index was 19.0 and whose discrimination Index was sufficiently high to warrant i t s inclusion.  This l e f t a t o t a l of forty  items with a mean d i f f i c u l t y index of 53.9 . Two items on the Application test also had negative discrimination indices and were discarded. Nine items with difficulty Indices below 25.0 were then discarded leaving only one item with a difficulty index  50 below 25.0  . Thirty two items remained at this point but the difficulty  index distribution was rather positively skewed, that i s , the distribution was not symmetric, and had a disproportionate number of d i f f i c u l t Items. Thus the choice lay In rejecting even more Items with almost certain loss of potential r e l i a b i l i t y or in writing additional items to provide a more symmetric distribution.  Since there was insufficient time to  prepare and administer a second tryout test, ten new Items were written and were administered to three of the six classes involved In the tryout sample.  Six of the original thirty two items were then replaced with  six of the new items, thus providing a reasonably symmetric difficulty index distribution with a mean of 46.2  .  The discrepancy between the means of the difficulty index distributions for the two tests indicated that the two Betsrof test items could not be regarded as samples drawn from populations with the same distribution and could therefore not be used directly to test the hypothesis of differential retention.  It was therefore necessary to  choose matched items from the two tests on the basis of the performance by the students in the experimental and control groups. This procedure w i l l be described in Chapter V. b)  The construction of the unit test.  It was reported that in  each tryout test, items had been matched on the basis of content and expected difficulty level.  Each of the f i n a l tests was processed In  this manner using the items selected from the tryout tests.  However,  on this occasion matching could be accomplished on the basis of difficulty indices obtained from the tryout sample and this, i t was expected, would increase test r e l i a b i l i t y to a considerable extent. The members of the paired items were again randomly assigned to an odd  51 or even position on the test.. Table III indicates the item distribution by content category i n the two unit tests. The items on each test were then arranged in order of increasing d i f f i c u l t y , checked for equal distribution of correct responses, and the alternative choices dispersed in random fashion to prevent correct response patterns. The tests are reproduced in appendix C. The time limits for both tests were set a t f i f t y minutes.  Upon  consideration of the tryout test results and the number o f questions on each test i t was believed that a l l students would have time to consider a l l items thus making the tests true power t e s t s . c)  The administration of the unit tests.  The students in the  experimental and control groups were familiar with the procedures o f responding to multiple choice tests since they had written several o f these tests during the year.  The strategy of test writing had been  discussed with the students throughout the year so that i t was f e l t that blind guessing could be curbed to some considerable degree. The Knowledge test was administered to both groups during the same period on the morning of the day following the completion of the heat unit.  Similarly, the Application test was administered i n the same  manner on the school day following the writing of the f i r s t test. A l l students indicated that they were able to consider a l l items on both tests.  It appeared, judging from the omitted items, that the instructions  concerning guessing had been observed. Two students were absent during the writing of the Knowledge test and four during the writing of the Application test.  52 TABLE I I I ITEM DISTRIBUTION BY CONTENT CATEGORY IN THE TWO UNIT TESTS  CONTENT CATEGORY  Number of Items Knowledge  Application  Percentage of T o t a l Knowledge  Application  Work and Energy Changes  2  2  5.  6.2  Heat Sources  3  0  7.5  0  Expansion  5  7  12.5  22.  k  3  10.  9-3  Measurement of Heat  6  5  15.  15-5  Heat Transfer  8  7  20.  22.  Molecular Motion  k  k  10.  12.5  Change of State  8  k  20.  12.5  TOTAL  ko  32  Temperature and Thermometers  53  k.J  THE ADMINISTRATION OP THE RETEST a) AdministratIon of the Retest. The students had been informed  several times throughout the year that they would be examined on various parts of the course toward the latter part of the school year. This was done in an attempt to curb possible hostility toward surprise retesting which was to occur approximately six to seven weeks following the f i r s t unit test administration.  The students were also told that  these tests given late i n the school year would be counted toward their f i n a l grade in science. However, attention was not specifically drawn to the retesting of the heat unit. Six weeks following the f i r s t test administration the students were informed during their morning r o l l c a l l period that they were to write a science test during the f i r s t two class periods of that morning. It had been decided to administer both unit tests at the same time since the variable of interest, namely retention, would i n a l l likelihood be greatly affected by any student foreknowledge of retesting. The students were then randomly assigned to the writing of either the Knowledge or the Application test during the f i r s t period. This was done to confound potential practice or order effects.  During the  second period the students who had written the Knowledge test now wrote the Application test and vice versa. The time limit of f i f t y minutes for each test was again strictly observed. b) The test results. retests were written.  There were five absentees on the day the  Altogether, data on the Knowledge criterion was  obtained for thirty three students i n the experimental group and thirty eight students in the control group.  Data on the Application criterion  was obtained f o r t h i r t y two students i n the experimental group and t h i r t y seven students i n the c o n t r o l group. I t appeared from the number of items l e f t out t h a t the i n s t r u c t i o n s regarding guessing had been observed. The t e s t scores were processed by the U n i v e r s i t y of B r i t i s h Columbia Computing Centre.  CHAPTER V THE STATISTICAL ANALYSIS 5.0  INTRODUCTION In t h i s chapter the unit t e s t scores, the pretest scores, and  the retest scores are presented together with t e s t r e l i a b i l i t i e s . Two t e s t s of significance are made and interpreted i n the l i g h t of the assumptions of the analysis of variance and covariance model. The selection of items constituting the subtests which are used t o test the d i f f e r e n t i a l retention hypothesis i s described and a t e s t of significance i s made. 5.1  THE UNIT TEST RESULTS AND RELIABILITIES a)  The unit test scores.  The d i s t r i b u t i o n of scores f o r the  experimental and control groups on the two unit t e s t s may be seen i n Table IV. It can r e a d i l y be seen from Table IV that the means and medians for the control group are greater than the corresponding means and medians f o r the experimental group.  I t cannot be asserted that these  differences are due t o chance since the groups were not i n i t i a l l y chosen randomly.  Tests of the retention hypotheses w i l l need t o take  account of such i n i t i a l differences.  The s t a t i s t i c a l  "equalization"  of groups with respect t o the t e s t i n g of experimental hypotheses  will  be discussed l a t e r i n t h i s chapter. The d i f f i c u l t y and discrimination indices f o r the two unit tests are shown i n Table V and the loss scores f o r the two groups on each t e s t are shown i n Table VI. retention hypothesis.  The l a t t e r w i l l be used i n t e s t i n g the  56  TABLE IV  DISTRIBUTION OF UNIT TEST SCORES  Frequency Test  Knowledge  Scores  36 34 32 30 28 26 2k 22 20 18 16 Ik  12 10 8  -  37 35 33 31 29 27 25 23 21 19 17 15 13 11  Exp. Group  Application  Cont. Group  1 1  Exp. Group  Cont. Group  3 1 5 9 3 6  6.  2 3 8 6 2 2  k  • 8 6 6 3 2  2 1 6 1 9 6 6 2 3 2  1 3 3  k  2  9  2 2 2 k  N =  37  38  35  38  M =  25.9  27.2  15.8  17.4  Mdn =  26.1  28.8  15.6  17.5  s = 2  s=  33.80  21.22  5.81  4.61  57  TABLE V  DISTRIBUTION OF DIFFICULTY AND DISCRIMINATION INDICES FOR THE UNIT TESTS  Difficulty Indices  •93 .88 .83 .78 .73 .68 .63 .58 .53 .48  -  .^3 .38 .33 .28 .23 .18 -  .97 .92 .87 .82 .77 .72 .67 .62 .57 .52 .U7 .42 .37 .32 .27 .22  Frequency Knowl.  -3 3 2 4  h 3  k  6 5 2 1 1 1 1  Discrimination  App.  1 1 2 5 2 1 4 5 7 2 2  Indices .63 - .67 .58 - .62 •53 - .57 .48 - .52 .43 - .47 .38 - .42 .33 - .37 .28 - .32 .23 - .27 .18 - .22 .13 - .17 .08 - .12 .03 - .07 (-).02 - .02 (-).07 - (-).03  Frequency Knowl.  App. 2  3 3 5 2 6 5 3  4  3  4  1 3 3 6 3 4 2 4 2  2 1 1  N =  4o  32  N =  4o  32  M =  .68  •53  M =  .35  .34  58  TABLE VI DISTRIBUTION OF LOSS SCORES  Frequency Loss  Knowledge  Scores  Exp. Group  Application  Cont. Group 1  -7 -6 -5  1 1 3 1 5 2 7  -k -3 -2 -1 0 1 2 3  2 2 1 1 1 2  k 5 6  7 8  1 3 3 3 7 6 5 3 3  1  32  33  38  M=  .15  -.05  5.40  5.56  2.32  2.36  S*  s=  -  1 1 7 1 3 2 3 3 5 2 2 1  1 1 1  =  N  Exp. Group  -1.34  Cont. Group  2 2 6 5  k  3 2 2 1 1 1  37 .03 5.56  2.11  2.36  59 b) The r e l i a b i l i t y of the unit tests.  The construction of the  unit tests has already been described in Chapter IV. The items within each test were matched on the basis of content and difficulty.  The  test r e l i a b i l i t y of each test was determined by correlating the scores on the two halves and subsequently using the results in the SpearmanBrown formula r  1  u  -  22  1  +  r i i 22  The r e l i a b i l i t y coefficients ( r ^ ) obtained for the Knowledge and Application tests were .82 and .80 respectively.  They were  estimated by the Spearman-Brown formula from the half - length coefficients (r^i)  which were .69 and .67 respectively.  The test r e l i a b i l i t y coefficients were not as high as desired but were accepted as reasonably good in view of the small sample size.  As a matter of interest, the Application test r e l i a b i l i t y  coefficient, calculated by the Spearman-Brown formula, becomes .83 i f the length of the test i s increased to forty items, as In the Knowledge test.  Of course, i t i s assumed that the eight additional  items exhibit the same characteristics as do the thirty two items of the actual test. c)  The correlation between the tests.  The correlation coefficient  between the Knowledge and Application tests was .72 . This provided a measure of the degree of relationship between the two sets of test scores.  It was inferred that in so doing, the correlation coefficient  •Robert L. Ebel, op. c i t . , p. 315«  6o showed the extent to which the two tests measured different aspects of mental functioning. This inference was made rather cautiously since i t is so highly dependent upon test validity.  This correlation i s  rather higher than most of those reported in the literature  though  not as high as some^. An assumed true correlation coefficient r = .72  yields the  following partitioning of Application test score variance: (i) (ii)  error variance  variance explainable by the variability of Knowledge test scores  (iii)  : 20 percent;  :  52 percent; and  true variance, reliably measured and different from the variability of Knowledge test scores  :  28  percent. Evidently the two kinds of performances involve some common a b i l i t i e s , but each requires certain abilities that the other does not.  It is of course assumed that the test items have a l l been  accurately classified according to the specifications of the Taxonomy. 5.2  THE PRETEST RESULTS The distributions of scores on the three pretests may be seen  in Tables VII and VIII. The control group means were greater than the experimental group  ^Edward J. Furst, "Effect of the Organization of Learning Experiences upon the Organization of Learning Outcomes," Journal of Experimental Education, 18 (1954). 3T. R. McConnell, "A Study of the Extent of Measurement of Differential Objectives of Instruction," in An Appraisal of Techniques of Education: Symposium (Washington: American Educational Research Association, National Education Association, February, 1940).  61  TABLE VII DISTRIBUTION OF PRETEST SCORES  Frequency Test Scores  41 39 -  46 44 42 4o  45 43  Verbal Reasoning Test Exp. Group  2  3h  5  -  31 - 32 29 - 30 27 - 28 25 - 26 23 - 24 21 - 22 19-20 17 - 10 15 - 16 13 - 14 11-12 9-10  Exp. Group  Cont. Group  1 1 1 1  37 - 38 35 - 36 33  Cont. Group  Numerical A b i l i t y Test  6 1 2 2 3 3  4 3 3 3 1 7 3 3 3 1 3 2  1  2 5 1  4 4 5 2 6  l 1 1  l  1 8 3 9 3 5  4 1 1 1  1 1  3 2 1  N =  33  38  33  38  M =  27.3  31.1  28.2  31-7  62 TABLE VIII  DISTRIBUTION OP PRETEST SCORES  Read General Science Test Scores 69 67 65 63 61 59 57 55 53 51 49 47 45 43 4l 39 37 35 33 31 29 27 25 23 21 19 17  -  70 68 66 64 62 60 58 56 54 52 50 48 46 44 42 4o 38 36 34 32 30 28 26 24 22 20 18  Frequency Exp. Group  Cont. Group 1 1  1 1 1 1 1 3 4 3 6 3 5 2  2 4 4 6 1 1 5 2 5 1 2 1 1  1  2  N  33  38  M —  42.2  48.4  63 means f o r each of the three pretests.  I t was therefore evident that  the two groups could not be regarded as random samples from a normally d i s t r i b u t e d population and that indirect or s t a t i s t i c a l control would need to be exercised i n the "equalization" of the groups p r i o r to comparison.  5.3  THE STATISTICAL TEST OF HYPOTHESIS I The s t a t i s t i c a l hypothesis to be tested was the n u l l hypothesis,  H  0  :  -  where  JU.  sense.  Since  >  t  n  e  alternative hypothesis being H i  :  < ^<  c  represents an adjusted mean i n the a n a l y s i s of covariance p t  = F /, . \ , where 1 and » are the respective (x>) V, > 1 degrees of freedom of the numerator and denominator i n the F r a t i o ±  obtained i n the analysis of variance, a one-tailed t was to be made.  The l e v e l of significance f o r the F  => *f F  test  r a t i o was pre-  set at 5 percent. Loss scores, the difference between i n i t i a l and f i n a l unit test scores, were calculated f o r each student on the basis of the Knowledge test r e s u l t s .  The experimental and control groups' scores were plotted  separately against t h e i r o r i g i n a l Knowledge test scores, t h e i r  Numerical  A b i l i t y scores, t h e i r Verbal Reasoning scores, and t h e i r Read General Science Test scores.  The scatter plot graphs indicated small p o s i t i v e  correlations between Knowledge loss scores and both Numerical scores and o r i g i n a l Knowledge scores.  Ability  Knowledge loss scores and Read  General Science scores indicated a small negative c o r r e l a t i o n .  However,  Knowledge loss scores were correlated p o s i t i v e l y with Verbal Reasoning scores f o r the control group and negatively f o r the experimental group. This impaired the value of the Verbal Reasoning factor as a covariate  ,  6h and led to Its r e j e c t i o n i n the t e s t i n g of the f i r s t  hypothesis.  The remaining scatter plots were examined f o r l i n e a r i t y of regression and there was lationships was  no reason to believe that any of these re-  nonlinear.  I n i t i a l knowledge of the unit as measured  by the Knowledge t e s t , numerical a b i l i t y as measured by the Numerical A b i l i t y t e s t , and p r i o r general science knowledge as measured by the Read General Science t e s t were then chosen as covariates to be used i n the t e s t i n g of the f i r s t There was  hypothesis.  no reason to believe that within class regression  c o e f f i c i e n t s d i f f e r e d to a degree greater than that expected by chance. Both skewness and variance d i s p a r i t y were in evidence to a small degree among the experimental and control group Application loss scores but neither was  i n evidence among the Knowledge loss scores.  It was  felt  that possible v i o l a t i o n of the normality and homogeneity of r e s i d u a l variance assumptions would pose no threat to hypothesis t e s t v a l i d i t y since the  F  t e s t i n the analysis of covariance  i s known to be  robust  with respect to these two v i o l a t i o n s . ^ In the analysis of covariance  procedure group means are f i r s t  adjusted on the basis of performance on the covariates.  This adjust-  ment i s necessary to "equalize" the groups and eliminate any selection bias which may  have been i n operation.  The analysis of covariance following r e s u l t s .  The adjusted means f o r the c o n t r o l group and  experimental group were .Oh and  (New  was performed and produced the the  .05 respectively.  B. J . Winer, S t a t i s t i c a l P r i n c i p l e s i n Experimental Design York: McGraw H i l l Book Co., 1962), p. 586.  65  Source  df  Groups  1  Error  66  C l e a r l y , since  SS (Adj.)  MS (Adj.)  .002  .002  i s less than  0.00  10.67  704.07 F  F  1 , the n u l l hypothesis cannot  be rejected. In view of t h i s r e s u l t , i t was concluded that there was no s t a t i s t i c a l basis f o r maintaining that there i s greater mean retention at the knowledge l e v e l among students taught by means of a discovery method than among those taught by means of a lecture  demonstration  method. A l l loss scores and covariate scores f o r the two groups are shown i n appendix D. 5.4  THE STATISTICAL TEST OF HYPOTHESIS I I The s t a t i s t i c a l hypothesis to be tested was the n u l l hypothesis,  HQ  : y / ^ - = ^*-  where  c  , the a l t e r n a t i v e hypothesis being H ^  represents an adjusted mean.  A one-tailed t  was t o be made and the l e v e l of significance f o r the F  :  ^U^. <^ = i\j F  test  r a t i o was t o be  set at 5 percent. Loss scores were calculated f o r each student on the basis of the Application t e s t r e s u l t s .  The experimental and control groups' loss  scores were again plotted separately against the o r i g i n a l Application t e s t scores, the Numerical A b i l i t y t e s t scores, the Verbal Reasoning t e s t scores, and the Read General Science test scores.  The r e s u l t s  indicated a positive c o r r e l a t i o n between Application loss scores and  66 o r i g i n a l Application scores, a negative c o r r e l a t i o n between Application loss scores and Numerical A b i l i t y scores and a small negative correlation between Application loss scores and Read General Science t e s t scores. There appeared t o be no c o r r e l a t i o n between Application loss scores and Verbal Reasoning test scores.  Therefore, the covariates used i n  the test of the f i r s t hypothesis were a l s o used i n the t e s t of the second. The remaining data was then examined as described i n section 5.3 to determine whether or not the analysis of covariance assumptions had been met.  These appeared to be s a t i s f i e d and so the analysis was  performed.  The adjusted means obtained for the c o n t r o l group and the  experimental group were .34 and -1.70 respectively.  The r e s u l t s of the  analysis of covariance are shown i n the following table.  Source  df  SS (Adj.)  MS (Adj.)  Group  1  61.58  61.58  Error  64  580.37  9*07  F  6.79 * *  ** p < .01 for a onetailed t test. Since  t (54)  =  64)  =  2.61 and the tabled value of  •t(64) at the 1 percent l e v e l i s 2.390 the difference i n means Is not only s i g n i f i c a n t at the 5 percent l e v e l but also well beyond the 1 percent l e v e l .  Thus the n u l l hypothesis was rejected and the a l -  ternative hypothesis,  : JU. <^ jic £  c  , was accepted.  It was therefore concluded that there was a s t a t i s t i c a l basis for maintaining that there i s greater mean retention at the application  67 l e v e l among students taught by means of a discovery method than among those taught by means o f a lecture demonstration method. The obtained value o f  _t i s s i g n i f i c a n t at the 1 percent l e v e l  so that that there i s l i t t l e chance o f t h i s result being a type I error, though t h i s i s always a p o s s i b i l i t y which only r e p l i c a t i o n w i l l resolve. 5.5  THE LOSS SCORE RELIABILITIES The r e l i a b i l i t y of the difference between two scores can be  expressed in a simple formula, which reads r  r  diff  =  ll  +  r  r ^  —  2 1  where  22  —  r  r  n  r  12  12  i s the r e l i a b i l i t y of one measure, i s the r e l i a b i l i t y of the other measure, and  r^g  Is the c o r r e l a t i o n between the two measures.5  The r e l i a b i l i t y c o e f f i c i e n t f o r the Knowledge loss scores obtained by using the Knowledge test r e l i a b i l i t y c o e f f i c i e n t , .82, and the pre-post Knowledge test c o r r e l a t i o n , also .82, i n the above formula i s 0.0 . The r e l i a b i l i t y c o e f f i c i e n t f o r the Application obtained i n the same manner using the Application c o e f f i c i e n t , .80, and the pre-post Application  test  loss scores,  reliability  test correlation,  .73,  i s .26 . Much larger loss score r e l i a b i l i t y c o e f f i c i e n t s had been hoped for.  This reduction of r e l i a b i l i t y i s of course a tremendous challenge  ^Robert L. Thorndike and Elizabeth Hagen, Measurement and Evaluation i n Psychology and Education (New York: John Wiley & Sons,  Inc., 1961),  p. 192.  68 in studies involving loss or gain scores.  The fact that such clear-cut  r e s u l t s were obtained i n the case of the Application hypothesis i s quite remarkable  i n view of the severe measurement error, and renders  the conclusion of more than ordinary interest.  5.6  THE SUBTESTS USED IN THE TEST OF HYPOTHESIS I I I  In order t o t e s t the hypothesis concerning d i f f e r e n t i a l  retention  i t was necessary t o provide comparable t e s t s , matched with respect t o content and d i f f i c u l t y .  The low r e l i a b i l i t i e s of the two tryout tests  prevented t h e i r use i n the construction of the t e s t s t o be used i n a test of t h i s hypothesis.  Instead, items were chosen from the two f i n a l  unit t e s t s on the basis of d i f f i c u l t y index and discrimination index comparab i l i t y . The d i s t r i b u t i o n of the eighteen items thus selected i s shown i n Table IX.  Most of the matched item d i f f i c u l t y indices were within .07 of  each other with the greatest difference being .13 .  The mean d i f f i c u l t y  indices of the Knowledge and Application subtests were .63 and .62 respectively; the mean discrimination indices, .39 and .3^ respectively. The d i f f i c u l t y index d i s t r i b u t i o n s , shown i n Table X, had the same range, approximately the same form, reasonable symmetry, and approximately equal variability. The r e l i a b i l i t y c o e f f i c i e n t s f o r the Knowledge and Application subtests, obtained by the procedure mentioned i n section 5.1, and .57 respectively.  were .67  These were low i n comparison t o normal test  r e l i a b i l i t y f i g u r e s , but were considered reasonable i n view of the small number of items used.  69 TABLE IX DISTRIBUTION OF ITEMS BY CONTENT CATEGORY IN THE SUBTESTS  Content Category  Number of Items Knowledge Subtest  1 2 3  Application  3 1 3 1 1 2 2 5  2 1 2 3 2  4  5 6 7 8  4  MATCHED SUBTEST ITEMS  Item No. on  Item No. on  Knowledge Test  Application Test  3 5 8 9 11 16 18 19 22  19 1  24 25 28 29 31 33 34 36 37  14 3 4 10 26 27 20 12 16 5 9 15 28 29 8 11  Subtest  TO  TABLE X DISTRIBUTION OF DIFFICULTY INDICES IN THE SUBTESTS  Difficulty Indices .88 .83 .78 .73 .68 .63 .58 .53 .48 .43 .38 •33  -  .92 .87 .82 .77 .72 .67 .62 -57 .52 .47 .42 .37  N =  Frequency Knowledge Subtest  Application Subtest 1 1  1 1 1 3 1 1 4 1 2 1  2 2 4 1 3  2  3 1  18  18  M =  .63  .62  S2 =  .033  .025  S =  .18  .16  71  5.7  THE STATISTICAL TEST OF HYPOTHESIS III The  HQ  s t a t i s t i c a l hypothesis t o be tested was the n u l l hypothesis, = f<  :  where ^  c  *  t  h  e  a l t e r n a t i v e hypothesis being H^  i s an unadjusted mean.  : f*-  c  < /<-c-  This required a one-tailed t e s t .  The  l e v e l of s i g n i f i c a n c e had again been set at 5 percent. Loss scores were calculated on both the Knowledge subtest and the Application subtest  f o r each student.  Next, mean loss scores were  calculated f o r both experimental and control groups on each o f the two subtests.  Each of the four mean loss scores was divided by i t s own  standard error o f difference and the four terms were combined t o form experimental and c o n t r o l group mean d i f f e r e n t i a l loss scores.  Finally,  each o f the two mean d i f f e r e n t i a l loss scores was divided by i t s standard error of difference.  A t e s t was then t o be made t o determine  whether the difference i n the respective scaled mean d i f f e r e n t i a l loss scores of the experimental and control groups was s i g n i f i c a n t . The d i v i s i o n of each difference by i t s standard error accomplished the transformation of a l l loss scores t o a s i m i l a r scale, that i s , a l l differences were scaled to t h e i r respective standard deviations.  This  scaling procedure allowed the formation of the differences between Application and Knowledge mean loss scores t o be used i n t e s t i n g the d i f f e r e n t i a l retention hypothesis. The derivation and interpretation of the above terms and the test s t a t i s t i c are given i n appendix E. No adjustment t o "equalize" groups was possible using t h i s  °Derivation based on personal communication with Dr. T. D. M. McKie, The University of B r i t i s h Columbia.  72 s t a t i s t i c a l procedure.  Lack of adjustment was not considered a serious  problem i n t h i s case since i n i t i a l bias appeared t o favor the c o n t r o l group and hence would not prejudice possible s i g n i f i c a n t r e s u l t s i n favor of the experimental group. The t e s t s t a t i s t i c ,  Z  was calculated and i t s value found t o be -.768 r e f e r r i n g t o a net gain instead of l o s s . p r o b a b i l i t y of obtaining a value of  Z  , the negative  sign  For the one-tailed t e s t , the  l e s s than  -.768  was determined  from the table of the normal p r o b a b i l i t y i n t e g r a l t o be .2212  .  The n u l l hypothesis could not be rejected and was therefore accepted.  I t was concluded that there was no s t a t i s t i c a l basis f o r  maintaining that there was greater mean d i f f e r e n t i a l retention among students taught by means of a discovery method than among those taught by means of a lecture demonstration method.  CHAPTER VI CONCLUSIONS AND SUMMARY 6.0  INTRODUCTION This study was designed to examine the effects of discovery  learning on retention. This was achieved by comparing a discovery method of instruction with a demonstration method on the basis of retention of objectives defined by the Taxonomy. There were, however, several limitations of the study which w i l l be discussed prior to the statement of conclusions. 6.1  LIMITATIONS OF THE STUDY First, there are inferential d i f f i c u l t i e s in that only a  cautious generalization of the experimental results to the population of interest may be made. I f one attempts to generalize to the population of British Columbia high school science students one must consider the limitations resulting from the use of a single content unit, a single teacher, and a small sample.  However, in so far as the students in  the sample are representative of the population «•—- and there is no evidence of nonrepresentativeness — to this population.  the results may be generalized  The requirements of a strengthened generalization  demand greater sample size and more representative sampling. Secondly, the lack of randomization necessitated the use of covariates in the equation of experimental and control groups.  Since  the correlations between the covariates and the dependent variable were rather small there was some question as to whether a l l of the preexperimental bias in favor of the control group was removed.  Although  74 this effect operated to strengthen the results obtained, i t should be noted that in the event of replication of this study the existence of possible covariates more highly correlated with loss scores than the present covariates ought to be investigated. As implied in the previous paragraph, the low r e l i a b i l i t y of the loss scores was undoubtedly responsible for the concealment of a sizeable portion of the experimental effect.  Improvement of loss score r e l i a b i l i t y  might have been achieved through increasing the number of test items. Decreasing the correlation between test and retest would also help, but unfortunately 6.2  i t is not within the control of the experimenter.  CONCLUSIONS F i r s t , there is no statistical basis for asserting that discovery  learning results in greater retention at the Knowledge level of mental functioning than does learning acquired by means of a lecture demonstration method.  In view of the loss score r e l i a b i l i t y on the Knowledge  criterion these results are to be expected. Secondly, there is a statistical basis for asserting that discovery learning results in greater retention at the Application level than does learning acquired by means of a lecture demonstration method.  The  s t a t i s t i c a l test shows the difference between the experimental and control groups to be significant beyond the 1 percent level although the level set for null hypothesis rejection was only 5 percent.  An effect  of this magnitude is a l l the more remarkable when considered in the light of the low r e l i a b i l i t i e s obtained for the Application loss scores. Behaviors involving Application are probably more frequently encountered during discovery learning than during learning obtained by  75 means of a lecture demonstration method. reasonable explanation  This would appear to give a  of the r e s u l t s obtained, namely, the absence of  a s i g n i f i c a n t difference on the Knowledge retention c r i t e r i o n and  the  presence of a s i g n i f i c a n t difference on the Application retention criterion. F i n a l l y , there i s no s t a t i s t i c a l basis f o r the b e l i e f that discovery  learning r e s u l t s in greater d i f f e r e n t i a l retention than does  learning acquired by means of a lecture demonstration method.  The  r e s u l t s obtained are i n the r i g h t d i r e c t i o n , however, and t h i s e f f e c t would be enhanced i f a correction could have been made to remove the I n i t i a l biases of the groups.  Conceivably the e f f e c t might have been  large enough to be not a t t r i b u t a b l e to sampling f l u c t u a t i o n . 6.3  IMPLICATIONS OF THE STUDY Despite the lack of loss score r e l i a b i l i t y , c l e a r cut r e s u l t s  were obtained f o r the Application hypothesis.  This f i n d i n g supports  the e a r l i e r T y l e r studies which claimed that Application behaviors were more r e a d i l y retained than Knowledge behaviors and provides j u s t i f i c a t i o n f o r the current movement i n science education concerning course r e v i s i o n .  C l e a r l y , i f discovery  learning contributes to the  development of greater retention among Application objectives than among Knowledge objectives, then a greater emphasis on Application objectives i s warranted.  Further,  i f Application behaviors are more  l i k e l y to be encountered than Knowledge behaviors during  discovery  learning, then science courses designed to be taught using  discovery  methods hold a decided advantage over the o l d science courses.  76 6.4  POSSIBILITIES FOR FURTHER RESEARCH There are several p o s s i b i l i t i e s f o r further research.  The f i r s t  concerns a r e p l i c a t i o n of the present study t o investigate more thoroughly d i f f e r e n t i a l retention between Knowledge and Application objectives.  A greater number of items —  perhaps about 60 — matched  with respect t o content and d i f f i c u l t y should be constructed and r e l i a b i l i t y c o e f f i c i e n t s i n excess of .90 should be aimed at f o r both tests.  This problem, because of i t s magnitude, might w e l l be undertaken  by an i n s t i t u t i o n a l research group rather than by a single i n d i v i d u a l . Further investigation of the scaling problems involved i n making such comparisons i s needed, and perhaps also of the associated covariance adjustment  problem.  Investigation of d i f f e r e n t i a l retention between other objectives may also be c a r r i e d out. F i n a l l y , a great deal of research might p r o f i t a b l y be directed toward the systematic design of i n s t r u c t i o n a l methods based on learning principles. 6.5  SUMMARY OF THE INVESTIGATION It was hypothesized on a p r i o r i grounds that: 1.  greater mean retention would occur at the Knowledge and Application l e v e l s among students taught by a discovery method than among students taught by a lecture demonstration method; and  2.  greater mean retention would occur at the Application l e v e l than at the Knowledge l e v e l among students taught by a discovery method than among students taught by a lecture  77 demonstration method. Each of two ninth grade science classes i n a single school was taught a heat unit using one of the methods mentioned above.  The  i n s t r u c t i o n a l methods were assigned randomly t o intact classes both handled by the same teacher. Two multiple choice achievement t e s t s covering the content of the heat unit were constructed] one consisting of items i n the Knowledge category and the other consisting of items i n the Application category. A tryout of each of these t e s t s was conducted upon 160 students i n a single school thus allowing the elimination of unsuitable items.  The  r e s u l t i n g unit t e s t s , with f o r t y Knowledge and t h i r t y two Application items respectively, were administered to the students of the experimental and control groups both Immediately following and s i x weeks following the conclusion of the heat u n i t . The r e l i a b i l i t y c o e f f i c i e n t s of the Knowledge and Application t e s t s , estimated by c o r r e l a t i n g the half test scores and applying the Spearman-Brown formula, were .82 and .80 respectively. Possible concomitant variables such as p r i o r general science knowledge, verbal reasoning, and numerical a b i l i t y were measured by appropriate pretests.  Covariates were chosen on the basis of pretest  score c o r r e l a t i o n with loss scores.  The analysis of covariance was  performed and the r e s u l t s provided a s t a t i s t i c a l basis f o r the acceptance of the experimental hypothesis dealing with the retention of application objectives and f o r the r e j e c t i o n of the experimental hypothesis dealing with the retention of knowledge objectives. Items were matched on the basis of content and d i f f i c u l t y index to form Knowledge and Application subtests.  A normal  Z  statistic  78 was used to test the d i f f e r e n t i a l retention hypothesis and the r e s u l t s indicated that there was no s t a t i s t i c a l basis f o r the acceptance of t h i s experimental hypothesis.  BIBLIOGRAPHY  80  A.  BOOKS  Bloom, Benjamin S., et a l . Taxonomy of Educational Objectives. Handbook I: Cognitive Domain. Hew York: Longmans, Green and Co., 1956. Bloom, Benjamin S. "Read General Science Test Review," Fourth Mental Measurements Yearbook, Oscar K. Buros, editor. New Jersey: The Gryphon Press, 1953. Bruner, Jerome S.  I960.  The Process of Education. New York:  Vintage Books,  Campbell, Donald T., and Julian C. Stanley. "Experimental and QuasiExperimental Designs for Research on Teaching," Handbook of Research on Teaching, N. L. Gage, editor. Chicago: Rand McNally and Co., Carroll, John B. "Differential Aptitude Test Review," Fifth Mental Measurements Yearbook, Oscar K. Buros, editor. New Jersey: The Gryphon Press, 1959* Conrad, Herbert 8. "The Experimental Tryout of Test Materials," Educational Measurement, E. F. Lindqulst, editor. Washington: Amer. Council on Educ, 1951. Ebel, Robert L. Measuring Educational Achievement. Hall, Inc., I965.  New Jersey:  Prentice  Edwards, Allan L. Experimental Design in Psychological Research. New York: Holt, Rinehart and Winston, I965. Furst, Edward J. Constructing Evaluation Instruments. Longmans, Green and Co., 1958. Gerberich, J . R. Specimen Objective Test Items. Green and Co., 1957.  New York:  New York:  Longmans,  Medley, Donald M., and Harold E. Mitzel. "Measuring Classroom Behavior by Systematic Observation," Handbook of Research on Teaching, N. L. Gage, editor. Chicago: Rand McNally and Co., 1963. Sears, Pauline S., and Ernest R. Hilgard. "The Teacher's Role in the Motivation of the Learner," Theories of Learning and Instruction, N. S. S. E. Yearbook, Part I, E. R. Hilgard, editor. Chicago: The University of Chicago Press, I 9 6 & . Thorndike, Robert L., and Elizabeth Hagen. Measurement and Evaluation in Psychology and Education. New York: John Wiley and Sons, I96I. Tyler, Ralph W. "The Relation between Recall and Higher Mental Processes," Education as Cultivation of the Higher Mental Processes, C. H. Judd, editor. New York: The MacMillan Co., 1936.  81 Wallen, Norman E., and Robert M. W. Travers. "Analysis and Investigation of Teaching Methods," Handbook of Research on Teaching. N. L. Gage, editor. Chicago: Rand McNally and Co., 196*37 Watson, Fletcher G. "Research on Science Teaching," Handbook of Research on Teaching, N. L. Gage, editor. Chicago: Rand McNally and Co., I963. Winer, B. J . Statistical Principles in Experimental Design. McOraw H i l l Book Co., 1962. B.  New York:  PUBLICATIONS OF GOVERNMENT AND OTHER ORGANIZATIONS  Bennett, G. K., et a l . Differential Aptitude Tests Manual. New York: The Psychological Corp., I963. British Columbia Department of Education. Science Curriculum Bulletin. Victoria, B. C : The Queen's Printer, I963. British Columbia Department of Education. Science Curriculum Bulletin. Victoria, B. C : The Queen's Printer, 1965. British Columbia Department of Education. Junior Secondary School Science. Victoria, B. C : The Queen's Printer, 1966. McConnell, T. R. "A Study of the Extent of Measurement of Differential Objectives of Instruction," in An Appraisal of Techniques of Education: Symposium. Washington: Amer. Educ. Research Assoc., N. E. A., 19"5oT Read, John G. Book Co.,  Read General Science Test Manual. 1951.  New York: World  U. S. Department of Health, Education, and Welfare, Office of Education, New Dimensions in Education (Effectiveness in Teaching No. 2 ) . Washington: 1935", C.  PERIODICALS  Cunningham, H. A. "Lecture method versus individual laboratory method in science teaching - A Summary," Science Education, 30 (1946), 70-82. Furat, Edward J. "Effect of the Organization of Learning Experiences Upon the Organization of Learning Outcomes," Journal of Experimental Education. 18 (1950), 215. Johnson, A. Pemberton. "Notes on a Suggested Index of Item Validity: The U-L Index," Journal of Educational Psychology, 42 (1951), 499-504.  82 McDougall, William P. "Differential Retention of Course Outcomes in Educational Psychology," Journal of Educational Psychology, 49, No. 2, (April, 1958), 53O'Kelley, L. I., and A. W. Heyer. "Studies in Motivation and Retention," The Journal of Comparative and Physiological Psychology, 4 l (1948), 466.  O'Kelley, L. I., and A. W. Heyer. Studies in Motivation and Retention: II. Retention of Nonsense Syllables Learned under Different Degrees of Motivation," The Journal of Psychology, 2J (1949), 143. Smeltz, John R. "Retention of Learnings in High School Chemistry," The Science Teacher, 23 (October, I956), 285. Suchman, Richard J . "Inquiry Training: Building Skills for Autonomous Discovery," Merrill-'Palmer Quarterly, 7 (July, 1961), 147. Tyler, Ralph W. "Permanence of Learning," Journal of Higher Education, IV (April, 1933) 203-204. D.  UNPUBLISHED MATERIALS  Boldt, Walter B. "Grade Placement of an Experimental Unit in Secondary School Physics." Unpublished Master's thesis, The University of British Columbia, Vancouver, I 9 6 3 . McKie, T. D. M. "An Investigation of the Relationship between the Relevance Category of Achievement Test Items and their Indices of Discrimination." Unpublished Master's thesis, The University of British Columbia, Vancouver, I 9 6 2 .  APPENDIX  A  DIRECTIONS TO STUDENTS 1.  Many of the items on this test w i l l appear different from any that you have seen before. Most of them require a l i t t l e b i t of thinking as well as just remembering something that you have learned. However, ALL of them CAN be done by any student who has learned the work done to date WELL. So however strange or hard a question may appear at f i r s t sight, remember — YOU KNOW ENOUGH TO ANSWER IT} a l l i t needs is your knowledge plus a U t i l e thoughtJ  2.  You may answer questions even when you are not completely sure that your answers are correct. In such cases, intelligent consideration of the choices provided may help you to gain marks. HOWEVER, you should AVOID WILD GUESSING as this may result in a reduction in your score.  3.  Give each question careful thought, but work as quickly as you can. If you find a question too d i f f i c u l t , do not linger over-long on i t . Pass on to the next ones, and return later to any that you have missed.  k.  For each question there are four possible answers. You are to decide which i s the best one. Mark only ONE of the four choices, as shown in the sample item below. ITEM: The chemical symbol for mercury i s : a)  M  b)  Mg  c)  Hg  d)  Ag  The correct answer i s c ) , so make a heavy mark in the space marked c). Erase completely any answer you may wish to change, and be careful not to make any stray marks of any kind on your answer sheet or on your test booklet. a) b) c) d)  5.  DO NOT OPEN THE TEST BOOKLET UNTIL YOU ARE TOLD TO START! working time for this test i s 50 minutes.  6.  As soon as you receive your answer sheet, complete the required Information on the top ( i . e . NAME, SCHOOL, etc.).  7.  On tests of this kind, i t i s not expected that any student w i l l get a l l of the items correct j In fact, i t Is usual for most average students to score around 50$. The pass-mark w i l l undoubtedly be much less than 50$, so i f you have to leave quite a few questions unanswered, do not worry — and DO HOT GUESS.  The  APPENDIX  B  86 SAMPLE LESSON FOR THE DISCOVERY METHOD A b r i e f review of the previous lesson i s conducted. The aim of t h i s lesson i s t o study the effect of external pressure on the b o i l i n g point of water. The teacher asks questions designed t o probe student understanding of the d e f i n i t i o n of the terms and the purpose of the lesson. example, "What i s meant by the terms 'external pressure' and  For 'boiling  point'?" and "Describe i n your own words the problem t o be Investigated." The teacher asks questions concerning the formulation of possible hypotheses and the t e s t i n g of these hypotheses.  For example, " I f the  external pressure i s increased, would you expect the b o i l i n g point t o Increase, decrease, or stay the same?"  and "How  could the hypothesis  be tested?" Further student questioning e l i c i t s more s p e c i f i c information concerning hypothesis t e s t i n g .  For example, "If you expect the b o i l i n g  point t o increase when the external pressure increases, how are you going to arrange your equipment and what procedures w i l l you carry out to check your hypothesis?" Instead of following up t h i s questioning by explanation at t h i s point the teacher allows the students t o decide f o r themselves which hypothesis to test and discuss among themselves the procedures t o be followed and the r e s u l t s anticipated. The teacher checks that the students can handle the apparatus and then sets them to work while he c i r c u l a t e s among the groups t o provide assistance where necessary. The students perform the experiment and record t h e i r observations. Students may have questions such as the following:  " I f the water  87 temperature Is lower than 100° C. why does the water b o i l ? "  The teacher  responds by asking a series o f questions designed t o lead the student to an understanding of and solution t o h i s problem.  For example, "what  substance occupies the space above the water?" , "How i s t h i s substance affected when i t loses heat?" , and "Is i t as d i f f i c u l t f o r water molecules t o move into the space above the l i q u i d water a f t e r cooling the substance as i t i s before cooling i t ? " The students receive some degree of assistance i n t h e i r i n vestigation from the guide questions on the experiment! i n s t r u c t i o n sheets. After t e s t i n g one hypothesis a student may formulate and t e s t a l t e r n a t i v e hypotheses.  For example, instead of investigating the  e f f e c t s of decreasing the pressure above a l i q u i d he may Investigate the e f f e c t s of increasing the pressure above the l i q u i d . The students formulate t h e i r conclusions, submit t h e i r reports, and return t h e i r laboratory equipment.  88 SAMPLE LESSON FOR THE LECTURE DEMONSTRATION METHOD A b r i e f review of the previous lesson i s conducted. The aim of t h i s lesson i s t o study the effect o f external pressure on the b o i l i n g point of water. The teacher outlines by a lecture and blackboard presentation the theory involved In the lesson.  For example, the students are asked  to consider the effects of increasing and decreasing the pressure above a liquid.  Molecular behavior in discussed with respect t o both  situations but the heating or cooling of the gas above the l i q u i d i s not mentioned.  The students record the blackboard presentation i n t h e i r  notebooks. The teacher asks questions concerning the d e f i n i t i o n s of terms and the purpose of the lesson as i n the "discovery method".  He also  asks questions concerning the formulation of possible hypotheses and the t e s t i n g of these hypotheses.  The students discuss with the teacher  which c f the hypotheses r e l a t e d i r e c t l y t o the problem t o be investigated and what information i s to be gathered t o test these  hypotheses.  The teacher d i r e c t s the discussion toward the hypothesis which might most suitably be tested under existing conditions.  In t h i s case,  for example, investigating the e f f e c t s of decreasing the pressure above a l i q u i d would be most suitable f o r purposes of demonstration. Following the discussion, the teacher outlines the object, the hypothesis, and the plan of the investigation f o r the students on the blackboard.  The students record t h i s outline as part of t h e i r  laboratory report. The teacher performs the experiment and the students record t h e i r own observations.  Wherever possible an experimental report format has  89 been provided f o r the students.  This involves the systematic tabulation  of data. The students are required to give written answers to a series of questions as part of the laboratory report. to noteworthy points i n the demonstration.  These questions pertain For example, "When the  f l a s k i s cooled i s the pressure of the water vapor above the l i q u i d equal to atmospheric  pressure?"  The students formulate t h e i r own conclusions and submit t h e i r report.  APPENDIX  C  91 KNOWLEDGE TEST ON HEAT UNIT 1.  The specific heat of an object i s the amount of heat one gram of the object a) b) c) d)  2.  When a liquid i s heated, the molecules a) b) o) d)  3.  k.  a)  212 P.  c)  b)  100" F.  d)  32° F. 0° F.  the the the the  heat heat heat work  lost by the water. absorbed by the water. absorbed by the metal. done in changing the temperature of the water.  When matter absorbs radiation, the radiation i s converted to heat. chemical energy.  c) work. d) temperature.  Water and alcohol tend to a) b) ci d)  7.  0  When a warm piece of metal is placed in cold water, the heat lost by the metal i s equal to  a) b) 6.  move closer together. move into a definite pattern, become heat energy. move faster.  The temperature at which water changes from liquid to solid (when the barometric pressure i s j6 cm. of mercury) i s  a) b) c) d) 5.  absorbs without changing i t s temperature. radiates before its temperature can be raised by one degree C. requires to raise its temperature by one degree C. contains after its temperature Is raised by one degree C.  cool at the same rate. expand at different rates, contract at the same rate. cool at the same rate and expand at the same rate.  Which of the following conduct heat? a) b) c) d)  solids and gases only. liquids and gases only. solids and liquids only. gases, solids, and liquids.  92 8.  9.  10.  11.  12.  Which of the following substances w i l l have i t s temperature raised by one Centigrade degree when one calorie of heat i s added to one gram of the substance? a)  water  c) glycerol  b)  turpentine  d) alcohol  Which of the following statements i s correct? a) bl c)  Water does not conduct heat. Water is a poor conductor of heat. Water i s a moderate conductor of heat.  d)  Water is a good conductor of heat.  Which of the following statements i s correct? a) Mercury does not conduct heat. b) Mercury and water conduct heat equally well. c) Mercury is a poorer heat conductor than water. d) Mercury is a better heat conductor than water. A copper gauze, positioned as shown, causes a Bunsen flame to disappear because the flame cannot get through the gauze. the heat i s conducted away too rapidly by the copper gauze, the natural gas has already been used up. the heat i s radiated away too rapidly by the flame.  Which of the following are sources of heat? 1. Sunlight 2. Burning fuels a) b)  13.  1 only 2 only  3.  Electric sparks  c) 1 and 2 only d) 1, 2, and 3  The number of degrees between the freezing point and the boiling point of water on a Fahrenheit thermometer i s a) b)  0 100  0) 180 d) 212  Ik. Which of the following i s an example of a good conductor of heat? a) b)  zinc water  c) d)  china pyrex  93 15.  A flask containing water is fitted with a stopper and gla^B tubing as shown. When the flask i s heated, the water level in the"tube drops slightly because a) water contracts when heated. b) water expands when heated. c) the flask expands faster than the water. d) the flask contracts slower than the water.  16.  When heat is added to a cubic foot of a i r , its volume remains the same and i t s temperature Increases, increases and its temperature increases, decreases and its temperature Increases, increases and i t s temperature remains the same.  17.  When water freezes i t a) b) c) d)  18.  When the temperature i s above the freezing point of water, the numerical reading on the Fahrenheit thermometer a) b) c) d)  19.  is proportional to the reading on the Celsius thermometer. i s greater than the reading on the Celsius thermometer. i s less than the reading on the Celsius thermometer. may be greater than or less than the reading on the Celsius thermometer.  A stone falling from a height of six feet above the ground possesses a maximum of kinetic energy a) b) c) d)  20.  gives out heat. takes in heat. neither gives out nor takes in heat. may either give out or take in heat.  at a at a Just Just  height of six feet above the ground. height of three feet above the ground. before i t strikes the ground. after i t strikes the ground.  The amount of heat required to raise the temperature of 1 gram of water from 99 C. to 100 C. 0  0  a)  i s greater than the amount of heat required to raise the temperature of 1 gram of water from ^ C. to 1° C. b) i s less than the amount of heat required to raise the temperature of 1 gram of water from 0° C. to 1° C. c) i s the same as the amount of heat required to raise the temperature of 1 gram of water from 0 C. to 1 C. d) is the same as the amount of heat required to raise the temperature of 1 gram of ice from 0° C. to 1" C. 0  0  0  94 21.  Bodies which are warmer than their surroundings are called a) b)  22.  c) d)  radiation conduction and radiation  1:2 1:1  c) d)  2:1 4:1 ~  Mechanical to kinetic to electrical Potential to kinetic to heat Kinetic to mechanical to heat Kinetic to heat to potential  The operation of a compound bar thermostat Is based upon the fact that a) b) c) d)  26.  conduction convection  Water cascading over a dam illustrates the following changes in forms of energy a) b) c) d)  25.  radiators conductors  Two aluminum blocks, one twice as heavy as the other, are both at the same temperature. The ratio of the amount of heat contained by the heavier block to the amount of heat contained by the lighter block i s a) b)  24.  c) d)  For whieh of the following method( s) of heat transfer i s no material medium required? a) b)  23.  heat sources heat indicators  Metals expand when heated. Different metals have different heat capacities. Metals bend more easily when heated. Different metals expand at different rates.  A thick walled glass jar and a thin walled pyrex beaker are heated by a Bunsen burner. Which of the following statements i s most correct? a) The jar breaks because i t does not expand uniformly. b) The beaker does not break because i t does not expand uniformly. c) The Jar breaks because i t expands uniformly. d) The Jar breaks because i t s walls are thicker.  27.  When an impure substance dissolves in a liquid .  28.  a) the boiling temperature Increases. b> the freezing temperature increases. c) the boiling temperature decreases. d) the freezing temperature is not changed.  Decreasing the vapor pressure above water decreases i t s a) b)  boiling temperature. water pressure.  c) boiling pressure. d) water temperature.  95 29*  The amount of heat an object possesses depends on the object's ) )  30.  temperature only. mass and weight.  c) weight and temperature. d) temperature and mass.  The specific heat of water i s greater than a)  alcohol.  c) both alcohol and turpentine.  b)  turpentine.  d)  neither alcohol nor turpentine.  31.  Heat applied to water at 100  C. tends to  32.  a) raise the temperature of the water. b) prevent heat loss from the boiling water. c) raise the temperature of the steam. d) change the state of the water. Which of the following statements i s correct? a) Water boils at 90 C. b) Water boils at 100 ° C. c) Water boils at 110 ° C. 0  d)  Water may b o i l at 90 ° C., or 100 ° C., or 110 ° C.  33'  When water changes from a liquid to a gas  3k.  a) heat b) heat c) heat d) heat The boiling  i s given out. i s taken i n . i s neither given out nor taken in. i s either given out or taken i n . point of water is not affected by  a) adding a salt which does not dissolve in water. b) increasing the vapor pressure above the water. c) adding table salt to the water. d) decreasing the vapor pressure above the water. 35.  Winds transfer heat by a) b)  36.  37.  conduction. convection.  c) d)  radiation. convection and conduction.  The energy of molecular motion appears as a)  friction.  c)  temperature.  b)  heat.  d)  potential energy.  Heat energy Is radiated through space by a) b) c) d)  molecular molecular molecular a process  collisions. group transfer. motion. which does not depend on molecular motion.  96 38.  An outdoor thermometer uses mercury for which of the following reasons? 1. Mercury has a high boiling point. 2. Mercury's freezing point i s below that of water. 3. Mercury expands as the temperature increases. k. Mercury does not stick to glass. a) b)  39.  c) d)  1, 2, and 3 only 1, 2, 3, and k  Which of the following statements i s correct? produced when a) b) c) d)  ko.  2 and 3 only 2, 3, and k only  Heat i s usually  work i s done. potential energy i s entirely converted to kinetic energy. kinetic energy i s entirely converted to potential energy. potential energy i s lost.  Which of the following statements i s correct? a) bl c) d)  Shiny surfaces Black surfaces Black surfaces surfaces. Shiny surfaces  absorb energy better than black surfaces* reflect energy better than shiny surfaces. radiate and absorb energy better than shiny radiate energy better than black surfaces.  97 APPLICATION TEST ON HEAT UNIT 1.  Wet clothing on a washline may require a long time t o dry because a) b) c) d)  2.  surrounding a i r molecules are not i n motion. s p e c i f i c heat of a i r i s too low. a i r pressure i s too high. amount of water vapor i n the a i r i s too high.  Cork i s a poor conductor of heat because a) b) c) d)  3.  the the the the  i t contains many a i r spaces. i t s molecules are separated by large distances, cork molecules do not absorb heat r e a d i l y . i t i s a good r e f l e c t o r of heat.  A wire i s sealed into an evacuated bulb made of pyrex glass. The bulb, which i s to be dipped into l i q u i d a i r at a very low temperature (-200° C.) i s also t o be used at room temperature (+20 ° C . ) . The most important factor t o consider when designing t h i s apparatus i s a) b) c) d)  k.  the the the the the  rate of thermal expansion of wire and the glass. thickness of the glass. shape of the bulb. thickness of the wire.  the the the the  bottom and r i s e s bottom and r i s e s top and sinks as top and sinks as  Water enters the c o i l a) b) c) d)  at at at at  as i t cools. as i t warms up. i t cools. i t warms up.  n 5.  A cork, whose diameter i s s l i g h t l y smaller than that of the tube, i s allowed t o s l i d e f r e e l y i n a tube p a r t i a l l y f i l l e d with water. I f the water begins t o b o i l the cork w i l l be forced up by a) b) c) d)  a i r pressure. water pressure. atmospheric pressure. vapor pressure.  98 6.  A small bubble of mercury i s Introduced into a piece of glass, tubing one end of which i s sealed. After the -/ sealed end i s placed in boiling water the bubble i s at the position shown. After the sealed end i s placed in a mixture of ice and water the bubble w i l l be at; al  b c| d —  7.  1. 2. 3. k.  3  A flask f i l l e d with a i r at room temperature i s inverted in a.beaker of water as shown. When the flaskVis heated the level of the water in the beaker w i l l a) b) c) d)  8.  position position position position  rise. fall. neither rise nor f a l l . f i r s t rise, then f a l l .  A short distance from the spout of a steaming kettle the a i r contains a mist of water droplets, but the a i r between this mist and the spout is clear. The reason for this i s a) b)  that water vapor i s always invisible. that a l l of the water molecules in this region are in the gaseous state. c) that there Is no form of water present in this region. d) that the droplets in this region are travelling too fast to be seen.  9.  Under normal conditions liquid water boils at 100 ° C. and liquid nitrogen boils at -196°C. When equal amounts of water (at 99° C.) and liquid nitrogen (at -197 C.) are poured together 0  a) both the liquid nitrogen and water freeze. b) the liquid nitrogen freezes. c) the water freezes. d) neither the liquid nitrogen nor the water freeze.  99  10.  A certain metal ring cannot be slipped onto a glass rod because the diameter of the glass rod i s a l i t t l e too large. It i s possible to put the ring on the glass rod by a) b) c) d)  11.  heating cooling cooling cooling  the rod. the ring. the rod. both the ring and the rod.  Electricity passing through the c o i l of an electric kettle causes water in the kettle to b o i l because electrical energy i s converted to a) chemical energy. b) kinetic energy.  12.  ei work. d) potential energy.  A wooden block i s sent sliding across a table top. energy of the block i s converted to  c) potential energy. d) friction and potential energy.  a) heat energy. b) friction. 13.  Two rods, one made of aluminum and the other of copper, are i n i t i a l l y attached to a plate held at a constant temperature of 8 0 " C. If both rods Cat are i n i t i a l l y at 20° C. then z  Ik.  The kinetic  points 2 and it- receive the same amount of heat at the same time, b) points 2 and 3 receive the same amount of heat at the same tine, c) points 1 and k receive the same amount bf heat at the same time, d) points 1 and 3 do not reeeive the same amount of heat at the same time. a)  Water at 0 ° C. is circulated through a c o i l which i s surrounded by water at 80 C. The temperature of the water in the tank Tu f a) rises because work Is done pumping water through the c o i l . b) f a l l s because heat i s absorbed by the c o l l . c) remains the same because the heat lost i s equal to the heat gained. d) remains the same because heat is radiated by the tank. 0  100 15.  One hundred grams of ice at 0 C. are added to one hundred grams of water at 0" C. in an insulated container which allows no heat losses. The mixture w i l l 0  ,1  become ice at 0 ° C. become water at 0° C.  16.  become mostly ice at 0° C. remain the same.  A flask containing water is fitted with a stopper and glass tubing. The flask i s placed in a hot water bath and the level in the tubing rises to a height of 10 Inches. A mixture of 50 percent water and and 50 percent alcohol i s poured into the same flask and the flask is again placed into the hot water bath. The level of the liquid in the tubing rises to a height of less than 10 Inches because different liquids are used. b) 10 Inches because the same amount of heat Is absorbed. c) more than 10 inches because different liquids are used. d) 10 Inches because the hot water bath i s at a constant temperature. a)  17.  A bar is made of a strip of iron and a strip of brass bonded together. For a one degree rise in temperature brass expands l£ times as much as Iron. If one end of the bar i s clamped as shown and the temperature i s lowered f i f t y degrees the bar w i l l remain straight because iron is stronger than brass, curve downward, remain straight because the temperature Is lowered, curve upward.  18.  One end, X, of a closed tube f i l l e d with a i r i s briefly cooled with an ice pack. When the ice pack is removed, the temperature w i l l  r  a) b) c) d)  rise rise fall fall  rapidly at X. rapidly at Y. slowly at X. slowly at Y.  101 19.  A reading of 45 degrees on the Celsius (Centigrade) scale i s the same as a Fahrenheit reading of e) d)  57 degrees. 77 degrees. 20.  8 l degrees, 113 degrees.  A closed box has two chimneys inserted at the top. A burning candle is placed underneath one of the chimneys as shown. Region A i s just above the flame and region B is just above the chimney. Heat i s transferred from region A to region B e  a)  mainly by convection and conduction. b) only by conduction and radiation. c) mainly by convection and radiation. d) only by conduction and convection.  A  21.  A 500 cm. rod changes 0.1 cm. in length for a one degree change in temperature (Celsius). If the rod's original temperature of 250° C. changes to 200 ° C. the length of the rod i s 450.0 cm. 495-0 cm.  22.  c) d)  Water at 0 C. i s circulated through a c o i l which i s surrounded by water at 80 C. Which of the following statements i s correct? a)  Water rises b) Water c) Water d) Water  See diagram for 23-  below the c o i l nor f a l l s . below the c o i l above the c o i l below the c o i l  neither rises. rises. falls.  Materials X and Y, both at 80 C., are placed in a container f i l l e d with a i r at 20 C. If material X contains more heat than material Y the a i r above a) b) c) d)  24.  499.5 cm. 505.0 cm.  Y X Y X  will will will will  rise. remain at rest. move toward X. f a l l and move toward Y.  Telephone wires are hung as shown. The distance d i s a minimum during '  a) b) c) d)  winter. spring. summer. none of these seasons.  102 25.  The removal of a i r from between the walls of a thermos bottle would reduce heat loss mainly due to a) b) c) d)  26.  Water at 80 ° C. i s a) b) c) d)  27.  conduction and radiation. convection and conduction. radiation and convection. conduction, convection, and radiation.  twice as hot as water at ko° C. eighty times as hot as water at 0 " C. two unit8 hotter than water at 40° C. eighty units hotter than water at 0 ° C.  A heavy block, attached to a disc, i s allowed to f a l l pulling the disc through the water from position 1 to position 2. Which of the following statements i s correct? The temperature of the water rises. *) The temperature of the water f a l l s . c) The temperature of the water remains constant. d) The disc does no work on the water. a  28.  Ethylene glycol dissolves in water in beaker 1 while table salt dissolve8 in water in beaker 2. a) bl o) d)  29.  >  Both mixture 1 and mixture 2 freeze below 0 C. Neither mixture 1 nor mixture 2 freezes below 0° C. Only mixture 1 freezes below 0° C. Only mixture 2 freeies below C.  The amount of energy stored in one gram of steam at 100° C* i s a)  greater than the amount of energy stored in one gram of water at 100 " C. b) less than the amount of energy stored in one gram of water at 100 C. c) the same as the amount of energy stored in one gram of ice at 0° C. d) the same as the amount of energy stored in one gram of water at 100 C. 0  0  103 30.  Several drops o f water are placed on the bulb o f a mercury thermometer and are allowed t o evaporate. The mercury l e v e l i n the thermometer drops because the evaporation o f the water a) b) c) d)  31.  leaves the a i r around removes heat from the removes heat from the t r a n s f e r s heat energy  the bulb cooler than the surrounding a i r . region around the bulb. bulb. t o the mercury.  Two grams of water a t 30 C. are poured onto a f i v e gram piece o f g l a s s a t 0 C. The f i n a l temperature o f water and g l a s s i s 20 C. The s p e c i f i c heat o f g l a s s i s 0  0  32.  0  a)  .1  per  b)  .2  per ° C.  0  C.  c) d)  .k  .5  per ° C per ° C.  Four grams o f b o i l i n g water Is poured i n t o twenty grams o f water a t an unknown temperature. I f the f i n a l temperature ( a f t e r mixing) i s 60° C , what was the unknown temperature? a) b)  6 8 C. 52 °C. e  c) d)  48°C. k0° C.  APPENDIX  D  105 HYPOTHESIS I  Control Group Loss Score  2 -4 -1 -4  2 -3 -2 -3 -1 7 6 -2 1 -1 -1 2 1 0 0 -1 0 -4  -7 4 1 4 0 0 0 1 4 -3 8 -2 1 -1 -1 -5  Orig. Read Knowl. Test Score Score  19 30 37 31 32 32 28 17 21 27 31 29 26 29 35 23 27 37 17 36 31 31 20 30 27 26 30 33 32 33 31 30 27 18 18 21 16 16  46  52 70 52 56 55  48  4  9  44  45 51 52 43 51 61  46  33 51 39 68 55 56 35 53 53 37 57 54 54 57 39 32 39 45  40 46  40 36  Experimental Group  N.A. Test Score  28 29 4o 37 38 34 38 34 16 34 30 30 37 35 38 34 33 34 31 37 38 35 21 34 30 38 35 30 32 33 25 34 28 27 27 31 23 18  Loss Score  0 0 0 3 -4 -4 0 -2 -1 2 4 2  7 0 -2 -2 0 0 -6  -2  3 -3 -4 5 1 -2 1 -1 6 1 7 1 -5  Orig. Read Knowl. Test Score Score  28 33 14 33 26 26 18 26 24 33 20 29 23 15 25 23 30 32 19 33 26 24  32 27 34 25 36 19 27 25 21 27 28  40 4i 36 63 45 45 42 47 47  49 41  50  49  18 37 37  48  39 47  41 42  54 37 37 60 4o 57 37  41  27 18  46  35  NoA. Test Score  17 29 20 35 38 28 24 23 23 34 31 36 31 27 27 25 29 28 32 29 38 31 22  24  36 36 36 27 23 29 16 23 25  106 HYPOTHESIS I I  Control Group Loss Score -2 8 0 -2 -5 1 3 -3  7  -4 -1 1 2 0 -3 1 -2 5 3 -2 -5 -2 2 -1 1 -1 -1 -3 5  -3 0 3 6 -3 -4 -3 3  Orig. App. Score 9 22 27 19 17 26 17 15 15  14 lb 16  18  18 21 lb  10 24 17 23 18 23 16  18 23 19 18 22 23 18 19 11 12  10 14 9 13  Experimental Group  Read Test Score  N.A. Test Score  46  28 29  52 70 52 56 55  4o  48 4Q 44  37 38 34 38 34 16  45  34  51 52 43 51 61  46 33  51 39  68 55 56 35  53 53 37 57  54  54  57 32  39 45 40  46 40 36  30 30  37  35 38  34 33 34 31 37  38 35  21 34 30 38 35 30 32 33 34 28 27 27 31 23 18  Loss Score -1 1 1 -3 -6 0 -5 1 -3 -5 4  3  -1 2 0 6 2 _o  -4 -3 -5 1 0  1 -7 -5 3 -2 -5 -5 -1 -5  Orlg. App. Score 16 16 9 25 16 18 10 15 17 16 17 16 11 15 13 23 20 13 21 13 19 18 15  24 10 18 12 14  14 9 16 13  Read Test Score 40  41 36 63 45 45 42 47 47 4 50 4 18 37 37 9  9  48  39 47  41 42 54 37 37 60 4o 57 37 4i 27 18  46 35  N.A. Test Score 17  29 20 35 38 28  24 23 23 34 36 31 27 27 25 29 28 32 29 38 31 22  24 36 36 36 27 23 29 16 23 25  107  HYPOTHESIS III DISTRIBUTION OF SUBTEST LOSS SCORES  Differential Loss Scores -8 -7 -6 -5  -k -3 -2 -1 0 1 2 3  k  5 6 7  N =  Frequency Control Group  1 1 2 2 6 3 3  k 3 2 2 3 1  37  Experimental Group 1 1 1 3 5 7 5 1 1 2 1  32  APPENDIX  E  109 THE DERIVATION OF THE TEST STATISTIC USED IN THE DIFFERENTIAL RETENTION HYPOTHESIS The Knowledge and Application subtest scores are assumed to be normally d i s t r i b u t e d with means and variances as shown. Assume:  where:  A or K E or C 1 or  and  Consider  A  represents an Application subtest score  K  represents a Knowledge subtest score  E  represents the experimental group  C  represents the c o n t r o l group  1  represents the i n i t i a l subtest r e s u l t  2  represents the f i n a l subtest or retest result  ,  ^A  £  2  which Is the experimental group  -A  i n d i v i d u a l Application subtest difference scaled t o i t s own standard deviation.  The denominator i n t h i s term i s the standard error of  Application subtest score difference.  Nov, (K - A ) - " ( ^ - ^ tt  where :  t<  A.  ,  represents the Application subtest-retest  + < -2^ ^ ;  v  population  correlation coefficient,  a.  represents the experimental population i n i t i a l Application subtest score variance,  110  2  Oft  represents the experimental population f i n a l Application retest score variance, represents the experimental population i n i t i a l Application  \ A  /  subtest mean, and M-h  / Then,  represents the experimental population f i n a l Application  c 1  retest mean. |  A,  -  2  ^ K  ~ r, r  €  I t follows  - ^  [  '  2  n r- ~ n  J  ,  1  F  that,  z /  where:  '  A  £  ,  represents the experimental group i n i t i a l Application  f  subtest mean, A  represents the experimental group f i n a l Application  F  I  retest mean, and N  represents the s i z e of the experimental group. t  [• L  Similarly,  -  K  Therefore, f o r the experimental group, A  -  £  L  2  A  S  E  K  -  E  •  r  '2  K  '"I  £  A * "A*,  A,-A,  -  111 1 .^  2,  N,  (FA,:2 £  / or  \  or  N E  \ simplifying  symbols  N ( D  ,  £  V  )  £  J^AK  where  e x p e r i m e n t a l group Knowledge - A p p l i c a t i o n  / represents the  E  l o s s score c o r r e l a t i o n  coefficient. S i m i l a r l y , f o r t h e c o n t r o l group  (  ~  2  J,  I  i s d i s t r i b u t e d as  / Ac, " A  N,  rr. -* <TA  " ^  C  simplifying  A', - A  1  C  symbols  N ( D  , V  C  C  "A  )  Thus t h e d i f f e r e n c e between e x p e r i m e n t a l and c o n t r o l group mean d i f f e r e n t i a l l o s s s c o r e s ,  S  2  >2  K„  «  2.  However,^2  =  0  independent and D  E  - D  £  ,  -  V  £  + V  c  -  c  l  i  c  2j)'^pT^ tf~\  s i n c e t h e e x p e r i m e n t a l and c o n t r o l groups a r e =  D  c  K  2.  A  £  scaled  '  (7>  i s d i s t r i b u t e d as N ( D  or  under t h e n u l l h y p o t h e s i s .  2  -c,  )<  112 Thus, the test s t a t i s t i c , A  -  A  K -  £  K  A  -  c  A  a, °2.  1  ~J>A* E  f**c  m  +  N  i s d i s t r i b u t e d as  N  (  0 ,  1  )  under  H  D  c  l  K„  -K C,  C  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0104557/manifest

Comment

Related Items