Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Evaluation as protection : using curriculam evaluation to promote a just distribution of educational… Reitz, Cheryl Rene 1997

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


831-ubc_1997-0274.pdf [ 8MB ]
JSON: 831-1.0054813.json
JSON-LD: 831-1.0054813-ld.json
RDF/XML (Pretty): 831-1.0054813-rdf.xml
RDF/JSON: 831-1.0054813-rdf.json
Turtle: 831-1.0054813-turtle.txt
N-Triples: 831-1.0054813-rdf-ntriples.txt
Original Record: 831-1.0054813-source.json
Full Text

Full Text

EVALUATION AS PROTECTION: Using curriculum evaluation to promote a just distribution of educational resources in a private post-secondary English-language liberal arts institution in Canada for Japanese students which uses a leveled, modular, skills-based mastery-learning entry programme by CHERYL RENE REITZ B.A., Anthropology, The University of Washington, 1971 Professional Development Program, Simon Fraser University, 1988 Post-Baccalaureate Diploma, TESL, Simon Fraser University, 1989 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE D E G R E E OF MASTER OF ARTS in THE FACULTY OF GRADUATE STUDIES Centre for the Study of Curriculum and Instruction  We accept this thesis as conforming to the required standard  THE UNIVERSITY OF BRITISH COLUMBIA April 1997 (c) Cheryl Rene Reitz, 1997  In  presenting  this  degree at the  thesis  in  University of  partial  fulfilment  of  of  department  this or  thesis for by  his  or  requirements  British Columbia, I agree that the  freely available for reference and study. I further copying  the  representatives.  an advanced  Library shall make  it  agree that permission for extensive  scholarly purposes may be granted her  for  It  publication of this thesis for financial gain shall not  is  by the  understood  that  be allowed without  head of copying  my or  my written  permission.  Department  of  l g * f r g £r  He  The University of British Columbia Vancouver, Canada  •ate  DE-6 (2/88)  Afti\ u>  }  mn  %Jij  C\tr*i&ul«r« ^nJ. T^sfaucfam  ABSTRACT This thesis examines how one might evaluate the justice of educational resource distribution. It focusses on the criteria of institutional justice formulated by John Rawls: according to these criteria inequality in the distribution of resources is only allowed if it can be shown to benefit all groups, including 'the least favoured'. The thesis also demonstrates how qualitative and quantitative research methods can be combined in order to reach a more accurate and 'just' evaluation. The research, which was conducted at a private post-secondary English - language liberal arts institution in British Columbia for Japanese students, compares annual student growth in English, both before and after the implementation of a three-to-tenmonth leveled, modular, mastery-learning program for entry-level students. The research also includes interviews to determine teacher attitudes about the previous and present programs and their effect on students. In both the qualitative and quantitative studies, program effects on high-, medium-, and low-entry ability students are looked at separately (in order to use Rawls' criteria). The context of the research is clarified with short summaries of issues around mastery learning, leveling versus tracking, and Japanese versus western education. The quantitative researchfindsthat, contrary to teacher impressions, the mean improvement for students in the present program is not significantly different from that in the previous program. The qualitative research however, points out important justice implications not revealed by the other study. The thesis concludes that (1) there are some problems with using Rawls' criteria in an educational setting; (2) looking at program effects on three separate ability groupings can reveal trends having justice implications; and (3) assessments of the justice of educational resource distribution should attempt to triangulate with both qualitative and quantitative studies which attempt to answer the same question. ii  TABLE OF CONTENTS Abstract  ii  Table of Contents  iii  List of Tables  v  List of Figures  vi  Acknowledgement  vii  Dedication  viii  1.0  Introduction  1  1.1  Definition of Terms  1  1.2  Context of the Question  7  1.3  Summary of the Thesis  16  2.0  Justice as Fairness - John Rawls  20  2.1  Two Visions of Democracy  20  2.2  Rawls' Principles  21  2.3  Two Questions About the Application of Rawls' Theories to Education  25  Summary of Use of the Notion of Justice as Fairness in Curriculum Evaluation  28  2.4  3.0  Changing Methods of Curriculum Evaluation  29  3.1  Qualitative/Quantitative Dualism  29  3.2  A Combination of Qualitative and Quantitative Methods  3.3 4.0  in Curriculum Evaluation  31  Application of this Perspective to my Thesis  34  Summary of Issues Related Specifically to the Research Site iii  35  5.0  6.0  4.1  Mastery Learning  35  4.2  Use of Ability-Grouping ('leveling') versus 'Tracking'  45  4.3  Relevant Japanese Curricular, Evaluation, and Justice Issues  49  The Research Projects  58  5.1  Quantitative Project - Description and Results  61  5.2  Qualitative Project - Description and Results  80  Conclusions  125  6.1  Quantitative Research  125  6.2  Qualitative Research  129  6.3  A Stereoscopic View  137  6.4  Conclusions about the Kind of Iriforrnation Gained by Combining these Two Types of Research  141  Conclusions about the Use of John Rawls' Criteria when Addressing Issues of Educational Justice  143  6.6  Summary of Research Findings and Conclusions  144  6.7  Suggestions for Further Research  146  6.8  A Postscript  149  6.5  7.0  Works Cited  8.0  Appendices  151  8.1  Samples of SPSS 6.1 Matched-Pairs Input and Transformed Data  157  8.2  Sample of SPSS 6.1 Output showing Means and Significance  159  8.3  Parallel 80%/50% Pass Mark Grading Schemes Used at College 'X' 160  8.4  Application: A Look at some Site-Specific Justice Issues  161  8.5  Description of SLEP Test  166 iv  LIST OF TABLES Number  Title  Page Number  Table 1  Summary of General Characteristics Defining Mastery Learning  39  Defining Characteristics of the Three Kinds of Mastery Learning  40  Proportion of Subjects in the Three Levels, by MatchedPair Groupings  68  Mean SLEP Entry Score and Gain (per Year and per Matched Pair Groupings)  71  Table 2  Table 3  Table 4  Table 5  Student Responses to Programs by Level (as Remembered by Teachers)  130  Table 6  Summary of Quantitative and Qualitative Results  138  Table 7  Key to SPSS Input Data  157  Table 8  Sample of SPSS Input Data  157  Table 9  Key to SPSS Transformed Data  157  Table 10  Sample of SPSS Transformed Data  158  v  LIST OF FIGURES Number  Title  Figure 1  Mean Increase in Total SLEP Points by MatchedPair Years  Figure 2.1  Mean Increase in Listening SLEP Points by MatchedPair Years  Figure 2.2  Mean Increase in Reading SLEP Points by MatchedPair Years  Figure 3.1  Mean Increase in Listening SLEP Points by MatchedPair Years and Entry Level  Figure 3.2  Mean Increase in Reading SLEP Points by Matched Pair Years and Entry Level  vi  ACKNOWLEDGEMENT This thesis would not have materialized had I not been the recipient of extraordinary kindnesses from many people whom it is my pleasure to thank at this time. I must thank first and foremost my family. Through some very difficult times over the past few years, they have consistently assured me it was extremely important to them that I continue working towards my educational goals. The unconditional support of my colleagues — both my fellow teachers at work and my fellow UBC M.A. and M. Ed. graduate students here in the Kootenays — has been a source of great comfort and inspiration. Dr. LeRoi Daniels of UBC first saw some value in my topic and Dr. Ian Andrews of Simon Fraser University advised me, while the project was still in an embryonic stage, that it was indeed worth pursuing. My UBC advisor, Dr. Jerrold Coombs (Educational Studies), and committee member Dr. Murray Elliott (Educational Studies) have exercised admirable patience and restraint,frequentlyguiding me away from various seductive side-routes, assuring me that just because something is interesting to me does not mean it must be included in my master's thesis. Dr. Graham Mallett (Language Education) agreed to be my outside reader under extraordinarytimeconstraints, for which I am most grateful. Two other UBC professors outside my department deserve a special thank-you as well. Dr. Marshall Arlin (Educational Psychology) took time to make some helpful observations about my research results. Statistician Dr. Arleigh Reichl helped me import datafromone program to another, and made sure I was mputting data correctly. I must also thank several other colleagues who have assisted in the following areas: Dr. John Casey chaired the Research Committee at my institution which gave me both access to data and time to interview teachers; Dr. Carol Thew was persistent enough to locate vital data which all others thought had been lost and, along with UBC doctoral candidate Marilyn Low, provided helpful insights at crucialtimes.Finally I extend my deep gratitude to cultural liaison officer, Mr. Yoichi Oshima, who assured me that my description of Japanese education was fairly close to his experience as both a student and a teacher in Japan. The thesis is still farfromperfect, but it is a lot better than it might have been if not for you. I must apologize for being unable to incorporate all of your advice. (For example: Unfortunately, it is impossible to make my thesis half the size and double the size at the same time!) In those rare instances in which your advice conflicted, I hope you will agree with my choices. Taking full credit for any and all errors and omissions, I extendi" all of you my heartfelt thanks and best wishes. Though I started to give up on myself marry, times, you never gave up; your faith and good humour always brought me backfrom'the edge'. May God bless you all.  5  vii  DEDICATION  To the memory of my parents Dorothy Verne Mehl Hennen August 16, 1921 - October 7, 1995 and Robert Eugene Hennen April 18, 1921 - October 19, 1996  viii  Reitz 1  1.0  INTRODUCTION  Because the distribution of educational resources is seen to be ultimately associated with the distribution of social, political, and economic power, the manner in which educational resources are distributed is a perennial concern in a democratic society. In this thesis, I willfirstpresent some proposals regarding ways to determine a just distribution of these resources. Next, I will suggest a way of combining qualitative and quantitative methods to aid in this determination. I will provide an example of a type of research and evaluation which uses a combination of qualitative and quantitative methods to examine how closely a curricular program conforms to a just arrangement, within the context of my own teaching environment, a private post-secondary English Language liberal arts institution in British Columbia for recent Japanese high school graduates. The thesis will conclude with suggestions of how my research could serve as a model for further inquiries into the justice of specific curricular decisions.  1.1  Definition of Terms  The terms used in the title - just, distribution, and educational resources - were used intentionally in contrast to equal, access, and education or knowledge. The rationale for doing this is below and is followed by a discussion of 'input and 'output definitions of equality and the notion oV outcome distribution' in education.  1.1.1  'Just'  Equal treatment, in that it can mean identical treatment, is not necessarily just. The word 'justice' implies a more sensitive or appropriate fit between the individual and the treatment received than does 'equality'. By 'justice' I refer to any practice in which  Reitz 2  the principle of equality is followed. This principle is an assertion that "in all public matters all persons should be treated identically, except in contexts where sufficient reasons exist for treating particular individuals or groups differently" ("Justice"). In defining the antithesis of justice, John Rawls refers to 'injustice' as "inequalities that are not to the benefit of all" (62). Clearly, the important relationship between equality and justice is complex.  1.1.2  'Distribution'  'Distribution' was used instead of 'access,' because the latter denotes simply that a 'commodity' is potentially available if the 'consumer' has the desire or motivation and the personal assets to take advantage of the opportunity to acquire it. If we see the consumer and his/her assets as the sole determiners of whether the commodity is acquired, and the role of the institution as simply to provide 'access' to it, this term is sufficient.  However,  if we are looking at the institution's role as being more complex than this, we might consider using a different word. While the term 'distribution' can mean simply the (oneway) act of dispersing and giving out shares of a commodity, I use it here in its second meaning: "the extent to which different groups, classes, or individuals share in the total production or wealth of a community" ("Distribution"). Here, the verb 'share' implies an intentional and cooperative giving and taking. This seems to more accurately describe that active two-way relationship between educational resources (a type of wealth) and learners which results in the learners' synthesis of knowledge (a more-or-less critical determiner of future production and wealth as well as a type of wealth in itself). While the term may not be totally accurate, in that the giving and taking are at times one way, at others reciprocol, and at times unintentional, it seems to be more appropriate than 'access' in this case. It also implies a statistical construct - a 'just' distribution in contrast to a 'normal' one.  Reitz 3  1.1.3  'Educational resources'  It was tempting to use the term 'knowledge' instead o f educational resources', as in John Goodlad's reference to "the distribution or democratization of knowledge" (792). However, because it fits more closely with the theories of John Rawls presented herein, I have primarily used the term 'educational resources' as interpreted by Jerrold R. Coombs ("Equal" 282) to mean "those conditions or objects which facilitate desirable educational achievements." It can mean teachers' time and attention, learning materials and methods, a comfortable classroom, and so on.  1.1.4  'Input/Output'  One way to determine whether these resources have been justly distributed (the 'input' interpretation) is through measuring how equal the programmes are which have been made available to various individuals or groups of students. Another way to determine whether resources have been justly distributed (the 'output' interpretation) is through measuring the degree to which students' observable educational achievements (sometimes called 'learning outcomes') are equal. The 'output' interpretation was given support by the U.S. Supreme Court Ruling of 1954 that 'separate but equal' schools for black and white students were not just because they (by implication: the separate schools themselves rather than any genetically determined student attributes) produced unequal effects upon the students. As James S. Coleman noted, this landmark decision furthered the notion that". . . equality of opportunity depends in some fashion upon effects of schooling . . . " (my emphasis) ("Concept" 15). Coleman, et al. made liberal use of this output assumption in their influential Equality of Educational Opportunity, a report commissioned by the U.S. Civil Rights Commission in 1964.  Reitz 4  In thefirst(input) case, then, it is the responsibility of institutions to give equal resources (input) while the learners bear responsibility for creating their own, differentiated educational achievements. In the second (output) case, equal achievement (output) — assumed to be leading to more-or-less equal power, equal satisfaction, equal income, etc.-- becomes the goal, and the responsibility for promoting these results shifts, to some unspecified degree, to the institution; the implication is that input may have to become unequal in order to promote a more equal output. While not discounting the factors of personal choice and responsibility on the individual level, this output notion correctly acknowledges that providing equal input to all groups, even when the principle of equality is applied, does not guarantee that all groups will equally benefit (as determined by output measures), no matter how equality of input is determined. Differential utilization of educational resources, in particular among racial, cultural, and gender groupings, then, is seen to be another form of inequality — one that can serve to further additional inequality and polarization in the next generation — which educational institutions must eventually address. Jerrold R. Coombs ("Equal" 282) has identified fallacies integral to both input and output interpretations. The input interpretation, he points out, while correctly acknowledging the intentionality of learners, incorrectly assumes equality of learner groups — that is, that all start out on equal footing and that therefore, given equal programmes, the groups will achieve equal outcomes. On the other hand, the output interpretation assumes that students are passive recipients of 'knowledge' ~ that the institution can create achievement (and equality), neglecting the two-way intentional nature of learning and incorrectly assuming that equality of outcomes among groups is either desirable or possible. According to the principle of equality, one could justify situations where students receive very different educational resources according to their age (different grade  Reitz5  levels), personal interests (elective clubs, sports, and classes), and personal strengths (adjustments for various handicaps). This is considered just, although most would agree that (except in cases where restitutions are being made), using race, nationality, religion, or gender as a determiner of the distribution of educational resources is not. However, as Coombs ("Equal") points out, minority or less-powerful groups may have inherent values and interests which prevent their members from consuming educational resources 'equally' with predominant groups. As examples, he notes groups such as Amish, various aboriginal groups, and even females (as a less-powerful group), whose 'unequal' consumption of educational resources may be their own (conscious or unconscious) choice, based on their rejection of mainstream or 'predominant' religious, cultural, or gender values. He suggests that in cases such as this, culturally-determined or genderspecific programmes could be quite different from those of the mainstream, as long as resources are distributed in such a way that "neither group has reason to envy the resources given to the other" (291). This thesis will consider the use of this principle with various ability groups as well.  1.1.5  'Outcome Distribution'  It is very important to distinguish between determiners of justice at the level of the individual versus that of the group. While each individual within a group is going to have very different strengths, motivations, and achievements from others in the group, making the goal of equal outcomes on the individual level ludicrous, a case can often be made for striving for equality of outcome distributions among groups. This is dependent on several factors such as whether the learning goals are mutually and equally shared by the groups, and whether the instrument chosen to measure the outcomes is not biased towards any particular group(s).  Reitz 6  In cases in which groups have very different attributes which would affect their learning (such as differences in prior learning or ability - critera by which students are grouped at my institution), achievement among groups may be compared (outcome distribution determined) in terms of differences in improvement (change in achievement) instead of differences in absolute achievement — as long as all groups can be assessed using the same scale. Otherwise, the only way the lower groups' outcomes would ever be seen as 'equalized' with those of the higher group would be by preventing the higher group from advancing, a ludicrous situation, but nevertheless, one which we will ponder in our examination of the mastery learning literature. 1.1.6  Summary - How the Terms will be Used in this Thesis  In this thesis, the entire student sample is monoracial, monocultural, and of the same general socio-economic and age group as well. The smaller groups are based simply on performance on standardized tests of English proficiency — a function of native ability, motivation, and prior learning combined. Presumably, then, there is no significant difference among the groups in learning goals, and the test instrument is not biased towards any one group. While both input and output are considered, the emphasis of this thesis generally is on outcome distribution among the ability groups, not on specific individuals within them. However, standard deviation measures used in the quantitative study may be seen as an indication of the extent of individual variation. Note also that in the quantitative study, the distribution is that of change in scores among the various ability groups rather than the distribution of absolute scores. Parts of the qualitative study, besides looking at the distribution of educational resources as determined by both inputs to and outputs from the various ability groups, will also look at the extent to which the curriculum is experienced as 'just' by individuals — both teachers and students.  Reitz 7  In short, the roles of both the institution and the learner in 'creating academic achievement' are acknowledged in this thesis. Consequently, the justice of educational resource distribution is evaluated both in terms of learners' access to resources (accessibility being a function of both resource and learner attributes), and learners' achievement from resources (again, a function of both resource and learner qualities). Both input and output will be considered, though outcome distribution will be stressed, particularly in the quantitative study. Not wishing to commodity knowledge nor trivialize the complex nature of its synthesis or expression, I hope the phrase 'just distribution of educational resources,' then, will prove useful in this case.  1.2  Context of the Question  In this section I hope, first, to show how the concerns of this thesis are not confined solely to the specific context of my institution; secondly, to establish the relevance of recurring themes of this thesis — horizontal versus vertical power arrangements; qualitative versus quantitative research and evaluation; and standardization versus teacher professionalization ~ and, finally, to demonstrate how these themes are intimately related in the present educational context and particularly in regard to determining the justice of curricular decisions.  1.2.1  Consolidation/De-centralization Trends  The political and economic reality of Canada in the late 1990s is that of balancing budgets and downsizing entitlements, programs, and bureaucracies in order that future generations not be saddled with the present generation's debts. Thoughtful people, while coming to understand this reality, are asking what the effects will be, and how the, negative ones might be mitigated.  1  Reitz 8  School districts in British Columbia, Alberta, and Ontario are presently being forced to consolidate (centralize) for budget reasons. The evaluation from the top down of schools, of principals, of teachers, and of the curriculum they teach will, as a result, become more centralized, but also more remote in that final decision-making will more frequently occur in places other than the local community. Private institutions such as my own, experiencing similar financial pressures, are likewise seeking to trim administrative costs; presumably this will result in less administrative time for curricular and teacher evaluation. What effect will these situations have on evaluation of curriculum? While there will inevitably, due to the greater degree of district centralization, be some additional external constraints (such as general proclamations from unseen bureaucrats rather than more negotiable site-specific mutual agreements among personally-affected parties), these may not be enforceable at the classroom level anyway due to the increasing remoteness Of administrators. Some would even question whether present dictums are truly enforced — implementation studies have often shown that many teachers implement fully only those programs which they support, regardless of'policy'. In addition, constraints due to greater district centralization will probably be countered by concurrent de-centfalization trends at the federal and provincial levels providing more power to — but less 'outside' pressure on — the district office; This situation (remoteness from consolidated district Offices plus decentralization at the federal and provincial levels) will presumably provide teachers, at least on a day-to-day basis, with more latitude to conduct their classrooms as they wish. In a recent phenomenon related to this, various provincial educational initiatives (such as British Columbia's province-wide Year 2000 reforms) have been promoted vigourously, then suddenly dropped or changed radically; as a result, teachers abandoned on various 'bandwagons' are left with an appetite for innovations, but with a corresponding cynicism towards those imposedfromabove. This development^ in  Reitz 9  rejecting what William Pinar, et al. (298) describe as a "monolithic, single-voice curriculum" imposed from above, parallels the notion of'heteroglossia' (inclusion of multiple voices, acknowledgement of multiple truthsj first promoted by Mikel Bakhtin, popularized by James A. Whitson, and related to 'horizontal' or cooperative methods of curriculum development, research, and evaluation described below (see also Pinar). Increased remoteness from administrators is potentially supportive, then, of 'horizontal' (rather than 'vertical' or 'top-down') curriculum evaluation practices such as those advocated in the 1992 ASCD Yearbook edited by C. Glickman: evaluation by peers, by autobiographical self-exploration, or through action research (see also Gitlin & Goldstein; Gitlin). With these horizontal models, qualitative evaluation methods are generally preferred to quantitative ones. My contention is that while these 'horizontal' methods have great merit, they also contain some important flaws which must be addressed.  1.2.2  Horizontal versus Vertical Methods  These educational power realignments (consolidation/decentralization) and their associated features potentially have a great impact on what educational resources will be available, how they will be valued, and, most important to this thesis, how they will be allocated and distributed among students. Two conflicting movements characterize these realignments. Vertical power realignments, such as district consolidation, are hierarchical by definition; they generally are utilitarian (concerned with maximizing 'the good') in nature and lead to standardization. Horizontal power realignments, such as decentralization, cooperative (and individual) teacher action research, and peer evaluation, are non-hierarchical by definition; they are generally laissez-faire in nature, and lead to multiple, nonstandardized results. For example, without a common provincial umbrella uniting them,  Reitz 10  districts will tend to become more unique: less like one another ~ and as Martin Carnoy (199) points out, less equal as well. Similarly, in a 'horizontal' power alignment, an individual's statement of her/his own internal experience is regularly considered valid even without the corroboration of objectively-collected 'data'. Obviously, the number of possible descriptions of one's personal internal experience (versions of reality or of'the truth') is equal to the number of subjects queried. There are certainly varying postures from which to regard the nature of truth and reality: Is reality/truth an illusion? If not, is there one 'unitary' reality/truth or many? Is reality/truth 'objective' or is it the synthesis of multiple 'intersubjectivities'? and so on . . . However, the point is that significant challenges have been thrown to positivism, to the supremacy of'objectivity', and to the primacy of quantification (if not to quantification itself). These challenges have had and will continue to have a major impact on curriculum research and evaluation practices. My contention is that the many educational philosophers who have put forth these challenges (see Pinar, et al.) are, generally speaking, supportive of the current 'horizontal' educational power realignments previously mentioned. It must be noted that 'horizontal' does not imply 'qualitative'; neither does 'vertical' imply 'quantitative'. These are simply terms that (i.) show the general direction from which the locus of power originates and (ii.) label a collection of elements which are frequently but not always associated with each. While contending that 'horizontal' evaluation is primarily (though not solely) of a qualitative nature, I do not wish to imply that 'vertical' evaluation according to external standards is necessarily quantitative. For example, current accreditation procedures (primarily a 'top down' or 'vertical' — and summative - evaluation) frequently contain significant qualitative elements. My thesis is wholly supportive of this trend:  Reitz 11  If either of these movements is allowed to progress without limits, however, unchecked by the other, injustice may result. In the schools, those on the bottom of the educational hierarchy, students, are potentially the most vulnerable to this injustice. Students are the ones to suffer most from the excesses of vertical power, empirical evaluation, and standardization such as being inadequately taught by dispirited teachers who have no sense of ownership of the curriculum, experiencing (personally) the cumulative effects of'failure' according to standardized testing and the normal curve, or enduring overly-standardized, boring texts. However, students also suffer unjustly when there is insufficient vertical power and empirical evaluation. Without some means in place to monitor conditions in their schools, or compare them to other schools, students can, unbeknownst to the public, suffer from inadequate materials, lack of coordination and integration (from class to class, year to year, and school to school), disorganized or incompetent teaching, segregation and unequal treatment (according to gender, race, socio-economic class, or handicap), or unfair allocation of resources.  1.2.3  Attack on 'Vertical' Methods  Pinar, et al. note that as early as the 1960s, more than a few scholars (such as James B. Macdonald, Dwayne E. Huebner, Herbert M. Kliebard, Elliot W. Eisner, Maxine Greene, Louise M. Berman, and Paul R. Klohr) had begun to criticize what they saw as related practices such as "behaviourism, scientism (a reduction offorms of knowing to quantifiable ones), dehumanizing technology, and an oppressive, alienating bureaucraticization of the schools . . . (They).. . attacked behavioural objectives,... and quantified, standardized evaluation and measurement of learning" (my emphases) (Pinar, et al. 184; Huber). This was what Pinar, et al. refer to as 'The first stage of  Reitz 12  Reconceptualization' of the curriculum field which was to exert ever more influence on education, continuing on into the present day. Those who distrust vertical power, standardization, and empirical evaluation have many advocates and an attractive set of rationales. For example, a major trend in curriculum theory is to see traditional (ie. standardized, mainstream, 'top-down') education negatively as a way of assimilating and controlling lower classes while indoctrinating them to the point of view of either the most powerful or most numerous group in society (Bowles & Gintis; Apple, "Curriculum"). Schools are seen as vehicles for reproducing current socio-economic hierarchies through what Phillip Jackson calls the 'hidden curriculum.' This idea has gone through several phases: the concepts of race and gender were added to that of class (Apple, Cultural); the notion of resistance by the lower status or minority group members was added (Willis; Giroux) and a 'liberation pedagogy,' based on the work done earlier by Paulo Freire ("Conscientizing") was advocated. Freire's idea was that the schools could be used in a more positive manner, developing within society's victims the intellectual tools they need in order to dismantle oppressive systems. These reformers came to advocate an evaluation of curriculum by such techniques as determining its value according to how well it empowers and liberates students (Freire, Pedagogy) or as considering the curriculum an art form which can be judged aesthetically subject to 'critical connoisseurship' (Elliot Eisner, Enlightened). In addition, they urge evaluation of teachers by self (and by peers), and of students by qualitative, individualized, and descriptive methods. Note that one cannot completely separate the teacher from the curriculum any more than one can separate the medium from the message. Methods of evaluating students are also an integral part of the curriculum. Evaluation in curriculum, then, can easily come to include that of students and teachers as well.  Reitz 13  All of these theorists were opposed to the use of top-down evaluation if it served to further the successful reproduction of an oppressive status quo. Because standardized testing has been used so extensively to sort and label students, to limit their future opportunities, to rob them of their self-esteem, or to marginalize them (House, "Justice"; Bowles & Gintis; Books), many educators have also come to oppose standardized testing of students, particularly if the tests are used to determine or limit the future academic options open to a particular individual. However, both bureaucrats and the public continue to lend great credence to numbers, despite the protests of many evaluation critics. Districts are quite often judged by their students' performance on standardized tests. Students are admitted to university primarily on the basis of their high school grade point average. As well, as critics Blaine R. Worthen and James R. Sanders claim,": . . quantitative work is still the dominant approach to educational inquiry, as even casual reading of the most influential journals in education . . . will reveal" (51). Nevertheless, I predict that horizontal realignments, because they involve less administrator time, and are generally easy and cheap to implement (in addition to other, more idealistic rationales), will become even more popular in the coming age of 'downsizing'. On the other hand, along with balancing budgets can come demands for greater accountability and quantitative justification of expenditures such as educational outcomes, 'payoffs' and 'dividends'. As well, economies of scale support standardization, and quantitative studies, like qualitative ones, can also concentrate on factors which are easy and cheap to measure. Though horizontal trends seem to be on the upswing, then, there is reason to predict that both trends will continue to influence evaluators of the foreseeable future.  Reitz 14  1.2.4  Standardization - Oppression or Protection?  Note Pinar, et al.'s association (thesis, page 11) of quantification with standardization; though the two often go together, neither requires nor assumes the other. However, the two are commonly associated by many people; this common association, combined with a negative perception of the purpose of standardized testing, may well have contributed to the present popularity of qualitative (as an alternative to quantitative) testing — and to that of curriculum evaluation and research methods, as well. The vertical ideal of 'a standardized curriculum' has waxed and waned over the decades. For example, in the 1920s scholars such as Franklin Bobbitt, in the words of Pinar, et al. saw "curriculum standardization and centralization . . . (as). . . goals, not oppressive realities" (33). Reformers criticized fragmented, locally-controlled school systems of disparate quality under the asssumption that, as David Tyack put it, "Regulation, bureaucratization, and centralization would equalize education by standardizing it, delegate decision making to experts, and 'Americanize'a diverse population" (my emphasis) (3), In the Progressive Era, the theories of John Dewey were to conflict somewhat with Bobbitt's goals; while Bobbit was seen as an advocate for bureaucracy, Dewey was seen as one of teacher professionalism. As R.Corwin put it, "Bureaucracy, by its nature, requires a high degree of standardization, with stress on uniformity in both rules and conduct . . . Professionalization, however, is marked by a low degree of standardization" ( Glanz 162). During the 1940s and 1950s, Bobbit's and Dewey's visions of teaching were to alternately influence and temper the proliferation of mutually supportive standardized tests and textbooks. As Kellaghan and Madaus (89) note, external exams tend to restrict what is taught - objectives that are not to be tested, or are difficult tovtpsjtsuch as oral and manual skills) will simply not be emphasized in a class or in a text. -External exams,  Reitz 15  including university entrance exams, came to "determine much of the curriculum and circumscribe the professional role of teachers" as Pinar, et al. (798) paraphrase Kellaghan and Madaus (89). Here we see a relationship among bureaucracies, external standards and standardization of materials which contains elements we will see again, though in a far more extreme form, in the discussion on Japanese education (thesis, pages 50-52). From the 1960s through the 1980s, there were various curricular reform movements, some calling for more standardization and external testing, some calling for less. In general, there was concern over both high dropout rates and poor preparation for post-secondary schooling and jobs, especially in urban public schools and among minority groups. A common response to this, noted Dennis Carlson, was the recommendation that schools return to 'basic skills' and more top-down evaluation through standardized testing, a phenomenon in evidence at my college as late as 1990, as will be seen in the qualitative section of this thesis.  1.2.5  A 'Blind Spot' of Solely-Horizontal Methods  After decades of a scientific-positivist (some would say 'utilitarian') approach to curricular decision-making, many now seek to bring a more human touch back to education. Words such as 'nurturing,' 'community,' and 'spirituality' are heard once again from academics. Many educators now recognize the mistakes of the past which resulted from the mis-use of curricular power and the unjust distribution of educational resources, often buttressed by claims of objectivity and reason, and Supported by the results of quantitative research. As a consequence^ these academics (and others) tend to question the exertion of top-down power, arbitrary controls, or external standards and question whether objectivity is possible or (Derrida; Deleuze & Guattari) if'reason' actually exists. Pinar, et al. note many other curriculum theorists today who would have educators question (or even abandon) the positivist strictures of reason, logic, and  Reitz 16  empiricism which have directed (these theorists would probably say 'shackled') the progress of Western thought. To reject reason implies irrationality, a precarious stance at best; likewise, though we may not label ourselves 'positivists,' we devalue any of the positivist 'tools' at our own peril. This is not to say that we cannot put some new tools (e.g. qualitative, horizontal, liberating tools) into our toolbox. However, this does not imply that we need to throw out the old ones simply because people sometimes mis-use them. While acknowledging their positive intentions, logical rationale, and wellarticulated methods, I will argue that the 'blind spot' of those who see curriculum in this way seems to be their distrust of power, seen in their negative attitude towards external standards and controls and their preference for qualitative evaluation. While just allocation of educational resources, equality of opportunity, and provision of the best education possible to our young are, unquestionably, their goals, precisely these ideals — justice, equality, and quality of education ~ may be threatened by the devaluation of external standards, quantitative evaluation, and external controls. Good intentions are not enough, as school personnel themselves may not even be aware that conditions at their school are unsatisfactory or unjust, or could be better, unless they have an external standard with which to compare them.  1.3  Summary of the Thesis  How should twenty-first century educators address the problem of j ustly distributing educational resources? They must consider the many failures of centralization and vertical, utilitarian control; of standardization; and of using primarily quantitative evaluation methods. They can't ignore the compelling arguments fpr^ decentralization and horizontal control, for including many — even conflicting --points of view (heteroglossia) in evaluation, for qualitative research, and for peer evaluation.  Reitz 17  How can they ensure equality of educational opportunity without creating a standardized regulatory monolith which destroys the quality of the educational experience? Is there any way to equalize educational opportunity for students while, at the same time, granting teachers more autonomy in the classroom? Are vertical and horizontal arrangements mutually incompatible? Or, by incorporating the strengths of both arrangements, and ensuring that neither predominates to the point of creating injustice...can they perhaps be woven together? Is some synthesis possible? If so, what criteria shall we use to determine whether the principle of equality has been adhered to and whether a just distribution of educational resources has been achieved? It is, perhaps, only human nature to blame the tools that people misuse rather than the motives of the people improperly wielding them. My thesis is that ironically ~ despite their good intentions and the undeniable value of qualitative and horizontal methods of evaluation — postmodern theorists' distrust of power and their devaluation (or outright rejection) of quantitative methods and external, hierarchical (top-down) evaluation could easily lead to a laissez-faire education system. Without some external standards, and without some quantitative and top-down evaluation, there is no way to ensure justice in allocation of resources, equality of opportunity, or quality of pedagogy. Ensuring justice, equality, and quality is no more likely in a laissez-faire education system than in a laissez-faire marketplace. I am not advocating a return to utilitarian, solely-statistical, solely-top-down methods of curriculum evaluation. These methods alone are unaided by principles of justice, confined to a limited scope, and deaf to the nuances of the individual spirit; alone, neither the positivist tools of evaluation nor the new, often more qualitative tools which have been developed, are sufficient to ensure that all children truly encounter justice, opportunity, and quality education throughout their school experience.  Reitz 18  Rather, I propose that formative evaluation of individuals (principals, teachers, students) should primarily be horizontal, qualitative, and mutually negotiable, as this form is potentially more meaningful and constructive on a personal level, as well as less threatening (hence, more readily acceptable) to the individual being evaluated. Furthermore, evaluation of curricula, schools, districts, 'the nation's schools,' etc. should have qualitative components as well as quantitative ones. However, a need remains for some external standards to guide all these individuals, institutions, and curricula and, it follows, for at least some quantitative assessment according to these external standards. To determine which method to use in a given case, I propose these guidelines. A combination of vertical and horizontal, quantitative and qualitative methods may be used at any time to add more information to an assessment. However, a combination of vertical and horizontal, quantitative and qualitative methods must be used in cases in which matters of justice, equality of opportunity; or quality ofpedagogy, which impact directly upon students, are being assessed in a summative manner. This is because when such critical factors are being assessed, as much information as possible is needed. We simply can't assume that participants either possess sufficient knowledge of the big picture or are capable of accurately situating themselves in it. We also can't assume that all evaluators have the same standards unless they are articulated. To determine whether a just distribution of educational resources has been achieved, I originally turned to a theory of justice, sometimes referred to as 'justice-asfairness,' propounded by Harvard University's John Rawls. Curriculum-evaluation scholar Ernest House ("Justice") had suggested that Rawls' theory might be useful in assessing the justice of curricular decisions. However, over time, I came to question (as have others such as Kenneth Strike as well as Rawls and House themselves) whether Rawls' theory was directly applicable to determining the just distribution of educational resources.  Reitz 19  Consequently, then, I will urge a reconsideration of some of the reasons for external standards, quantitative evaluation, and external controls, will question the implications of their devaluation, and will suggest two ways of mitigating their negative effects while retaining their positive ones: (i.) by using some principles of justice-asfairness, and (ii.) by using a combination of quantitative with qualitative evaluation methods. We will now take a more scrutinizing look at previous queries into these issues, exploring in Section 2.0 the notion of justice-as-fairness as it applies to evaluation in education, particularly in the evaluation of educational resource distribution, and in Section 3.0, what research methods to use in this determination. As well, we will briefly examine literature which will help define the specific context in which the research was conducted, such as summaries in Section 4.1 of the ongoing debate over mastery learning; in Section 4.2 of the difference between 'leveling' and 'tracking' students; and in Section 4.3 of curricular, evaluation, and justice issues in the Japanese education system, with which my institutions' students are most familiar. The two research projects are presented in Section 5.0 and the conclusions of the thesis in Section 6.0.  Reitz 20 2.0  JUSTICE AS FAIRNESS - JOHN RAWLS 2.1  Two Visions of Democracy  The principles defining democratic thought would seem to be liberty or the freedom to determine one's own life course and to maximize one's own good, tempered by rational and moral constraints; equality or "justice as regularity" (Rawls 504); and fraternity or brotherhood; a harmony of interests for mutual benefit; one-for-all and alitor one; or an expression of the feeling of ". . . not wanting to have greater advantages unless this is to the benefit of others who are less well off (Rawls 105). John Rawls has proposed some intriguing guidelines whereby the morality of the allotment of rights and resources by democratic egalitarian institutions could be judged (to see why he recommends democratic egalitarian over natural libertarian, liberal egalitarian and natural aristocratic institutions see Rawls, Section 12). These guidelines or principles attempt to ensure a more genuine equality of opportunity for all through providing protection to those possessing the least resources with which to take advantage of these opportunities. Another popular vision of democracy emphasizes liberty and equality, advocating a laissez-faire pursuit among equals for the possession and maximization of resources. The rationale for this utilitarian pursuit is that the inherent competition will result in the maximization of resources for the benefit of all (an idiosyncratic — perhaps even paternalistic — way of expressing 'fraternity'). This vision is, however, also related to the philosophy of social Darwinism, in which the survival of the fittest humans (and human groups) is thought to result in the eventual bettering of the human species. The fallacy inherent in both of these visions is that, once an allotment of resources has been achieved, the players are no longer equal: those who have achieved more (and their progeny as well) possess a greater advantage in further pursuits to maximize those resources. The human species is not necessarily 'bettered' by the survival of these more successful members since their 'advantage' may not be attributable to inherent (i.e.  Reitz 21  hereditary) individual qualities, but simply, as Rawls notes, to the resources possessed or inherited: . . . the institutions of society favor certain starting places over others. These are especially deep inequalities. Not only are they pervasive, but they affect men's initial chances in life; yet they cannot possibly be justified by an appeal to the notions of merit or desert. It is these inequalities, presumably inevitable in the basic structure of any society, to which the principles of social justice must in the first instance apply (Rawls 7). 2.2  Rawls' Principles  2.2.1 General Conception of Justice  Rawls, then, advocates providing a 'handicap' to at least partially redress the above inequality, thus enabling those who possess significantly fewer resources to compete on a more equal footing. These varied resources are considered by Rawls to be 'goods' (sometimes he refers to them as Values'), as in his General Conception of Justice which states that: All social primary goods — liberty and opportunity, income and wealth, and the bases of self-respect — are to be distributed equally unless an unequal distribution of any or all, of these goods is to the advantage of the least favoured (303). Rawls particularly emphasizes self-respect not only as a good in itself but also as a prime resource upon which the ability to acquire further resources is dependent (440). In Rawls' First Principle of Justice, he deals with liberty and equality: Each person is to have an equal right to the most extensive total system of equal basic liberties compatible with a similar system of liberty for all (302).  Reitz 22  2.2.2  The Second Principle of Justice (the 'Difference' Principle)  Fraternity or 'caring' has been described by Steven R. Covey (4) as a 'superordinate' or higher middle position transcending the divergent values of liberty and equality. Rawls, in a similar vein, promotes fraternity in a Second Principle of Justice whereby inequalities are only permissible when they are to the greatest benefit of the least advantaged: Social and economic inequalities are to be arranged so that they are both (a) to the greatest benefit of the least advantaged . . . and (b) attached to offices and positions open to all under conditions of fair equality (302). The rationale for social or economic inequality to the greatest benefit of the least advantaged is that it will eventually enrich the society as a whole (including those most advantaged). It is easy to misconstrue Rawls' meaning here. He does not mean that inequalities must benefit 'the least advantaged' more than other groups of people. Rather, he means that any inequality must benefit those who are the least advantaged (under this unequal distribution) more than they would be under a condition in which resources were distributed on a strictly equal basis (Coombs, email 10 Feb. 1997). For example, Rawls would probably see justification for an unequal distribution of educational resources which equipped disadvantaged students to participate more fully as equals within societal institutions; for an unequal distribution of such resources as higher grades, advanced degrees, or income which provided an incentive to those who better society (e.g. by creating jobs or healing the sick); or for an unequal distribution ofrightsto those who harm society (e.g. incarceration), so long as this inequality may'be seen to benefit 'the least advantaged' groups to a greater degree than would a strictly equal distribution. While the Second Principle of Justice does not explicitly acknowledge the inequalities students bring into the classroom, such as varying types of home, economic,  Reitz 23  and cultural backgrounds, intelligence, motivation, interests, and previously-attained skills, Rawls does acknowledge that in any society, some will naturally be more advantaged than others. He explains that this principle represents a social (contractual) "agreement to regard the distribution of natural talents as a common asset and to share in the benefits of this distribution . . . " (101). Rawls compares the Principle of Redress with the Second Principle of Justice, to which it is related. The Principle of Redress: is the principle that undeserved inequalities call for redress; and since inequalities Of birth and natural endowment are undeserved, these inequalities are to be somehow compensated for . . . society must give more attention to those with fewer native assets and to those born into the less equal social positions. The idea is to redress the bias of contingencies in the direction of equality. In pursuit of this principle greater resources might be spent on the education of the less rather than the more intelligent, at least over a certain time of life, say the earlier years of school..." (my emphases) (100-101). Rawls states very clearly that his Second Principle of Justice (the 'difference' principle) "is not... the principle of redress" (101) in that: It does not require society to try to even out handicaps as if all were expected to compete on a fair basis in the same race. But. . . (the difference principle). . . would allocate resources in education, say, so as to improve the long-term expectations of the least favoured. If this end is attained by giving more attention to the better endowed, it is permissible; otherwise not. And in making this decision, the value of education should not be assessed solely in terms of economic efficiency and social welfare. Equally if not more important is the role of education in enabling a person to enjoy the culture of his society and to lake part in its affairs (my note - thus democratizing society as well), and in this way to provide for each individual a secure sense of his own worth (my emphases) (101). Coombs warns against a possible misinterpretation of this principle which is "not about redressing disadvantages to presently disadvantaged groups, but rather about ensuring that who' ever ends up disadvantaged by social arrangements would not be worse off than they would be if benefits were distributed on a strictly equal basis" (email, 7 Jan. 1997). Rawls adheres to the traditional convention of'the veil of ignorance,' in  Reitz 24  which a social contract is agreed to in the beginning by people who are 'blind' as to which socio-economic position they might personally end up filling. Therefore, they would choose a contract in which all resources were distributed equally ~ unless it were clear that another arrangement maximizes "the minimum amount of goods anyone will receive, i.e. increase over what they would have under a policy that distributes goods equally" (Coombs, email 10 Feb. 1997).  There is a limit, in other words, on redistribution ~ it  should not end up replacing one disadvantaged group with another. Rawls' principles may be seen as a just compromise between the impulses to compete freely and to distribute fairly, tempering the excesses of each. In the material realm, of course, the extremes of'pure' capitalism (free competition) and 'pure' socialism (fair distribution) come to mind. However, in promoting partial redress to the lesseradvantaged, Rawls is not advocating any particular economic system. He is not a Marxist; in his book (259) he considers Marxism only as an economic arrangement and as an ideal which, if carried out in its idealized form (fair distribution - a tautology for Marxism), could be 'beyond justice' (Rawls 281 refers to R.C. Tucker's The Marxian Revolutionary Idea chs. I and II). In addition, Rawls' principles are those of moral philosophy, not economics, and apply to the just distribution of both tangible and intangible resources. In short, to Rawls, economics should be guided by justice as well as utilitarianism, in that it should not be concerned only with the maximization of resources, but with their fair distribution as well. This is sometimes called a maximum/minimum (max-min) principle, in that "the minimum amount of primary goods any person will receive" is maximized (Coombs, email 10 Feb. 1997). Rawls' premise is that if participants in an institution are more equally endowed with its resources, the entire institution is more just than an institution exhibiting extreme variations of resource endowment among its participants. It should be noted that Rawls' prime aim is to promote justice rather than democracy; in this case the two happen to intersect.  Reitz 25  It is clear then, that Rawls, though neither a neo-Marxist nor a 'postmodernist' as such, is concerned with empowerment and the leveling of hierarchy. It is from a similar perspective that I wish to position myself and have attempted to conduct this project.  2.3  Two Questions About the Application of Rawls' Theories to Education  2.3.1  The Educationally Disadvantaged  One question came up: "Who, exactly, are 'the disadvantaged' in education?" James S. Coleman suggests that unequal ability, in addition to unequal (racial, cultural, familial, etc.) background, might be a natural inequality deserving of consideration when trying to equalize society (17). In the section of his theory where he advocates a partial application of the Principle of Redress, Rawls also deals with this question, suggesting the allotment of educational resources so as to improve "the long-term expectations of the least favoured . . . (which could include). . . those with fewer native assets . . . (such as). . . the less rather than the more intelligent" (100-101). Clearly, Rawls is suggesting here that ability is a valid criterion, along with race, culture, socio-economic class, and gender, to consider when assessing whether educational resources are being justly distributed. Kenneth A. Strike reaches the same point when he considers the goal of equal opportunity: "If we can assume that any social results are essentially a function of native ability plus opportunity (leaving aspirations out of consideration for the moment - see Strike's footnote #24), then when opportunity is equal the disadvantaged will be those who possess less native ability" ("Role" 7). He points out a problem with this, however: "If, then, we are to apply the difference principle to schooling, we will wish resources to be distributed and patterns of achievement to result (my emphasis) such that they are to the advantage of the less well endowed" (Strike "Role" 7). He proposes that only after  Reitz 26  inequalities in the achievement of minority and poor children disappear should this become a goal. Interestingly, he modifies somewhat the manner in which these are priorized in 1983 — see thesis, page 47). What Strike called 'patterns of achievement that are to the advantage of the less well endowed' may not mean they all have TVs. It is often to the benefit of the less advantaged not to limit the ways in which those naturally better endowed can advance themselves since the less advantaged, too, will share in positive ways with their achievements, regarding (as Rawls put it) "the distribution of natural talents as a common asset" (my emphases) (101). To this I. would add the corollary that society, likewise, should regard the distribution of natural disabilities or intellectual inadequacies as a common problem of which all will share in the mitigation. We will revisit the issue of the potential achievement of'the less-endowed' versus that of'the better-endowed' academically in the section on mastery learning.  2.3.2  Generalizability of Rawls' Principles  Another question came up: "Just how globally can Rawls' criteria be applied?" Several educational evaluators (Coleman "Equality"; House, "Justice"; Strike, "Role") have debated this and could not agree on an answer. House was perhaps the first, in 1976 ("Justice") to use Rawls' theories in a critique of modern (both qualitative and quantitative) curriculum evaluation practice, suggesting that evaluators should be concerned with their impact on individual subjects' self-esteem. Also, he advocated looking not only at mean benefits, but at their distribution among advantaged versus disadvantaged groups as well.  (Note that in  order for evaluators to do this, their experimental design will of necessity be quite different than it would if they were looking at the group as a whole - there must be at least one more variable such as family income, socio-economic status, a standardized test  Reitz 27  score, or some qualitative factor which could help categorize each subject according to how 'advantaged' they are.) Strike replied to House: "One takes these [Rawls'] rules intended for such global applications and applies them to specific situations only at one's peril and against the spirit of Rawls' views." Strike (incorrectly, I contend), interpreting Rawls' 'resources' as 'wealth', claimed that "Rawls' Second Principle governs basic institutions for distributing wealth (my emphasis) in a society. It does not per se govern the distribution of test scores" (Strike "Role" 5). Strike went on to temper this with the suggestion that test scores might have some relevance to Rawls' principles if one could show a causal connection between the distribution of test scores and "the distribution of social and economic benefits" (6). It should be noted that Rawls himself was first to point out the limitations of his theory: There is no reason to suppose ahead of time that the principles satisfactory for the basic structure hold for all cases. These principles may not work for the rules and practices of. . . less comprehensive social groups . . . Now admittedly the concept of the basic structure is somewhat vague. It is not always clear which institutions or features thereof should be included (Rawls 8-9) In my opinion, there are various levels on which an evaluator might use Rawls' theory. It seems logical to try to apply it in a variety of situations and see if it is of use or not. My particular research is at the level of a school, of course, and none of the students belong to a disadvantaged socio-economic group, so I am considering the 'academically' rather than the socio-economically disadvantaged However, Rawls' theory does not particularly state the parameters of research involving his principles; it was left up to the researcher to determine whether the principles proved useful in the particular context.  Reitz 28  2.4  Summary of Use of the Notion of Justice-as-Fairness in Curriculum Evaluation  In conclusion, Rawls' theory of justice-as-fairness has given educational evaluators, who see the benefit of the new evaluation methods, but are troubled by the moral relativism that so often accompanies them, some new moral direction. Rawls' most relevant contributions to curriculum evaluation, I predict, will be this advice: (i.) to consider the effect of programmes and evaluation itself on individuals' self-esteem; (ii.) to look at the mean effects of programmes on each of the sub-groups affected, not only on the mean effect on the entire group and (iii.) to ensure that the programme is of benefit to all, including the least-advantaged sub-group. His theory, which some might perceive as an elaborate re-statement of the 'Golden Rule,' is by no means a final solution to the relativist dilemma. However, with luck, it may provide the mental scaffolding from which a future moral philosopher will be able to construct something which more closely approximates a solution.  Reitz 29  3.0  CHANGING METHODS OF CURRICULUM EVALUATION  3.1  Qualitative/Quantitative Dualism  3.1.1  Qualitative and Quantitative Defined  During the twentieth century, logical positivism, the social efficiency movement, and behaviourist psychology were to have a great influence on curriculum research and evaluation. In each of these movements, research done by the use of quantitative methods was strongly preferred. As noted previously, while quantitative methods continue to be widely used in curriculum research and evaluation, qualitative methods (largely as a reaction against the misuse of quantitative methods — in education as well as other disciplines) have been gaining in popularity. How exactly are 'qualitative' and 'quantitative' differentiated in a strictly dualistic sense? Perhaps the easiest explanation is that quantitative methods of evaluation use a generally-accepted standard of measurement and measure a phenomenon according to the standard in such a way that the measurement could feasibly be replicated by others who would, under the same conditions, obtain the same results. These results could then, if desired, be analysed statistically. Qualitative methods, quite simply, are those which do not conform to this pattern. Qualitative methods in curriculum evaluation range from simple 'goal-free' and 'instrument-free' observations to interviews and autobiographical and phenomenological exploration. They tend to look in-depth at one or a small number of subjects or phenomena and seek to construct meanings which are mutually-agreedupon by researcher and subject. There is no sense in referring to the 'quantitative method' or the 'qualitative method;' each is a group of many different methods ~ even more so, two studies which both involve 'a combination of quantitative and qualitative methods' can be extremely  Reitz 30  different, utilizing vastly differing methods. Lynne Miller and Ann Lieberman (12), in an otherwise excellent article, ignore this distinction when they state that "both the Rand and DESSI studies depend on a combination of qualitative and quantitative measures, so we cannot attribute the difference in findings to difference in method." On the contrary, the different results may be due to very divergent methods.  3.1.2  Strengths and Weaknesses of Each Group of Methods  Each group of methods has inherent weaknesses detracting from its usefulness to curriculum evaluators. Quantitative methods focus so completely on one factor that they often distract researchers from other important, but harder (or impossible) -to-measure elements (such as internal, personal experiences). They also prevent researchers from seeing unanticipated, so unmeasured (and undetected) consequences. I have detailed many other problems with quantitative methods in the Introduction, but wish to dwell in more detail on the difficulties with qualitative methods at this point. Qualitative methods can give such mixed results that it is difficult to know what use to make of the data. Qualitative methods also tend to lack replicability. In their focus on one or a small number of subjects in depth, they may end up with an atypical rather than a typical subject selection. With qualitative methods of curriculum (and teacher) evaluation, there are even more subtle pitfalls: Autobiography is a form of 'se//-reporting' and 'se//-assessment' which is sometimes used as part of an accreditation or a curricular evaluation process. F. Michael Connelly and D. Jean Clandinin (1.41.) warn against narcissistic tendencies and Hollywood-style happy endings into which autobiography — if the work is not (intersubjectively) tempered by collaboration with others — can fall.  Peer-assessment  also contains some potential pitfalls. William Pinar questions whether the attitude of empathy many postmodernists assume when assessing peers (and others) might not serve,  Reitz 31  at times, to conceal or rationalize more than it reveals. Pinar, et al. note that empathy involves mentally participating in the intentions of others, "intentions which can function as self-rationalizing, self-forgiving, indeed self-deceiving ideas. Empathizing with another . . . might lead to collusion " (583). Finally, Madeline Grumet warns that 'teacher' narratives can lapse into an impotent moral relativism: a "failure to engage in some analysis . . . beyond celebration and recapitulation (which) leads to a patronizing sentimentality (consigning) the teacher's tale to myth, resonant but marginal because it is not part of the discourse that justifies real action" (324). Relativism, in its openness to many possible interpretations of'truth' and 'the good' can open many previously-closed doors to the mind. However, it can also, in its inability to clearly show a 'correct' course, obscure routes which were previously open and led to action. Each group of methods also has its strengths. Quantitative methods can reveal trends that casual observers would not detect, prove causation, and clearly confirm or refute hypotheses. Because they are replicable, they are not as easily subject to observers' (conscious or unconscious) personal whims, moods, and prejudices. Qualitative methods, however, can point out trends that the researcher had never known to exist nor thought to measure, or suggest questions that the researcher had never thought to ask. They help people to form hypotheses. They can reveal personal, internal points of view such as motivation and reasoning as well.  3.2  A Combination of Qualitative and Quantitative Methods in Curriculum Evaluation  3.2.1  Dualistic Nature Refuted  Until recently, these two groups of methods have been seen as antithetical dualisms, mutually unmixable, like water and oil, or (arguably) like science and religion,  Reitz 32  predicated on two such different world views that the conclusions using one perspective could not in any way be supportive of or comparable to those using the other perspective. One advocated either one or the other, never both. There has been a regrettable tendency, as Kenneth R. Howe points out, to retain the "rigid epistemological distinctions between quantitative and qualitative methods " (10). He claims that this dualism, which forces researchers to choose from 'value-laden' qualitative or 'value-free' (descriptive) quantitative methods is a legacy of positivist dogma which should be discarded. Howe and others claim that objectivity is a myth — that bias exists in quantitative methods as well as qualitative ones in that choices such as what to measure, what instruments and standards to use, and what statistical analyses to employ, etc., involve very value-laden decisions. To pretend they don't only adds to the 'hidden' bias. Many curriculum evaluation methods contain both quantitative and qualitative elements. For example, the results of questionnaires and textual analyses both describe and tabulate (often large) numbers of'intersubjectivities', These tabulations, though of 'subjective' inner experiences and personal opinions, can easily be replicated if the number of subjects is great enough. As in the Indian folk story The Blind Men and the Elephant (where 'the truth' was found by synthesizing a number of quite differing intersubjectivities) the 'objective' nature of'multiple intersubjectivities' is becoming recognized. In my mind, intersubjectivity can be seen as a synthesis of multiple personal perceptions — perceptions moving from the subjective and personal towards the objective and impersonal — an empirical position defying the dualism of subject/object. In other words, a description of the elephant issued by several blind men will be more accurate than that issued by one blind man, and is more accurate, in some ways, than that provided by an 'objective' (but only two-dimensional) photograph. Elliot Eisner (Art 252) has suggested that standardized tests and other quantitative data can be used to supplement the qualitative methods he has developed. But how  Reitz 33  should one go about combining the results? The notion of'triangulation' (collecting data from more than one source about the same event or behaviour) has often been applied to combining more than one type of qualitative approach or investigating a phenomenon from the point of view of more than one stakeholder group. However, Louise H. Kidder and Michelle Fine show how triangulation can also result from combining data from qualitative and quantitative studies. This is easier, they claim, if both studies are clearly trying to investigate the same hypothesis, and if the qualitative study is not so fluid that the questions the researcher asks vary from subject to subject or over time. For a far more detailed discussion of these topics, see Worthen and Sanders; Mark and Shotland; Kidder and Fine; Howe; Miller and Lieberman; Madaus and Kellaghan; Cook and Reichardt; Madey; and Stone, all of whom conclude that the time has come to stop thinking of qualitative and quantitative methods as mutually antagonistic and instead recognize them as complementary, mutually supportive, and best used in combination with one another.  3.2.2 Benefits of Combining Methods  The three most widely-mentioned benefits of combining qualitative and quantitative methods are (i.) that each is strong where the other is weak; thus, they fill in each others' gaps, complementing one another and strengthening the research, (ii.) that when they support one another, the results are strengthened as well, and (iii.) that when they contradict one another, both results are called into question; in this case an explanation for the contradiction (possibly requiring further research) is called for.  Reitz 34  3.3  Application of this Perspective to my Thesis  In my thesis, I hope to follow the suggestion of Blaine R. Worthen and James R. Sanders that instead of expending energy on debating the relative merits of qualitative versus quantitative methods, scholars and practitioners' energy would "be more productively channeled into conceptualizing and testing procedures for effective integration of quantitative and qualitative methodologies, an area in which there is still very little guidance" (53). I will conduct one primarily qualitative and one primarily quantitative study, each of which tries to answer the same general question about just distribution of educational resources. I will analyze the results each study contributes ~ alone, and in combination with the other — to see how (and whether) they complement, mutually support, contradict, or inform one another. I hope to be able to verify or refute some of the claims others have made about combining qualitative and quantitative methods, and perhaps even contribute some additional observations as well. Please note however, that this is not primarily a methodological study. While I hope to demonstrate triangulation of qualitative and quantitative data in curriculum evaluation, my primary purpose is to reach the best determination of the justice of a particular distribution of educational resources, not to prove that qualitative/quantitative triangulation 'works.' An assumption of my study is that, if conducted properly, this type of research can combine the best aspects of both quantitative and qualitative methods.  Reitz 35  4.0  SUMMARY OF ISSUES RELATED SPECIFICALLY TO THE RESEARCH SITE  Instead of proceeding directly to the research itself, I have chosen to situate the research in a particular educational context. This is because the research site is a rather atypical Canadian liberal arts college, though those who teach adult-basic-education or English-as-a-Second Language students mayfindfrequent parallels with their concerns. I am writing this section not for my own institutional colleagues, to whom this context is intimately familiar, nor for those who simply wish to follow the philosophical logic of my argument. This is background information for readers who want to know more clearly what sort of school, curriculum, teachers, and students we will be looking at, and what issues concern someone evaluating whether educational resources are being justly distributed through the curriculum currently in use at this particular school. Three issues specific to my institutional setting will be summarized. First, the curriculum and evaluation methods used at my school will be situated within a masterylearning framework, along with a short descriptive and analytical review of mastery learning. Next, the very different purposes and results of'ability-leveling' and 'tracking' will be clarified. Finally, Japanese curricular, evaluation, and justice concerns which affect programmes and their delivery at my college will be summarized.  4.1  Mastery Learning >  Mastery learning, an outcome-based curriculum model, is used at our institution for thefirstfour ('Foundation') levels of listening, speaking, reading, grammar, and composition. The Foundation levels, while not the sole focus of my study, are the primary focus since all but the highest entry-level students spend most of their time in this programme during theirfirstyear at the college. Let's look briefly at mastery  Reitz 36  learning's history, philosophy, and experimental base and at how it is actually practised today. The various controversies surrounding mastery learning, particularly the issues which affect our institution, will be summarized.  4.1.1  Benjamin Bloom's Suggestion  The basic premise of mastery learning is that given enough time, practically anyone can learn practically anything. The idea has been traced back as far as John Comenius' seminal Pampaedia in the seventeenth century and probably even further (Block, Mastery; Guskey), but began its most recent incarnation in 1968 when Benjamin Bloom saw the implications of a conceptual model of learning propounded by John B. Carroll. Carroll's model posits the degree of learning to be a function of the time actually spent relative to the time needed. Time needed is seen as a function of (1) an individual's aptitude, (2) the quality of instruction (e.g. materials and methods), and (3) the individual's ability to understand the instruction (e.g. matches between student and teacher language and between learning and teaching style, plus affective factors). Allocation of time (for learning and teaching) as well as (2) and (3) above could, reasoned Bloom, be manipulated by the school to ensure that all students succeeded in mastering the basic material to be learned. Summative tests would be given at frequent intervals. Those who didn't reach the mastery standard would re-study and then re-take an alternative form of the summative exam. They wouldn't be penalized for having taken additional time to learn, nor for mistakes made while learning. The goal of mastery learning advocates is not simply to maximize learning through raising test scores, a purely utilitarian goal, but to better the educational achievements of those, in particular, among the least advantaged in the educational hierarchy. Echoing John Rawls, Block, et al. (220) proclaim,"... we believe in equity  Reitz 37  in terms of student learning outcomes, not in terms of student learning opportunities (my note: inputs). Indeed, to attain outcome equity we are willing to provide unequal treatment in terms of learning opportunities and learning time for some students and especially for those who historically have been the 'have-nots' in the teaching-learning process" (220). While few would argue with the sentiment, there is great debate as to how to define 'equity in learning outcomes,' whether it is (or to what degree it is) actually possible to attain, and what the costs would be (including detrimental effects it might have on the learning of those 'more academically endowed').  4.1.2 Organization of Time versus Outcome Distribution  In mastery classrooms, time may be organized in one of three basic ways: (1) Instruction can be completely individualized as in many adult basic and computerassisted instructional programmes. (2) Alternatively, various levels of achievement may be offered in different (homogeneous) classrooms and students may repeat, advance, or skip levels every few weeks as deemed appropriate (used in many language institutes and skill-based courses such as ballet). (3) In the typical heterogeneous classroom, however, students are not segregated according to ability or achievement as in the former two cases. Rather, the students who reach the mastery standard earlier engage in enrichment activities while their slower classmates re-study the materials and eventually re-take the exam. While Bloom agrees that aptitude (which he would call 'learning rate') is distributed normally among students and that given uniform instruction (including equal time), their achievement (outcome) is also distributed normally, he claims that given optimal instruction (including optimal time) based on individual needs, most students can  achieve mastery of most desired learning outcomes.  Reitz 38  While the mastery standard at our school is only 80%, Block, et al. consider 'mastery' as a minimum achievement of 85% to 95%. This score must be attained on summative tests which are criterion-referenced to the objectives of the curriculum. Ideally, the standard is high enough to ensure that desired learning has occurred, but not so high that mastery is perceived as unattainable. In mastery learning's strictest form, students' achievement is evaluated on what, in the end, they have actually learned as shown on a summative test given about every two weeks or so, not by their classwork, marks on formative tests, effort, nor time required to master the material. (Note: in our school, marks on formative tests are now given some weight as well.) For an exhaustive and research-based analysis of the relationship between time, ability, equality, achievement, and mastery learning see University of British Columbia's Dr. Marshall Arlin. Interestingly, while Arlin supports many of the claims of mastery learning advocates, he notes that their greatest weakness may be their tendency to , 'hyperrationalize' (a term he credits to A.E.Wise) — to "persist in rationalizing policy decisions that overlook means-ends relationships" (Arlin 81). He implies that the more zealous mastery learning advocates tend to minimize research which casts doubt on their claims that mastery learning does not result in slower learning for the 'faster' students in heterogeneous classrooms. Though our college uses mastery learning only in its homogeneous (Foundation level) classrooms, this issue (also called the 'Robin Hood' effect — first, by Arlin and later, by Slavin) will surface in my research. The following two tables summarize the characteristics of mastery learning in general (Table 1) and of the three quite different ways it can be delivered (Table 2).  Reitz 39 Table 1 Summary of the General Characteristics Defining Mastery Learning:  Instructional objectives are well-defined and appropriately sequenced Student learning is checked regularly and frequently and immediate feedback given Standards are criterion-referenced rather than norm-referenced A criterion level of performance is held to represent 'mastery' of a given skill or concept Corrective instruction is given to enable students who do not initially meet the mastery criteria to do so on later, parallel assessments Time and resources are organized to ensure most students are able to master the instructional objectives Students are not penalized for mistakes on formative' tests Sources: Guskey; Slavin; and Block, et al.  Reitz 40  Table 2 Characteristics Defining the Three Types of Mastery Learning:  Instructional Basis:  Who Paces Instruction:  Instructional Time:  Curriculum:  Heterogeneous Classroom, Group Year-Long Programme  Teacher  Relatively fixed  Relatively fixed  Individualized Programme Individual  Student  Variable  Variable  Homogeneous Classroom, Modular Programme (Leveled)  Teacher, Fixed, Fixed within module within module  TYPE:  Group  Student, at module end  (NOTE:  4.1.3  Variable, at module end  Student can repeat, advance or skip a level at module end)  Grades and Mastery Learning  Bloom, while questioning the premise that grades and their sorting function are a necessary component of learning, concedes that they are socially expected and are an integral part of our culture. He advises that if grades are to be assigned in a mastery learning school, they should reflect students' actual learning rather than their standing in relationship to others and should not penalize some students for taking longer than others to learn. In mastery learning, 'failure' is not an option. Because school policies differ so  Reitz 41  greatly, several possible ways of assigning grades in mastery-learning classes have been devised: Many individualized and 'leveled-group' mastery-learning situations grade simply with two grades: 'A' or 'Incomplete'. The 'Incomplete' changes to an 'A' when the mastery standard is eventually met. A variation of this is to distinguish among 'passing' grades in order to give an extrinsic incentive to do more than the minimum required (e.g. 85% = C, 90% = B, 95% = A, etc.). The grading system used at our school is based on this variation, and is described in more detail in Appendix 8.3. Another possibility, a form of'criterion referencing', is to assign grades based on the number of course goals mastered at a particular point in time (e.g. 70% of course goals mastered = C, 80 % of course goals mastered = B, etc.). Yet another possiblity, advocated by Champlin is the 'open transcript' concept, whereby students are allowed to demonstrate and receive credit for achievement of specific units whenever they are achieved in the student's academic career. Often a series of summative tests is given during a reporting period (all but the last of these might be thought of as 'formative', depending on how often they are given). Block, et al., advocate giving more weight to the final summative test, which is to include questions from the previous units, because it is more representative of the students' holistic learning and retention. In our school, formative ('progress') tests are given every one to two weeks, and a summative 'mastery' exam every five to six weeks at the end of a 'learning module'. This is the point at which students can change classes, depending on whether they repeat, advance, or skip a level. The bottom line, of course, is that in mastery learning when the student leaves a particular classroom, grade, or institution, the transcript should indicate what the student has learned. With typing, it could show 'words per minute' and % accuracy; in language learning, it could show the highest 'level' achieved. (In some other subjects, writing descriptions of student learning could be far more challenging - or anecdotal ~ for  Reitz 42  instructors). In our institution this is not a big problem because only the first four levels ('Foundation') involve the mastery grading system. Students spend from one to three-anda-half years beyond this at our institution, being graded under a more traditional system. The registrar has devised a means of integrating grades from the two systems into one final grade-point-average (see Appendix 8.3).  4.1.4  Mastery Learning and the Slower Student  If mastery learning simply resulted in moving the normal curve of achievement in a heterogeneous class intact, but to the right on a percentage scale, the benefits to the less-advantaged students would be a sense of greater accomplishment and greater selfesteem, which could enhance self-efficacy and therefore lead to greater future achievement: 85% 'feels better' than 50%. However, Block, et al., claim that the curve does not move intact. Rather, mastery learners' rate and achievement of learning becomes less spread out and bulges far to the right rather than in the middle. Far more students become high-achievers; those few remaining on the left, while recognizing that others may be achieving more than they, still benefit through mastery learning's greater allocation of teacher time plus more time to meet learning goals, and resulting enhanced achievement and self-esteem. The cycle of failure for them becomes 'short-circuited'. While some possible questions remain (e.g. 'What about the students who simply cannot keep up with the group, even with extensive extra time?'), advocates and critics alike tend to agree mastery learning is generally beneficial to disadvantaged (low-ability) students' achievement and self-esteem. The claim that achievement under a mastery system becomes less spread out as slower students 'catch up' to their faster peers is more likely to hold for heterogeneous classrooms. In individualized mastery instruction, as Arlin (72) points out, the spread among students tends to become greater, as 'fast' students are allowed to advance far  Reitz 43  more quickly than they would have been able in a heterogeneous classroom and 'slower' students are allowed to take the time they need to truly master the materials. In our fivemodule 'levels' system, we start the year with three levels (one to three), but usually end the year with five levels (three to seven), resembling the same phenomenon found by Arlin.  4.1.5  Mastery Learning and the Faster Student  Here is where opinion diverges greatly, especially in regard to mastery learning's effect on high-ability students in a heterogeneous classroom. Some claim mastery learning results in teachers lowering the test ceiling (and with it, standards of success), simplifying the curriculum (neglecting critical-thinking and problem-solving, while emphasizing 'basic-skills') and spending more time with lower-ability students so that they can pass. Meanwhile, they fear, higher-ability students are being held back from progressing and deprived of challenge -- while waiting for their slower peers to 'catch up'. This is referred to by Arlin (68) as mastery learning's 'Robin Hood' effect which robs high-ability students of teacher time while giving it to the low-ability students. Nevertheless, others such as Block, et al. cite research (Chan; Conner, et al.; Fitzpatrick and Charters) which seems to substantiate Bloom's proposal that mastery learning would lead to maximized learning for all without having a detrimental effect on faster-learning students. They noted no negative effects on critical thinking, problemsolving, or retention over time. The verdict is still out, however. The use of mastery learning is less controversial with individualized instruction or with homogeneous, leveled classrooms such as are found at my institution (Slavin 206). In these situations, low-ability students would be expected to benefit from the high mastery-standard and the additional time provided for them to master course goals without penalty. On the other hand, the progress of higher-ability students should not,  Reitz 44  theoretically, be affected as they are not expected to wait for their slower peers to catch up. These two assumptions will be somewhat challenged at points during my research.  4.1.6  Mastery Learning and the Language Teacher  Some mastery learning and 'outcomes-based-education' (or OBE ~ which some people prefer to call the various curricular reforms that grew out of mastery learning) critics claim that its behaviourist-inspired emphasis on breaking topics down into small skills can backfire on the teacher, especially the foreign language teacher. Gretchen Schwarz and Lee Ann Cavener (336) claim that in particular, "English . . . is a discipline that is not organized in a cumulative, sequential, linear fashion . . . The behaviourist idea of breaking down learning into bits that must be mastered before a student can go on does not work well in English. The discipline itself is more holistic, recursive, and process-oriented, to say nothing of the various ways in which students learn." The sentiment expressed here is one which is hotly debated at our college, as will be seen in the qualitative research section particularly. They also criticise outcomes-based-education for its bureaucratic emphasis, ignoring what Dewey would call 'teacher professionalism': "Although OBE advocates claim that OBE liberates teachers, the emphasis on standardization and accountability . . . keeps teachers voiceless, yet responsible for the results . . . " (my emphasis) (Schwarz and Cavener 325). While standardization and accountability are not intrinsic to mastery learning philosophy, mastery learning does fit neatly into the 'vertical' methods described in the Introduction, and as will be shown, was in part selected for use at our school in order to ensure some standardization and accountability.  Reitz 45  4.2  Use of Ability-Grouping ('Leveling') versus 'Tracking'  4.2.1  Rationale for 'Tracking'  This is a most important issue at many foreign language institutions, including ours, which are philosophically opposed to 'tracking' but find it pedagogically useful to place students in at least some of their classes according to ability levels. James S. Coleman and Kenneth A. Strike have both grappled with this distinction. Coleman ("Concept") showed how, over time, North American secondary schools diversified from providing only one (academic) track to offering a second, vocational track. This diversification was thought to promote a greater choice of opportunities to young people as a group. Free secondary academic or vocational education would then, it was reasoned, be available and relevant to the children of all classes. However, as Coleman pointed out, it has come to present quite the opposite dilemma on the level of an individual student, who, through achievement testing or personal choice, is 'tracked' or assigned "to a curriculum . . . (which). . . closes off for that child the opportunity to attend college" (7). "(This). . . assignment of a child to a specific curriculum implies acceptance of the concept of equality which takes futures as given" (10). Coleman, therefore, was quite concerned with the inequality of opportunity tracking can confer to individuals. I will refer to this practice of placing an individual student onto a curriculum which forecloses significant future possibilities as 'tracking'.  4.2.2  A Sample Case: Elementary School Reading Groups — Tracking or Ability Grouping?  Reading groups at the elementary school level have often been criticised because they are assumed to be the first stage of tracking in an educational 'meritocracy' which  Reitz 46  continues to significantly influence one's life chances beyond graduation. Kenneth A. Strike ("Fairness") attempted to analyse this phenomenon from a justice point of view. By 'meritocratic' Strike characterized practices such as adherence to strict medical school entrance standards which result "in the distribution of some desired but scarce benefit to those who deserve it. Meritocratic selection is often thought to be justified in that it results in an efficient distribution of scarce resources to the benefit of all" (my emphases) (127). Critics of ability grouping have questioned "whether what is ostensibly a meritocratic decision is, in fact, based on merit or . . . whether, once students are grouped, instructional time is equally divided" (134). Strike, on the other hand, contends that these questions are irrelevant since ability-grouping, at least at the elementary level, should not be concerned with merit, but with such non-meritocratic criteria as the child's personal needs and ability to profit. Strike takes issue with those who claim being placed in a particular ability group can negatively affect one's self-respect. He contends that if so, this is an unfortunate "consequence of the fact that ability grouping is so often assumed to be part of a meritocratic selection process . . . " (133) and that it can, in fact, enhance esteem, by giving students a greater opportunity to excel, since they are placed in what is truly the best situation for them personally to learn. Although Strike is specifically speaking of elementary school ability-grouping, his rationale for it is the same as that used at my school for 'leveling' students into ability-grouped classes.  4.2.3  Equality of Opportunity and Tracking  Strike implies he sees no great problem with a meritocracy or tracking ires secondary school when he states, "I do not believe that the fair value of liberty requires substantial equality beyond the point of minimal competence . . . (It) does not require that  Reitz 47  everyone be held to a lowest common denominator of competence. It does, however, require that expertise be equitably distributed. What threatens the value of liberty is a monopoly (my note: by one societal group) on expertise" (133). Strike supports inequality which most benefits first, those individuals naturally less-endowed intellectually and second, the less advantaged (cultural/racial/socioeconomic/ gender/religious/etc.) groups. The goal in each case is different, though. The first case focuses on minimal competence on an individual level. The second focuses on equal distribution of competence (outcome distribution) among groups. In summary, Strike supports ability grouping at the elementary level as long as the purpose is to further the two above goals, and providing that it does not serve, nor is it perceived as, the first stage of a meritocratic selection process. Adding a prerequisite of minimal competence on an individual level to the goal of equal outcome distribution among groups may be seen as a useful caveat protecting the ideal of equality against its nemesis, mediocrity. I am reminded^here of an observation by Dr. Martin Luther King, Jr. (147) who stated that compensatory programs were necessary because "It is obvious that if a man is entered at the starting line in a race three hundred years after another man, the first would have to perform some impossible feat in order to catch up with his fellow runner." King is obviously not referring to an individual African-American, because individual AfricanAmericans have certainly 'caught up with' and surpassed most Euro-Americans in every way. He is referring here to African-Americans as a group, whose outcome distribution he wants to approximate that of other Americans. However, Strike would include a goal of minimal competence on the individual level in addition to that of equal outcome distribution among (in this case, racial) groups.  Reitz 48  4.2.4  Leveling and Tracking  In describing the program we use at our school, a former Japanese board member also uses a racetrack analogy: One of the characteristics is that there are two categories among the levels. Levels One to Four are called 'Foundation' and Levels Five to Seven are called 'Transition'. There are huge differences in concept between Foundation and Transition. Foundation is like, for example, going to driver's training school. If students have developed a skill, they can move on to the next level. If not, they repeat the level. . . Students only can repeat levels up to Level Four. Because each student has a different speed of skill learning, one student will learn the skill of Level Three within two modules when the other student learns it in one module. This 'learning speed' is different for each individual, and 'faster' does not necessarily mean 'better'. This difference is like some people can naturally run faster than others; however, if the slower runners have proper training, they can run faster. This training is like Levels One to Four. Once students are in Level Five, 'learning speed' is not an issue anymore. After they complete Level Four, all students can run faster than a certain speed (my note: Strike's 'minimal competence'). Of course there are faster runners and slower runners, but every single student can run at least 100 meters for 15 seconds or faster, and that is the minimum they will need to pass classes in Transition. In my opinion, one of the more interesting issues at my institution has been this use of'leveling' according to ability. I contend that tracking (which is often called 'specialization' when the various tracks are valued equally by society), even at the postsecondary level, should be avoided as long as possible. Any leveling or ability-grouping should be along the line of going through a series of 'pre-requisites' which advance one to a desired goal rather than foreclosing advancement to higher levels. Everyone is seen as climbing the same ladder, though people climb it at different speeds and, if they are tall enough, might even be able to start climbing it several rungs above the bottom. This is  Reitz 49  what one sees in ballet or music lessons, or in schools such as ours. Unless it forecloses future choices, this is not tracking, but ability-grouping. Of course, there inevitably comes a point at which a person must decide their future course ~ be that in choosing a 'major', deciding to study language 'X' instead of language 'Y', or in deciding to pursue a particular trades certificate ~ all of which do, in fact, take one 'out of the GENERAL race' and put one on a SPECIFIC 'track'. The point is to put this 'fork in the road' as far off as possible so as to provide the greatest selection of opportunities as possible to the greatest number of young people - to come as close as possible to the mirage we call 'equality of opportunity'.  4.3  Relevant Japanese Curricular, Evaluation, and Justice Issues  In Japan, too, there are both horizontal and vertical controls guiding people's behaviour. However, both types of control are far better established and taken far more for granted than in Canada. Japan sees itself as an egalitarian, homogeneous society in many ways and takes great pains to discourage individuality in the name of group harmony. Yet, its strictly-regulated meritocracy determines to a large degree what individual Japanese people do and how they relate to each other. While my western mind-set automatically sees this as a dichotomy, the Confucian mind-set sees these as complementary aspects of the harmonious way in which people are meant to live. Perhaps harmony can be seen as a 'superordinate' position uniting the two divergent Confucian values of merit and equality, much as fraternity (thesis, page 22) can be seen as bridging the gulf between liberty and equality in democratic thought. Accordingly, the twofirstphilosophies, egalitarianism and meritocracy, hold sway respectively at the elementary and secondary levels. By the time students reach the post-secondary level, they have successfully learned how to meld the two; they will have  Reitz 50  a strong sense of loyalty to whatever group they join, while showing an equally high degree of respect for those in positions of authority.  4.3.1  Horizontal, Egalitarian Controls  'Horizontal' controls stress Japan's group mentality and egalitarianism. Japan's schools, both public and private, must teach the various curricula dictated by the federal Education Ministry, using only the textbooks they prescribe, thus contributing to the perception of Japanese schools as egalitarian agents, unifying the classes and the different parts of the country. Elementary schools do not differentiate among pupils according to ability (except for severe disabilities). Students pass on to the next grade with their cohort group, regardless of their performance or mastery of skills. To fail a student is not an option; in fact it would be an admission of poor teaching. Peer controls, occasionally including bullying, are very strong, rapidly teaching those who deviate from the norm that 'the nail that stands out will get hammered down,' as the famous Japanese proverb goes. Individuality and creativity are not nurtured in Japan as they are in Canadian schools. Unlike their peers in Japan, students in North America are asked from an early age not only to have personal opinions, but to be able to express, defend, and justify them, The individual's opinion is sought after and his right to hold it respected, but it is open to challenge at any time. Recognition of individual performance is very important, and by secondary school, 'copying' out of a book without acknowledging it, or copying another student's work is considered 'theft of an idea'. In contrast, Japanese students may not see individual ideas as having such value, and may see copying as a sign of respect for the author, artist, etc. There are some vocal Japanese critics of this cultural value, however, as in the 1985 Provisional Council on Educational Reform's First Report on  Reitz 51  Education Reform which was commissioned by the Government of Japan to deal with a perceived 'crisis in Japanese education'. Because of their cultural ideal of homogeneity, Japanese experience discomfort in discussing individual differences, especially in ability. In elementary school, everyone of the same age is treated as being of the same ability and generally expected to be able to perform at the same level. If they do not, they are exhorted to try harder, as poor performance is interpreted as a lack of will or effort, not lack of ability or 'readiness'. Parents and schools generally frown on 'ability' testing, assuming that innate ability or lack thereof is seldom a reason for success or failure; rather, they have a deep belief that hard work will lead to success. In turn, 'lack of ability' is an unacceptable explanation for substandard achievement. However, at puberty, once the Japanese child is thoroughly imbued with the notion of group identity, her/his group undergoes great change, and the child is placed in a new position of vertical competition with her/his peers.  4.3.2  Vertical, Meritocratic Controls  This is because after elementary (or junior high school at the latest), merit, not equality, is the driving force, as students are sifted and divided according to their ability to pass school entrance exams. Torstein Husen refers to this system as "the Great Sieve that sorts and certifies people for their slot in society" (Husen 411)." This 'Great Sieve' is also referred to as the 'Examination Hell' which determines future placement on the Japanese meritocratic ladder. The idea of meritocratic civil service exams originated in China in the sixth century and was justified on the grounds that "this method would allow those with natural ability to enjoy equal opportunity with the aristocracy" (Pinar, et al. 797). It acknowledged that human talents of value to the state were to be found among  Reitz 52  commoners as frequently as among aristocrats, a 'horizontal', egalitarian notion. On the other hand, the goal of this method was to place those who possessed these talents within the control of a Vertical' hierarchy, with merit the criterion of initial placement and age the primary criterion of advancement. This idea was to take firm root in Japan, where it now affects practically every person who attends school or gets a job in the country. Pencil-and-paper tests, which are primarily short-answer (but recently have required an occasional essay-answer), determine a student's progress throughout every stage of this system, from entry into even some kindergartens and elementary schools, to junior high, to high school, and finally, to college or university (Costniuk 147; Unks 35). From the top down, it is not the student's school record that counts; rather, performance on the entrance test determines to which school s/he can proceed, with the reputation of the preceding school determining which succeeding school (or eventually, employer) will even consider the student as a possible candidate, worthy to take the exam. Many large businesses and government ministries hire only from among the graduates of a particular university. On-the-job education is provided by employers, who expect even university graduates to have only a good general education and very little differentiation or specific job-oriented practical knowledge (Leclerq). Ironically, the hardest and most significant part of university is getting admitted. Once admitted, the student is virtually assured of both graduating and gaining employment based not on her/his university record, but on the prestige of the university itself. Although there are many serious university students in Japan, the university experience in Japan is sometimes referred to as the 'Four Wasted Years' or 'Leisureland' (Chapman). Private juku (after-hours) schools function to assist individuals to attain a higher slot in this meritocracy. Attended by the majority of Japanese secondary pupils, they provide remediation to those with learning problems, enrichment to those who need extra challenge, and review to those students who simply want to do better on exams. Public  Reitz 53  school teachers often lecture to large classes with little concern as to whether the majority of students understand them; this is because it is understood students will cover the subject matter again in the juku school. Juku, at which many students study until midnight, are blamed for the common problem of Japanese students sleeping in their daytime classes. This is exacerbated by the fact that Japanese teachers do not regularly interact verbally with students during lectures, so students do not feel pressure to engage and can 'dose off without penalty. Also, 'the group,' because it doesn't want the failure of any of its members to bring it public embarrassment, often carries with it students who depend on their more diligent peers to take notes and help them with homework. Though Japanese students are highly competitive, then, they also feel significant responsibility for the success of their group. Similarly, juku can be seen to assist both horizontal and vertical power structures. Thanks to juku schools which compensate for their shortcomings, says Kazuyuki Kitamura (161), public schools "can function according to the two principles of egalitarianism and uniformity," acting on the premise that all students have equal ability. Ironically, however,/w£w are available only to those in society whose parents can afford them, thus widening the gulf between parents with high and low incomes into even greater inequalities between their children's educational credentials — and the respective children's future earning power. At every level of education, then, the focus of teacher, parent, and student is on performance at the next crucial entrance exam. Junior high schools teach to the tests of the target high schools, and senior secondaries teach to the university entrance exams. At the private jukus, the entrance exams are the prime agenda and most learning is by the traditional rote method. Ability to demonstrate knowledge of specific facts on a specific date, then, is valued far more than ability to perform well in day-to-day classroom tasks or to organize and compose one's thoughts in either written or oral discourse.  Reitz 54  Vertical power encompasses both the respect that one feels towards those above one in the hierarchy and the desire to rise within the hierarchy oneself. Throughout the high school years, culminating in the university entrance exam, the young Japanese person establishes her/his general position in the meritocracy, one which will probably determine much of the rest of her/his life chances. Given the importance of this position, it is no surprise that Japanese students are very anxious to excel, and fear making mistakes.  4.3.3  Amalgamation of Horizontal and Vertical Systems to Assure Harmony  Japan reflects the melding of the Confucian ideals of respect for hierarchy (including desire to rise within it) and submersion of self into the group. Strict hierarchies exist, yet equality is imperative within the group. These ideals can be seen in Japanese attitudes towards ways of showing respect to the teacher, towards ambiguity in the curriculum, towards tracking, and towards the written word. All of these attitudes are commonly exhibited in classrooms at our college.  'Respect'  The ways that Japanese students show respect to the teacher are completely the opposite of what is expected in Canada and, unfortunately, tend to stifle oral language learning. First, one normally shows respect by silence, certainly not by chattering. For another, in Japan, pupils often show respect by trying to faithfully copy their sensei. In language learning, the end goal is not to simply 'parrot', but to generate unique, contextsappropriate utterances or writing. In a non-interactive classroom such as is the norm in  Reitz 55  Japanese secondary schools, at least (interaction is far more common at the elementary school level), this rarely occurs.  'Ambiguity'  Japanese generally feel that respect for authority is necessary to ensure harmony, whether or not that authority is 'justified'. This abnegation of self and peers can lead to a belief that there is only one right answer, that of'the authority'. Ironically, this way of thinking somewhat parallels that of western thinking which sees reality as unitary and more accurately (objectively) perceivedfrom'outside self. The difference is that for Japanese, 'outside self can imply another person, while in western thinking it means 'objective' science. [This 'vertical', authoritarian kind of thinking is culturally, I feel, tempered by another, more 'horizontal' Japanese thought pattern which teaches that 'the truth' instead of being 'out there' resides in multiple intersubjectivities in which reality is an "intersubjective construct to be formulated and negotiated intersubjectively," as Pinar, et al., (412) paraphrase Tets uo Aoki.} Nevertheless, many young Japanese students come to assume that answers are either correct or not, and that the teacher is the judge. If a student gives an incorrect answer publicly, s/he shames (embarasses) the entire group. As a result, if a student is unsure of the answer, s/he remains silent (also a sign of respect of the teacher) or tries to consult with fellow-students rather than say the wrong thing. Japanese students are incredulous when western teachers claim, in the Socratic tradition, not to know 'the answer,' to insist that there are many acceptable answers to a particular question, or to encourage students 'not to worry about mistakes - just talk (or write) as much as you can!' Japanese teachers, particularly in the secondary schools, are expected to follow a very explicit, prescribed curriculum which will prepare students for university exams that do not tolerate ambiguity. Students passively take in knowledge  Reitz 56  from the teacher through listening, reading, and silent observation, do practice drills to memorize it, and reproduce it on an exam.  'Tracking'  Japanese secondary schools are mostly untracked. However, as Susan Goya (128) points out, "Japanese students are tracked, not into different programs within one school, but into entirely different schools. Moreover, this tracking rigidly determines a student's future career possiblities." Therefore, once the studentfindsher/his place within the hierarchy of secondary schools, s/he is able to find her/his place within the egalitarian, homogeneous group within that school. The placement of students into levels at our college, though certainly not intended as 'tracking' (see thesis, pages 45-49) could easily lead to student perceptions of a meritocracy. Students might consider their 'group' to be their classmates in the same level, rather than seeing the school as one egalitarian, homogeneous group.  'The Written Word'  The Chinese civil-service exams emphasized the written word. This may have been because the same written (ideographic) language was successfully used by people speaking many different spoken languages of China. Obviously, their common language was the written one, not their various oral ones. In Japan, too, written language forms are more respected than spoken language. However, tacit, mutual understanding is considered superior to either one: "Japanese tend to distrust verbal facility in communicating personal opinion as being glib and superficial. . . simplicity of expression . . . is valued more highly than elaborately reasoned explanations" (Naotsuka and Sakamoto, et al. 173-4). As S. Nakayama notes, these values clearly contrast with  Reitz 57  the Greek and Judeo-Hebraic oral traditions of rhetoric ~ dialogue, reasoned argument, and debate. Presumably, proficiency in both oral and written language is valued by modern language learners. However, Japanese cultural traditions encouraging silence, and valuing the written word over the oral tend to make this kind of learning very difficult for many Japanese students in our college.  4.3.4  Summary  The most important thing I hope to convey here is that this is the general background of our students. However, students who come to Canada to study English are generally not typical Japanese young people, but those who want to try a different kind of post-secondary education. They know that learning conditions will be quite different here. They expect to have their Japanese ideas about education challenged. They hope they will like and be successful with the new teaching styles. They are given extensive written and oral translations of promotional materials which explicitly describe the teaching styles of Canadian teachers, and during orientation sessions in Japan experience sample lessons taught by teachers from the college either in person or via video. Nevertheless, for most of these students the reality of Canadian teaching assumptions and methods comes as a real shock and it is extremely difficult for many of them to overcome the classroom habits of twelve or more years. The adventure — both for them and for us as teachers ~ is the daily attempt to bridge those cross-cultural (not to mention linguistic) gaps. The fact that we are able to do so attests to the tremendous mutual effort of both students and teachers which makes our college a very exciting and satisfying place in which to grow.  Reitz 58  5.0  T H E RESEARCH PROJECTS I have completed two research projects in my own institution, one using a  primarily qualitative approach and the other using a primarily quantitative approach. The primary goal of these projects was to determine, from two different perspectives, the degree to which the present strictly-leveled, modular, discrete-skills-based, masterylearning programme used by the college promotes a just distribution of educational resources, according to the criteria set forth by John Rawls. The secondary goal was to demonstrate the type of research advocated in my thesis, which uses a combination of qualitative and quantitative methods to evaluate fairness, in particular when determining whether a curricular programme is of benefit to all students. As I worked into the project, I realized I had acquired three additional goals: First, as I started to wrestle with some of the implications of John Rawls' theory of 'justice as fairness,' I saw one shouldn't, as Rawls himself warned (8-9), assume that his theory would be directly applicable in any situation. Therefore, I saw that I must also determine whether Rawls' criteria for evaluating the justice of institutions would be appropriate for the type of research I was attempting to do. Also, I wanted to see what kinds of answers the two respective (qualitative and quantitative) research methods would give me, and how the data could be synthesized. At the beginning, I did not have a clear notion of how the data would 'fit together.' It should be noted that if the programme cannot be demonstrated to be equal or superior in effectiveness to others (e.g. the one it replaced), it cannot be considered to benefit the group as a whole, and would therefore automatically be considered an unjust innovation. Therefore, my final goal was to compare the effectiveness of the present programme to that of the programme it replaced. Obviously, it cannot be compared to any other possibly-superior programmes which haven't been tried at this institution using  Reitz 59  these research methods (post-hoc test data and teacher interviews about their experience of the two programmes) which dictate that both programmes have been both tried and tested on students at the same institution. Note that to determine the justice of the programme, one quantitative and one qualitative project were chosen from among myriad possibilities. Obviously, the greater the number of perspectives from which a programme is viewed, the more accurately it can be seen and the more fairly it can be judged. If time and funding permit two (or preferably, even more) research projects, it would seem logical to try to look at something from at least one quantitative and at least one qualitative perspective, much as a doctor not only takes patients' vital signs, but also asks them how they feel. Each perspective can be seen to inform and support the other as well as to help confirm or reinforce notions formed by using the other perspectives. Thus does the ancient wisdom of the fable, The Blind Men and the Elephant, continue to remind us of our personal limitations and our collective wisdom. Not only does the fable teach that a more complete notion of a whole comes from viewing it from different perspectives; it also teaches that the view from each perspective is misleading, if the viewer naively supposes that what is seen is the whole. In short, neither of the projects I have chosen gives the 'final word' on the justice of this programme change; the two projects put together are better than one, but they are still only two out of countless possible perspectives that could be taken to provide a more accurate determination. John Rawls' principles state than an institution, policy, programme, etc. can be deemed 'just' if it benefits and distributes resources equally among all participants or, if not equally, that any unequal distribution can be shown to eventually benefit the group as a whole and benefits the least advantaged more than would a strictly equal distribution. Therefore, the distribution of resources among the lower-, middle-, and higher-entry groups will be considered separately.  Reitz 60  It is possible that if a programme can be shown to benefit one subgroup, while not benefitting the group as a whole, its use within that subgroup only might be justified providing this limited inequality can be shown to benefit the group as a whole including the least advantaged. As a bit of background, I noted a fair bit of questioning among my fellow firstyear (of a two-to-four year programme) faculty members about the wisdom of retaining an 80% mastery pass standard, and some dissatisfaction as well with the need for summative exams after everyfive-to-sixweek module. To some teachers, the 80% standard seemed artificially high and the modules too short. I thought (and my institution agreed) it might be useful to have some formal feedback at this time, therefore, on the present programme. The 80% mastery standard and thefive-to-sixweek modules are an integral part of the programme (but impossible to evaluate separately from one another, nor from the associated contribution of other components such as 'levelling' students). If the research were to show the programme to be equally or more effective than its predecessor, its continuation would be supported. If it were to show the programme is not as effective, there would be good reason to question it and justification for experimenting with alternatives. If it were to show the programme is justly administered to all students {provides a just distribution of learning outcomes among all abilitylevels), its continuation would also be supported. Finally, if the programme were shown not to benefit some segment of the student population, the institution might want to consider alternatives to it for that segment of the population. I don't claim to be providing all the data relevant to making 'the right' decision. For example, the SLEP test which will be used to assess 'learning outcomes' (see  -  F  Appendix 8.5 for a description of this test) does not test language 'output' (speakingfind writing) skills, only the more easily measurable language input skills, listening and  Reitz 61  reading. It is entirely possible that the output language skills are developed quite differently in the new programme, but the data to show this is not as readily available nor as clearly 'objective' as the SLEP scores. My intent as a researcher was to create discussion, not dissension. The data provided will, hopefully, help inform and clarify this discussion in a coherent, organized fashion. 5.1  Quantitative Project — Description and Results  This was a post-hoc study comparing matched-pairs of students before and after the institution's initiation, in the 1991-1992 school year, of a strictly-leveled, modular, primarily discrete-skills-based, mastery-learning programme. The study took place in May, 1996 through February, 1997. Listening and Reading SLEP scores of 824 students (207 in 1990, 255 in 1992, 196 in 1994, and 166 in 1996) were considered. I started out with these questions and hypotheses: 5.1.1  The Research Questions:  When Japanese students of English who are studying in a high-medium-low track (as distinguished from 'level' or 'ability group' in thesis, page 45), non-modular, nonmastery, and primarily content-based classroom situation (hereinafter referred to as 'the previous' programme) are compared with those studying under a leveled, modular, mastery-learning and primarily discrete-skills-based classroom situation (hereinafter referred to as 'the present' programme): (1)  Is there a difference in the way the previous and the present programmes' mean  SLEP (Secondary Level English Proficiency Test) scores increase over the year, and if so, what is it? (Note: separate scores are given for the Reading and Listening components of this test.)  "' y  Reitz 62  (2)  Is there a difference in the way the two programmes' respective mean Reading  SLEP scores increase versus the way their respective mean Listening SLEP scores increase, and if so, what is it? (3)  Is there a difference in the way the two programmes' respective lower-entry SLEP  students' mean scores (in both Listening and Reading) increase versus the way their higher-entry SLEP students' mean scores increase and if so, how? 5.1.2 Hypotheses: Under a levelled, modular, skills-based, mastery-learning situation: (1)  There will be a generally positive change in the mean increase in SLEP  scores when the present programme is compared to the previous programme. Rationale: This is because of the supposed superiority of a leveled, discreteskills-based, modular, mastery programme in teaching basic skills and because the SLEP test in particular measures basic English listening and reading skills. As well, the teachers' growing expertise (through two to six more years' experience) in teaching the same clientele should have some positive effect on student achievement, regardless of the particular programme used. (2)  The positive mean increase in total SLEP hypothesized in (1) is projected  to have a proportionately greater impact on listening than on reading. Rationale: Both reading and listening skills were being equally and-specifically targeted in the new programme. There has traditionally been a difference between the way Reading versus Listening SLEP scores change. In general, students imrjgOjve more dramatically in listening than in reading thefirstyear. Since listening skills always  Reitz 63  improve more dramatically, this difference will presumably continue to hold in the present programme. (3)  The lower entry SLEP students in the present programme will exhibit a  greater positive change over the previous programme in both Reading and Listening mean SLEP score increase than the higher entry SLEP students. Rationale: This is because mastery programmes are thought to benefit students with lower ability and/or achievement even more than students of higher ability and/or achievement. 5.1.3  Defining Factors:  Population: All students are recent graduates of Japanese high schools studying English in Canada at a Japanese and Canadian joint-venture institution recently granted accreditation by the Private Post-Secondary Education Commission of the Province of British Columbia. While extremely high intelligence or socio-economic status may be found occasionally among the students, the opposite extreme is never found — the former because of entrance requirements and the latter because of the sponsor income level required to send a child to an overseas private school. Independent Variable: This is the initiation of strictly-leveled, modular, discreteskills-based mastery-learning instruction in Year 1991-1992 (fourth year of the school's operation). Note that a detailed description of the programme used in thefirstthree years of the school's operation will be provided through teacher interviews in the qualitative study). There were four levels of the independent variable: previous programme (control) Reading vs. present programme Reading and previous programme (control) Listening vs. present programme Listening.  Reitz 64  I assumed that data from thefinalyear of the previous programme (1990-1991) would be useful for the 'baseline' since the programme was well-established by that point; as well, it was the only year in the previous program at which SLEP was administered at entry. In addition, I felt that data from 1991-1992, the initial year of the present programme, should not be used as teachers were just becoming familiar with it. SLEP scores from the years 1992-1993, 1994-1995, and 1996-1997, therefore, were chosen to represent student achievement within the present programme. It seemed likely that a two-year interval would give a fair indication of student progress without unduly introducing teacher and student variables (such as teacher personnel changes and changing student cohort attributes). Dependent Variables: Change in individuals' Reading SLEP and in individuals' Listening SLEP between entry and exit. Control Variables:  Instructor and Content Factors: Except for thefirsttwo years of its operation, the institution has had a very low faculty turnover, so it may be claimed that students in each of the four years in question (1990-1991,1992-1993, 19941995, and 1996-1997) had the same school, the same general staff, and, in general, the same subjects were being taught. There were some significant departures ~ more 'content' teaching characterized the previous programme, while the present programme is more 'discrete-skills'-based - at least in thefirstfour levels. Also, in the 1990-1991 year, many students studied pronunciation skills using computer-assisted learning ('MacEnglish'), a program which was slowly phased out over the following two-to-three years.  Reitz 65  There is the also the above-noted aspect of improved teacher expertise over the period of the study. The present programme's skills-based nature and this greater collective teaching experience would presumably favour the present programme. Another possible teacher factor favouring the previous programme might be the greater energy and enthusiasm with which people confront new tasks and novel challenges such as the opening of a new college. On the other hand, the same factor could also have been at play during the implementation period of the present programme. Subject Factors: Factors of gender (approximately 50% each sex) and age (primarily ages 18 and 19) were approximately the same for all four groups. Students are of the same culture and race. It is assumed that personality and motivational factors were constant for all the years (although the economic downswing in recent years in Japan may have had a motivational effect — possibly positively on some and negatively on others). Variations in initial ability were controlled for through matching like pairs. Note: Students in the lowest entry-levels were predominantly male and those in the highest entry-levels were predominantly female, in all years in question. Test Factors: Venue and tape quality can affect SLEP Listening scores. An attempt was made to ascertain whether this could have been a factor. The, entry SLEP test is now given to students in Japan, shortly before they arrive in Canada. In 1992, 1994, and 1996 SLEP was administered in large group settings in Japan. The listening conditions and the tape quality at this venue were approximately the same in all three years, according to those who administered it.  Reitz 66  However, the entry test in the 1990 baseline year was given at the college, in small classrooms. This (possibly) more favourable testing condition in the baseline year entry SLEP test could have made the initial scores (especially Listening) higher and thus, any gain smaller. Note that some of the students may have actually suffered the reverse effect since they may have been suffering from 'jet lag', having arrived just a few days earlier. This would have made the initial scores lower and thus, any gain larger. It is impossible to say how much these different conditions may have affected the baseline data. I can affirm that the conditions under which the exit exam was given on campus were approximately the same in all four years. Textbook Factors: Different texts were also introduced at the same time as the new programme. In some subjects, new texts were again introduced in the second and fifth years of the new programme. This experimentation with different textbooks and other materials characterizes the institution, in both the previous and present programmes, and is a factor to consider when interpreting the results. The texts in the present programme's second (1992-1993) year and fourth year (1994-1995) were, with minor exceptions, the same. The reading text used in 1996-1997 is different from that used in 1992-1993 or 1994-1995. 5.1.4  Kinds of Analysis Used:  Standard statistical analysis was used. 'Matched Pair t-tests' were performed to determine means of the differences between exit and entry SLEP scores (net increase) according to various conditions (and the significance of these). I used the 'SPSS 6. k  fori  Windows Student Version' statistics programme on a 486 Packard Bell computer to analyze the data and to make the original charts. For each of the four years, 1990, 1992,  Reitz 67  1994, and 1996,1 entered each student's student number, entry and exit Listening SLEP score, and entry and exit Reading SLEP score. From these three lists, I was able to set up three Matched-Pair groups (1990 vs. 1992,1990 vs. 1994, and 1990 vs. 1996) in which pairs of students both had the same entry Listening and the same entry Reading SLEP score. Theoretically, the two students in each pair are considered of the same initial skill level — they are statistically considered one person undergoing two different treatments. For a sample of the way these pairs were described statistically, see Appendix 8.1. In order to investigate the third hypothesis, I had to create three subgroups from the data, a low-, medium- and high- entry group based on their 'total' entry SLEP score, the sum of their Reading and Listening SLEP scores. The three groups were numbered T to '3', from low to high. (Note that the institution does, in fact, do its initial leveling of students according to this same 'total' SLEP. While there may often be a perfect correlation between a student's actual 'assigned' entry level at the college and the level to which this statistical procedure assigns them, these must be noted as two entirely different ideas.) In order to use the same criteria for both sets of years, there was a different percentage of students composing the various levels in each set of years. For example, 20% of 1990/1992 students were in Level One, compared to 24% of 1990/1994 students and 25% of 1990/1996 students. Though the cutoff points between levels seemed arbitrary, I was attempting to create a 'low' and a 'high' range of approximately 20% each year. This was as close as it was possible to approximate that goal:  Reitz 68  Table 3 Proportion of Subjects in the Three Levels, by Matched-Pair Groupings  Entrv Level  SLEP Score  No. of Matched Pairs per Group, per Level: 1990/1992 % n=*  1990/1994 % n=*  1990/1996 % n=*  '1'  20-28  20%  18  24%  24  25%  26  '2'  29-36  54%  72  55%  55  59%  63  *3'  37+  26%  32  21%  21  16%  17  * Note: 'n' is the number of pairs, so [n = 18] = 36 subjects, 18 from each year  As is apparent, the numbers and percentages of pairs in each level vary greatly. However, this does not affect the means, only their variability. It is more difficult to find significance when comparing two small groups than when comparing two large groups. In general, the larger the sample, the smaller the range of possible means (i.e., the Level Two group is the largest and has the smallest range of possible means). The statistics programme enabled me to define the three above groups. This was all the information the programme needed. With minimal direction on my part, it did the rest. The results are shown graphically in Figures 1,2.1, 2.2, 3.1, and 3 .2. The more detailed sample programme inputs and transformations plus the actual programme outputs are included in Appendices 8.1 and 8.2. 5.1.5 Quantitative Study Results according to Hypotheses^ Hypothesis #1 - Figure 1: "There will be a generally positive change in the mean increase in SLEP scores when the present programme is compared to the previous programme. "  Reitz 69  VO  0\ —  .  ^o < >"*2 o= Cs  SB  CS  CIu "O  cs  o  a. ON  ir,  0 0  —  ft, w  o  C/3  o  a.  "2 * * S u w O  -2 © H  ft. J2 W .5 J o  c«  ft.  CB  B K  S  .-  E  e  >> Z Is,  e ^_  Reitz 70  The full-group results (illustrated in Figure 1) uniformly surprised me. Comparing 1990/1992, 1990/1994, or 1990/1996, the student's mean improvement in total SLEP was greater in the previous programme. However, the 't'-tests showed that these differences were not significant. (1990/1992 significance was p=.328, 1990/1994 significance was p=.618, and 1990/1996 significance was p=. 153. Generally only 'p' of equal to or less than .05 is considered significant. The first hypothesis, then, was not supported. The lack of significance of the difference means that neither previous nor present programme can clearly claim general superiority over the other in terms of student improvement in basic English (reading and listening) skills over the year. Figure 1 also illustrates an anomaly created by repeating a Matched-Pairs Design year after year using the same baseline database. Almost inevitably, the actual number of students in the baseline year is going to be greater than the number of individual cases that are found to be 'matchable' with the students in the year with which they are being 'twinned'. Therefore, three unique sets of 1990 students were created, those being matched with 1992 students (n = 122), those being matched with 1994 students (n=100), and those being matched with 1996 students (n=106). To see the effects of this, compare the mean increase in SLEP scores for the 1990 group matched with 1992 (mean=l 1.6) with the 1990 group matched with 1994 (mean = 12.8), and with that matched with 1996 (mean — 12.7). Note that none of the matched-pair groups are representative of the population they are taken from.  The reason for matching the pairs is not to compare the students,  or even the years, but to compare the programmes. For example, if one wanted to compare 1992 with 1996, one would have to match 1992 with 1996 rather than  Reitz 71  comparing 1990/1992 with 1990/1996. I have not done this as I am only comparing the two (previous and present) programmes. I was interested in this phenomenon and decided to compare the mean entry SLEP and mean gain of each whole-class with the smaller matched groups to see how representative they were. Here are the results:  Table 4 Mean SLEP Entry Score and Gain (per Whole-Class and Matched-Pair Groupings) Year  Whole-class or Matched pair? n  Mean Entry SLEP  Mean SLEP Gain  1990  WHOLE-class  207  32.9  12.1  1990  Matched with 1992  122  33.7  11.6  1990  Matched with 1994  100  32.7  12.8  1990  Matched with 1996  106  32.3  12.7  1992  WHOLE-class  255  35.6  10.7  1992  Matched with 1990  122  33.7  11.1  1994  WHOLE-class  196  33.9  12.1  1994  Matched with 1990  100  32.7  12.4  1996  WHOLE-class  166  32.7  11.3  1996  Matched with 1990  106  32.3  11.8  Reitz 72  Note that the higher mean entry SLEP scores generally show less gain. Students in 1992 entered with listening and reading skills that were on average higher than previous years, and, predictably, their average SLEP gain (10.7) was lower. Also, in order to accommodate this difference, the 1992 Matched-Pair group had a slightly lower mean entry SLEP (and higher gain) than the whole-class, and the 1990 group it was matched with had a slightly higher mean entry SLEP ( and lower gain) than the whole-class. However, as shown in 1996, the reverse is not always true; the 1996 whole-class had the lowest mean entry SLEP, but did not show the highest mean increase. These ideas will be further explored in the next section. Hypothesis #2 - Figures 2.1 and 2.2: "The positive mean increase (from Hypothesis # 1) is projected to have a proportionately greater impact on listening than on reading."  Reitz 73  S  e  O PH PH  U -  w> e 1/5  •5 'S S S  s_  «s  PH  0 0  s  SO  i  u != S± 2 « 1  be es  CO  -O HH  in  PH *©  O PH PH  W -J  ox)  0 0  •S .fa  VO  4) «  2 PH. 0>  DJD  e  i-3  o c fee S  IK  <Z)  "o PH  s  5 5 •2 0. cc  "C  w o> Si  e  w  HH  c  S  w  PO  HH  Reitz 74  Again, in all six cases (1990/1992, 1990/1994, and 1990/1996 for both Reading SLEP score change and Listening SLEP score change), the students' mean improvement was greater in the previous programme. However, the 't'-tests again showed that these differences were not significant. A summary of these results is found in Figures 2.1 and 2.2. These figures also show that the second hypothesis appears to hold, for the group overall, at any rate. Both Listening SLEP and Reading SLEP scores held to the same pattern through all four years in question. In all cases the mean Listening SLEP score improvement was two to three points greater than the mean Reading SLEP score improvement. It should be noted that the mean entry Listening SLEP is always lower than the mean entry Reading SLEP. An explanation for this is that in the Japanese secondary school system, English literacy tends to be valued, or at least emphasized, over oral English. Therefore, some of the improved Listening SLEP may be thought of as the students simply realizing what words, previously learned from books, actually sound like ~ a case of listening knowledge 'catching up' to related reading knowledge. It is interesting to note that sometimes the characteristics of the three groupings of1990 students exhibited greater differences from one another than from the other years' students. You can see the results of having three quite different groups of 1990 students if you compare the mean changes in Listening SLEP scores in the Figure 2.1 1990 group matched with 1992 (6.85) with the 1990 group matched with 1994 (7.63). In this case, the difference between the two 1990 groups was much greater than the differences between each and its matched-pair year. The important thing is to compare the trends shown in each graph with one another, not the specific means; as noted before, the two sets (1990/1992 and 1990/1994) are quite different in composition froraone another.  Reitz 75  The first significant result of the study (p< or = .05) is evident in Figure 2.2. In the 1990/1996 matched-pair group, the 1996 students' SLEP Reading improvement was significantly (p=.025) less than that of their 1990 'identical twins.' Since the 1996 students are using a  different reading text {rem the 1992 and 1994 students, it is  possibly a text effect rather than a 'present programme versus past programme' effect. This is probably something we should investigate in the next few months. One thing this illustrated to me is the benefit of being able to chart the same standardized test over a period of years. Anomalous scores such as this can be identified as such instead of being taken to mean more than they should (i.e., 'the general trend'), incorrectly influencing important decision-making. On the other hand, tests come in and out of vogue, and there may be compelling reasons to change standardized tests over the years. Also, sometimes there is a well-meaning rush to change curriculum perceived as inadequate. It would seem illogical to wait for several years, proving through standardized tests that there was indeed something wrong with the curriculum before doing something about it. Using the same rationale, doctors often must treat extremely ill patients before all tests confirm their initial diagnoses. This is not to say only qualitative methods are to be used, only that standardized tests may not be practical or appropriate indicators in all cases; non-standardized quantitative data might be more accessible and appropriate. Each of these provides 'one more piece of the puzzle; one more blind man trying to describe the elephant'. Hypothesis #3 - Figures 3.1 and 3.2: "The lower entry SLEP students in the present programme will exhibit a greater positive change over the previous programme in both Reading and Listening mean SLEP score increase than the higher entry SLEP students."  Reitz 76  Os  os  Os  o  00  «  so  —  os e  Oi lieu  so os o  L  ON  — 1 SO CO  SO  W SO  o os  IT)  OS  -o c cs  ft Scs CU  SO SO  ><  "3 • CU  vO  os o OS  SO  se  o  CM  OS  o  o  OS  OS  ©  ©  SO  Os  0 0  sO  IT,  C4  W DC  Os  ON  CL.  —  CU «»  CS  DC  w  5  *  •2*  w J  •a — 1  CS)  -5  CA  "S a.  c  w  es  CU  'S  <— CL.  O  i  £ A  CU —  E a  s 0(i  es . <u DC  Reitz 77  VO  Z  90  o  VO  os  os os  VO  o  Os  ON  o  ©  OV  > <U  VO OS  SO VO VO  Oj\  o  in  Os  o  00 in 00  e W  e cs 83  OS  z  tvo  S3  S  VO OS  o  VO  VO  Os  o Os  OH  o  CM  W  OS  OS  00  ©  Os  Os  WO JB  •5 es v  as Os  0 0  VO  ID <*>  S3  <U u  CS  OX)  O  •3  ^  OB  W  I  £1 -=  "S Cu  cs u CO "c .2P  w S —  es . v cm  S E  Reitz 78  The third hypothesis, in which the performance of high-entry versus low-entry students in the two programmes was compared, met with interesting and mixed results [See Figures 3.1 and 3.2]. As mentioned, matched pairs were classified as Level One, Two, or Three according to their entry total SLEP scores. To test the third hypothesis, the changes in both Listening and Reading SLEP scores were compared, giving a mean change in both Listening and Reading SLEP for 1990/1992, 1990/1994, and 1990/1996 according to each matched pairs' entry SLEP Level. While the individual comparisons are statistically of little significance, an intriguing pattern emerges which bears considering as being more than the sum of its parts. Whether we look at Figure 3.1 from the point of view from 1992,1994, or 1996, we see the present programme's Level One students achieving a higher mean Listening SLEP improvement than the previous programme's Level One students. The present program's Level Two students, on the other hand, had almost the same (or lower) mean Listening SLEP improvement as the 1990 students in all years. When one examines Level Three, a mirror image of the Level One performance in listening is seen; 1992's Level Three students (matched with 1990 students) show a significant, though not great, decrease in listening improvement from 1990 (p=.022). Note, this is the second of only two statistically significant differences found in the study. Succeeding years 1994 and 1996 also showed Level Three students making less progress than they did in the previous programme, but the differences are insignificant and get successively smaller each year. (Note that the insignificance may be due partially to far lower numbers of matchable students in this level in 1994 and 1996). Reading [Figure 3.2] follows a different pattern. Reading progress (except for an insignificant anomoly in 1992, Level Two) is consistently lower in the present  Reitz 79  programme for all three levels. While none of these differences exhibit p < or = .05, one (1996, Level Two) approaches it (p = .07). Note that while the only statistically significant results described here are the general drop in Reading in 1996 and the Level Three drop in Listening in 1992, other results approached significance. While the patterns are of interest and potentially significant, they should not be misconstrued or used alone to justify any decision-making. 5.1.6  Quantitative Study Results: Summary  Except for one year (1996) in Reading, there is no statistically significant difference in the annual increase in Listening or Reading SLEP scores between students in the previous and those in the present programme. There appears to be no significant difference between programmes in the way Listening and Reading improve. Listening consistently improves more dramatically than Reading. There is a possibility that students respond to the present programme differently according to their entry level. Of those who have entered the school with a low SLEP score (28 or less), those in the present programme tend to exhibit greater mean Listening SLEP gain than those in the previous programme. Conversely, of those who have entered the school with a high SLEP score (37 or more), those in the present programme exhibited a smaller mean Listening and Reading SLEP gain than those in the previous programme, though except for one year (1992) in Listening, these differences are not significant. The pattern is intriguing in its persistence, however. I will attempt to triangulate the data (to combine my conclusions and discussion of these results with those of the qualitative project). I feel the two quite different projects, each seeking to answer the same question, shed interesting light on one another.  Reitz 80  5.2 Qualitative Project — Description and Results 5.2.1  Design of the Study  Introduction  This is a descriptive and comparative study using qualitative methods. The study involved semi-structured interviews of teachers. The aim was to examine, from an instructor's point of view, their perceptions of both the effectiveness (see thesis, page 58) and justice of the previous programme versus that of the present programme. However, teachers were also asked what they perceived as student responses to the two programmes; results of this must be analyzed with the understanding that this data is of a hearsay nature, so quite subjective. Students were not interviewed, as they have no basis upon which to compare the two programmes, having only experienced one. Comparative student perceptions, then, were noted through instructors' imperfect interpretations and memories. Note that for this study, unlike the quantitative one, I had no clear hypotheses. However, it should be noted that the question to be answered is the same in both studies: How justly (and secondarily, how effectively) were these two programmes distributing educational resources, according to Rawls' principles of justice? There were several standard questions of an open-ended nature. Digressions from these were encouraged, though all questions were asked of each instructor. There were also several very specific questions formulated in order to ensure specific points were covered by all interviews. The purpose of the college in encouraging this project was not the justice issue, nor curiosity about what kind of information one could get from qualitative versus quantitative studies; rather, it was interested in my project of seeking out teacher views in  Reitz 81  an anonymous forum, with the purpose of (possibly) giving direction to future plans. Therefore some results I will present to the college (included in Appendix 8.4) will be more detailed and site-specific than those in the body of the thesis. The results section of the thesis, though including less detail, will include an analysis of the kind of information about the distribution of educational resources gained through this methodology. Similarly, the conclusions I will present at the end of the thesis will include an analysis of the kind of conclusions about the distribution of educational resources that could be drawn through this methodology. The campus seems ready to re-examine itself and consider the possibility of another major curriculum shift. It is hoped that this narrative, an amalgam of about twenty hours of interviews with thirteen teachers, will enrich a thoughtful reappraisal which will contribute an even more exciting and successful episode to this unique institution's curriculum history.  There were a total of thirteen subjects. Sixteen  instructors taught at least two years in both the previous and the present programme (at least one full year in each). Of these, eleven (ten, excluding myself) are currently teaching at the institution. Nine of the current instructors agreed to participate. As well, four out of the five instructors meeting the criteria for inclusion, but not currently teaching at the institution agreed to participate. Of these four, two are on temporary leave; one quit voluntarily to pursue further education, and one was laid off, but hopes to return to the college. Since I have also taught in both programmes, I had to be exceedingly careful not to interject my own biases into the interviews. Fortunately, I have not been outspoken in either attacking or defending the present programme, so my views were not generally known, nor (I think) would they have been perceived as threatening to either point of view. While a questionnaire would present less interviewer bias, the instructors are so, busy that a questionnaire would probably have been answered in a cursory fashion, if at  Reitz 82  all. However, it should be noted that one instructor, presently in Japan on leave, was willing to give a long interview (in written, questionnaire fashion) via e-mail! The Questions  Part One: The Previous Programme 1) Describe (the institution's) previous entry-level or beginning ESL programme (before the Foundation programme started in Year Four: 1991-1992). 2)  What did you perceive as the strengths of the previous programme? weaknesses?  3)  How was the previous programme perceived by most teachers? by most students? by high-entry SLEP students? by low-entry SLEP students?  4)  Why do you think (the institution) decided to change to the present programme?  Part Two: The present 'Foundation' programme: 1) What do you perceive as the strengths of (the institution's) present Foundation programme? weaknesses? 2)  How is the present programme perceived by most teachers? by most students? by high-entry SLEP students? by low-entry SLEP students?  3)  How well, if at all, does the present programme address the weaknesses of the previous programme?  4)  How do you personally feel about the following? (I wanted very short, specific answers here) leveling students according to ability five-to-six-week modular system standardized curriculla standardized testing in general 80% pass mark 2 (formative) progress tests counting for 10-30 % of final grade final (summative) exam counting for 50-70% of final grade the term 'mastery'  Reitz 83 the term 'competency-based' Part Three: Evaluation: 1) What sort of evaluation is needed by a) (the campus to which our students transfer in their second year) to determine student placement in Year Two programmes there? b) (this campus) to determine student placement and advancement in its first year programmes? c) students? d) their parents? 2)  How well were these needs met by evaluation practices used in the previous programme?  3)  How well are these needs met by evaluation practices used in the present programme?  4)  Were the previous program's evaluation methods fair (or perceived as fair) to the students?  5)  Are the present program's evaluation methods fair (or perceived as fair) to the students?  6)  Did the previous program's evaluation methods motivate or discourage students in general? high-entry SLEP students? low-entry SLEP students?  7)  Do the present program's evaluation methods motivate or discourage students in general? high-entry SLEP students? low-entry SLEP students?  Part Four: Future: 1) Would you like to return to the previous programme (at this institution)? If so, in what ways? If not, why not? What would be the implications and impacts of this change? 2)  Are there any changes you would like to make in the present programme? What are they? Why would you make these changes? What would be the implications and impacts of this change?  3)  Is there some other kind of programme (neither the previous nor the present programme) that you would like (this institution) to use? What would be the implications and impacts of this change?  Reitz 84  Part Five: Closing: 1) Do you have anything you would like to add? 2)  Do you think the questions were fair and represent the questions that (this institution) should be asking about its programme?  Kinds of analysis used  Types of responses were loosely-tabulated and frequently-given responses were noted. Idiosyncratic responses, patterns, and relationships among responses were particularly noted. This was in many ways a 'fishing expedition' in that both the questions and the interviewers' attitude during the interviews were very 'open-ended.' I truly didn't know what would come out in the one-on-one interviews. During faculty meetings, the more vocal instructors' views were made quite clear, but the more taciturn, those inhibited by the size of the group, or those unwilling to engage in controversy hadn't made their views public. The results of the qualitative research will be presented in the form of an historical narrative, because each interview was basically a retelling of the story of the college, and in particular of the development of its curriculum, from a different perspective. Kidder and Fine (69) refer to this practice as 'Research as Story Telling,' noting that all research, quantitative as well as qualitative, tells a story, and that in the analysis offieldwork (the authors are referring here especially to ethnographic methods), the researcher often is constructing "a narrative pertaining to more than one actor." As stated previously, it is imperative to realize that teachers (and only teachers from the first-year campus) were the only stakeholders consulted; consequently the story is told from their point of view only. Their perspective is only one of several possible versions of the truth and must not be misconstrued as 'the' truth. I have tried to weave all the 3 stories into one coherent narrative, while retaining some of the contradictions and  Reitz 85  inconsistencies, the humour and enthusiasm that made hearing it 'once again' always a new and delightful experience. Hopefully some of the flavour of these interviews will be retained. . . My voice, however, is quite evident: in which statements I choose to include, in which I quote directly, and in how I choose to paraphrase those I do not quote. In the same vein, if given the same set of data, ten researchers would probably choose ten different ways to organize, interpret, and present it. However, I am also convinced that their ten interpretations, though 'different,' would not be contradictory; rather, they would be supportive of one another. It should be noted that I began teaching at the institution in the beginning of Year Three, thefinalyear of the previous programme, so was not present during the difficult startup period, but was present during the development and early years of the present programme. Because I was not able to witness thefirsttwo years personally, my interpretation of the teacher descriptions of these years is perhaps least coloured by my own personal feelings. However, I am also unable to verify any of these descriptions from my personal experience. As well as the voices you will hear, note the missing voices that are sometimes conjectured, sometimes paraphrased, and frequently maligned - particularly those of the students, the administration, and the second-year teachers. Ideally, all of these stakeholders' points of view would be included. These are the missing perspectives which could help define far more clearly the dimensions of the elephant. Without them, we are still only thirteen blind men groping about in the dark, sharing what insights we can collectively gain. Again, what is the nature of truth and reality? I do not present these narratives as 'reality' ~ only as the intersubjective reality of thirteen teachers. Whether 'true' or not, it continues to influence — and explain — the way in which they have chosen (and continue to choose) the curriculum and how they teach it.  ^^  Reitz 86  My initial feeling was that because the events took place so long ago, and occured during a naturally experimental 'start-up' period, no one would feel hurt by the frequent 'hind sight is the best sight' criticism. Part of what I have learned from this research is that one should not assume this. For one thing, unlike teachers, of whom there are so many that no one individual need feel singled out for criticism, only a few people were responsible for management decisions; individuals could therefore be identified and unnecessarily embarrassed. For another, the reader must understand that this was a very groundbreaking cross-cultural venture. During the first two years in particular, a relationship of trust had to be established among the Japanese and Canadian board members, administrators, and staff in three cities in two different countries. This did not come about overnight. Administrators, caught in the middle, were often powerless to make changes they realized must be made until approval came down from slowlydeveloping trans-Pacific channels of authority. Therefore, since my purpose was to examine a curriculum, not to present the definitive 'true' history of the institution, and certainly not to spread gossip or criticise individuals, I deleted many of these negative comments, summarizing only those teacher attitudes towards Year One and Two administration which affected curricular decisions. What follows, then, is an amalgam of the teachers' voices unless otherwise noted.  5.2.2  Qualitative Research Results - Introduction to the College's Story  Once upon a time, a group of educators and businesspeople from Japan and Canada got together to develop a private college in British Columbia for recent Japanese high-school graduates. The school had high ideals of producing graduates (after a twoyear course) of'independent spirit' who were prepared for world-citizenship in their understanding of, and in their ability to communicate (through English) with people of other cultures. As well, they would have an easy familiarity with computers and with at  Reitz 87  least one other specialized subject area such as business, interpretation/translation, teaching Japanese as a foreign language, or environmental or multicultural studies. Amazingly enough, given these high ideals, they succeeded in their endeavour, even expanding to offer both three- and four- year programs. Over fifteen hundred graduates of this college are now working in Japan and internationally today. However, development of the curriculum at the college, in particular that for beginning (entry) students, has had a turbulent history. In the first years of operation, the college used a 'content-based' curriculum largely based on the theories of a highlyrespected educator I will refer to as Dr. V. (my note: this is not her real name. I use a pseudonym for two reasons. First, many teachers in the interviews rather vehemently malign her theories, and in reporting them, I would risk libeling her. Secondly, knowledge of the details of her theory is irrelevant to the purposes of this thesis). Heterogeneous (non-leveled) classes were to be taught on a term or year-long basis. Teachers, though bound by Dr. V.'s theory in that they had to show how every lesson met her specific criteria, were free to develop their own curricula, materials, tests, and grading schemes. Each year, in response to student and teacher demands, the curriculum changed somewhat. By the fourth year, it had changed to providing a discrete-skills-based curriculum for its beginning (entry) students with homogeneous (leveled) classes in Reading, Writing/Grammar, and Listening/Speaking taught in five modules, each five-to-seven weeks long. For each level in each of the above three subjects, teachers developed completely standardized objectives, materials, tests, and grading schemes (which included an 80% 'mastery' pass standard in the first four levels). The strictly-leveled, skills-based component was balanced by other required but heterogeneous (non-leveled) classes delivered to students in a more content-based style (computers, presentation / study skills, experiential studies, plus a cross-cultural survey course taught in Japanese). As well, once students had progressed through the first four levels ('Foundation'), they  Reitz 88  encountered 'Transition' courses: more challenging Reading (with several choices of content), Writing, and Listening/Speaking content coupled with a content 'elective' course, while being freed from the 80% mastery standard (moving to a 50% 'pass' standard). The college has continued to refine this system over the last five years. (My note: That's the basic story, but what was really happening in those classrooms? Why was attendance such a problem? Why did 20 % of the students drop out the first year? Why did so many teachers in Years One and Two quit? Why did the teachers, in the middle of Year Three, decide to make a radical curricular change in Year Four? 1 continue to let them address these questions in their own voices).  5.2.3  The Previous Programme: Introduction  In the beginning . . . it clearly wasn't to be an English as a Second Language school. In the beginning of each year, went the plan, the students would be given 70% 'Bridge' classes in which English skills were taught within the context of a compelling content area such as Writing/Sociology, Reading/Newspapers, or Conversation /Communication Theory. They would also be given 30% 'elective' courses. The proportion of'Bridge' classes was to decrease as the year progressed. Electives included a selection of business (and computer) courses, the Forest Industry in B.C., Environmental Studies, Study of Language (simple linguistics), the History of English, Canadian History, and Human Geography. In courses taught by teams, teachers agreed upon joint objectives, though each teacher was given free rein in deciding how to implement and assess them. In the first year, students were placed at random in classes regardless of ability. Classes were either on a term (there were three terms) or year-long basis with the same teacher. By the second year, a Japanese entrance exam (not SLEP) was used to create three tracks (called 'Levels'): A, B, and C. In general, these were cohort groups which  Reitz 89  moved through the year together in the same class. At the end of Terms I and II, teachers had a big meeting in which a few students were chosen to move up to the next level. The criterion was whether the teachers agreed the student could 'handle the challenge.' No one recalled any students ever having been moved downwards. In the third year (1990-1991), though SLEP was given at entry for thefirsttime (in the students'firstweek in Canada), it was not used to create the three entry tracks (now called 'Levels' One, Two, and Three). Instead, the Japanese entrance exam continued as the criterion. Towards the middle of the third year, a decision was made to start a Level Four for a group of Level Three students who needed even more challenge.  The Previous Programme: Strengths  Most teachers would agree that a lot of great things were happening. Instructors were hired from several different countries; each had her/his 'own style', and they were 'academically exciting.' They were not hired because of their teacher-training or experience; in fact, some had neither. Instructors had been hired because of their knowledge of a content area, and they wanted to teach here primarily because it was not 'an ESL school . 1  Free to experiment, teachers did more or less what they wanted, using their own resources, making their own tests (which were often very challenging, and custom tailored to exactly what they had taught), innovating constantly, collaborating when they felt like it, but allowed to go their own way. One teacher successfully used elementaryschool whole-language methods with the students, while another taught a university' course using a Canadian sociology / textbook. The two things tying these courses together were Dr. V.'s theories and the College's Mission Statement:  Reitz 90 . . . to advance students towards global citizenship as well as making them into culturally informed citizens of their home country. (College 'X') provides for the students a comprehensive learning environment designed to promote: Independence of spirit; Understanding of other peoples and cultures, and Coexistence, developing from a sense of world community.  Initially, several teachers noted, high expectations of the first-year students were generally held, "so teachers really pushed students to succeed" (which some, but not all, students did). Another positive aspect noted by more than one teacher was that over the year or term, teachers got to know students well and so got to tailor what they taught to individual student needs. There was time to really teach the material and to 'spiral' it with previous learning. As in most Japanese post-secondary institutions, student 'failure' was extremely rare, and the lack of leveling the first year gave students a sense of equality. In many ways, teachers recalled less stress (than there is now), with more continuity (fewer changes of instructor or class) and a stronger sense of teacher-student rapport. Some electives had good, strong, challenging content. Some of the content, such as a rather sophisticated cross-cultural communication class, was "very relevant to both the school's philosophy and students' interests." Having electives start at the first of the year ensured that "all students got introduced to critical-thinking and literary-type questions right away." Several instructors noted that this content was more fun to teach and more interesting to learn than the present curriculum. All teachers mentioned the enthusiasm of the faculty; for example: "inspired staff - always busy! . . . core faculty strong, dedicated, committed, forward-looking, cooperative, had a sense of purpose and 'pulling together'. . . sincerity to make this thing work." By the third year, besides the three or four 'tracks' in use, some standardization and 'basic-skills' had been added to the curriculum, which was moving away from the initial open-ended and content-based directive. For example, in classes taught by more 1  Reitz 91  than one teacher, those teaching it had to have at least 50% of their final exam questions 'in common.' As well, conversation classes had strong grammar (language structures) and pronunciation components. In recognition of differing student abilities, some curriculum materials now included suggestions as to how to adjust learning activities to a particular ability level. Though moving in the third year towards a more standardized, skills-based curriculum, as three teachers put it (and several others echoed), "a strength of the conversation class was that it recognized the need for some basic skills; however, the movement was towards communicative competence, not just language," and "a strength of the school was the recognition that it wasn't just language in the curriculum, but a recognition of the value of the subject areas in the globalist realm," andfinally,"I do not think it is a weakness that we started with the concept of content, even though we misapplied it." (My note: The general philosophy upon which the college was built is still supported by most of my subjects, then, though they regret the naivetee with which it was initially applied.)  The Previous Programme: Weaknesses  On the other hand, teachers came up with twice as many weaknesses as strengths and expressed more emotion as they described the extremely challenging circumstances they encountered during the first three years. It is important to note that thefirstyear of almost any programme will have negative 'startup' effects. One could well ask if the negativity — towards, for example, Dr. V.'s theory — might have been misdirected; perhaps if it had been introduced to teachers after t.hty  had gained some experience and confidence with students and the  programme, it would have been very differently received by them. However, while teachers became accustomed to, or learned to ameliorate, many negative programme  Reitz 92  aspects by the second and third years, core weaknesses did not go away . . . Teachers continue to describe their problems: The most glaring weakness, apparent on the first day, was a tremendous misfit between most students' ability-levels and the curriculum which the teachers had developed. In the beginning, teachers, as they recalled it, received little or no documentation about student abilities; many recalled completely rewriting their curriculum once the low English-language level of most students became apparent. As one teacher put it, "Because the curriculum was inaccessible to students (my note: due to their lack of reading skills, vocabulary, and basic idiomatic/cultural understandings), teachers often 'chucked' the official curriculum and re-wrote it on a daily basis, at least for lower-level students." Students were equally stunned to discover how little they understood the classes, and how poorly-prepared their teachers were for them. In general, much of the materials were of far too advanced a nature; many students were barely able to learn even the 'key vocabulary,' much less the 'content.' On the other hand, when teachers tried to adapt the curriculum so that lower-ability students could understand it, they frequently felt intimidated by the school's non-ESL philosophy. Several reported using elementaryschool literature in lieu of'ESL-ed' adult materials. In this case, while the ability-level was appropriate, there was another misfit, this time between student interest and the subject matter. This situation was more intense the first year, but continued on through the third year to a lesser degree. Several noted that in the first year, teachers often felt alienated from an administration which they perceived as overwhelmed with startup duties. Many expressed concern that administrators appeared not to have a clear concept of the students' abilities, or of the curriculum teachers were using, or of the extent to whiclMie two 'matched' one another.  Reitz 93  Related to this problem was the first year's total lack of student leveling, though this changed when a form of 'tracking' (which was referred to as 'leveling') was introduced in the second and third years. Sadly, the track on which one was placed often took on inordinate social importance among status-conscious students, and the previously-noted sense of equality disappeared. This was perhaps exacerbated by the fact that there was no way for most students to change tracks once they'd been placed on one. There was no way to either pass (out of), repeat, or challenge a 'level', and the fact that everyone knew the subject matter in the lower tracks would never reach the same level of sophistication as that found in the upper tracks created a self-esteem (and a possible justice) issue for lower-track students. However, all teachers acknowledged the need for some form of tracking or leveling at least in the beginning of the year since many students hadn't yet acquired the language to access a 'language-based' (i.e., content) curriculum. In heterogeneous classes, teachers found upper-ability students bored or lower-ability students hopelessly confused (often simultaneously). Seldom would either end of the spectrum be satisfied. As an example of teacher-adaptation to this problem, two teachers sometimes leveled elective classes "behind administrators' backs," as one reported it, by trading students in order to form one 'high' and one 'low' ability group. For one teacher, as well, the school, other than being "a vague, philosophical undertaking, really hadn't discovered (or developed) its true identity yet." A consensus was that there seemed to be no overall plan or coordination. Teachers noted that guidance, consistency, clarity and leadership were lacking in many areas. In the realm of curriculum, a lack of goals or a year-long scope and sequence of student learning meant that courses were planned independently of one another and "no attempt to spiral, integrate, or reinforce prior learning could be made." Skills were taught on a 'hit-andmiss' basis. "Depending on what teachers a student got, some skills could be taught several times while others were not taught at all; there was no way to ensure that all  Reitz 94  students would be taught anything. New teachers had no idea how to proceed as nothing concrete was 'in place' to direct them." Ironically, term- and year-end evaluation at that time centred on teachers and students, not on the courses or programme itself. A general teacher misgiving was that "students were getting insufficient training," or as others put it, "teachers felt the courses weren't helping our students" and were "random, ill thought-out". They sympathized with the many first year students who, as one teacher put it, "felt they had been lied to" in that their actual educational experience was apparently quite different from the perception they'd formed from promotional materials. The most common complaints, however, related to a lack of consistency and standards in such areas as what was taught (even in the same subject in the same track), texts, tests, criteria for grades, numbers of field trips and guest speakers, rules and expectations, etc. This was confusing, demoralizing, and seemed unjust to students and damaged the school's credibility with them. A teacher noted that in Japanese education there is a high degree of consistency between classes, materials, and tests at the same grade level among all Japanese schools. Students who come to North America from Japan "want to feel they are receiving the same education (my note - this phrase may mean entirely different things to Japanese and Canadian educators), no matter who their teacher was, and they wanted their grades to 'mean something' — to be tied to some meaningful 'scale.'" However, there was no way of comparing grades one got from different teachers, levels (tracks), and courses. Tests and grades were "all over the map." Teachers acknowledged this, but without clear leadership they were unable to solve this dilemma on their own. As well, standardization of objectives, materials, tests, and grading practices would mean a big trade-off with the independence enjoyed by so many of the faculty.  1  .  Probably the most demoralizing aspect for faculty members, however, was the factionalization that characterized their own ranks. Three areas of dissension arose: use  Reitz 95  of Dr. V.'s theories, heterogeneous versus homogeneous [non-leveled (or multileveled) versus leveled] classes, and the teaching of content versus the teaching of language (skills): "The factionalization and splits among teachers was emotionally draining ~ issues such as 'language versus content' and 'homogeneous versus heterogeneous group' drove people apart.. . however, mutual resentment at being forced to utilize V.'s theories in first year ESL classrooms became a source of cohesion." To this day, a lingering bitterness is revealed in such terms used in the interviews as 'rabid V.-ism' and 'V.-ism to the extreme degree.' As one teacher noted, "some of the best 'content' teachers left the institution because of their frustration at being forced to make everything they taught meet (V.'s) criteria." Another teacher noted that, "Some very good teachers ended up quitting their jobs for some very good reasons." [My note: Ironically, while Dr. V.'s theory was meant to unite the curriculum, opposition to it seems to have ended up uniting the teachers, so inappropriate did every one of the teachers I interviewed deem its use with the first year students. Yet this also resulted in creating an uncomfortable difference between the two faculties of the first and of the second (and later) years' students as Dr. V s theory was ~ and remains ~ a useful and appropriate organizer for the second (and later) years' curriculum.] Teachers who had a background in teaching basic language skills more readily supported the idea of homogeneous grouping, and were often upset when they had to teach a heterogeneous class. Some of them felt personally threatened by the idea of having to teach content, often unfamiliar to them, to a multileveled class (either because of feelings of inadequacy or feeling that it was inappropriate for students, or both). One of their complaints was that the content courses too often used the lecture-andmemorization style that students were familiar with from Japan. However, their basic complaint was that the curriculum didn't address students' lack of basic skills in a coherent, systematic manner.  Reitz 96  At first, heterogeneous grouping was advocated by the 'content' teachers, but many of them came to the conclusion that some students just 'weren't getting it,' not because of inadequate intelligence or lack of effort, but because they lacked basic language skills. As a result, these teachers often became strong advocates of homogeneous groups, at least in the beginning of the year, and for the extreme high and low ability students in particular. Some of them, however, felt personally threatened by the idea of becoming ESL instructors (again, either because of feelings of inadequacy or feeling that it was inappropriate for students, or both). In some cases their philosophical transformation (towards favouring first year homogeneous, skills-based courses) took place over a couple of years' time. Meanwhile there was much argument and controversy. The day-to-day reality for teachers was constant revision and creation of materials, lesson designs, and tests, 'fumbling around' to get through each day, daily (required) lunch-time meetings, 70-hour work weeks (several people noted this), struggling with constant and rapid curriculum changes, lots of developing 'by the seat of one's pants,' and "everyone 'reinventing the wheel.'" [My note: What struck me was how on one hand teachers said they were free to do as they wished, but on the other hand there were a lot of directives (i.e., to use Dr. V.'s criteria) from administration. Perhaps the directives were so frequent that, over time, overwhelmed teachers generally came to ignore them.] For example, teachers said they "were struggling with constant, rapid curriculum changes;"" There were lots of meetings!" but "There was no coordination in the overall plan." Meanwhile, a more immediate concern was how to prevent more student dropouts, as teachers realized their jobs were dependent on retaining as many students as possible. This pressure was difficult for teachers to bear considering they were wqrking so hard and still 'things weren't right.' Many teachers, exhausted and discouraged, simply 'dropped out' (quit) as well.  Reitz 97  By the middle of the third year, V.'s criteria were no longer required and, in fact, rarely used at all. Those who hadn't quit came to realize that even though they felt a well-deserved sense of'ownership' over curriculum they had developed under such trying conditions, it still needed work. The consensus was that they liked the freedom, but the curriculum was simply too difficult to teach, demanding an unrealistic amount of their time, effort, and creativity.  The Previous Programme: Teachers' Perceptions of Students' Responses to it  As the interview progressed, teachers were asked specifically to mentally reconstruct how the previous programme was, according to their memories, perceived by students ~ by students in general, by the high-entry students, and by the low-entry students. In general, they said, most students seemed to enjoy their time in Canada about as much as they do at present; they had a good experience learning in a new way, they improved their English, and they expanded their view of the world. Each year was better than the one previous insofar as producing student satisfaction. However, recurring themes, echoed by many teachers, were that even at the end of Year Three, the programme lacked cohesiveness, purpose, regularity (consistency), and sequence.  Depending on the course and teacher they had, they said they had 'too much homework and it was way too hard' or they had 'too little homework and it was way too easy!' Students complained of having little idea or sense of their own progress. Each year a significant group advanced to their second year with a sense of not having gotten quite what they had expected ~ a vague sense of disappointment, though by Year Three, this was far less pronounced than in thefirstyear. The high-entry students found the programme either exciting and challenging or boring and too easy, depending on their teacher and whether they had been placed in the  Reitz 98  upper track classes. In general they liked the fact that electives (unlike now) started in Term I. They sometimes felt held back by the slow pace of the non-leveled (heterogeneous) courses. Some complained that teachers 'facilitated' courses instead of 'teaching' them [Socratic-style dialogue (between teacher and students) and small group discussion - instead of lecturing]. Others expressed dislike of any skills-components in their classes (i.e., a grammar component in Conversation class), saying they had already learned it in junior high school, while others were very appreciative of specific skill instruction, particularly if they felt it was an area in which they were weak. In the first year, a large percentage of these upper-entry students left at the end of Term I (this was complicated by age and gender factors: they were mostly females, who were significantly older than the balance of the students). In the second and third years, a better effort was made to be sensitive to these various problems — in more explicit promotional literature, in admission practices, and in actual orientation of students. Teachers gave very mixed answers as to how the low-entry level students perceived the previous programme. On the plus side, overall, most of them seemed happy with the programme. They worked hard and had no major complaints. They benefitted from upper-ability students' 'modeling' in their multileveled classes. They enjoyed the elective class activities and being introduced to exciting and interesting concepts, even though they realized others were understanding the subjects more thoroughly than they. Most tended to have fun with the recreation programme while basically (and uncritically) ignoring the academic programme. They knew they were all going to pass anyway. They had an experimental, playful, fun-loving attitude. Unlike most of our present students, a significant portion of our early students, particularly in Year One, had 'a distinctly separate agenda': many had a lot of spending money, were fairly 'wild,' and were often absent from classes. They, like so many of their cohorts in Japanese colleges, considered this a well-deserved 'leisureland' between 'the examination  Reitz 99  hell' of high school and the lifetime of serious employment that would await them upon graduation. However, on the minus side, a significant number of low-entry students were not blissfully ignoring the fact that they were struggling with the academic content. They were, as teachers described, '"lost. . . confused . . . overwhelmed . . . floundering . . . just 'here.'" Because they were unable to fail and then repeat a level, they were consistently dealing with new and challenging material which was 'over their head,' especially in the non-leveled electives. This experience was demoralizing for many, especially considering there was no academic support system (tutors, student-at-risk reporting and counselling, learning resource centre, etc.) like there is now. These students generally did not respond well to the lack of structure and open-endedness of the previous programme, and usually left the first-year campus dissatisfied with what they had learned. The system did not deal with the problem of how to help these students succeed; the best it could do was to put them onto a 'low' track and keep them there all year.  5.2.4  The Present Programme: Introduction  In the middle of the third year, the administration invited all the teachers to a weekend retreat at a resort to deal with all of the noted problems by developing a new programme. When asked why the institution changed to the new programme, teachers cited these as the main reasons:  1)  Out of recognition of the students' needfor fairness and of teachers' need for a curriculum which was easier-to-teach , there was a need for some standardization of the curriculum, of objectives, of materials, of testing, and of grading. [My note: fairness through standardization or 'justice as regularity' (Rawls 504)]  ?  Reitz 100  2)  They were unhappy with the orientation in students'firstyear to content rather than  the skills they seemed to need  in order to access that content. They saw a  need for curriculum which would be more accessible to students because it is more attuned to their ability levels. (My note: justice as  equal access to  resources)  3)  There was a need to change how 'Levels' One to Four were being taught and delivered; a "recognition of the need to give different curriculum and content to different levels — programme —  not just an 'enriched', a 'regular', and a 'watered-down' (track)  using the same basic content." There was also a need for students  to be able to repeat (without penalty) and skip levels. (My note: justice as  equal  distribution oj educational outcomes and of self-esteem)  4)  Out of recognition for the need (of administration and of teachers in particular) to know and evaluate  5)  New blood:  what was being taught  at the college more accurately.  'Burnt-out' faculty had been replaced by a new administrator and four  new teachers who were ready to experiment, and unattached to old ways of doing things. One of the teachers in particular had knowledge of a programme which sounded like an attractive alternative. (My note: This was a modular, discrete-skills-based,  mastery learning programme  leveled,  successfully used to teach  ESL students elsewhere).  6)  Exit surveys  showing  information gaps,  and dissatisfaction among students and  teachers convinced administration a change was necessary. The school is 'market-driven'; consequently it is imperative to keep the student retentibnrpate  Reitz 101  high, while at the same time acquiring an ever more prestigious reputation among the highly competitive Japanese post-secondary education 'marketplace.' (My note: the rate of student retention had greatly improved by 1990, but fears of a return to the previous low retention rate were still strong)  7)  Evolution: A natural desire to improve each year. The present programme was a logical outgrowth of the changes made in Year Three.  8)  A desire to get away from the Japanese higher-education model of 'leisureland in which students cannot fail.  9)  Administrators and instructors in the second year of the programme wanted students to meet minimal entry standards into their programme. This was impossible if the grades with which thefirst-yearteachers provided them had no real meaning.  10)  Finally, out of recognition that "students and teachers come to this institution because they want more than skills-based education," the two diverse 'languageskills' and 'content-teaching' camps came together to create a novel compromise, the Foundation and Transition programmes which effectively straddle both camps. (My note: First students are taught primarily discrete language skills in Foundation's 'mastery' programme; later they are taught primarily content in Transition's non-mastery classes. In 'Transition', Reading is no longer leveled, but Listening/Speaking and Composition remain leveled all year (though they,change to a 50% pass standard after Level Four). Students can 'fail' and repeat Foundation classes without penalty, while students in Transition are penalized for  Reitz 102  failure. As well, students take heavily content-driven, non-leveled 'elective' courses in Transition).  Much curriculum development time and interminable meetings later, the Foundation and Transition programmes were in place. Students and teachers generally seemed to be doing quite well with these programmes, though occasionally someone would comment on a glitch, ambiguity, or philosophical inconsistency - not whether the programme itself was good, but whether its dictates were being consistently followed. Over the years, the problems and discontent seemed to become more defined and to come ever more prominently into the foreground of teachers' attention. Therefore, it seemed to many that now might be the right time to make up a balance sheet of the strengths and weaknesses of the present programme itself. Several teachers mentioned that if, indeed, the institution is to create a new programme once again, it would seem expeditious to try to retain the strengths of both (previous and present) programmes while addressing their various weaknesses. [My note: One important factor I think should be considered is that in any programme, every aspect of innovation will have its inherent strengths and weaknesses (tradeoffs). The challenge is to build a balanced programme which acknowledges, minimizes, and mitigates the negative impacts whenever possible, while enhancing and enabling the positive ones].  The Present Programme: Strengths  This section combines the answers to several questions about the strengths of the programme, how teachers personally felt about leveling, standardized curricula, etc., and 'what they would like to add.' This is because there was so much overlap between the answers to these somewhat related questions.  Reitz 103  The new programme definitely addressed the noted weaknesses of the previous programme. As well, teachers commented a lot on the extensive formative and summative testing used in this programme. They said it gave teachers and students lots of steady, valuable feedback, allowing teachers to locate student problems and help rectify them before they became 'fossilized errors,' and modular exams in particular "help students deal with what was previously 'down-time' in the middle of a term," or as others put it, "They're Japanese - they need and want tests!!!"... (though). . . "now students can't simply (my note: in Japanese style) 'cram' at the end of the eleven-week term, then pass." The 'Great Compromise' (between content and skills), resulting in the creation of the Transition programme — which retained the 'content' of electives and other advanced (beyond 'Level Four') courses ~ is still highly supported by most teachers. The sequential nature of the skills-teaching, the standardization and consistency of objectives, materials, tests, and grading, and the clearly defined levels which have explicit mobility built in are generally acknowledged to be successfully countering problems of the previous programme: The specific needs of individual students are now measured and addressed. Students get a sense of their progression and can work pretty much at their own pace, repeating, progressing, or challenging (skipping) levels every five-to-six weeks as needed. Now, students don't takefirstyear elective courses or transfer into second year courses until they have met minimal standards. Teachers noted that there is a yearlong scope and sequence chart and a common teacher understanding of what is to be taught in each course at each level, and how it is to be done and evaluated. Some lauded the institution of 'Curriculum Heads,' teachers given extra time to help oversee that the curriculum in a given subject area is kept up-to-date and standards agreed-upon (and followed) by the entire team of teachers. New teachers can step into the programme with minimal preparation. As well, teachers have developed and now share a fairly large body of quality supplemental materials. Courses taught many times by  Reitz 104  many people can be compared and slowly improved over time. As one teacher noted, the curriculum is seen as "a work in progress - not written in stone."  The Present Programme: Weaknesses  Here again I combined answers to the specific query about present-programme weaknesses with answers to other questions which addressed areas teachers perceived as weak, plus the additional questions about what other programmes they would like to try or changes they would like to make. Here are the major complaints and concerns:  Too Much Standardization: While there was 100% support for the standardization of course objectives, several teachers felt standardization of materials and tests might have gotten carried too far; they feared that there is too much 'teaching to the test,' and decried the 'loss of creative juices' amongst faculty who had become lost in the 'safe mediocrity' of the modular system. Students, one claimed, were being led on an educational 'forced march.' The 'lockstep' system is perceived to be so inflexible teachers can neither take advantage of'teachable moments' nor address individual students' needs. A good question was asked: "Does 'standardized' always mean higher standards!" At times teachers confused 'standardization' with 'mastery' (i.e., "Mastery tests must be standardized") since standardized objectives, materials, tests, and grading standards were introduced at the same time as mastery learning. (My note: However, the two are unrelated issues as mastery tests in other institutions are not necessarily standardized among teachers). There is a problem with 'fossilization' of tests and materials. Teachers noted that it is very difficult to make needed changes to courses since all the standardized materials and tests have to be changed (this is particulary difficult with listening exams, for which scripts must be written and cassettes made); if there were not so much standardization, it  Reitz 105  would be much easier. Or, as one teacher stated, "the system tries to maintain itself rather than addressing students' needs."  80% Mastery Pass Mark: The 80% mastery standard and the term 'mastery' itself (with the false expectations of'perfection' it connotes) came in for the teachers' toughest criticism. They seemed to object more to the term than to the actual philosophy and practice of'mastery' learning such as giving people extra time to learn the material without penalizing them; in this institution, this means allowing them to repeat a level without penalty. In our system, only the grade they receive when they pass — greater than or equal to 80% ~ goes on theirfinaltranscript. Teachers rarely criticised other 'mastery' learning practices such as using frequent formative testing; using closely parallel course objectives, materials, and test items; or grading according to how well criteria are met instead of according to a bell curve. Giving a summative grade based primarily on afinalexam came in for some criticism; but requiring a fairly high (80+ %) pass standard was definitely questioned by many. A large number of teachers made statements like "an 80% standard implies they've learned 80% of what they're supposed to know (but in reality haven't)," or the "concept of'mastery' of something within six weeks is not practical nor is it educationally sound . . . often things they're taught in Module One don't really get learned until Module Three." Problems noted were that with such a high pass standard and with lots of outside pressure to pass the majority of students, teachers were inclined to 'teach to the test,' to scale marks (or "tailor marking so that there are not too many failing or getting A's"), and, over time, to remove difficult items from the tests "so that most of the students who. complete the level in Module Five (note, these are the 'lowest entry' students to take any given level) can pass." As one teacher muses, "Unless we change the administrative posture of the college, an 80% will never be a real 80%. We forgot who we and our  Reitz 106  students were. We signed up for the Guided Tour to El Dorado . . . but does it really matter?" Students, on the other hand, are, as one teacher put it, "forced by the mastery concept to memorize rather than to internalize." Others noted that with an 80% pass standard, there aren't "many numbers to play with" - it seems strange to call 79% 'unsatisfactory.' Others noted that the changeover from an 80% pass in Foundation to 50% pass in Transition is awkward for teachers and confusing for students. In one week, a good essay is given 85%; in the next it is given 70%, a 'Fail' the week before (See Appendix 8.3).  Not enough Levels, but Modules are Too Short: A few teachers suggested there should be one or two more entry levels ~ four to five levels minimum — to accomodate the extremely low- and (possibly) the extremely high-entry students, and one more added at the end of Term One for ambitious upper level students to challenge into. (My note: This would require the development of curricula for two or more additional levels for the first and last module of the year. Also, any of these changes would be rather dependent on the number of students. There would have to be a minimum of one class of students at each level in order for this to be a viable option). While most teachers felt that the 'challenge' option was positive, one warned that in some cases it can be very damaging; students who skip a level can miss out on important information, developing incomplete schemata. [My note: In order to 'challenge' (or skip) a level, students must receive a 95%finalmark in their present level, score 80% on the final exam of the level they wish to skip, and get their teacher's recommendation.] One of the biggest complaints was that the modules are too short. For one, teachers complained that it is very difficult to complete a full evaluation procedure with summative exams and reports in such a short time (some modules are only separated by  Reitz 107  three days). A teacher allowed that the modular system ensures that teachers keep 'a tight ship', but fears that "sometimes the ship is 'too tight!'" and went on to complain that "continual evaluation takes time from teaching." [My note - by 'teaching,' 1 surmise the interviewee meant instructing the class, as evaluation (particularly the 'formative' testing used in mastery learning) is certainly a function of teaching and is generally assumed to have pedagogical value]. Another complaint many noted was that the modules are not long enough for teachers to adequately cover (or for students to synthesize) all the objectives in each level, especially considering how many progress (formative) tests must be given in the class time alloted. There isn't enough time for "experimentation, individualization, enrichment, and creative activities!". . . and another: "No room to manouvre, no room for creativity or teacher strengths . . . No time!" Various teachers recommended deleting objectives, particularly in grammar and composition such as recommending less sophisticated rhetorical foci in lower levels and "less grammar - period!" ~ this was followed by the comment that "All students, even especially-low-level students, should have at least one (some?) content courses . . . [and in an only half-joking vein:] If grammar has failed them for so long, why not try something else . . . electricity? carpentry?" This leads appropriately into the next topic. Content Issues: Content issues were raised as well. Many teachers felt that listening and speaking classes should be alloted more time per week and that pronunciation instruction was being neglected. Others felt that vocabulary wasn't being specifically targeted; for example, some students leave the institution "without knowing the numbers, months, or days of the week." Various teachers wanted to add more 'content' courses (electives) to the curriculum (my note: presumably this would entail changing the current elective requirements and/or minimal criteria for taking electives). One found the grammar programme "boring ~ students have already had six years of  Reitz 108  grammar... they should be ready to have it applied in another way. There are good programmes out... we haven't looked far enough." One claimed that upper level students should be given more challenge at the beginning of the year, not the 'review' they are given now (a review, nevertheless, as several teachers noted, of what they've often been inadequately taught and which they have never been asked to apply in contextual, genuine, extemporaneous, oral/aural or interactive situations). Various individuals noted the need for a more interesting reading text, a better language lab, more discussion groups with Canadians, more field trips, more 'interactive activities' in general, longer classes in computer studies, and 'values clarification' and 'intercultural competency' taught across the curriculum. Finally, one teacher criticized the lack of a Curriculum Head for the electives offered in Transition, a further tribute to the effectiveness of Curriculum Heads. Miscellaneous Doubts: Some doubt and ambiguity came to light, such as "our purported coherence and sequentiality are in many ways only apparent; they are insubstantial, focussed only on how the institution appears lo students" Placement and testing are areas of great concern: "We claim to and appear to be competently placing and advancing students into appropriate levels, but are we?" Several teachers would like to have more input into initial placement of students (such as administering a speaking test and seeing a short writing sample before students are initially placed), and for students to be able to begin in different levels in different subjects according to their specific abilities in each area (a 'finer-tuning' of our present practice in which each student starts all his/her Foundation classes at the same level). One teacher criticized some tests as having poor questions: "Tests need to be analyzed item by item." Another subject questioned how objective teacher-produced tests really are. S/he worried that scaling and adjusting marks was a sign of'fudging' and  Reitz 109  'dishonesty.' Derogatory words like 'bogus,' 'arbitrary,' 'not legitimate,' 'incompetent,' and 'inflated' came up many times, especially when teachers were discussing teacherdeveloped tests and standards. (My note - it struck me that perhaps these teachers had too much faith in 'professionals' and too little in themselves; they didn't consider that commercial standardization also takes time and that even commercially-made standardized tests are regularly scaled). However, one teacher, while not denying that scaling goes on, suggested that "no matter what the standard was, we would still have 'borderline' cases and some scaling." One teacher suggested that final exams should be the only ones standardized; progress (formative) tests should be made by individual teachers. Examination practices advocated by individual teachers included more one-onone interview-testing of students' actual 'communicative competence' (especially in listening/speaking and grammar) and a "comprehensive exam at the end of Foundation covering Levels One through Four — a more holistic measure, not simply looking at discrete skills ~ to keep students out of Transition that don't really belong there" (My note - The implication here is that the Level Four exit standards are too low, allowing students into Transition who are unable to succeed at that level). Another wasn't happy with the way the 'core body' of (Levels One to Four) skills was defined, and suggested that this area be re-examined. Another, concerned that upper-level students aren't getting enough intellectual stimulation, suggested that electives in the final two modules be leveled, to enable teachers to present more highly-challenging content to these students. This teacher recommended that some Reading electives such as Anne of Green Gables might be best 'reserved' for these higher level students as well.  Reitz 110  Another wasn't sure if the present programme is any more successful for teaching English than the previous one, claiming, "Nobody knows if it is more successful..." In the same vein, a teacher wondered, "How much of our current success is due to 'studentat-risk' protocol (my note: tracking student progress, regular meetings among teachers and interviews with students regarding 'at risk' status) and the Learning Resource Centre (free tutoring service), and how much is due to the modular, leveled programme itself? It's hard to determine which factor is helping students more."  The Present Programme: Teachers' Perceptions of Students' Responses to it  In general, teachers said that students seem pretty happy with the curriculum. Some aren't content with the speed with which they are learning English, nor with what they consider to be unnecessary review of high school grammar in the beginning of the year. Dropout rates have plummetted since thefirstyear (though Year Three was very low as well), and exit surveys show students have a high degree of respect for the programme. The high-entry students generally like the programme because they perceive Foundation as a challenging but short route to the more interesting Transition courses. They seem happy in their own 'prestigious' group. As one teacher noted, "They need to work with successful peers. They don't want to be 'teachers' (peer tutors, i.e., in a multileveled class); they want to be learners." However, some see the Foundation courses as too easy — the grammar is perceived as 'the same' as what they learned in junior and senior high, though, noted several teachers, most of them can't see their own weakness: that they have only learned how to pass grammar exams, not how to use good grammar in their writing or speech. Several teachers said students would like another  Reitz 111  level into which the most ambitious and hard-working students in the highest level could challenge into at about mid-year. The lowest of the low-entry students, much as in the previous programme, seem to have to work pretty hard to succeed. However, they appreciate being able to work at their own speed, repeating a level if necessary, but also having the opportunity to 'challenge.' Two teachers noted some concern over their lower social status, though others also noted many of these students used other opportunities to gain status through sports, music, etc. In fact, some of these students may be in this category because oj their 'other agendas.' A few teachers have heard more than one low-level student grumbling that some courses are too difficult, especially Foundation Listening/Speaking and Reading. One teacher felt that these students are '"plugged through' a lockstep system which doesn't adequately address their learning problems;" as a result, they are "frustrated with their learning experience, which is actually very Japanese in its way of testing and sorting students." However, another teacher, while acknowledging that these students "find it very difficult to move at the pace we've established" claims that, "Most are happy to use all the extra help and personal attention we've provided (learning resource centre, tutorials, etc.). They perceive a lot of extra effort is being put forth on their behalf." The consensus among teachers would seem to be that most low-entry students perceive the present programme to be challenging but satisfying.  Reitz 112  5.2.5  Qualitative Research Results: Teachers Compare Present Versus Previous Programme Evaluation  First, I asked teachers what they perceived as the evaluation needs of various entities (information about studentsfromthefirst-yearteachers needed by the secondyear campus, by thefirst-yearadministrators, by students themselves, and by their parents) and how well each programme met their needs. Here are their answers.  Second-Year Campus Needs: Teachers noted that the second year teachers expect us not only to prepare students for the next year, but to 'sort' students for them. They want a general sense of students' oral and written language competencies, of their ability to research and to meet Canadian classroom expectations (e.g., are they 'active' learners?), general social skills, and of any notable character traits. Teachers noted that the second-year teachers expect consistency and standardization from us in our evaluation practices, but some noted that they were unsure if the second year teachers agree on minimum criteria for their programme; another said it would be "very helpful if lots more faculty [from the second year programme] could visit here." The consensus was that we are doing a very good job now in providing them with information they requested, but that the inconsistency of evaluation practices in the previous programme had made transcripts they receivedfromus worthless. However, one teacher proposed that the second year programme should develop and have us administer a standardized 'Exit Year One/Entrance Year Two' exam which would more accurately reflect what they were to be doing in the second year. (My note - All students 'advance' to the second year campus; however, some of them go into an alternative programme if they are judged ill-equipped language-wise to handle the regular  Reitz 113  programme; generally, students who do not complete Level Four by the end of the first year go into this programme.)  First-Year Faculty Needs: Teachers felt the evaluation information needed at this institution for placement and advancement within the first year programme is, as well, pretty much what we are giving ourselves now: Since 1991 we have used SLEP for placing students in levels initially. We give marks five times a year and supplementary anecdotal comments at least twice a year (end of modules two and four) and more if students are having difficulties. With this information, students advance within the levels in what teachers judge is a fairly satisfactory manner. One teacher, however, would like to see initial placement improved with a formal 'test of motivation' administered in Japanese. (My note: Motivation is informally tested in the entrance evaluation procedure in Japan. It is unclear if a valid test of motivation exists in any language). There is also a concern that too much of our testing is written; several advocated more oral testing based on 'genuine communicative competence.' Some teachers mentioned the need for greater standardization among writing and speaking tests, and in electives. Again, compared to the previous programme, teachers felt we have significantly improved our placement and advancement evaluation practices.  Students' Needs: What do students need to know about their academic achievement and progress? Teachers thought that their particular needs were for frequent (weekly was often recommended), clear feedback on how they are meeting specific course objectives ~ formative information, in other words. They need to know if they are in danger of failing, and how they can improve. They need to know how the evaluation system 'works' and that it is not biased.  Reitz 114  One teacher felt we gave too much evaluation to students, others recommended more self-evaluation and 'communicative' testing. One spoke of the motivating effect of evaluation: "There is a fine line between criticism and encouragement... we should set a standard high enough that they have to reach." Feeling was mixed somewhat here, as some teachers thought that students really don't care all that much about evaluation: all they want to know is if they are going to pass and graduate.  Parents' Needs: To teachers, parents were assumed to want to know basic information such as if their child is having severe academic, life-style, social, or health problems; they need warning that their child may not pass a level, whenever possible. They definitely need warning if their child may be placed in the alternative programme for the second year. They need to have a clear understanding of what criteria are being used to make decisions. Other than that, they want enough information to feel secure that they've "turned their child over to an institution that will take good care of him/her — because they're so far away." Culturally, said one, they "can accept poor behaviour as a reason for failure, but not 'inability to learn.'" (My note: effort, consequently, is emphasized far more in our anecdotal comments than ability). They need to have anecdotal comments translated by the Japanese staff. The consensus was that over time, we have improved a lot in reporting to parents, but there are still some problems. To say your son/daughter has 'successfully completed' or 'successfully mastered' something is a little inaccurate (and redundant) in English — how does it translate into Japanese? "This term — successfully completed," said one teacher, "doesn't address underlying issues of communicative competence and personal growth."  Fairness: Next, I asked some questions about whether evaluation methods were fair or perceived as fair, and then how the evaluation methods used affected students in each of the two programmes ~ specifically, whether the methods motivated or  Reitz 115  discouraged students ~ students in general, the high-level students, and the low-level students. The answers, and the reasons teachers gave for them, were rather interesting. Predictably, nine out of thirteen teachers clearly rated the previous programme as unfair in that lots of students complained about inconsistencies from teacher to teacher, and class to class. Students, remembered several teachers, felt they were graded quite subjectively, and often didn't understand teachers' evaluation methods. Of two 'maybe/not sure' answers, one remembered students as realizing and accepting that some teachers graded more strictly than others, that the system was imperfect but 'not bad all in all.' Another pointed out the internal consistency of each teacher ~ that each teacher evaluated in a fair manner according to her/his own individual criteria, objectives, and tests. One also noted that "all students were part of 'the same system ' and equally subject to its whims," (in my opinion, a convoluted form of fairness!). Of the two who felt the previous programme evaluated fairly, two didn't remember any complaints. One said, "Students only cared if they passed and got a diploma. If so, they felt it was fair." In another part of the interview, one teacher also pointed out that the evaluation methods used in the initial year, in particular, were very close to those found on most North American university campuses, where the professor is given a large measure of autonomy. However, the general verdict was 'Unfair, and perceived as such.' Also, quite predictably, ten out of thirteen teachers felt the present programme is both fair and is perceived as fair.  They cite well-stated goals and objectives, and  standardization as the basis, though note that in addition "team meetings help build consensus [and hence, greater consistency] about marking." One teacher observed: "Errors are usually that students who shouldn't, do move up (pass), not the other way around." Critics, however (those who saw bad points as well as good), saw some inconsistencies, such as in how items on the same test are marked, or the amount of time one class versus another might be alloted to spend on a progress test. They did note, though, that these were pretty minor compared to student complaints about the previous  Reitz 116  programme. Several said that complaints about inconsistent evaluation were far more common, understandably, in the least standardized classes, experiential studies and the electives. One teacher said that the present evaluation methods are unfair because too much of the final grade is based on 'testing.' Others, however, thought present methods are unfair because not enough of the final grade is based on the final exam: too much of the grade is based on doing homework, going to conversation groups, and taking progress tests. (This is clearly not an area of teacher agreement!)  Motivation vs. Discouragement: When asked whether evaluation practices motivated or discouraged students, teachers said that in general the present programme's practices are far more motivating to students, though slightly less motivating for the lower-entry than for the higher-entry students. The teachers gave very mixed reviews to the previous programme; they tended to say it discouraged more than it motivated, but many were unsure or said that it had done both or neither, or that they didn't remember. They also, however, gave the previous programme a slightly higher rating for motivating higher-entry than lower-entry students. A reason one teacher gave for students in general being unmotivated in the previous programme was that there were "tremendous 'lulls' in the middle of thefirstand second terms which allowed students to become lazy for longer — it didn't 'keep them on their toes'... feedback, even when it is negative, can be encouraging." The present modular system, on the other hand, provides a module-end exam during what used to be mid-term One and mid-term Two. Some previous-programme students, a teacher said, also suspected that "grades were probably inflated and didn't reflect achievement." High-entry students, some claimed, were motivated because they were quite happy and challenged, and not as apathetic as lower levels. However, others said they were discouraged because of inconsistencies in grading and because of a sense that "they had 'arrived' and had no  Reitz 117  motivation to knock themselves out." Many dropped out, complaining, noted their teachers, that standards weren't very high and that "they wanted their excellence recognized." Previous-programme low-entry students "had more problems with the curriculum; whether this was motivating or discouraging depended on the student." One said there was lots of apathy in lower-levels; another referred to a sense that they (the low-entry students) would always be 'at the bottom.' In multileveled courses they knew that "they'd get low marks anyway, no matter what they did . . . (and) that if you failed, it didn't matter." (My note: Students apparently perceived few, if any, consequences should they not meet course expectations.) General reasons given for the present program's motivating students included desire to move up to the next level (there are now more levels than previously, and students have four opportunities to move up a level, plus chances to 'challenge'). Students who want to take more 'interesting' courses (e.g. electives) realize they must go through Level Four first. 'Status' was cited as another motive for advancement. Some teachers said the present programme is more motivating because it is easier; some said it is more motivating because it is more difficult, some said it is less motivating because it is easier. (Obviously this is quite a subjective area as well!) One said in the first part of the year the high-entry students are motivated, but s/he is not sure if this continues into the latter part because there are no levels for them to challenge into. Lower-entry students in the present programme were characterized as "motivated — they work their tails off!" by one teacher, but another said, "while those moderately low are encouraged because they know they can 'do it' with hard work, those very low are discouraged." Another teacher claimed the low-entry students are "terrorized, not motivated, and fear does not promote language learning." An interesting point was a conjecture that "middle-entry students are less motivated, perhaps, than the higher or lower students because (unlike the lowest levels) they can 'fall behind' and . . . (unlike  Reitz 118  the highest levels) it's not such a 'fall from grace.'" Another interesting question was posed: "Does the present programme motivate the lower-level students negatively (through avoidance of failure) or positively (through attraction to success)?"  A Balance Sheet  When asked how well the present programme addresses the weaknesses of the previous programme, or whether they would like to return to the previous programme, most teachers reacted strongly in favour of the present programme. They said the present programme addresses lack of standardization: "There is a comfortable balance between standardization and creativity . . . scope and sequence enables us to know what students have been taught so we don't have to start at 'square one' all the time."  It also addresses  students' lack of basic English skills: "We're acknowledging that ESL is an important component of the first year. More students are participating actively; there's more discussion and less lecture-style." Finally, it addresses the needfor content-learning: "Content is being taught, but in a much more logical way . . . the programme clearly defines the parameters of language and content so that teachers and students know better what to expect from course to course." One teacher noted that students seem to take on more responsibility for their own learning progress when they see it laid out so clearly. Another claimed the materials are more respecting of students' maturity (no more infantile reading materials) and diverse ability levels. Also, students are given more chances (to succeed, or to repeat levels without penalty) than before. However, the present programme is seen as flawed as well. As one teacher noted, "in the process of addressing the previous program's weaknesses, we also created some new problems and needs." (Examples of these follow.) Another noted that there is a difference between 'addressing skills' and demonstrably improving them. This subject isn't really sure if or how students' skills, especially grammar, have improved: "Some  Reitz 119  Level Five students still cannot write a paragraph." Another questioned if the very highentry-level students are being truly challenged, noting that "we have a greater spread of ability levels and rate of learning than we account for or admit." This is related to another comment that, "Our clientele may have changed somewhat" (since initiation of the present programme). "Would I want to return to the previous programme? . . . not a chance . . . not even in my dreams . . . definitely not," Ten out of thirteen teachers were adamant on this, but one wanted to return: I would like to get back to three terms rather thanfivemodules, I'd like to explore different ways of evaluating students that aren't so fear-producing, and I'd like to see all students getting into content courses earlier than they are now; one wanted to combine the two: Ditch the module system, put less weight on the final exam, give teachers a little more flexibility, and include more 'content' learning (but not 'V.-ism!'); however, retain some of the present skill expectations; and one was nostalgic for certain aspects of the past: I don't miss the lack of direction and goals. (V.) structures are great for organizing, but you need goals. However, we had some wonderful themes that we've lost — deemed 'unreachable' for our students, and perhaps we went overboard, simplifying too much. Perhaps the 80% pass standard caused us to be a little too 'bare-bones', boring, and simplistic. 5.2.6  Qualitative Research Results: Proposed Changes  This leads us to the question of what changes teachers would like to make in the present programme. When trying to visualize an improved programme, the areas they most frequently cited are listed below in the same order in which teachers prioritized them. For possible justice implications, see Appendix 8.4.  Reitz 120  1.)  Avoiding using the term 'mastery' incorrectly: Most teachers object to the term  'mastery' which creates unrealistic expectations and claims. 'Mastery' implies control of a skill or comprehensive knowledge of a subject, neither of which, they claim, our students can realistically be said to have attained by achieving 80% on a fairly simple exam at the end of a five-to-six week module. A proposal was made by one teacher which would enable students to truly 'master' the subjects, enabling students to progress 'in their own time' as the proponents of mastery learning advocate. This was the idea of'continuous intake' of students and allowing students to take more than one year to complete the 'entry' programme. (My note - Without this freedom to take as long as necessary to master the objectives, we are following a mastery system 'imperfectly' — even, perhaps, as many teachers noted, 'dishonestly') One alternative to this rather drastic step is to use another term for what we do. 'Competency-based' learning is sometimes used to denote evaluation according to how well a student has achieved course objectives (instead of according to how they stand in relation to other class members on a 'normal' curve) However, most teachers were unfamiliar with the term. One said it "more accurately describes what we're doing" while another said that "though we attempt to do it to some degree, it is not an accurate description." Another said it's good because it implies an 'application' of skill: "I can do it!"  2)  Lowering the 80% pass standardfor Foundation: Interestingly, several teachers'  rationales for lowering the pass standard were similar to that used for creating it consistency and raising standards. They would like to see a consistent pass standard (60%> perhaps) used in both Foundation and Transition, and they feel the 80% standard, instead of motivating students to achieve a high standard, has resulted in teachers actually lowering their standards by over-simplifying exam questions and 'teaching to the  Reitz 121  test' to enable 'an acceptable' (e.g., acceptable to the administration) number of students to pass instead of addressing actual student needs. They also feel the 'double standard' (separate grading scales for Foundation and Transition - -see the GPA table in the Appendix 8.3) is confusing to students and parents as well.  3)  More opportunities for listening, speaking, instruction in pronunciation, and  (structured) interaction with Canadians. Particularly advocated were longer Listening/Speaking classes, possibly with specific pronunciation and grammar skills built into the course objectives; this is related to the next suggestion . . .  4)  Combining listening/speaking class with grammar class and combining  experiential studies class with the study/presentation skills class. This is partially in answer to the need many teachers have expressed to decrease the number of classes and increase the number of hours for oral and interactive activities. The major reasoning here, however, is that these are pairs of related subjects which should be integrated for their mutual enhancement, to reinforce skills taught in one that are directly applicable to the other. Advocates especially wanted verb tenses and articles to be in listening / speaking (instead of writing) class in order to make grammar more contextual and less like it was taught (often unsuccessfully) in Japan.  5)  Decreasing the number of objectives for each level (particularly grammar),  having fewer and longer modules, and increasing the number of levels: These are all related to the same problem of having too many objectives to teach properly in one fiveto-six week module.  6)  More challenge for upper level students: Teachers advocated more challenge'for  upper level students such as adding an additional level for them to challenge into at mid-  Reitz 122  year, the opportunity to audit university-level courses, a year-long voluntary Honours Seminar (noted on the transcript in some way), the option to take more than one elective course in a term, and, possibly, leveled electives.  5.2.7  Summary: Back to the Future?  Teachers often waxed philosophical towards the end of the interview. The theme of 'getting back to what we have lost' seemed to surface for a lot of teachers as we neared the end. When asked if s/he had anything to add, one teacher said, "It was interesting - it made me think, especially about the past," and another said, "I realized doing this interview how much I favour, support, and enjoy content learning and how much students benefit from it." One teacher claims, "I don't believe Japanese students are nearly as committed to sameness for everyone as we have presumed that they are. I believe we can be more creative and do more with students than modularize them 100%." Another questions whether teachers who are jacks of all trades' (rather than content specialists) are really what students want, noting that a truly fine school "should aim for a team of specialists." This same teacher wants "more freedom to teach outside a team at the upper levels." Another says: I would like to get into more depth, content, and academic material. Students want new information: we should bring more research and issues to them rather than just slipping along with a few almost 'stereotypical' assumptions . . . people aren't interested in reading about what they already know. They lose the spark of motivation. In our 'compromise,' perhaps we swung a little away from the junior college and a little too much towards the junior high school in terms of content! Thinking about the past and the future opens up the theme of change - of our students, of ourselves, of our world. Several teachers noted that, "We should never rest and be totally complacent. As our clientele and (their) employment requirements change, we must keep our eyes open and change to meet their needs," and "It's not just a matter  Reitz 123  of the curriculum 'then and now;' we — teachers and the college -- have changed and matured, too." Another noted that the present programme "isn't written in stone; since we've instituted it, we've actually made substantial changes such as initiating and standardizing 'progress' tests (my note - as compared to more summative final exams). . . and generally (and incrementally) improving most of our individual courses." Others talked about how working at the college has deeply affected them personally and emotionally, in both positive and negative ways, such as one teacher: "We should focus on actual student learning rather than how we appear to others. [On the other hand], it is important for students and parents, especially of four-year students, to keep 'a core' of very good, contented, committed, trusting and trusted faculty who feel they have a personal stake in the institution's success ~ this is Japanese style." Another teacher's bottom line was that compared to the past, "life is a lot easier as a teacher now!" This contrasted with the thoughts of a teacher whom contentment has clearly eluded. I think the teamwork we have been put through has been necessary as we have broken new ground in a teaching area for which we had no role model, and that it has trained and helped both us as teachers and the programme in general. However, it has also been difficult. It is difficult to create curriculum for teachers with different teaching styles and requirements; it is equally difficult (no, more difficult) to follow another's half-completed or experimental curriculum, especially when that teacher has a different style. Having been through that treadmill, I believe it would be good now to try something free-er. Also, I do believe that our system of teaching — so many hours, so many preps, so many meetings, so many tests, so much curriculum planning, teaching new areas, teaching in areas that are not one's strengths — does lead to burnout and maybe not doing one's best. I would like to see more time, and encouragement to present, publish, and generally take part in the wider world of teaching — all requiring time (which is nibbled away by the factors mentioned above) — but this is professional development. Another echoes these concerns with, "This is not a system for encouraging people to grow. There are structural weaknesses." As if conversing with an unseen comrade, another teacher contributes an additional perspective to the same phenomena:  Reitz 124  Are these good questions? Yes, but perhaps we should be asking how the resources have changed and been developed. I was hired to develop a computer course ten days before the first 280 students arrived. Thefirstkeyboarding programme was Shareware, the computers hadn't arrived, and the computer courses were taught from 6:00 - 10:00 pm. In addition, there was only a half-hour lunch, plus a half-hour faculty meeting every day! I would like to express a sense of gratitude to the dedication of my fellow faculty members, for their high expectations and high achievement. From Day One and continuing to now, never has there been ample time to get what needs to be done, accomplished. Faculty have developed curriculum with minimum hours, taken on leadership with no recognition, jumped in wherever a need was perceived, and evidenced tremendous teamwork consistently. With all of its weaknesses, when asked how the present programme is perceived by most teachers, the answer was usually a mildly qualified 'good;' for example: "A big improvement - but not perfect!" or "O.K., but could stand improvement," or "Generally — a fair degree of confidence that 'it works' for most students at a pace that helps them develop," but some said feedback from students is "Mixed: some seem happy; others are wondering why students aren't happy and want to re-assess it." Another ventured that "There is general dissatisfaction that it might not be doing what we thought it would . . . administrators like it, but teachers are getting disillusioned." One teacher summarized it thus: Teachers can't do many creative things (in 'Foundation'), but realize when they put the whole programme into perspective (including classes in computers, presentation/study skills, experiential studies, Transition and elective courses plus a cross-cultural 'survey' course taught in Japanese) the students' needs are being quite well-met. All in all, it's meeting students' needs and preparing them to get successfully into content issues. One strength of our faculty is that we are flexible and always looking for ways to improve. I think in the final sentence above this teacher has defined our two common denominators, flexibility and the desire to improve. This is probably one of the few possible statements with which I am convinced every teacher at the college.could agree.  Reitz 125  6.0  CONCLUSIONS  6.1  Quantitative Research  6.1.1  Site-Specific Conclusions of the Quantitative Research  Within the parameters of the quantitative study, no consistently significant differences were demonstrated between the two programmes. In other words, no significant differences showed up as significant for more than one matched-pair year, contributing to the impression that these may be anomolies, due possibly to factors other than the difference between the two programmes. Arguably, the most important weakness of this research is that it did not include other important skills that are taught at the college such as speaking or composition. Without this information, the results are somewhat narrow in implication.  6.1.2  Site-Specific Justice Implications of the Quantitative Research  As noted previously, we will consider learning outcomes as a special form of potentially distributable 'resources' like self-esteem, because they are indicative of what knowledge the individual has incorporated through an educational experience, and because they (both the knowledge itself and its demonstration through testing) help to determine an individual's future access to other resources. The first question is, according to John Rawls' criteria, and the quantitative data alone, whether learning outcomes (mean increase in SLEP Listening and Reading scores) in the previous programme were justly distributed among students. It was just in that it was (1) of benefit to the group as a whole (at least as much as the present programme), and (2) it was of benefit to the least advantaged subgroup — in that those with lower entry  Reitz 126  SLEP scores made the greatest gains (not significantly different, however, from the present programme) of the cohort group. According to the same criteria and data, the present programme would also be considered just. That is, it was (1) of benefit (or not significantly worse, at any rate) to the group as a whole, and (2) it was of benefit to the least advantaged subgroup -- again those with lower entry SLEP scores made the greatest gains of their cohort group. One question from a justice point of view is: Is there any way of reversing the negative trend in mean listening SLEP gain for the upper-entry students without affecting the gains being made by the lower-entry students? SLEP gain is not a scarce resource in that one group's gain is not necessarily another's loss. Therefore, this should be possible. Note that this trend may be reversing anyway, as the difference between the two programmes has been decreasing each year. It is possible that if the programme for upper-entry students were changed to give them significantly greater enrichment and challenge, this might have detrimental effects on the lower-entry students such as fewer educational resources (i.e., teacher time and materials), including decreased self-esteem. These justice implications will be explored more fully when qualitative data are added to the discussion. (See also Appendix 8.4).  6.1.3  Conclusions about the Kind of Information Gained from this Kind of Quantitative Research  This kind of research can give very specific answers to very specific questions. It is not concerned with questions of equality of access to education by different ability groups (or even to 'knowledge'), but of how learning outcomes are actually distributed among different ability groups by different programmes. It assumes that the difference between achievement on a pretest and a post-test is a function of the distribution of learning outcomes (which may, arguably, be indicative of access to knowledge itself or to  Reitz 127  education). It also assumes that in a matched-pair group of students from two different programmes (starting with the same pretest scores), any difference in their mean posttest scores is a function of the difference between the two programmes. Given those assumptions, the research can give: 1)  general information about differences in student achievement in the two  programmes. 2)  specific information about differences in the achievement of students in different  ability groups in the two programmes. 3)  accurate estimates of the significance of differences found in achievement in the  two programmes. 4)  (if comparing more than two years) evidence of patterns which can further  confirm the significance of the above differences.  While some very specific useful information can be gained, there are some important limitations that must be understood by researchers using post-hoc testing and a matched-pair design to compare students' mean improvement in two different programmes. For example:  1)  You need pre- and post-test scores for the same test, given during the same time  of year (if comparing full-year programmes), under the same conditions. It needn't be a commercial standardized test. It could be made by a single teacher (or group of teachers) for her/his/their own action research, comparing different teaching methods or materials with different classes.  2)  You cannot extrapolate that because a particular programme demonstrably  improves Skill 'A' (e.g., Listening) that it also improves Skill 'B' (e.g., Speaking). You would need separate tests to demonstrate this. Therefore . . .  Reitz 128  3)  The greater the variety of skills being tested, the more generalizable are your  results to the entire programme; the fewer the skills, the less generalizable are your results.  4)  The more years you can compare with the baseline year, the better.  5)  The baseline year is very important: The fewer anomalous conditions that year  which can get confused with the programme effects, the better.  6)  The fewer the changes in factors other than programme changes (e.g., student or  teacher characteristics) among the baseline and other comparison years, the better.  7)  The larger the groups you have from which to draw the matched pairs, the more  likely you are to have large enough matched-groups to give you significant results.  8)  You won't be able to tell which aspects of a programme are responsible for any  observed differences, especially if the skills being measured are those general skills which (like 'listening' and 'reading') are developed and reinforced by many different aspects of the programme.  In summary, the strength and the weakness of quantitative data is its specificity. It cannot answer a lot of questions, but it can answer a specific question or set of questions quite well. In fact, it can even tell what the chances are that the results it gives are true. I have one more piece of advice, particularly for teachers who might want to do this as action research and are uncomfortable with statistics. They can test for significance and make graphs with one of the new computer statistics programmes (for  Reitz 129  Windows or Mac) made especially for the social sciences or education. They are relatively inexpensive and easy to learn and use, without an extensive knowledge of the mathematics. Be sure to input your data into these from the very beginning; importing data from a spreadsheet into one of these programmes can be very challenging, as I will attest.  6.2  Qualitative Research  6.2.1  Site-Specific Conclusions of the Qualitative Research  Here are the most important findings of the qualitative research: While the final judgment was 'mixed', teachers generally characterized the previous programme as interesting, exciting, and ambitious in intent, but lacking consistency and standards, boring to some high-entry students, marginally accessible to many lower-entry students, and damaging to some of the latter students' self-esteem. Teachers (and, by conjecture, students) find the present programme generally (though not 100%) effective in its distribution of educational resources to students. Most teachers felt it teaches language skills very effectively, but perhaps it doesn't teach enough content. Most teachers indicated they were ready to consider making some major changes in the present programme such as changing the 80% mastery standard or giving more hours to the teaching and practice of oral/aural skills.  Reitz 130  Table 5 Student Responses to Programmes, by Entry Level (Remembered by Teachers)  Summary of Student Responses to Programmes (remembered by teachers), by Entry Level  PREVIOUS PROGRAMME  In general:  Level 1  Level 2  Level 3  PRESENT PROGRAMME  feeling upset by lack nf stanftarrk  trying to 'level up' or challenge  and consistency  -motivated!  either 'having fun' or  most working very hard, but a few  demoralized by 'track' system  demoralized by how difficult it is  and 'lost' in difficult content  to achieve 80% pass  either lost or challenged-  system is basically good, but  depending on teacher, individual  sometimes boring  either challenged or 'coasting'  many challenged, a few 'coasting',  some bored, many upset by  a few bored because there is  lack of direction  nowhere to challenge to  Reitz 131  6.2.2  Site-Specific Justice Implications of the Qualitative Research  According to the qualitative data alone, the previous programme was unjust because it lacked standards, was delivered inconsistently, and was somewhat inaccessible to the lower-entry students. Inequalities that occured were generally seen to benefit no one, and lower-entry students were especially noted as suffering from curricular practices; the programme did not provide them, in other words, with the educational resources they needed to take advantage of the knowledge that was being presented or offered to them. Tracking practices, inaccessibility of the curriculum, and the perception of inflated grades for the lower tracks all could have resulted in decreased motivation and decreased self-esteem for the lower-entry students. Also, lack of consistency in what was taught from class to class and from teacher to teacher meant that, in general, resources were not being distributed equitably. Teachers seemed to accept that standardization and consistency can be carried to an extreme, bypassing teacher professionalism and stifling creativity, resulting in widespread boredom and the elevation of mediocrity to the norm. However, they generally felt that in the first programme, there was so little standardization and consistency that many students were actually being treated inequitably. This kind of unfairness is described by John Rawls as failure to meet the criteria of the concept of equality. The first of three levels of the concept of equality applies to the administration of institutions as public systems of rules. In this case equality is essentially justice as regularity. It implies the impartial application and consistent interpretation of rules according to such precepts as to treat similar cases similarly (my emphasis) (504).  Reitz 132  While the inconsistency in evaluation practices and in what was taught from teacher to teacher and class to class could have adversely affected students from all entry-levels, all these practices considered together probably resulted in an unjust distribution of resources in the previous programme favouring the upper-entry students. According to the qualitative data alone, the present programme justly distributes educational resources to students. This is because teachers perceive that (1) it is of benefit to the group as a whole, (2) it is of benefit to the least advantaged subgroup (the lower-entry students), and (3) the unequal distribution of resources (in this case extra tutoring, opportunity to repeat levels without penalty, and the balance between skillsbased versus content learning) is to the benefit of the least advantaged subgroup. In other words, while those teachers interviewed tended to question whether upper-entry students in the present programme are getting enough challenge (particularly towards the latter part of the year), they are convinced that the lower-entry students need the skills-based approach. The programme is considered generally more just for the entire group, as well, because of its consistent standards (from class to class, teacher, to teacher, and level to level) and in its leveling practices which, because of the ease and regularity with which students move on to upper levels, were felt to be less destructive of self-esteem than previous tracking practices. Therefore, the programme was thought to be generally just, but perhaps slightly in favour of the lower-entry students because of a possible decrease in challenge and interesting content offered to upper-entry students. Again, we ask ourselves if it might not be possible to retain the positive effects on the lower-entry students while increasing the challenge and giving more interesting content to the upper-entry students. However, let us postpone this discussion until we amalgamate this data with the quantitative data.  Reitz 133  6.2.3  Conclusions about the Kind of Information Gained from this Kind of Qualitative Research  The aim of these interviews was to examine from an instructor's perspective: (1) how effectively the previous and present programmes have distributed learning resources to students, (2) how justly each has distributed learning resources to students, and (3) what teachers perceived as student impressions of (1) and (2). The results to all three were to be in terms of 'students in general,' 'upper-entry students,' and 'lower-entry students.' While the interviews were 'open-ended,' in many ways, I was also looking for (and asked for) some very specific information from all teachers, knowing that a lot of the answers would overlap with others. Yet, I found the variety and scope of the answers was much greater than I had anticipated.  In addition to the specific information I wanted, I got these additional kinds of information: a)  historical context - of events, of thoughts, and of feelings  b)  causality and order of events  c)  reasons for curricular decisions  d)  'behind the scenes' actions, including evasive and compensatory ones  e)  opinions that are/were controversial, especially from people who generally avoid public controversy - the normally-outspoken people tended to be rather low-key in these interviews  f)  'the flavour' of the past — and of "behind the classroom/office door' in the present as well  g)  unresolved contradictions and unanswered questions — a very large portion of the information was in this category  Reitz 134  h)  guesses and conjecture, though interviewees generally labeled these as such, I surmise  Were the questions answered clearly; are they neatly packaged and graphed? No, this was qualitative research: "You can't always get what you want, but you'll surely get more than you need . . . " I had intended for this to be a 'fishing expedition' with openended questions the bait, intending to 'catch' the unexpected but valuable new idea or insight, allowing it to 'surface' so that others might encounter it as well — Just because an idea is good doesn't mean everyone has encountered it; just because everyone has an idea doesn't mean it's the best. However, teachers (especially before the interview) tended to look upon what I was doing as a poll — that the view that was stated the most often, 'won.' Several teachers wanted to participate, but hadn't taught in the previous programme. When told they weren't eligible to participate, two of the teachers expressed a feeling of disenfranchisement. In reality, I ended up putting almost every idea, even if it was stated only once, into the 'results' section. This was because I didn't think my role was to be the judge of the worth of particular ideas at this point in the project, and because part of the value of this kind of research is in seeing the range of opinion within the subject group. However, if I saw an inconsistency, a negative consequence or implication, I may have pointed it out in a note or in the 'results' section — not to the interviewee ~ but tried to leave it to the reader to make the final decision. In this vein of wanting to include all ideas, whenever I put prevalent ideas into the 'results,' I noted them as such. However, if a single person's idea was unique or represented an extreme point of view, shed interesting light on the topic, was very wellstated, or was what I judged to be a positive contribution to 'the dialogue' ~ one which  Reitz 135  others might 'pick up on' and incorporate into their own points of view — I added these ideas with the caveat that 'one teacher said . . . . ' People sometimes gave several different results of one phenomenon, or several different causes of another. One person said 'A' was bad because of'B,' while another person said 'A' was good because of T3.' It would be interesting to see how they would react to inconsistencies such as this which cannot be pointed out during an interview. Because of thoughts like this, and acknowledging that the interviewer's role is very subjective, I thought it would be a good idea to take the results I had written up back to the teachers to ask for their written comments as to where the results 'rang true' and where they didn't. When I was told this is a technique often used in interviewing, I decided I would definitely try it. However, there was no time to re-interview teachers; instead, I made several copies of the 'results' available in the teacher's lounge and asked teachers who had participated in the interviews to read them, then write notes and comments onto these copies over a two-week period. After that time, I incorporated their 'second opinions' into the results as well. Unfortunately, only two teachers responded to this format. In contrast, teachers from the second-year (and beyond) campus, when shown the qualitative data, felt that their voices and that of other stakeholders should have been included because the limited point of view presented herein is not 'correct'. While I sympathize with this point of view, acknowledging that it would have been very interesting and that the reality this enhanced perspective would have provided would be closer to 'objective truth' (if I may use this term), I feel that the (albeit limited) view of the stakeholders I have interviewed is every bit as valid as that of any of the other stakeholders. Their insider point of view certainly helps explain why various decisions were made. Often, as has been shown, justice issues such as concerns about equality, consistency, standards, accessibility, and self-esteem were at the root of their decisionmaking.  Reitz 136  A similar response to interview results has been described by Kidder and Fine. Trying to answer the question, 'Whose story shall prevail?' they note that quite often, those who observe others and the actors themselves have very different notions of the causes of the actors' behaviour: "Observers are likely to locate causes within the actor . . . and actors are likely to locate causes in their surroundings (71)." Note that teachers and administrators can each, at various times, be either actor or observer. From my teacher interviews, Kidder and Fine's claim is often (but not always) borne out — teachers frequently justify their own actions in terms of the situation in which they found themselves, but  explain administrators' actions in terms of their respective personal  strengths or weaknesses. These personal attacks, while interesting, can be very hurtful to administrators who, unlike teachers, can be easily identified since there were so few of them. I have attempted to delete most of these statements — they do not help evaluate the curriculum and they cause unnecessary pain to people who were working under great duress. Teachers generally described the previous programme with consistency, logic, and clarity, in sharp contrast to the way in which they describe the present one. Teachers generally spoke 'with one voice' describing the previous programme; only a couple of teachers were enthusiastic about it very often. As a consequence, it was easy to say, "The consensus was . . . " (and very tempting to say, "The programme was . . . !) However, when teachers started describing the present programme, their answers started to seem more and more subjective, expressing many different points of view and shades of opinion. Teachers had various complaints about the present programme, but there seemed to be no clear or coherent statement, neither of the problems nor of possible solutions to them. Here I found myself quoting more, in order to let the variety of voices be heard — I couldn't label something as a 'consensus' which clearly was not. Now, there is no clear battlecry such as was heard last time the programme changed: "Consistency! Standards! Levels! Modules!" However, I hope that the process  Reitz 137  of participating in my research may enable teachers to start formulating a more coherent statement of current problems, possible solutions, and a clear, collective vision of the future course of curriculum at the institution. In summary, anonymous interviews of people who taught in both a previous and a present programme can be very helpful in programme evaluation through comparison. They can give collective (not necessarily 'correct') answers to specific questions. In open-ended questions, many new and interesting points of view emerge, as in a 'group brainstorm.' It gives participants a chance to examine (and explain) the present in terms of the past; and to re-examine the past in terms of the future; indeed, the process itself may lead to problem-definition and problem-solving. The narrative form that people so often select to give their answers in naturally includes causal information. However, as noted in the introduction, these narratives each told a somewhat different story. This leads us to the notion that... Inherent weaknesses of this research technique include the subjective nature of both (1) the data and (2) its interpretation by the interviewer. Returning a transcript (of the notes taken) to the interviewee for corroboration could help correct the latter, but not the former, subjectivity. On the other hand, the multiple perspectives it provides (the multiple subjectivities, if you will) are also its greatest inherent strength. The more blind men describing that elephant, the better.  6.3  A Stereoscopic View (Getting Three Dimensions from Two Perspectives)  I have contended in this thesis  that in order to make a more accurate  determination of whether knowledge is being justly distributed in an educational programme, both quantitative and qualitative research should be pursued. Therefore, let us briefly examine what happens when we put the results of these two projects together.  Reitz 138  Table 6 Summary of Quantitative and Qualitative Results [Quantitative Results in Bold, Qualitative Results in Italics]  Almost no significant differences were noted between the two programmes, though assessment was confined to listening and reading proficiency.  The previous programme had positive features — it was probably of greater interest to upper-entry students, but all students suffered from the lack of standards and consistency. Lower-entry students in the previous programme in particular experienced low self-esteem and low motivation due to the inaccessibility of the content and/or to the limited opportunities for advancement available within the three 'track' system  There is a possibility that the present programme may be slightly more effective in Listening for the lower entry students than the previous programme, and slightly less so for the higher-entry students.  The present program is more effective in general (for both Listening and Reading), particularly for lower-entry students, who are given extra time and resources, if necessary, to complete it; perhaps it isn't challenging enough for higher-entry students.  Both programmes seem to be 'just' according to Rawls' criteria. Both programmes seemed to be equally effective in their distribution of educational resources.  The present programme seems to be more 'just' according to Rawls' criteria. It is generally more effective as well.  Reitz 139  How does the amalgamation of these two perspectives alter the view of the programmes we would get by using only one?  6.3.1  Stereoscopic Conclusion #1  Let us begin by looking at how the combined results affect how just and effective the two programmes are in general  W WE ONLY USED THE QUALITATIVE RESEARCH, we would conclude that the present programme is far more just and effective than the previous one, though the previous programme contained elements some teachers would like to re-incorporate.  IF WE ONLY USED THE QUANTITATIVE RESEARCH, we would conclude that there was little difference in either the justice or effectiveness of the two programmes.  STEREOSCOPIC VIEW: The quantitative research informs the qualitative research that the previous programme was far more effective (or the present one far less effective) than widely supposed by teachers. On the other hand, the qualitative research informs the quantitative research about the high dropout rate of both students and teachers in the previous programme (a factor which could be looked at quantitatively, but which was brought to light by the qualitative interviews), the general unhappiness, lack of motivation, low self-esteem, and particularly the lack of consistency and standards, all of which resulted in the creation of the present programme.  Reitz 140  STEREOSCOPIC CONCLUSION #1: The two programmes are both very similar in the effectiveness with which they distribute Listening and Reading knowledge to students, but the present programme does it more justly.  To arrive at this conclusion, we had to reject a finding (actually, an extrapolation of a finding) of the quantitative research -- that there was little difference injustice — as based on incomplete information, since the quantitative research did not look at factors (such as 'consistency' and 'self-esteem') which also determine justice, but may have no discernible effect on test outcomes.  6.3.2  Stereoscopic Conclusion #2  Now let us look at how the combined results affect how justly and effectively the two,programmes distribute educational resources to low-entry and high-entry students.  IF WE ONLY USED THE QUALITATIVE RESEARCH: we would conclude that the present programme is probably slightly more effective and just for the lowerentry than for the higher-entry students (for whom it provides less challenging content, at least in the first part of the year, than did the previous programme). However, given Rawls' criteria, we could conclude that this inequality was just (though there are some problems using Rawls' criteria here).  IF WE ONLY USED THE QUANTITATIVE RESEARCH: we would conclude that the present programme is probably slightly more effective for lower entry, and slightly less effective for higher entry, students in its distribution of Listening knowledge. However, given Rawls' criteria, we could conclude that this inequality was a just one — though again, there are problems with this conclusion. We would also conclude that there  Reitz 141  was no apparent difference in the two programs' distribution of Reading knowledge to the three entry levels.  STEREOSCOPIC VIEW: The tentative conclusions of the quantitative research support those of the qualitative research as far as Listening knowledge, but do not support them for Reading knowledge. In both cases, Rawls' criteria for justice are probably met.  STEREOSCOPIC CONCLUSION #2: Exactly the same as the quantitative conclusion above. In this case, we must stick with the more specific results of the quantitative research which are supported in part by the qualitative research. We must reject the qualitative results as applied to Reading because the quantitative research clearly and convincingly refutes this.  To arrive at this conclusion, we had to reject a finding of the qualitative research as based on teachers' overgeneralization to Reading of what was basically a valid observation about lower-entry versus upper-entry students' Listening progress.  6.4  Conclusions about the Kind of Information Gained by Combining these Two Types of Research  Hopefully this protracted exercise has served to demonstrate the usefulness of combining the two approaches to research, emphasizing their complementarity rather than their mutual exclusiveness. In Section Three (thesis, p33) I listed the three most widely-mentioned benefits of combining qualitative and quantitative methods: (i.) that each is strong where the other is weak; thus they fill in each others' gaps, complementing one another and strengthening the research, (ii.) that when they support one another, the results are strengthened as well, and (iii:) that when they contradict one  Reitz 142  another, both results are called into question; in this case an explanation for the contradiction (possibly requiring further research) is called for. My research supports (i.) above. The qualitative research definitely complemented the quantitative in that it brought issues to light and allowed programme justice issues to surface which were not detected through quantitative methods. However, the quantitative research was particularly good at verifying programme effectiveness and demonstrating how educational resources (as shown via learning outcomes) are actually distributed among the various ability levels. That these two perspectives can act as checks on one another should now be apparent. In addition, my research supports (ii.) in that when results overlap, conclusions are clearly corroborated. A good example of this is when both the qualitative and quantitative conclusions supported the idea that lower-entry students' mean Listening improvement may be superior in the present programme (though the data was insignificant). However, my research gives particular support to (iii.), giving three examples of it: a) by pointing out when logical but incorrect conclusions have been made by using incomplete information — e.g. that there was no difference between the two programmes in how justly resources were distributed; (b) by pointing out when overgeneralizations have been made from essentially correct observations — e.g. that lower-entry students' mean Listening and Reading improvement are both superior in the present programme; and (c) by clarifying whether differences in achievement are real or imagined ~ e.g. that the present programme is superior in its distribution of Listening and Reading knowledge.  Reitz 143  6.5  Conclusions about Use of John Rawls' Criteria when Addressing Issues of Educational Justice  I found the notion of educational 'values' or 'resources' which included access to teacher, social rewards (such as grades, diplomas, etc.) and self-esteem to be very useful in determining the various dimensions of justice in education. It isn't enough to look at 'how much money is spent', certainly. I also found very interesting and helpful the idea of comparing the distribution of resources to 'the least advantaged' (person, group) versus more advantaged (people, groups). The idea of justifying a certain amount of inequality if it can be shown to eventually better the whole of society seems to make a lot of sense. If not for John Rawls' theory, I would not have deemed it so important (in either the quantitative or qualitative study) to see if the two programmes affected students in the various levels differently, and so been unaware of one of the two significant findings of the research.. However, dealing with students who are Japanese nationals studying in Canada begs the notion of 'the whole of society'. Should we consider it to be 'the college as a whole?' Canadian society? Japanese society? Or 'global society?' The answer here is unclear, though my natural tendency would be to try to determine the eventual effect on global society (of having more Japanese young people who, being able to speak English, can communicate with 'foreigners'). My most serious misgiving when trying to apply John Rawls' theories to evaluating the justice of educational resource distribution pertains to his notion that no one should be worse off in an unequal distribution than they would have been had the resources been distributed  equally. How to determine (a) what an 'equal' distribution  of educational resources would actually look like and (b) what the results of this equal distribution on the various groups of students would be is a big question for me still; I  Reitz 144  have no clear solution for this and hope that those who argue for and against, for example, affirmative action programmes will eventually help clarify these issues.  6.6  Summary of Research Findings and Conclusions  6.6.1  Summary of Research Findings and Conclusions - For the Site (in-  house)  Getting back to the theoretical base from which we began this journey, the research taught me several interesting new things about my institution. First, it showed me that what teachers perceived as a big general increase in student learning due to initiation of a modular, skills-based mastery learning programme was only imagined. The previous programme had glaring faults from fairness and 'equality of access' points of view, but it certainly distributed knowledge (of listening and reading at any rate) no worse than the present programme. Secondly, the phenomenon of the greater mean increase in Level One Listening SLEP, along with that of the smaller mean increase in Level Three Listening SLEP was was reminiscent of the 'Robin Hood' effect, sometimes hypothesized for Mastery Learning, in which low-ability students make proportionally more progress, but highability students make proportionally less progress, than they did previously. Of course, the effect was small, it only held for Listening, and it was only significant for Level Three - and that only for one year. Nevertheless, having it corroborated independently by teacher interviews as well was quite exciting. Next, there was the issue of V.'s theories. They were almost totally out of use by the time I started teaching at the college, so I was unprepared for the vitriolic memorie s/f them which my interviews brought up. My conclusion is that because of the timing and the way her theories were presented, no sense of teacher-ownership of the theories had a  Reitz 145  chance to develop. In the end, Vs theory may be seen to have served three unintended functions: first, as a symbol and convenient scapegoat for teacher frustration with the entire programme; next, as a unifying factor, in that opposition to it served to enable the new and disparate teaching staff form common cause; and finally, as a catalyst for change. Finally, and perhaps most interesting: in the beginning, I defined two sets of attributes of power as 'horizontal' and 'vertical'. Horizontal power I saw as diffused, egalitarian, unpredictable, individualistic, tolerating of many 'correct' answers and ambiguity, soliciting active cooperation from those within its influence, encouraging creativity, and evaluating primarily by qualitative means. Vertical power, on the other hand, I saw as centralized, authoritarian, predictable, standardized, assuming that there is only one true 'reality', soliciting passive receptivityfromthose within its influence, encouraging uniformity, and evaluating primarily by quantitative means. In apparently typical Cartesian fashion, I had set up a series of dualisms. Yet, I hope it is clear to the reader that I see each as representing an extreme on a continuum rather than a binary choice. Using these sets of attributes, it struck me that teachers in thefirstthree years of this institution's history, though they were being asked to conform to some extent by use of V.'s theories, were basically in an extremely 'horizontal' situation. They were quite free to develop their own curricula including their own materials, methods, tests, and evaluation systems. Perceiving this extreme situation's effects to be a lack of standards, of coordination, of logical sequence, and of monitoring; confused and demoralized students; and inadequate materials, teachers voluntarily resituated themselves in a strongly 'vertical' configuration: the leveled, modular, skills-based, mastery learning programme we now have. Ironically the power did not comefromabove; rather, the teachers imposed a high degree of standardization upon themselves. When asked why they were doing this, they  Reitz 146  often cited a justice rationale: fairness (consistency and equity) in the teaching and evaluation of students. A few years later, when the negative effects of extreme vertical power (teaching to the test, overstandardization of materials and tests, individuality and creativity stifled) started becoming apparent, teachers began questioning if they had perhaps gone too far in their pursuit of fairness through standardization. They are at this crosssroads now, seeking to strike an acceptable balance between these two extremes.  6.6.2  Summary of Research Findings and Conclusions - Of The Thesis  While my primary intent at the beginning was to use the criteria for justice formulated by John Rawls to determine the justice of the distribution of educational resources at my institution, I think it is clear that I ran into some important limitations which madeit impossible to apply these criteria fully in this context. While justice was my initial interest, it soon became obvious to me that effectiveness was requisite to a just programme, so effectiveness became a secondary focus. The idea of looking at how a programme affects both those students with low- and high-entry skills was inspired by Rawls, however; it was useful in helping determine both effectiveness and justice, and was equally appropriate for the quantitative and the qualitative studies. I hope this thesis has also succeeded in its goal of demonstrating use of a combination of quantitative and qualitative research methods in the determination of the justice of educational programmes.  6.7  Suggestions for Further Research  6.7.1  Suggestions for Further Research - In-House  Reitz 147  I will only give three of numerous suggestions that come to mind. First, as stated, the implications of the quantitative research are really limited to reading and listening. I would suggest collecting random samples of student writing and student speech (through audio or video cassette recording) at the beginning and end of each year.  This would be  more difficult, and the samples by practical necessity, smaller. Each students identification number would have to be attached to the sample for correlation with SLEP (so that a 'level' could be assigned to the student). Through these samples, a data bank would be created, allowing researchers to match pairs and compare achievement in these 'output' forms of English over the years and over different programmes at the institution. Secondly, the institution should continue with beginning- and end- of year SLEP testing, if at all possible, in order to compare the listening and reading achievement of students among the various years and programmes. Lastly, while the institution does administer anonymous year-end programme evaluation questionnaires to first-year students (to which they must reply in English in writing) and does have informal, one-on-one interviews between Japanese staff and students towards the end of their first year, it might consider making the latter into more standardized exit interviews to give students a voice in an ongoing qualitative analysis of faculty and student perceptions of programme effectiveness and justice. Care should be taken to note responses according to entry-level. These could also be correlated with various qualitative and quantitative data the college already collects, but has not, to my knowledge, tried to combine, such as achievement in Year Two, graduation exit interviews, and eventual employment data.  6.7.2  Suggestions for Further Research - Theoretical  There are three big questions remaining in my mindafter this thesis. The first is the aforementioned exploration of John Rawls' idea: how tq; define 'the whole of society'  Reitz 148  in the late twentieth century. While Rawls himself recommended applying it on the level of country (or below), in this rapidly forming 'global society' I wonder if at some point Rawls' theory could be seen as applicable at this level. The second is also related to Rawls' theories: how to define what an equal distribution of educational resources would be, and how to conjecture, without actually putting it into effect, what the effects of this would be on each group — in order to determine whether any proposed unequal distribution would distribute to any group fewer resources than they would have had under this hypothetical situation of strict equality. The third is an extrapolation of a criterion of equality proposed by Jerrold R. Coombs (thesis, page 5) that any two groups could have two very different programmes, both of which could be considered 'just' if "neither group has reason to envy the resources given to the other (Coombs 291)." In the case of extra resources being given to low-entry level students (for example, more time to complete their study of Levels One to Four), one could look at a possible tradeoff available for upper-entry level students who get to study in more depth and pursue a greater variety of topics. However, how are we to know if one group envies the resources given to the other? One possibility would be to specifically address fairness issues in the term-end 'student satisfaction' surveys. A researcher would have to decide what percentage of students would have to be unhappy, and how unhappy they would have to be, before programme changes would be made. S/he would also have to decide what to do if lowerlevel students were happy, but upper-level students weren't. This is an interesting proposal which does not throw extra weight towards the well-being of the leastadvantaged groups as does Rawls. Thus it is more egalitarian than Rawls' proposal, though arguably not as 'just'. It poses many unanswered questions, but is intriguing and, like Rawls' proposals, should be explored.  Reitz 149  6.8  A Postscript  Three major developments of interest have recently occured. In December of 1996, the teachers decided to remove grammar from the writing course and put it in the listening/speaking course, adjusting hours accordingly. On January 2, 1997, the teachers of the college, in a typically 'horizontal' act, took a vote on whether to continue the 80% mastery standard for Foundation courses. They voted to make the new standard 50% starting in April, 1997, the beginning of the new year. The two most prevalent rationales for this were (1) unhappiness with two parallel grading schemes ~ a desire for consistency — and (2) to raise standards, in that out of a desire to enable most students to gain 80% onfinalexams, teachers had created exams that were too easy. Levels, modules, and the use of (presumably more difficult) standardized exams  by a team of teachers will continue.  Interestingly, a desire for consistency (among teachers, not among grading schemes) and a desire to raise standards had been two of the major rationales for the change from the previous to the present programme. Consistency and high standards, then, seem to be very important values to our teaching staff. Hopefully the consistency and higher standards that were gained by changing to the present programme will not be lost by the change to a 50% pass standard. Finally, though the school has made no plans to discontinue SLEP testing, it will be administering as well an instrument more popular in the Japanese market ~ the TOEIC exam. If the SLEP testing were to be discontinued as 'redundant,' a valuable means of comparing our programmes over time would be \oi\:. My concern here is mainly with justice, particularly, as John Rawls has taught me, with that of the least-advantaged students. Hopefully, teachers and others who evaluate these new programmes will look specifically at how each>programme affects these students. Hopefully, as well, they will not depend solely on either teacher intuition  Reitz 150  nor on test scores alone. Rather, they will thoughtfully combine these two kinds of results. If they do this, not only the least-advantaged, but the entire college, will benefit.  Reitz 151 WORKS CITED Aoki, Tetsuo. "Toward a dialectic between the conceptual world and the lived world." Contemporary curricular discourses. Ed. William Pinar. Scottsdale, AZ: Gorsuch Scarisbrick, 1988. 402-16. Apple, Michael, ed. Cultural and economic reproduction in education: Essays on class, ideology, and the state. NY: Routledge, 1982. "Curriculum and reproduction." Curriculum Inquiry 9 (1979): 231-52. Arlin, Marshall. "Time, Equality, and Mastery Learning." Review of Educational Research 54 (1984): 65-86. Bakhtin, Mikel. Speech genres and other late essays. Trans, and Ed. V. McGee, et al. Austin, TX: U Texas Press, 1986. Block, J.H., H.E. Efthium and R.B. Burns. Building Effective Mastery-Learning Schools. NY: Longman, 1989. Block, J.H. ed. Mastery Learning: Theory and Practice. NY: Holt, 1971. Bloom, Benjamin S. "Learning for mastery." Evaluation Comment 1 (1989): 1-12. Books, Sue. Critical authority as a terrain of struggle: What is gained and what lost in the struggle on this terrain? Proc. of Bergamo Conference on Curriculum Theory and Practice. Dayton, OH: October, 1992. Bowles, Samuel and Herbert Gintis. Schooling in Capitalist America: Educational Reform and the Contradictions of Economic Life. 1976. NY: Basic Books ' Paperback, 1977. Brecher, Janice I. "Secondary Level English Proficiency Test." Reviews of English Language Proficiency Tests. Ed. J. Charles Alderson, Karl J. Krahnke, and Charles W. Stansfield. Wash., D.C: TESOL (1987): 68-70. Carlson, Dennis. Teacher and crisis: Urban school reform and teachers' work culture. NY: Routledge, 1992. Carnoy, Martin. "School improvement: Is privatization the answer?" Decentralization and school improvement: Can we fulfill the promise? Eds. J. Hannaway and M. Carnoy. San Francisco, CA: Jossey-Bass, 1993. 163-201. Carroll, J.B. "A model of school learning." Teachers College.Record 64 (1963): 723-33.  Reitz 152 Champlin, J.R. "Is creating an outcome-based program worth the extra effort? A superintendent's perspective." Outcomes 1.2 (1981): 4-8. Chan, K.S. "The interaction of aptitude with mastery versus non-mastery instruction: Effects on reading comprehension of grade three students." Diss. U. Western Australia, 1981. Chapman, William. Inventing Japan: The Making of a Postwar Civilization. NY: Prentice Hall, 1991. Coleman, James S. "The Concept of Equality of Educational Opportunity." Harvard Educational Review 38 (1968): 7-22. "Equality of Opportunity and Equality of Results." Harvard Educational Review 3 (1973): 129-37. Coleman, James S.; E. Campbell, C. Hobson, J. McPartland, A. Mood, F. Weinfield and R. York. Equality of educational opportunity. Washington, D.C.: Government Printing Office, 1966. Comenius, John, a.k.a. Johann Komensky. "Pampaedia." c. 1630. Comenius'Pampaedia or Universal Education. Trans, (f. Latin) A.M.O. Dobbie. Dover: Buckland, 1986. Connelly, F. Michael and D. Jean Clandinin. "Narrative inquiry: Storied experience." Forms of curriculum enquiry. Ed. E. Short. Albany, NY: SUNY Press, 1991. 121-154. Conner, K, I. Hill, H. Kopple, J. Marshall, K. Scholnick and M. Shulman. "Using formative testing at the classroom, school, and district levels." Educational Leadership 43 (1985): 63-67. Cook, T. and C. Reichardt, eds. Qualitative and quantitative methods in evaluation research. Beverly Hills, CA: Sage, 1979. Coombs, Jerrold R. E-mail to the author. 7 Jan. 1997. E-mail to the author. 10 Feb. 1997. "Equal access to education: the ideal and the issues." Journal of Curriculum Studies 26 (1994): 281-295. Corwin, R. Militant professionalism: A study of organizational craft in high schools. NY: Appleton-Centry-Crofts, 1970.  Reitz 153 Costniuk, Bill. "Education in Japan." History and Social Science Teacher 23 (1988): 147-50. Covey, Steven R. "Whole New Ball Game." Executive Excellence August 1996: 3-4. Deleuze, G. and F. Guttari. Anti-oedipus: Capitalism and schizophrenia. NY: Viking, 1977. Derrida, Jacques. Of grammatology. 1967. Trans, (f. Fr.) G. Spivak. Baltimore, MD: Johns Hopkins UP, 1976. "Distribution." Concise Oxford Dictionary. NY: Oxford UP, 1990. Educational Testing Service. SLEP Test Manual. Princeton, NJ: ETS, 1988. Eisner, Elliot. The art of educational evaluation: A personal view. London, UK: Falmer, 1985. The enlightened eye: Qualitative inquiry and the enhancement of educational practice. NY: Macmillan, 1991. Fitzpatrick, K.A. and W.W. Charters. A study of staff development practices and organizational conditions related to instructional improvement in secondary schools. Eugene, OR: U of Oregon - College of Education's Center for Educational Policy andMgmt.: 1986. Freire, Paulo. "Conscientizing as a way of liberating." Contacto. March, 1971. Ed. A. Hennelly. Liberation Theology. Maryknoll, NY: Orbit Books, 1990. Pedagogy of the oppressed. 1968. NY: Seabury, 1970. Giroux, Henry A. Theory and resistance in education: A pedagogy for the opposition. South Hadley, MA: Bergin and Garvey, 1983. Gitlin, A. "Educative school change: Lived experiences in horizontal evaluation." Journal of Curriculum and Supervision 4 (1989): 322-39. Gitlin, A. and S. Goldstein. "A dialogical approach to understanding: Horizontal evaluation." Educational Theory 37 (1987) 17-27. Glanz, J. "Beyond bureaucracy: Notes on the professionalization of public school supervision in the early twentieth century." Journal of Curriculum and Supervision 5 (1990) 150-70. Glickman, C , ed. Supervision in transition: 1992 Yearbook of the Association for Supervision and Curriculum Development. Alexandria, VA: ASdD. 1992.  Reitz 154 Goodlad, John. "Access to knowledge." Teachers College Record 84 (1983): 787-800. Goya, Susan. "The Secret of Japanese Education." Phi Delta Kappan, October 1993: 126-9. Grumet, Madeline. "Retrospective: Autobiography and the analysis of educational experience." Cambridge Journal ofEducation 20 (1990): 321-6. Guskey, T.R. "Defining the essential elements of mastery learning." Outcomes 1 (1987): 30-34. House, Ernest. "Justice in Evaluation." Evaluation Studies Review Annual, Volume One. Ed. G.V. Glass. Beverly Hills: Sage, 1976. 75-100. Howe, Kenneth R. "Two Dogmas of Educational Research." Educational Researcher. Oct. 1985: 10-18. Huber, M. "The renewal of curriculum theory in the 1970s: An historical study." JCT 3 (1992): 14-84. Husen, Torstein. "Problems of Securing Equal Access to Higher Education: The Dilemma between Equality and Excellence." Higher Education 5 (1976): 402-22. Jackson, Phillip. Life in Classrooms. NY: Holt, 1968. Japan. Provisional Council on Educational Reform. First Report on Education Reform. June 26, 1985. "Justice." Fontana Dictionary ofModern Thought. London, UK: Fontana-Harper Collins, 1988. Kellaghan, T. and G. Madaus. "National testing: Lessons for Americans from Europe." Educational Leadership (1991): 87-90. Kidder, Louise H. and Michelle Fine. "Qualitative and Quantitative Methods: When Stories Converge." Multiple Methods in Program Evaluation. New Directions for Program Evaluation 35 (Fall, 1987). Ed. Mark W. Lipsey. San Francisco, CA: American Evaluation Assn., Jossey-Bass, 1987. 57-75. King, Martin Luther (Jr.). Why We Can't Wait. NY: Harper, 1963. Kitamura, Kazayuki. "The Decline and Reform of Education in Japan: A Comparative Perspective." Educational Policies in Crisis. Eds. William K. Cummings, et al. NY: Praeger, 1986.153-170.  Reitz 155 Leclerq, Jean-Michel. "The Japanese Model: School-Based Education and Firm-Based Vocational Education." European Journal ofEducation 24 (1989): 133-49. Madaus, G. and T. Kellaghan. "Curriculm evaluation and assessment." Handbook of research on curriculum. Ed. Phillip Jackson. NY: MacMillan, 1992. 119-54. Madey, D.L. "Some benefits of integrating qualitative and quantitative methods in program evaluation, with illustrations. Educational Evaluation and Policy Analysis 4 (1982): 223-36. Mark, Melvin M. and R. Lance Shotland. "Alternative Models for the Use of Multiple Methods." Multiple Methods in Program Evaluation. New Directions for Program Evaluation 35 (Fall, 1987). Ed. Mark W. Lipsey. San Francisco, CA: American Evaluation Assn., Jossey-Bass, 1987. 95-100. Miller, Lynne and Ann Lieberman. "School improvement in the United States: nuance and numbers." Qualitative Studies in Education 1 (1988): 3-19. Nakayama, S. Science in Japan, China, and the west. Trans. J. Dusenbury. New Haven, CT: Yale UP, 1984. Naotsuka, Reiko and Nancy Sakamoto, et al. Mutual Understanding of Different Cultures. Tokyo: Science Education Institute of Osaka Prefecture, 1981. Pinar, William F. "Whole, bright, deep with understanding: Issues in qualitative research and autobiographical method." Contemporary Curriculum Discourses. Ed. William Pinar. Scottsdale, AZ: Gorsuch Scarisbrick, 1988. 134-53. Pinar, William F., William M. Reynolds, Patrick Slattery, and Peter M. Taubman. Understanding Curriculum: An Introduction to the Study ofHistorical and Contemporary Curriculum Discourses. NY: Counterpoints-Peter Lang, 1995. Rawls, John. A Theory ofJustice. Cambridge, MA: Belknapp-Harvard UP, 1971. Slavin, Robert E. "Mastery Learning Reconsidered." Review ofEducational Research 57(1987): 175-213. Stansfield, Charles. "Reliability and Validity of the Secondary Level English Proficiency Test." System 12 (1984): 1-12. Stone, L. "Results from a global curriculum project evaluation: Practical problems theoretical questions." Paper presented at the annual meeting of the American Educational Research Assn., 1984, New Orleans, LA.  Reitz 156 Strike, Kenneth A. "Education, Justice and Self-Respect: A School for Rodney Dangerfield." Philosophy of Education, 1979: Proceedings of the 35th Annual Meeting of the Philosophy of Education Society. Ed. Jerrold R. Coombs. Normal, IL: PES, 1980. 41-49. "Fairness and Ability Grouping." Educational Theory 33 (1983): 125-34. "The Role of Theories of Justice in Evaluation: Why a House is not a Home." Educational Theory 29 (1980): 1-9. Tucker, R.C. The Marxian Revolutionary Idea. NY: Norton, 1969. Tyack, David. "School governance in the United States: Historical puzzles and anomolies." Decentralization and school improvement: Can we fulfill the promise?: Eds. J. Hannaway and M. Carnoy. San Francisco, CA: Jossey-Bass, 1993. 1-32. Unks, Gerald. "Three Nations' Curricula: What Can We Learn from Them?" NASSP Bulletin 76 (1992): 30-46. Whitson, James A. "The politics of'non-political' curriculum: Heteroglossia and the discourse of'choice' and 'effectiveness'." Contemporary Curriculum Discourses. Ed. William Pinar. Scottsdale, AZ: Gorsuch Scarisbrick, 1988. 279-331. Willis, Paul. Learning to Labour. Farnborough, UK: Saxon House, 1977. Hampshire, UK:Gower, 1981. Wise, A.E. "Minimum competency testing: Another Case of Hyper-Rationalization." Phi Delta Kappan 59 (1978): 596-8. Worthen, Blaine R. and James R. Sanders. Educational evaluation, alternative approaches and practical guidelines. White Plains, NY: Longman, 1987.  Reitz 157  APPENDIX 8.1 Samples of SPSS 6 1 Matched-Pairs Input and Transformed Data  Table 7 Key to SPSS Input Data (1990 matched with 1996)  =  id96 listl96 read196 list296 read296 id90 list290 read290  —  = =  Student ID# (1996) Listening SLEP score, beginning of year (both members) Reading SLEP score, beginning of year (both members) Listening SLEP score, end of year (1996 student) Reading SLEP score, end of year (1996 student) Student ID# (1990) Listening SLEP score, end of year (1990 student) Reading SLEP score, end of year (1990 student)  Table 8 Sample of SPSS Input Data (1990 matched with 1996) id96  list 196  read 196  list296  read296  id90  list290  read290  962097  17  21  26  23  902163  20  24  962136  17  22  26  25  902034  22  23  Table 9 Key to SPSS Transformed Data (1990 matched with 1996) chslep90 chslep96 chlist90 chread90 chlist96 chread96 slepl slepllev  = = = = =  [list290 + read290] - [listl96 + readl96] [list296 + read296] - [listl96 + readl96] Iist290-listl96 read290-readl96 list296 - listl96 read296 - readl96 Iistl96 + readl96 lIFslepl<29; 2 IF 28<slepl<37; 3 IF slepl>36  R e i t z 15  > CD 00  Ov  ro  "«5 VO  ov  T3  OS CD  S  OV  o  OS  -o ca K  ro  i—i  u  j= o  o Si  CO  ro  o VO Os  a. CD  o ON n. _u J= O  VO  O OS  CN T3  fl>  2;  ~  ro  CN  CN  0  OS tN to  IS  O CN  CN CN  ro  VO  1 . 3  •;••/ vo  1  © CN O Os  OS  Os CN  C  VO OV CN tili  CN  CN  VO CN  VO CN  VO OV  T3 CN CN VO OV  Os  o OV  T3  CN VO Os  CN  vo Ov  Reitz 159  APPENDIX 8.2 Sample of SPSS 6.1 Ouput Showing Means and Significance Partial Data Used for Figure 1  t-tests for Paired Samples Variable  Number of pairs  Corr  2-tail Sig  122  .235  .009  chslep90 chslep92  Mean  SD  SE of Mean  11.6393  4.835  .438  11.1148  4.694  .425  t-value  df  2-tail Sig  121  .328  Mean  SD  SE of Mean  12.7600  5.131  .513  12.4400  4.205  .421  Paired Differences Mean  SD  SE of Mean  5.896  .534  .5246  .98  95% CI (-.532, 1.581)  Variable  Number of pairs  Corr  2-tail Sig  chslep90 100  .071  .480  chslep94  Paired Differences Mean  .3200  SD  SE of Mean  t-value  df  2-tail Sig  6.397  .640  '  99  .618  95% CI (-.949, 1.589)  .50  Reitz 160  APPENDIX 8 3  Parallel 80% / 50% Pass Mark Grading Schemes Used in Students' First Year at College 'X' (Initiated April, 1991 and to be Discontinued March, 1997)  Letter Grade: Hour  80% Pass 'Foundation' % Equivalent  50% Pass 'Transition' % Equivalent  Grade Points per Credit  A+  98-100  95-100  4.33  A  95-97  90-94  4.0  A-  92-94  85-89  3.67  B+  89-91  80-84  3.33  B  87-88  75-79  3.0  B-  85-86  70-74  2.67  C+  83-84  65-69  2.33  c  80-82  60-64  2:0  NC  0-79  Incomplete  c-  55-59  1.67  D  50-54  1.0  F  0-49  0  -~  Reitz 161  APPENDIX 8.4  Application: A Look at some Site-Specific Justice Issues:  Given the 'Stereoscopic Conclusion' that the present listening program may well be better for low-entry (and worse for high-entry) students than the previous program, the next logical step would be to continue the program 'as is' for the low-entry students and formulate one with more challenge for the high-entry students. A number of intriguing suggestions were made along these lines by teachers in the qualitative interviews. However, there are justice issues to consider when changing a program: in particular^ possible unintended negative effects on lower-entry students. The following principle, based on John Rawls' theory, should be kept in mind when considering changing a program: An institution would not normally want to reverse an increased mean gain for its lower-entry students as the cost of increasing a mean gain for its higher-entry students.  Consider the following possible unintended effects adding enrichment and challenge for higher-entry students might have on lower-entry students:  (a)  Funds that might have been spent on remedial tutoring and materials might be  shifted to an enrichment program.  (b)  If Levels Three through Five (or elective) requirements or standards were made  more challenging, this could cause hardship to students who began in Level One wjien they got to those levels (in the latter part of the year). [On the other hand, changing s ^ . :  Levels Six or Seven (Listening/Speaking and/or Writing) would probably have no effect  Reitz 162  on lower-entry students; neither would adding an additional level for upper-entry students to challenge into at mid-year.]  (c)  If electives and Transition Reading became leveled (homogeneous), there would  be further separation of students by ability-levels even at year end, in contrast to the present, where most 'Transition' classes (except Listening/Speaking and Writing) are multileveled (heterogeneous). Lower-entry students would not experience the positive effects (such as positive peer modeling and feelings of equality) they now receive towards the end of the year from being in the same reading and elective courses as higher-entry students. Note that in students' second year, there is only rarely leveling of students.  (d)  In addition to the obvious detrimental effects of the above, all three could also  feasibly result in lower self-esteem for entry-Level One students.  As for the creation of an additional 'pre-Level One' entry level, there seem to be three problems to consider.  (a)  there would be no chance for a student to fail and repeat a level. If they failed,  they would be unable to complete Level Four, and automatically end up in the Second Year's alternative program. This might result in a greatly increased number of students in the alternative program, an impact possibly unacceptable to parents, students, and the second year faculty.  (b)  besides the lowered self-esteem they might experience being,put in the  alternative program, the students (the 'lowest' of the low) placed in this loyel at entry t  would experience, from thefirstday of theirfirstyear, even lower self-esteem than do  Reitz 163  entry-Level One students now,. (What happens now is that students who fail Level One in Module One in effect create this class, though it doesn't start until Module Two — the 'Level One Repeat class'). It is perhaps debatable whether there is more loss of selfesteem for a student who is initially placed in a 'Pre-Level One' class versus that of a student who is initially placed in a Level One class and proceeds to fail it.  (c)  another associated problem is that it is still quite difficult to predict from student  Listening (or total) SLEP scores which students will fail Level One. Therefore, to place a student in a pre-Level One class at entry might be to prematurely judge that student as less capable. Of course, this is exactly what one does by leveling students. However, this addition of a further ability-demarcation at entry might be excessive.  Solutions to the problem of'not enough time in a five-to-six week module to teach all the objectives' included (1) decreasing the number of objectives taught at each level or (2) increasing the number of weeks in each module [thus decreasing the number of modules — presumably from five to four]. What would be the implications of these to lower-entry students?  (a)  To allot fewer objectives to each level would limit how many total objectives  would be presented to each student by years' end, thus altering the present distribution of knowledge in all levels and lowering the entrance criteria for the second year program. It would also require wholesale reorganization of thefirstyear curriculum. Note: lowering the minimum Exit Year One/Entry Year Two requirements would particularly impact upon the lower-entry students; it wouldn't have much effect on upper-entry students who only spend two modules in Foundation courses.  Reitz 164  (b)  Lengthening the modules creates fewer of them in a year, limiting how many  levels could be taught to any one student. At present, if a student enters at Level One and fails one module, s/he is only able to complete Level Four, the minimum for entry into second year programs, by the end of the year. If there were fewer modules, a lowentry student who failed once would be unable to enter the second year program. A possible solution to this would be to decrease the required number of levels to be passed in order to enter second year programs. It is important to note that this option, which limits the number of levels any student can take, would have a more significant impact on high-entry students than would option (a).  (c)  The idea of a continuous intake and exit of students would enable increasing the  number of times students could repeat levels without penalty. However, as the teacher who suggested this also noted, there are social and logistical advantages to having cohort groups. The lack of a cohort 'support' group might have more negative impact upon lower-entry students.  Any of these options would conflict with the present requirements of the second year administration and/or faculty as well. There are, then, no obvious satisfactory solutions to this problem. Clearly this is a complex issue that would have to be very carefully implemented, considering all potential implications, particularly for the lowentry level students.  Finally, what would be the effect on lower-entry students of lowering the 80% pass standard? The effects of this are not so obvious. As noted in the qualitative interviews, some of the reasons for doing this are compelling. The only caution I would have is to think about the possibility of a program 'being more than the sum of its parts'. Proponents of various programs often say that in order to get the optimal effect of their  Reitz 165  program, it must be taken as a whole. Mastery learning proponents tend to say this, certainly, though some argue that the major cause of the success of mastery learning is its use of frequent formative testing. As noted earlier, the quantitative research done here does not explain what parts of a program are responsible for what effects; similarly it cannot predict what effect piecemeal changes (such as lowering the pass standard) would have. One would have to seriously consider all of the ramifications of this sort of change and arrange to monitor its effects, particularly the impact it had upon the distribution of educational resources to lower-entry students.  Reitz 166  APPENDIX 8.5 Description of SLEP Test The Secondary Level English Proficiency (SLEP) Test was developed for use with high-school or junior college students by Educational Testing Service, the makers of the more famous TOEFL (Test of English as a Foreign Language). It is standardized and norm-referenced. It is a multiple-choice test with four options for each of its 150 questions — 75 each in listening and reading comprehension. Brecher (69) states that "validity studies indicate that SLEP is a valid test of English language proficiency . . . (and) . . . largely due to its multiple choice format, the reliability of the test is quite high (see also Stansfield)." It comes in three forms, which we administer in the beginning of the year in April, in mid-year in October, and at year-end in February. We give it for the purposes of initial placement, to provide feedback on student progress to supplement that given by teachers, and to help evaluate our programmes oyer the various years. However, the SLEP test does not measure proficiency in output skills ~ speaking and writing. As well, notes the Educational Testing Service (41), "The test is not designed to provide information about scholastic aptitude, motivation, language-learning aptitude, and cultural adaptability." SLEP can be a predictor of TOEFL achievement. The Educational Testing Service (Table 4, p39) provides this table of score equivalency: SLEP Total Scaled Score 64 58 53 47 42 37 31  Expected TOEFL Total Scaled Score 600 550 500 450 400 350 300  


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items