Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Acceptability and authority in chinese and non-chinese english teachers' judgment of language use in… Heng Hartse, Joel 2015

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2015_september_henghartse_joel.pdf [ 2.16MB ]
Metadata
JSON: 24-1.0166373.json
JSON-LD: 24-1.0166373-ld.json
RDF/XML (Pretty): 24-1.0166373-rdf.xml
RDF/JSON: 24-1.0166373-rdf.json
Turtle: 24-1.0166373-turtle.txt
N-Triples: 24-1.0166373-rdf-ntriples.txt
Original Record: 24-1.0166373-source.json
Full Text
24-1.0166373-fulltext.txt
Citation
24-1.0166373.ris

Full Text

   ACCEPTABILITY AND AUTHORITY IN CHINESE AND NON-CHINESE ENGLISH TEACHERS’ JUDGMENTS OF LANGUAGE USE IN ENGLISH WRITING BY CHINESE UNIVERSITY STUDENTS by Joel Heng Hartse  B.A., Seattle Pacific University, 2003 M.A., Humboldt State University, 2007  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (Language and Literacy Education)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) July 2015  © Joel Heng Hartse, 2015 ii  Abstract This study solicits Chinese and non-Chinese English teachers’ judgments of linguistic (un)acceptability in writing by presenting teachers with essays by Chinese university students and asking them to comment on unacceptable features. Studies of error and variation in first and second language writing studies have often focused on errors in writers’ texts (see Bitchener & Ferris, 2012), but recent sociolinguistic perspectives used in this study take a broader view, considering variations from standard written English in light of the globalization of English. These perspectives, including world Englishes (Canagarajah, 2006; Matsuda & Matsuda, 2010), English as a Lingua Franca (Horner, 2011; Jenkins, 2014), and translingual (Canagarajah, 2013; Horner, Lu, Royster, & Trimbur, 2011) approaches to L2 writing, are applicable to academic English writing in international contexts. This study thus adopts a non-error-based approach to teachers’ reactions to nonstandard language use in Chinese students’ English writing, using the construct of “acceptability” (Greenbaum, 1977).  The study includes two parts: first, it solicits a group (n=46) of Chinese (n=30) and non-Chinese (n=16) English language teachers’ judgments of (un)acceptability by presenting teachers with seven essays by Chinese university students and asking them to comment on unacceptable features. Second, in follow-up interviews (n=20), the study examines teachers’ explanations for accepting or rejecting features of students’ writing and the ways in which they claim the authority to make these judgments. Using these methods, the study is able to determine which lexical and grammatical features of the texts the Chinese and non-Chinese participants judge to be unacceptable, how participants react when they encounter putative features of Chinese English and English as a iii  Lingua Franca, and how they describe their authority to make judgments of linguistic unacceptability. The study finds wide variation in the features of the texts that participants judge as unacceptable, and identifies some possible differing priorities in the Chinese and non-Chinese teachers’ judgments. It also describes how participants from both groups claim authority in judgments, variously positioning themselves as mediators, educators, and language users. The study adds to a body of scholarship which suggests that the identification of “errors” in writing is highly variable and contextual.   iv  Preface This dissertation is the intellectual property of its author, Joel Heng Hartse. All research design, data collection, and analysis was done by the author, and the research was approved by the University of British Columbia’s Research Ethics Board (certificate H11-01966).  Several sections of the dissertation have been published in slightly different form in other venues, described below. All portions of co-authored publications also appearing in the dissertation were solely written by Joel Heng Hartse.  Parts of Chapter 2 appear in: Heng Hartse, J., & Kubota, R. (2014). Pluralizing English? Variation in high-stakes academic  writing. Journal of Second Language Writing, 24, 71-82. A modified version of the second half of Chapter 4 is forthcoming in: Heng Hartse, J. (forthcoming). Chinese and non-Chinese English teachers’ reactions to Chinese  English in Chinese university students’ academic writing. In Z. Xu. & D. He (Eds.), Researching Chinese English: The state of the art. Springer. Another publication involving data from this study appears in: Heng Hartse, J., & Shi, L. (2012). Investigating acceptability of Chinese English in academic  writing. Contemporary Foreign Languages, 384(12), 110-122.   v  Table of Contents Abstract .......................................................................................................................................... ii Preface ........................................................................................................................................... iv Table of Contents ...........................................................................................................................v List of Tables .............................................................................................................................. xiv List of Figures ............................................................................................................................ xvii Acknowledgements .................................................................................................................. xviii Chapter 1: Introduction ................................................................................................................1 1.1 Context of the study ........................................................................................................ 3 1.1.1 China English .............................................................................................................. 5 1.1.2 English as a Lingua Franca ......................................................................................... 7 1.2 Gaps ................................................................................................................................ 8 1.3 Research questions ........................................................................................................ 10 1.4 Outline of the dissertation ............................................................................................. 11 1.5 Definitions and acronyms ............................................................................................. 13 Chapter 2: Sociolinguistic and ideological approaches to variation from standard written English in a globalized context....................................................................................................16 2.1 Introduction ......................................................................................................................... 16 2.2 Standard English as an ideological construct................................................................ 17 2.2.1 Language varieties ........................................................................................................ 18 2.2.2 Standard(ized) language............................................................................................ 19 2.3 Writing and standard English in a globalized context .................................................. 22 2.3.1 Written English as standard English ......................................................................... 23 vi  2.3.2 Standard written English and globalized academic writing ...................................... 25 2.3.3 The globalization of English and the possibility of competing standards ................ 27 2.4 Approaches to variation from standard English in academic L2 writing ..................... 29 2.4.1 Traditional error-based approach .............................................................................. 31 2.4.1.1 The shift from error-based approaches ............................................................. 33 2.4.2 World Englishes approach ........................................................................................ 34 2.4.3 English as a Lingua Franca approach ....................................................................... 35 2.4.4 Translingual approach ............................................................................................... 37 2.4.5 Summary: A variation-based approach to L2 writing............................................... 39 2.5 Theorizing acceptability as an approach to language difference .................................. 40 2.5.1 Distinguishing acceptability from grammaticality .................................................... 40 2.5.2 Acceptability in world Englishes and related fields ................................................. 44 2.5.3 Re-theorizing acceptability judgment research ......................................................... 47 2.6 Acceptability, ideology, and authority .......................................................................... 50 2.6.1 Language ideology .................................................................................................... 50 2.6.2 Metalinguistic judgments as a linguistic practice ..................................................... 51 2.6.3 Authority: Claiming the right to make judgments about language use .................... 55 2.7 Conclusion .................................................................................................................... 58 Chapter 3: Literature review of empirical studies of reactions to variation from standard English ...........................................................................................................................................60 3.1 Error studies in composition ......................................................................................... 60 3.1.1 Post-hoc error analysis studies .................................................................................. 64 3.1.2 Error gravity studies .................................................................................................. 67 vii  3.1.3 Summary ................................................................................................................... 73 3.2 Comparisons of NES/NNES responses to L2 writers’ errors ....................................... 74 3.2.1 NES/NNES error gravity studies in L2 writing ........................................................ 77 3.2.2 NES/NNES reactions to texts ................................................................................... 81 3.2.3 Summary ................................................................................................................... 88 3.3 Acceptability studies in world Englishes and ELF ....................................................... 89 3.3.1 Micro-level AJT ........................................................................................................ 92 3.3.2 Macro-level AJTs...................................................................................................... 93 3.3.3 Discursive AJTs ........................................................................................................ 98 3.3.4 Open-Ended Textual AJTs ...................................................................................... 101 3.3.5 Summary ................................................................................................................. 103 3.4 Conclusion .................................................................................................................. 104 Chapter 4: Methodology............................................................................................................106 4.1 Introduction ....................................................................................................................... 106 4.2 Overview of the study and research design ................................................................ 106 4.3 Recruiting participants ................................................................................................ 109 4.4 About the participants ................................................................................................. 112 4.5 Research sites .............................................................................................................. 115 4.5.1 SIC (Small Independent College) ........................................................................... 115 4.5.2 NKU (National Key University) ............................................................................. 116 4.5.3 ATC (A Technical College) .................................................................................... 116 4.5.4 JVU (Joint Venture University) .............................................................................. 117 4.6 Creating the AJT ......................................................................................................... 118 viii  4.7 Collecting the data ...................................................................................................... 120 4.7.1 AJT .......................................................................................................................... 120 4.7.2 Interviews ................................................................................................................ 121 4.8 Managing the AJT data ............................................................................................... 123 4.9 Data analysis procedures for the research questions................................................... 126 4.9.1 RQ1 ......................................................................................................................... 126 4.9.2 RQ2 ......................................................................................................................... 127 4.9.2.1 Features of CE................................................................................................. 127 4.9.2.2 Features of ELF............................................................................................... 129 4.9.2.3 Analysis procedures ........................................................................................ 130 4.9.3 RQ3 ......................................................................................................................... 131 4.10 Validity, limitations, and related concerns ................................................................. 132 4.11 My position as a researcher......................................................................................... 134 4.12 Conclusion .................................................................................................................. 136 Chapter 5: What gets marked: Differences and similarities between Chinese and non-Chinese participants in the AJT ...............................................................................................137 5.1 Introduction ................................................................................................................. 137 5.2 Overview of chunks .................................................................................................... 137 5.3 Comparison of groups’ responses to chunks .............................................................. 139 5.3.1 Introduction ............................................................................................................. 139 5.3.2 High consensus: Chunks marked by 50% of both groups ...................................... 140 5.3.2.1 Rule violations ................................................................................................ 140 5.3.2.2 Possible Typos ................................................................................................ 145 ix  5.3.2.3 Summary ......................................................................................................... 147 5.3.3 Differing priorities for the Chinese group .............................................................. 148 5.3.3.1 Perceived Chinglish ........................................................................................ 148 5.3.3.1.1 Pronoun shift: Is it Chinglish? ................................................................... 151 5.3.3.2 Collocations marked due to narrow semantic interpretations of words .......... 152 5.3.3.2.1 “Find job” .................................................................................................. 153 5.3.3.2.2 “Have a limit” and “have a limitation” ..................................................... 154 5.3.3.2.3 “Feel alone” ............................................................................................... 158 5.3.3.2.4 “Feel sense of inferiority” ......................................................................... 159 5.3.3.2.5 “Calculating some questions” ................................................................... 159 5.3.3.3 Summary ......................................................................................................... 160 5.3.4 Differing priorities for the non-Chinese group ....................................................... 161 5.3.4.1 Discourse markers ........................................................................................... 161 5.3.4.1.1 “It is known to all” .................................................................................... 163 5.3.4.1.2 “Last but not least” .................................................................................... 163 5.3.4.1.3 “Besides” (x2) ........................................................................................... 165 5.3.4.2 Dictionary words ............................................................................................. 166 5.3.4.2.1 “Cocker” .................................................................................................... 167 5.3.4.2.2 “Discretionarily” ....................................................................................... 170 5.3.4.2.3 “Improvident”............................................................................................ 171 5.3.4.3 Summary ......................................................................................................... 173 5.4 Conclusion .................................................................................................................. 173 Chapter 6: Participants’ reactions to Chinese English and ELF ..........................................176 x  6.1 Overview of CE/ELF features in the AJT texts .......................................................... 177 6.2 Participants’ reactions to ELF ..................................................................................... 177 6.2.1 “Unnecessary” articles ............................................................................................ 178 6.2.1.1 “(The) society”: A special case? ..................................................................... 183 6.2.2 “Missing” articles.................................................................................................... 186 6.2.3 “Overuse” of semantically general verbs ................................................................ 189 6.2.4 Countability............................................................................................................. 191 6.2.5 Overexplicitness ...................................................................................................... 193 6.2.6 “Redundant” prepositions ....................................................................................... 194 6.2.7 That/who distinction ............................................................................................... 195 6.2.8 Summary ................................................................................................................. 196 6.3 Participants’ reactions to CE ....................................................................................... 197 6.3.1 Chinese loanwords and loan translations ................................................................ 197 6.3.1.1 Harmonious society (loan translation) ............................................................ 198 6.3.1.2 Socialist market economy (loan translation)................................................... 200 6.3.1.3 Yuan (2x) (standing loanword/ borrowing) and Gulou (ad hoc loanword/borrowing) ...................................................................................................... 202 6.3.2 Semantic shift.......................................................................................................... 204 6.3.2.1 “Alarm clock is open” ......................................................................................... 205 6.3.2.2 “Living outside” .............................................................................................. 206 6.3.3 Null subject ............................................................................................................. 207 6.3.4 Adjacent default tense ............................................................................................. 208 6.3.5 Summary ................................................................................................................. 209 xi  6.4 Conclusion .................................................................................................................. 209 Chapter 7: Thematic analysis of participants’ claims to authority .......................................212 7.1 Mediator theme ........................................................................................................... 213 7.1.1 Native speaker surrogacy (Chinese participants) .................................................... 214 7.1.1.1 SIC2: “From my own point of view, and standing at your place” .................. 214 7.1.1.2 ATC4: “The native speaker’s habit in my mind” ........................................... 217 7.1.1.3 ATC1: “Whether the foreigners can make sense of the writing” ................... 218 7.1.1.4 ATC2: “it will be terrible for foreigners” ....................................................... 219 7.1.1.5 SIC5: “There is not a native speaker sitting next to me” ................................ 220 7.1.2 Sympathetic readership (Non-Chinese participants) .............................................. 222 7.1.2.1 JVU5: “If a non-sympathetic person is not going to understand” .................. 222 7.1.2.2 CAN9: “And that’s just the way it is” ............................................................ 223 7.1.2.3 CAN5: “I’m afraid that you are going to be marked on a system which is going to trash you” .................................................................................................................... 225 7.2 Language user theme .................................................................................................. 227 7.2.1 Bilingual expertise (Chinese participants) .............................................................. 228 7.2.1.1 SIC2: “I know what it means because I translated it word for word from Chinese!” ........................................................................................................................ 228 7.2.1.2 ATC1: “I can understand some Chinese style of English” ............................. 229 7.2.1.3 ATC2: “We have it, the Chinese way of thinking” ........................................ 230 7.2.1.5 SIC4: “I have to guess what is the Chinese meaning of this word” ............... 232 7.2.1.6 SIC5: “I can understand her and all her classmates can understand her”: ...... 233 7.2.1.7 SIC6: “I felt lost in the two systems”.............................................................. 235 xii  7.2.2 Bilingual expertise (non-Chinese participants) ....................................................... 236 7.2.2.1 JVU3: “They have used the Chinese version of something” .......................... 236 7.2.2.2 CAN2: “If I can come up with an obscure Arabic word” ............................... 237 7.3 Educator theme ........................................................................................................... 239 7.3.1 Representative of best (Chinese) pedagogical practices (Chinese participants) ..... 240 7.3.1.1 NKU9: “Make yourself love English so much” ............................................. 240 7.3.1.2 ATC3: “They seldom read outside class” ....................................................... 242 7.3.1.3 ATC4 and SIC7: “When I was learning English…” ...................................... 244 7.3.2 Representative of western academia (non-Chinese group) ..................................... 245 7.3.2.1 JVU4: “This is a high school kind of EFL composition” ............................... 245 7.3.2.2 JVU3: “This word isn’t academic – it’s very much informal” ....................... 246 7.3.2.3 JVU6: “We should be pushing hard the… western model” ........................... 248 7.4 Conclusion .................................................................................................................. 249 Chapter 8: Discussion and conclusion......................................................................................252 8.1 Discussion ................................................................................................................... 252 8.2 Implications of the study ............................................................................................. 257 8.2.1 Theoretical implications.......................................................................................... 258 8.2.2 Methodological implications .................................................................................. 259 8.2.3 Implications for practice ......................................................................................... 260 8.3 Future directions for research ..................................................................................... 263 References ...................................................................................................................................265 Appendices ..................................................................................................................................285 Appendix A All “priority” chunks for the non-Chinese group ............................................... 285 xiii  Appendix B All “priority” chunks for the Chinese group ...................................................... 289 Appendix C Acceptability Judgment Task ............................................................................. 291 C.1 Prompt A ................................................................................................................. 293 C.2 Prompt B ................................................................................................................. 294 C.3 Prompt C ................................................................................................................. 295 C.4 Prompt D ................................................................................................................. 296 C.5 Prompt E ................................................................................................................. 297 C.6 Prompt F.................................................................................................................. 298 C.7 Prompt G ................................................................................................................. 299 Appendix D Consent Form (English, for non-Chinese teachers) ........................................... 300 Appendix E Consent form (Chinese/English, for Chinese teachers) ...................................... 301 Appendix F Sample questions for semi-structured interviews ............................................... 303 Appendix G Text of recruitment email to ATC department head .......................................... 304 Appendix H Text of recruitment posting on professional association website ...................... 305  xiv  List of Tables Table 2.1 Comparison of linguistic/cognitive view and social/ideological view of AJTs ........... 48 Table 3.1 Error Gravity studies reviewed ..................................................................................... 63 Table 3.2 Post-hoc error analysis studies reviewed ...................................................................... 63 Table 3.3 Error gravity studies comparing NES/NNES responses ............................................... 75 Table 3.4 Comparisons of NES/NNES reactions to L2 writers’ texts .......................................... 76 Table 3.5 Micro-level AJTs reviewed .......................................................................................... 91 Table 3.6 Macro-level AJTs reviewed .......................................................................................... 91 Table 3.7 Discursive AJTs reviewed ............................................................................................ 92 Table 3.8 Open-ended textual AJTs reviewed .............................................................................. 92 Table 4.1 Profiles of the Chinese participants (n= 30) ............................................................... 112 Table 4.2 Profiles of the non-Chinese participants (n = 16) ....................................................... 114 Table 4.3 Information about the AJT essays .............................................................................. 119 Table 4.4 Comments on “heartless” (B064) ............................................................................... 125 Table 4.5 Proposed lexical features of China/ese English .......................................................... 128 Table 4.6 Proposed grammatical/syntactic features of CE by Xu (2010) based on spoken data 128 Table 4.7 Proposed grammatical features of ELF (Seidlhofer, 2004) ........................................ 130 Table 5.1 “Rule violation” chunks .............................................................................................. 141 Table 5.2 “Rule violation” chunks and representative comments .............................................. 142 Table 5.3 Possible typos ............................................................................................................. 145 Table 5.4 Perceived Chinglish .................................................................................................... 149 Table 5.5 Collocations ................................................................................................................ 152 Table 5.6 Comments on “find job” (A068) ................................................................................ 153 xv  Table 5.7 Comments on “have a limit” (B061) .......................................................................... 154 Table 5.8 Comments on “have a limitation” (B068) .................................................................. 155 Table 5.9 Comments on “feel alone” (G042) ............................................................................. 158 Table 5.10 Comments on “feel sense of inferiority” (F082) ...................................................... 159 Table 5.11 Comments on “calculating some questions” (D085) ................................................ 160 Table 5.12 Discourse markers..................................................................................................... 161 Table 5.13 Comments on “it is known to all” (A003) ................................................................ 163 Table 5.14 Comments on “last but not least” (C071) ................................................................. 164 Table 5.15 Comments on “besides” (D049) ............................................................................... 165 Table 5.16 Comments on “besides” (E020) ................................................................................ 165 Table 5.17 Dictionary words....................................................................................................... 167 Table 5.18 Comments on “cocker” (B058)................................................................................. 167 Table 5.19 Comments on “discretionarily” (B029) .................................................................... 171 Table 5.20 Comments on “improvident” (A059) ....................................................................... 172 Table 6.1 ELF-like chunks occurring in the essays .................................................................... 178 Table 6.2 “Unnecessary” articles ................................................................................................ 179 Table 6.3 Chinese participants’ comments on “the college students” (F019) ............................ 180 Table 6.4 Comments on “a good schooling” (A002) .................................................................. 181 Table 6.5 Comments on “click a send” (G024) .......................................................................... 181 Table 6.6 Comments on “the nature” (B071) ............................................................................. 182 Table 6.7 Other chunks involving “society” and participants’ comments .................................. 183 Table 6.8 “Missing” articles ....................................................................................................... 186 Table 6.9 Comments on “such competitive society” (E047) ...................................................... 187 xvi  Table 6.10 “Overused” verbs of high semantic generality ......................................................... 189 Table 6.11 Unconventional countability ..................................................................................... 191 Table 6.12 Comments on “woods” (G070)................................................................................. 192 Table 6.13 Overexplicitness........................................................................................................ 193 Table 6.14 Possible “redundant” preposition.............................................................................. 194 Table 6.15 Lack of that/who distinction ..................................................................................... 195 Table 6.16 CE-like chunks occurring in the essays .................................................................... 197 Table 6.17 “Harmonious society” ............................................................................................... 198 Table 6.18 “Socialist market economy” ..................................................................................... 200 Table 6.19 “Yuan” and “Gulou” ................................................................................................. 203 Table 6.20  Semantic shift .......................................................................................................... 204 Table 6.21 Comments on “alarm clock is open” (D071) ............................................................ 205 Table 6.22 Comments on “outside” across the three chunks (F000, F002, F007) ..................... 206 Table 6.23 Null subject ............................................................................................................... 207 Table 6.24 Adjacent default tense ............................................................................................... 208 Table 7.1 Participants’ positions of authority ............................................................................. 213  xvii  List of Figures Figure 4.1 Example of AJT ......................................................................................................... 124  xviii  Acknowledgements I have read the acknowledgements section of many dissertations and theses with pleasure, if only because it is often the only place where something warm and human emerges from what can be an otherwise dry and academic document. So:  I owe so much to so many people.   I would very much like to thank my PhD supervisor Ling Shi, who from the first time I met her at a coffee shop in Shanghai, expressed interest in my research and encouraged me to find my own path, assuring me that there is always a way around the inevitable difficulties that emerge in the course of research. She is unfailingly supportive, kind, and wise, and has been behind this occasionally quirky little study from the beginning. I also thank the other members of my committee: Ryuko Kubota, whose willingness to treat me as a colleague and collaborator is humbling and inspiring, and Patricia Duff, whose vast knowledge, warm collegiality, and careful eye has been a great boon to me.  I’m grateful to Anthony Paré and Janet Giltrow for agreeing to serve as university examiners, and to Xiaoye You for serving as external examiner, and for their thought-provoking questions, comments, and suggestions.       The participants who agreed to take part in this study are, to a person, very dedicated and conscientious teachers. I cannot thank them enough for taking time out of their busy schedules to participate in this study.  For funding, I am grateful to the Faculty of Graduate and Postdoctoral Studies, Office of the Vice President for Research & International, Go Global, Faculty of Education, the UBC-Ritsumeikan Academic Exchange Program, and Department of Language and Literacy Education at UBC for various grants, scholarships, and travel funds provided along the way, as well as to the TESOL International Association and the American Association of Applied Linguistics for professional development and travel funds.       I have benefitted enormously from the support and kindness of staff members in LLED past and present, including Anne Eastham, Chris Fernandez, Anne White, Lia Cosco, Angela McDonald, Laurie Reynolds, and Teresa O'Shea, and those who have served as head of the department during my time here: Geoff Williams, Annette Henry, Lee Gunderson, and Anthony Paré.   There are many people outside of UBC I must thank for their role in helping me carry out this research. It has been so long since I started that many of them must have forgotten me; I have certainly not forgotten them. There are many people who helped me get access to participants in ways that were extraordinarily kind, but whose names I cannot mention in order to preserve confidentiality. I thank them deeply and sincerely. Those I can thank by name include Chen Shanshan, Louise Green, and Damien Donnelly (who took me to the hospital when I broke my arm during data collection).      xix  I am so glad I met Cesar Velandia, Mukda Pratheepwatanawong, Anna Du, and Yin Kam Hoe, who made my lonely months of data collection bearable with their friendship and dinners at the local Sichuan restaurant.  In Vancouver, I am thankful for my classmates with whom I have suffered and rejoiced, especially Kim Meredith, Sara Schroeter, Ai Mizuta, Won Kim, Ryan Deschambault, and Dennis Murphy Odo, and the past and present members of the TESL writing group, especially Nasrin Kowkabi, Ismaeil Fazel, Junghyun Hwag, Rae-Ping Lin, Bong-gi Sohn, Klara Abdi, Ting Ting Zhou, Natalia Balyasnikova, Ava Becker-Zayas, and Tomoyo Okuda. Special thanks go to Junghyun Hwag and Paulina Sierra for helping me think through how to deal with the enormous piles of data this study generated, and I have appreciated regular dissertation support group meetings with Andrée Gacoin and Autumn Knowlton.  I would also like to thank Steven Talmy, Margaret Early, Bonny Norton, and Lisa Loutsenheizer for their encouragement both in and out of class, and Nikola Hobbel, Suzanne Scott, David Stacey, and (especially) Terry Santos, whose past support buoys me up even now. I also thank Dianne Fouladi, Sandra Zappa-Holman, and Reg D’Silva for their leadership and wisdom.     Nick Bansback wrote VBA scripts for Excel and supplied me with Sunday night beers for nearly the entire duration of my PhD. Ta very much.   For moral support and friendship here in Vancouver, my family and I have been blessed by many friends, including but not limited to Nick and Becky Bansback, Matt and Becky Smith, Josh and Danielle Blond Wauthy, Gavin and Elsie Chin, Heather and David Dunn, Jeff and Nadine Kapetyn, Rachel Jantzen, Romi Ranasinghe, Maureen Mogambi, Brad and Naja Eachus, and others from Tenth Avenue Church Kitsilano.        I wrote this dissertation while listening to the following albums, each dozens if not hundreds of times: Japancakes’ Loveless, The Arcade Fire’s Reflektor, Stars of the Lid’s And Their Refinement After the Decline, The Album Leaf’s In A Safe Place, Five Iron Frenzy’s Engine of A Million Plots, Frodus’ Conglomerate International, and Clams Casino’s Instrumentals (Vols. 1-3).  Finally, I would like to thank my parents, Jane Thurlow and Kevin Hartse, and my sister, Kendal Hartse, for their unflagging love and encouragement, and my parents-in-law, Tom and Joune Heng, for their support and interest in my career. Were it not for them I probably never would have set foot in China.  There will never be enough time or space for me to thank Sarah and Oliver Heng Hartse for the hope, love, and joy they bring to my life. I’m so looking forward to whatever Providence has in store for our little family, and to meeting its newest member. I promise never to write a dissertation again.    1 Chapter 1: Introduction This dissertation is in large part about the ways in which people in the position to judge academic writing make judgments about it, especially when they judge it to be in some way non-standard, and how this is impacted by the transnational contexts in which teachers and students live and work. I cannot speculate as to whether I would have been able to write it more quickly, efficiently, or painlessly if the subject matter had not made me suspicious of many linguistic, rhetorical, and editorial decisions I have made. At various points during the writing of what would become this dissertation, I have been told both that my writing is good and that it is difficult to follow; I have been told that I have a clear academic voice and that I am trying too hard to write like someone I am not; I have been told that language I have used is too technical or not technical enough. When I submitted one of the dissertation chapters as an article to a major journal in my field (it was rejected), one of the reviewers wrote “izzat a word?” in response to a lexical item I took to be the only sensible choice for my meaning. During my doctoral studies, I published one book and co-wrote another, and had several articles accepted by prominent academic journals. Yet because of these contradictory comments, I have often questioned my ability as an academic writer and asked myself whether I can write for a scholarly audience at all. As one of my mentors is fond of saying, academic writing is a second language for everyone. And it seems we are all judged, often harshly and in contradictory ways, by a variety of judges, when writing in that language. The contradictory responses I have had to my own writing have led me to the conclusion that the only reader I can hope to satisfy is myself — but they have also given me more insight into and interest in the ways that the readers of academic texts make these judgments. Of course, as an English language writing teacher myself, I am frequently called upon to make judgments of   2 my students’ writing, and while I try to do so to the best of my ability (as those who made the contradictory comments about my own writing did), I am finally aware that making judgments about good writing, even — and especially — at the level of “small and common words” (Chen & Lee, 2009) is far from a simple, black-and-white matter.  This dissertation takes that reality as its starting point, and investigates it in a specific context. The context is academic English writing in higher education in the People’s Republic of China, a context very much embedded in the internationalization both of the English language and of English-medium higher education. (My understanding of “academic English writing” is broad; I use the term to describe any English writing done at the postsecondary or scholarly level, but this study focuses on writing produced by Chinese college students in English writing courses or on exams.) The specific focus is how Chinese and non-Chinese English teachers make judgments about the unacceptability of language use that they view as non-standard in texts written by Chinese university students. The spread of English in the contemporary world has been widely acknowledged, and many scholars have, especially in the last thirty years, explored the consequences and implications of this spread in many contexts from a number of perspectives. One area in which English is particularly important is the continually internationalizing domain of higher education. English writing in particular is a “practical tool” (You, 2006, p. 200) and a necessary resource for a large number of college and university students around the world. “English writing,” however, is not a monolithic skill set transferable from one social context to another; writing pedagogies and practices differ across various boundaries, and international students know too well that a well-written essay in an English writing class in their home country is not the same as a well-written TOEFL essay, nor a well-written term paper at an English-medium university, nor   3 a well-written refereed journal article in a particular discipline. As the transnational movement of students from one English-using context to another continues, and the reality of differences in standards, norms, and models of English is better understood, a reevaluation of second language (L2) writers’ practices and their texts is in order. This study contributes to that reevaluation by investigating questions about how academic writing which deviates from a presumed standard written English (SWE) is received by those who read it – questions which are difficult to answer without crossing disciplinary, theoretical, and geographical boundaries.  Traditionally, a distinction has been made between English as a Foreign Language (EFL) writing and English as a Second Language (ESL) writing, and both have been seen as distinct from first-language writing in English (or L1 composition). Globalization in education and other domains has blurred these distinctions, however, and while decades of research on second language (L2) writing has produced useful results for educators, it is important to consider how “non-error-based” or “variation-based” approaches to second language writing can result in new knowledge and insights (see Heng Hartse & Kubota, 2014), and in fact to look beyond the disciplinary boundaries of “second language writing” for new ways of understanding texts and the people who interact with them (see chapter 2 for a discussion of translingual writing, for example). Several such approaches have been advocated in recent years (described in Chapter 3 of this dissertation), and in this study I look at data about academic writing through the lenses of world Englishes (WE), English as a Lingua Franca (ELF), translingual writing, and language ideology. 1.1 Context of the study The People’s Republic of China, the context which this study investigates, offers challenges to current understandings of “English in the world” (Rubdy & Saraceni, 2005). There   4 have been many studies of English in China in recent years, ranging from explorations of English and identity (Lo Bianco, Orton, & Gao, 2010), the history of the language in the region (Bolton, 2003), ideologies in English textbooks (Adamson, 2002), and surveys of English education in Chinese-speaking areas (Feng, 2013). While China has been described as an EFL, or, to use Kachru’s (1986) terminology from his model of WEs, an “Expanding Circle” context (in which English has limited usage in restricted domains, mainly for communicating with foreigners, and with reliance on the norms of “inner circle” standard English), the endorsement and appropriation of the language by the country’s educational policymakers has blurred the lines between ESL and EFL, and the growing English-knowing, middle-class population of China blurs the lines between the “Expanding Circle” and the “Outer Circle” (that is, countries in which English has been widely appropriated and is used intranationally in many domains, such as Singapore and Nigeria). For China, the question of “what… English literacy means” remains open (You, 2006, p. 200), as educational policy now advocates the learning of English from grade three through university (Chao, Xue, & Xu, 2014; Nunan, 2003, p. 595) and English is increasingly used for a variety of purposes by young middle-class professionals (You, 2011).  I became interested in this research because of my experience working as a university English instructor in Mainland China. I did this for two years (2007-2009), and it was my first full-time, ‘real’ English teaching position. This experience had a profound impact on the way I see teaching, language, and the role of internationalization in education. (In fact, I had no previous opinion on some of these issues.) During my time in China, I began reading the literature on world Englishes, and China English in particular, so my reading of the literature and my professional life informed each other as I began to think about possible PhD research topics. China’s increased emphasis on English in recent years has coincided with the rise of   5 alternative theoretical approaches to non-native Englishes. Rather than viewing English use by Chinese (and other) second language speakers as potentially being error-laden or deviant from standard English, these perspectives – particularly WE and ELF – aim to objectively describe and advocate for varieties of English that differ from American or British norms. The two areas of most immediate relevance to this study are China English (CE) (from the WE paradigm) and ELF. CE and ELF were selected because research suggests that features of these types of English are likely to be present in the texts used in this study, and studies of reactions to CE and ELF in the written texts of Chinese English users have been almost nonexistent. These two research traditions are described below. 1.1.1 China English English has long been regarded as an important foreign language in China, and educators have recognized the need to contextualize it in a way that accommodates local needs. Ge (1980) is credited with proposing the notion that China could have its own variety of English. He argues that “China English” should be recognized as necessary in Chinese-to-English translation; he cites terms such as “eight-legged essay,” “four modernizations,” and baihua (a vernacular Chinese literary movement) which do not exist in American or British English. He refers to these terms and other unique features of English developed in China as 中国英语 or “China English” rather than 中式英语 (“Chinese English,” also called “Chinese-Style English” or “Chinglish”), the latter being a mostly pejorative term that more often refers to learner error or, in recent years, inaccurately translated signage. Jiang (2003) suggests that Ge’s motivation was in part to replace the pejorative “Chinglish” with a positive term which matched the reality of the need for uniquely Chinese words in English. Although the notion of CE was developed independently of Kachru’s (1986) Three Circles model, it eventually found its home in the WE framework.   6 The distinction between “China English” and “Chinglish” has been crucial to Chinese scholars. Starting in the mid-1990s, papers distinguishing the two appeared in international journals, beginning with Jiang (1995), who argues that Chinglish is “a pidgin” or an interlanguage, which is “an unavoidable yet necessary stage on the way to learning English as a second or foreign language” for a Chinese student (p. 51). “China English,” on the other hand, is characterized by “a near-native yet Chinese accent,” words unique to the Chinese context, old-fashioned forms resulting from outdated materials, and a mixture of American and British norms (pp. 51-52). This distinction is also taken up by Dong (1995), Zhang (1997), Wei and Fei (2003), Jiang (2003), Yang (2006), and Cui (2006), and in recent years the existence of a legitimate “China English” as opposed to a stigmatized “Chinglish” has been largely accepted by many scholars and educators of English in China. More recently, He and Li (2009) and Xu (2010) have offered reviews of their efforts to clarify and define the meaning of CE. After synthesizing previous definitions, He and Li (2009) propose to define CE as a  performance variety of English which has the standard Englishes as its core but is colored with characteristic features of Chinese phonology, lexis, syntax and discourse pragmatics, and which is particularly suited for expressing content ideas specific to Chinese culture through such means as transliteration and loan translation. (p. 83) Similarly, Xu (2010), refers to CE as a developing variety of English, which is subject to ongoing codification and normalization processes. It is based largely on the two major varieties of English, namely British and American English. It is characterized by the transfer of Chinese linguistic and cultural norms at varying levels of language, and it is used primarily by Chinese for intra- and international communication. (p. 1)   7  Scholars have studied social attitudes about CE (e.g., Hu, 2005, Kirkpatrick & Xu, 2002, Pan & Block, 2011) and made provisional attempts to define its linguistic features; the place of CE in relation to American, British, and other world Englishes (Hu, 2004) remains an area ripe for research. Studies of written CE have tended to emphasize its use in media (Cheng, 1992; Gao, 2001; Yang, 2005) and occasionally literature (Xu, 2010; Zhang, 2002) and the Internet (Fang, 2008; You, 2008, 2011), but have rarely been extended to academic writing. While the “Chinese style of English writing…has begun to be conceptually appreciated” by some scholars (You, 2004a, p. 256), studies of academic L2 writing in China have largely not adopted a CE focus. This study thus aims to address that gap by investigating participants’ reactions to putative features of CE (these features are described in Chapter 4). 1.1.2 English as a Lingua Franca English as a Lingua Franca (ELF) is another important research tradition which has implications for the use of English in China. ELF is a research tradition which draws on WE scholarship but distinguishes itself from WE in several ways. While understandings of what the term ELF actually refers to may differ (see Friedrich & Matsuda, 2010), the classic definition of ELF is a “contact language’ between persons who share neither a common native tongue nor a common (national) culture, and for whom English is the chosen foreign language of communication” (Firth, 1996, p. 240). As a concept, ELF has much in common with more generic references to “English as an international language” in that it attempts to describe the way in which English is used as a language of wider communication among many people of differing linguistic backgrounds across national contexts; in practice, however, ELF refers to a specific research tradition which focuses on empirical descriptions of the type of communication described in Firth’s definition, both in terms of linguistic features and communicative strategies.   8 ELF researchers have posited, in addition to pronunciation and pragmatic features of ELF, a set of grammatical features that may be typical of communication between nonnative speakers (first developed by Seidlhofer, 2004, based on preliminary analysis of the Vienna-Oxford International Corpus of English, and taken up by other ELF scholars such as Cogo & Dewey, 2012). How people react to these features has not been widely studied, though Y. Wang’s (2013) study involving acceptability judgments (discussed further in Chapter 3) and her 2012 dissertation examined what she has called CHELF, or Chinese speakers’ English as a Lingua Franca. From an ELF perspective, it can be argued that most (if not all) Chinese speakers use English as a lingua franca, and thus it makes sense to investigate their language use from this perspective. It is only recently that ELF scholars have extended their research to academic writing (see a discussion of this in Chapter 2), and in general, discussions of ELF are beginning to move from simple “features” to more expansive inquiries into multimodal language practices (see Matsumoto, 2015). but for a study like this one, it is prudent not only to examine readers’ reactions to potential CE but also to potential ELF usages in Chinese university students’ written texts, since both of these traditions have important implications for Chinese English users’ language and reactions to it. The features of ELF that have been described by ELF scholars are covered in Chapter 4. 1.2 Gaps Broadly speaking, this study is situated in the gap between descriptive sociolinguistic studies in world Englishes, English as a Lingua Franca, and related approaches on the one hand, and second language writing (or simply writing studies) on the other. While sociolinguistics and contemporary writing research share similar concerns – particularly when it comes to forms,   9 functions, and contexts of language use – there has been little fruitful cross-pollination between these fields. Recent work by Lillis (2013) and Coulmas (2013) shows that this has a historical basis in sociolinguistics’ privileging of spoken language over writing, but in the last few decades, the influence of sociolinguistic approaches on scholars who work with writing has been growing considerably (see McKay & Hornberger, 1996, and Hornberger & McKay, 2010). Specifically, this study merges the concerns of descriptive sociolinguistic work in areas like world Englishes and English as a Lingua Franca with those of second language writing scholars interested in the unique features of L2 writers’ texts. While the former is usually described in terms of variation or difference from standard English, usually with the goal of showing how non-standard (or non-Inner Circle) varieties are coherent and legitimate languages, the latter has traditionally approached L2 writers’ texts as distinct from first language writers’ texts in negative terms, such as being less complex or more error-laden (see Silva, 1993, and Hinkel, 2002). The present study thus borrows both theoretically and methodologically from both approaches. This is described in more detail in Chapters 2 and 3; both sociolinguistics and writing studies have traditions of researching readers’ reactions to nonstandard language use and the consequences of those reactions – acceptability judgments in sociolinguistics and error gravity studies in writing – and this study merges those two by adopting a sociolinguistic variation-based approach to readers’ reactions to L2 writing. That is, in this study, I examine readers’ reactions to putatively non-standard language use in academic English writing by Chinese university students, but without a preconceived notion of what counts as an error. This allows the study to fill two important knowledge gaps: first, it allows us to examine, on a large scale, what different readers prioritize when they encounter language use they deem   10 nonstandard or unacceptable in academic writing. This is done in a bottom-up fashion, without predetermining “error” categories for research participants or showing them obviously unconventional sentences in a decontextualized questionnaire (a technique that has been common in both linguistic studies of acceptability – see Chapter 2 – and studies of error gravity in writing – see Chapter 3); instead, they read full essays and make judgments of lexical and grammatical use in context. Secondly, this more contextual approach allows an investigation of how and why readers react to features of non-standard varieties of English (specifically, Chinese English and English as a Lingua Franca) when they encounter it in its “natural habitat” of an authentic text. Finally, when it comes to the question of both what judgments people make about language use and why they make them, studies of language ideology in sociolinguistics (e.g., Cameron, 2012; Milroy & Milroy, 2012) have rarely been applied to studies of reactions to written language. This study addresses that gap by looking at participants’ perceptions of how they exercise the authority to make judgments about (written) language use; I view both of these what and why questions as being embedded in languages ideologies (described more in Chapter 2).  1.3 Research questions  In light of the existing gaps identified, this study focuses on acceptability judgments and participants’ accounts of how and why they make those judgments. These, then, are the research questions:  1. What lexical and syntactic features in Chinese student writing do Chinese and non-Chinese English teachers identify as unacceptable, and why?    11 2. How do participants react to chunks which evince features of either Chinese English or English as a Lingua Franca, and why? 3. By what authority do participants make judgments about the acceptability of English usage in writing?  To answer these questions, this study uses as its primary method of inquiry an acceptability judgment task (AJT) (e.g., Greenbaum, 1977; Higgins, 2003; Schütze, 2011; see Chapter 2 for a detailed discussion of AJTs) with follow-up interviews. Critically, unlike most “acceptability” studies in linguistics and related areas, which tend to use decontextualized sentences of stereotyped or unusual usage rather than language in context (e.g., a whole essay), this study asks participants to make their acceptability judgments of linguistic features in the context of an entire piece of writing (see Gupta, 1988, and Parasher, 1983). This is an innovation which broadens the concept of an acceptability judgment task for sociolinguistic research, allowing for participants’ subjective judgments to be analyzed both quantitatively, in the sense that the most frequently “marked” usages can be represented via descriptive statistics, and qualitatively, in that their judgments of acceptability will be accompanied by language data expressing their reasons for making the judgments. This study thus makes a contribution not only to knowledge about reactions to second language writers’ language use, but also to the methodology of studying these reactions in a variety of contexts. 1.4 Outline of the dissertation  This dissertation is organized into eight chapters. Chapter 1 has provided an introduction to some of the disciplinary, theoretical, and methodological issues involved in a study of this nature, touching on pressing issues involving sociolinguistics and second language writing and positioning the study as inspired by acceptability studies in sociolinguistics but moving toward a   12 more ideologically-inflected perspective.   Chapter 2 provides a review of relevant theoretical literature related to the topic, including approaches to language difference and error in first and second language writing studies; the primary guiding assumptions of approaches to writing in the areas of WE, ELF, and translingual writing; the history and development of acceptability judgment research; and the relationship of language ideology and authority in language to acceptability judgments. Ultimately, I argue that there is a need to investigate reactions to nonstandard usages in writing from readers’ perspectives in light of the recent turn toward sociolinguistics in writing studies inspired in large part by globalization.  Chapter 3 is a review of empirical studies drawn from three different disciplines – composition, second language writing, and sociolinguistic studies in WE and ELF – which deal with readers’ reactions to deviations from standard English, summarizing both their findings and their methodological approaches, and their relevance to the present study. This chapter shows an evolution in these studies from a decontextualized, experimental approach to creating hierarchies of errors by category to a more qualitative, open-ended approach in which researchers investigate participants’ reactions to language in context via not only surveys but also interviews and other forms of talk.  Chapter 4 describes in detail the research design, methodology, and data collection and analysis procedures for this study, as well as reflections on my own position as a researcher on this topic in the context of China. In this chapter I describe the need to make the large amount of data (nearly 3,000 separate comments on chunks made by participants – for a definition of “chunks” in this study, see below) manageable. I also discuss my decision to narrow the scope of the analysis to certain “focal chunks” in order to investigate participants’ particular judgments of   13 certain types of chunks in greater detail.  There are three data analysis chapters. Chapter 5 examines results from the AJT, describing the chunks of text that were most often marked by both Chinese and non-Chinese participants in the study, and discussing and analyzing differences between Chinese and non-Chinese participants’ responses to the AJT. It also includes explanations and analysis of interview excerpts in which participants discussed their reasoning for rejecting or not rejecting particular chunks. Chapter 6 also looks at the results of the AJT, but with a specific focus on how participants reacted to putative features of CE and ELF when they encountered them in the texts.   Chapter 7 focuses primarily on the follow-up interviews, offering a thematic analysis of the ways in which participants position themselves as people with the authority to make judgments of linguistic acceptability; these are divided into three themes of authority as a mediator, as a language user, and as an educator, each of which is expressed differently by members of the two groups.  Chapter 8, the conclusion, summarizes and synthesizes the data, and discusses implications of the study for theory, method, and practice in second language writing research, including implications for pedagogy and other practical applications for those who work with L2 writers and texts. 1.5 Definitions and acronyms  Below, I list some terms and acronyms that are used frequently in this dissertation. Some are relatively straightforward definitions of terms that may be unfamiliar, while others may be more ideologically contested; I present those to offer the reader my own interpretation of these contentious terms. I further discuss my own language ideology in section 4.11.  AJT (acceptability judgment task): An AJT is a research procedure originally used by   14 linguists (e.g. Quirk & Svartik, 1966) to elicit research participants’ knowledge of whether certain sentences are legitimately part of their language. The AJT is distinguished from the grammaticality judgment task (GJT) in its acknowledgement that the research instrument cannot necessarily be assumed to be accessing only a participant’s internal grammatical knowledge and that participants’ responses may be influenced by other factors. This distinction is further discussed in Chapter 2.    CE (Chinese English or China English): A term describing a variety of English unique to the Chinese context, which is argued by some to be currently developing (Xu, 2010). It is generally distinguished in the literature from “Chinglish” which is characterized as the error-riddled language of Chinese learners of English, whereas CE is seen as a legitimate and grammatical variety of English expressing Chinese culture and identity (see, for example, Hu, 2004).  Chunk: The word “chunk” is used in this dissertation to refer to a sequence of words selected by a participant as unacceptable in the AJT. Chunks range from a single word to a phrase to a whole sentence to multiple sentences. These should be understood as distinct from any notions of “chunking” or “formulaic chunks” referred to in SLA or other literature on language learning. In this study “chunk” has no particular linguistic definition and is understood in relation to the text a participant highlighted in Microsoft Word during the AJT.  ELF (English as a lingua franca): while this term can potentially be used to describe any interaction between two interlocutors for whom English is not a shared first language (see Firth, 1996), in practice it has come to refer to a particular theoretical approach to studying the usage of English across geographical and linguistic contexts. ELF as a field   15 is associated with the investigation of features and pragmatic norms that govern interactions between (mainly) non-native speakers of English who do not share the same first language and culture. ELF is discussed more in Chapter 3, and its putative features are discussed in more detail in Chapters 4 and 6.  ES (English speaker): Anyone who is a speaker of English, regardless of when he/she learned it.  ESL (English as a second language): This term is traditionally used to describe the learning of English by those who have not learned it from birth, in contexts where English is the dominant language in the society where it is being learned (as opposed to EFL). In recent years, the word “second” has been seen as problematic (sometimes replaced by “additional,” as in EAL) for learners who are multilingual, but I use this term due to its prominence in the field.  EFL (English as a foreign language): This term is traditionally used to describe the learning of English in a context where English is not a dominant language in the society where it is being learned. Globalization has, however, made this term sometimes seem ill-suited to contexts where it has traditionally been used (is English foreign in Denmark to the same degree that it is foreign in North Korea?).   NES(T)/NNES(T) (Native English speaker (teacher)/Nonnative English speaker (teacher)): These terms are used to distinguish speakers/teachers of English who have learned the language since birth from those who learned it later in life.  SWE (Standard written English): This term refers to a loosely-defined understanding of “educated” English usage used in written domains such as higher education, the media, and government.     16 Chapter 2: Sociolinguistic and ideological approaches to variation from standard written English in a globalized context 2.1 Introduction  This chapter looks at traditional second language writing approaches and sociolinguistically oriented approaches to variation from standard English, with an emphasis on writing. It aims to connect recent theoretical approaches to writing influenced by sociolinguistics to standard language ideology, and to argue that a conceptual shift from viewing L2 texts as containing clearly defined errors to viewing them as including variations from ideologically-inflected notions about standard written English (SWE) will produce fruitful results in research about L2 writing and readers’ reactions to it.  The structure of this chapter is as follows: first I review the notion of language standardization, particularly focusing on English, and discuss the role of written English in the construction of standard English, and the difficulties in applying a single understanding of standard written English, given its global spread.  Next, I describe theoretical approaches to variation from standard written English, ranging from traditional ‘error-based’ approaches to L2 writing to recent variation-based approaches (namely world Englishes, English as a Lingua Franca, and translingual writing), arguing for the need to move from an error-based approach to a variation-based approach. Because a variation-based approach calls for a more flexible construct than error, I describe the construct of linguistic acceptability, arguing that the more subjective and sociocultural aspects of acceptability which linguists have tended to view as weaknesses are actually strengths for investigating peoples’ reactions to non-standard language usage. Finally, I look at the place of acceptability judgments in the study of language ideology, arguing for the importance of studying authority in language judgments not (only) as something   17 inherent in social institutions or powerful individuals but as a claim that is produced by judges of language use in their encounter(s) with texts.  2.2 Standard English as an ideological construct  One of the main concerns of the current study is variation from standard (written) English in academic writing, and how that variation is viewed by readers. It is important, then, to first understand what is meant by “standard English.” Drawing on sociolinguistic understandings of language varieties and standard language, standard English can be seen as an ideological construction predicated on “the suppression of optional variation” (Milroy & Milroy, 2012, p. 30), which itself is ideological and based on (inter)subjective factors and social tensions. As Halliday (1978) argues, “although (language) attitudes are explicitly formulated in connection with immediately accessible matters of pronunciation and word formation, what is actually being reacted to is something much deeper. People are reacting to the fact that others mean differently from themselves” (p. 162). While Halliday goes on to argue that some people may make negative judgments about others’ language use because “they feel threatened by it,” it is not necessary to assume bad faith or intentional animosity on the part of those who actively defend “standard English,” even in an explicitly ideological way. After all, the more “progressive” sociolinguistic understandings of language difference I describe below are themselves ideological as well, as Cameron (2012) points out. However, it is necessary to establish a theoretical position showing that standard English, rather than being a clearly defined variety of the language, is less a linguistic than an ideological construct. Below, I discuss the meaning of a “variety” of a language, how a standard (written) variety of English comes to be recognized, and the social nature of this process. I then discuss why the delimiting of standard written English is more complex in an era of globalization and the   18 use of English in a number of contexts outside of traditionally (or ideologically) monolingual settings of English and the “first diaspora” of English (Bolton, 2006b, p. 293). 2.2.1 Language varieties  Variation in language use is one of the central concerns of sociolinguistics, unlike much mainstream linguistic theory, which “dismiss(es)…variation as an accidental feature of language” (Coulmas, 1997, p. 7). For sociolinguistics, the locus of research is not the language itself as only a “structured set of forms” (Linell, 2005, p. 4) per se, but rather how the differences between “meaningful actions and cultural practices” (p. 4) cause languages to differ. Thus, the “speech community,” a group of people who “agree on the social meanings and evaluations in the variants used,” is of utmost concern in determining which variations are used and accepted (Milroy & Milroy, 2012, p. 51). Speech communities do, in some way, “agree” on appropriate norms, but the amount of variation which causes a separation between, for example, dialects, is not strictly quantifiable, and indeed recent work describing “translanguaging” – “the act performed by bilinguals of accessing different linguistic features or various modes of what are described as autonomous languages” (García, 2009, p. 128) – complicates notions of separating varieties all together. It does seem, however, that the particular agreed-upon variations of one community “who believe themselves to be speakers of the same language do indeed cluster enough for the belief to be highly plausible” (Pateman, 1984, n.p.). The term variety, while inexact, accommodates the fuzziness of how variations in language usage lead to people making distinctions between languages. Trudgill and Hannah (1994), for example, refer to “the two main standard varieties” of English (“English English” and “North American English”), outlining differences of “phonetics, phonology, grammar, and vocabulary” between these varieties (pp. 2-  19 3), but recognizing that while people describe them “as if they were two entirely homogeneous and separate varieties,” there is also considerable variation within each variety (p. 56). Historically, the definition of “a variety” of a language has always been vague: the term can denote “any identifiable kind of language” (Spolsky, 1998, p. 126), and has been used synonymously with other metalinguistic terms like dialect, register, medium, field, style, accent, sociolect, idiolect, and so on (Bolton, 2006b, p. 290). On one hand, variety is not a technical term, yet there is a sense in which its broadness allows it to elude both the more theoretically conflicted term language and the more ideologically inflected term dialect. If we want a serviceable definition of “a variety of a language,” it might be called a collection of variations in language use which cause it to be defined as separate from other varieties of the language, but which is considered to share enough in common with other varieties to be related to them. It is, of course, people who “define” and “consider” varieties to be what they are; ultimately, what makes a variety of a language is people’s belief that it is one. This should not be taken to mean that varieties are not “real,” but that they are constituted in large part by what we say they are. The labeling of varieties of English, for example, has had important social, political, and identity implications for their users, both negative and positive.  2.2.2 Standard(ized) language  Trudgill and Hannah’s (1994) reference to “Standard Varieties of English” elucidates another important point about the differences between varieties of a language: the most socially relevant division made in a given society (and even, perhaps, internationally), is that between standard and non-standard varieties. To be conversant with the standard variety of language in a given society is, in a sense, to sociolinguistically belong in a way others do not. Yet a “standard variety” retains all the definitional fuzziness of a simple “variety.” The British Committee for   20 Linguistics in Education, for example, refers to standard English as “one variety of modern English, alongside a wide range of non-standard varieties” which “may be distinguished from non-standard varieties according to a relatively small number of linguistic features” (Committee for Linguistics in Education, n.p.). In fact, even if its contours are often delimited via prescription (shoulds) and proscription (should nots), standard English has no essential linguistic definition. Crystal (1994) refers to standard English as “a minority variety (identified chiefly by its vocabulary, grammar, and orthography) which carries the most prestige and is most widely understood” within a given “English-speaking country” (p. 24). To other scholars, standard English is “the variety of the English language which is normally employed in writing and normally spoken by ‘educated’ speakers of the language” (Trudgill & Hannah, 1994, p. 1), or simply “the language of the educated” (Lippi-Green, 1997, p. 54), defined by a set of abstract norms which advocate the “suppression of optional variation at all levels of language,” ostensibly in the service of less misunderstanding and more efficiency in communication, but also motivated by social, political, and commercial factors (Milroy & Milroy, 2012, p. 30). The concept of standard language itself, according to Crowley (2003), emerged in Britain “from the difficulties and problems found by nineteenth century linguists and in particular the lexicographers of the late 19th century” (p. 104). This is not to suggest that there is no agreement among groups about which variations should be used; rather, it confirms that standard language is not a linguistic concept, but an ideological one.  Because there is no linguistic definition of “standard English” as an entity, a more salient concept is the standardization of English, or the social and ideological process by which a variety comes to be called standard and promoted as the ideal variety to be used in a given society. This has implications in “monolingual” Inner Circle English societies as well as those   21 described as Outer Circle and Expanding Circle contexts, though the development of a reified standard Inner Circle English (American or British) necessarily precedes the development in the latter. Milroy and Milroy (2012), in their analysis of standard English in (ostensibly) monolingual contexts, discuss the standardization of a language as a process comprising the following steps (presented here in slightly different order than the original): 1. A need for uniformity is felt by an influential group in society, ostensibly for purposes of efficiency and intelligibility. Note that even at this early stage, the way the process is described is necessarily subjective and ideological: a “need” is “felt” by what the Milroys describe as “influential portions of society” (p. 22). While the goal may be ease of communication, the process is inevitably one carried out by an elite, educated group. 2. A variety is selected to fulfill that need. In fact “competing varieties might no doubt be selected by different parts of the community” (p. 22), so the “selection” of a variety is likely to be a jockeying for position between individuals and social groups of various social status and influence in society. 3. That variety is accepted by influential people in society. When eventually one variety emerges as the most influential, it is thus again because of its acceptance by the influential (powerful, educated) people in society.  4. That variety is diffused (i.e., imposed) through social institutions such as the media, government, publishing, and, especially, education. 5. The standardized variety is maintained via a number of processes, including: a. “Elaboration of function,” or the spreading, specialized use of the standard variety in certain high-status domains.   22 b. The imbuing of the language with prestige, for example via its usage in respected institutions or by influential people. c. The codification of the language in pedagogical and reference works (dictionaries, grammars, textbooks). d. Prescription and proscription in a number of contexts (promotion of the standard variety in popular books about language use, rejection of alternate usages and new variations). Thus, what is meant by “standard English,” while it is still frequently appealed to as a static entity in disputes over usage, is actually created and sustained through an ideological process; as Lippi-Green (1997) argues, the “ideology of standardization… empowers certain individuals and institutions” to “control and limit spoken language variation” (p. 59). Far from being a neutral code, standard English is an ideological discourse; a community cannot really ever have more than one “standard variety” of English because “the standard” exists more as an hegemonic ideal than as a variety among varieties. In the traditionally English-speaking countries, the ideology of standard English promotes educated, written usage and suppresses variation, particularly variation associated with socially marginalized groups (e.g., the poor, African Americans, rural residents, and so on). The many definitions of standard English which refer to it as the language of “the educated” are a clue to one of the neglected areas of examination in standard English, which is the role of writing and literacy in its maintenance. This is explored in the next section.  2.3 Writing and standard English in a globalized context  As a field, sociolinguistics has long been focused on speech as “authentic” language-in-use, with writing viewed as secondary or merely a way of “recording” language. Coulmas (2013)   23 offers an account of why linguistics resisted “the tyranny of writing” (p. 1) when establishing itself as a field. While there may have been reason to separate writing and speech in the history of linguistics and sociolinguistics, it does not make sense to limit the discussion of standardization to spoken English, since writing plays a key role in most people’s understanding of standard English and in its continued maintenance.  In this section, I argue that because written language and literacy education arguably play the most influential role in the development and maintenance of standard English in any given society, more attention should be paid to writing (particularly academic writing, where standardness is most actively enforced) as a site where standard language ideologies are enacted. I then describe the impact of globalization on the importance of standard written English and the related stigmatization of written English use that is perceived as non-standard, and the possibility of differing standards for written English across different global contexts. 2.3.1 Written English as standard English Lurking behind the term “standard English” is nearly always (except in the case of pronunciation) standard written English. It is written norms and written models which tend to be followed as standard; as Linell (2005) points out, “models and theories of language have been developed that are strongly dependent on long-time traditions of dealing with writing and written language,” and these models “are ultimately derived from concerns with cultivating, standardizing, and teaching forms of written language” (p. i). Despite this, however, most scholarly and public debates about standard varieties center on the speakers of a language or disputes over how it should be spoken (see, for example, Lippi-Green, 1997). This uncritical conflation of writing and speaking should prompt more investigation into how writing is viewed   24 and taught, and how ideas about standardness are reproduced in discussions of writing and writing pedagogy. Written language has had a disproportionate role in the standardization process in most English-speaking countries; indeed, the standardizing of writing has been more successful than that of speech, and many complaints about spoken English are seemingly due to spoken usage not matching the formal norms of writing. Milroy and Milroy (2012) argue that historically, English “standardization has first affected the writing system, and literacy has subsequently become the main influence in promoting the consciousness of the standard ideology. The norms of written and formal English have then been codified in dictionaries, grammar, and handbooks of usage” (p. 30). In fact, the term “standard English” has almost always been synonymous with “the medium of writing in the English language, grammatically stable and codified” (Crowley, 1999, p. 271). To be familiar with standard English is, then, to know how to read and write it.  Halliday (2006) also argues that when a language “becomes… written” and standardized, the change is not merely a social one, but is actually “systemic” (p. 357). The standard (written) language expands into more domains so that it becomes the preferred mode for doing certain things: “the semogenic power of the language is significantly increased” (p. 357). Writing and standardization – which go hand in hand – lead to change and growth in meaning potential, according to Halliday, so that it becomes possible to mean and ultimately do more with a standard (written) language. Thus, in a society which places the highest premium on standard written language, not to avail oneself of standard written English may lead to the inability to perform certain socially beneficial actions. Written language, as a primary channel of standardization, suggests a settling of difference, a literal marking down of norms regarding what is accepted or valued by the influential and educated in society. The effects of being proficient at   25 reading and writing standardized forms are undeniable in terms of access to information, education, prestige, and other cultural capital.   The idea of writing as a necessary skill in order to “do more” is well attested, especially in educational and academic settings. Lea and Street (1998) discuss how writing, and being able to write in a particular way, is positioned as a necessary skill: the process of writing, especially student writing, is “defined by what constitutes valid knowledge within a particular context,” and the social significance of this knowledge “can be identified through both formal, linguistic features of the writing involved and in the social and institutional relationships associated with it” (Lea & Street, 1998, p. 172).  Standard language ideology is embedded in norms of academic writing; to write well academically is to join the educated community of standard-bearers. This becomes more complicated when the community of English users spans various national contexts, as in the case of higher education. 2.3.2 Standard written English and globalized academic writing So far, the theoretical positions on standard English described above have been developed by theorists describing “monolingual” English contexts, for whom the implications for understanding the meaning of standard English in “non-English-speaking” countries are rarely a concern. However, it is important to adopt a conceptualization of standard(ized) English that can be applied to the globalized, internationalized contexts applied linguists are used to working in today. Not only is written language the primary channel of standard(ized) language, as we have seen above, but the emphasis on writing and standard written English in international academic culture makes the stakes even higher: English literacy implies the need for facility in both reading and writing standard (British or American) English. Li (2007) argues that for non-native users of English, “being equipped with Standard English” is an indispensible “prerequisite for   26 life-long learning,” and he advocates a pedagogy in China and Hong Kong which will allow students “to be literate in, and conversant with, lexico-grammatical features of the written standard variety in order to absorb all kinds of information in print or on the Internet” (p. 14). Canagarajah (2002) further argues that the mainstream view holds that “knowledge is writing,” and that therefore “rhetorical, linguistic, and genre conventions of writing are not simply matters of textual form” (p. 101), but they have consequences for whether writing that is viewed as non-standard is even afforded the status of legitimate knowledge. Being able to write and publish in standard written English, even for those in non-English speaking countries, means meeting an “international standard” (Flowerdew & Li, 2009, p. 13) in order to participate in a global flow of information, ideas, and knowledge. However, writing in a way that is judged unconventional by “center” scholars, educators, publishers, and others in relative positions of power can lead to writers being marginalized in significant ways. The tension is not only between native norms and non-native local norms, but also perceptions of “acceptability,” both considered locally (“Do I – or does my teacher – accept this as standard written English?”) and internationally (“Would a native speaker accept this as standard written English?”). It can be difficult, if not impossible, to separate these issues from each other when considering the ways in which ideologies of standard (written) English are constructed in particular contexts. Standard English was originally conceived of by 19th century linguists, as a “certain and uniform literary form around which were grouped distinct sub-varieties” (Crowley, 2003, p. 87); it was considered “the form common to all literary pieces not tainted by the merely provincial” (p. 88). Today, however, the question of being “tainted by the provincial” is much more complicated by the spread and development of English around the world. The sociolinguistic realities described by world Englishes research (see below) suggest that not only different   27 varieties, but different standard varieties of English can emerge. This has implications not only for pedagogy in monolingual English institutions which enroll students from other countries, but also and especially for Outer Circle and Expanding Circle contexts where the issues of local models in education (Li, 2007), large-scale language testing (Davies, Hamp-Lyons, & Kemp, 2003; Hamp-Lyons & Davies, 2008) and many other facets of English are complicated by both the multilingual realities of the country and the powerful influences of globalization and internationalization.  2.3.3 The globalization of English and the possibility of competing standards Appeals linking standard English to literary traditions and national identity are commonplace and accepted in the United Kingdom and the United States. However, given the internationalization of education and academic writing, and the reality of globalization in general, there is a need to examine the ways in which standard English becomes a pervasive construct in education and other domains beyond “English-speaking” countries. It is worth examining the “language effects” (Pennycook, 2007) of the export of standard English ideology from monolingual English writing contexts to a variety of multilingual countries where English usage has developed in its own way.   From the above understandings of standardization, we have seen the influence of “the educated” (including, of course, linguists, who created the concept) and influential people in a society as being central to the maintenance of standard language. Standardization is not a democratic process by which certain forms happen to emerge and become accepted by a speech community. While there is no reason for bottom-up changes in language not to occur—vernacular language change certainly cannot be controlled—standardization is more a political and social project than it is a linguistic one. And if we broaden this concept to examine the   28 global spread of English, the question of standardization becomes more complicated, because the spread of English has brought with it the dominance of standard American or British English. The seemingly necessary social advantages of being conversant with these varieties, especially in their written forms seem, for many, to outweigh the potential costs associated with acquiescing to politically powerful interests. One does not need to wholeheartedly adopt Phillipson’s (1992) view that this is deliberate “linguistic imperialism” on the part of the United States and United Kingdom (although purposeful spreading of the language has, historically, been advocated by some in those countries) to see that British and American English do represent a kind of hegemonic standard, especially ideologically, for many teachers and learners of English. For example, 100% of the 1,261 Chinese students surveyed by Hu (2004) chose both British and American English in response to the multiple-choice question “What kind of English do you think is standard?” No students chose Indian, Singaporean, nor even Australian, Canadian, or New Zealand English.  Phillipson (1992), who divides the English(es) of the world into the “core” and “periphery,” describes the dilemma faced in non-Inner Circle countries: whether to acknowledge and accept local norms (which may be in the process of being established), or to stick with the core standard English(es) that have historically been viewed as preferable. As Phillipson puts it, “The essential question then is the nature of the relationship between the standard English of core English-speaking countries and periphery English variants…The political, social, and pedagogical implications of any declaration of linguistic independence by periphery English variants are considerable” (1992, p. 26). He cites the debate between Quirk and Kachru (1991) on the legitimacy of non-native varieties of English as one which has important implications for “linguistic and pedagogical standards, language variation, the status of indigenized varieties of   29 English, and the norms that should hold for learners of English in a variety of contexts,” all of which are underscored by the question of “who has the power to impose a particular norm and why” (p. 26). The question of whether to uphold local or center standards has real consequences for teachers, learners, and many others in countries which have chosen or been forced to “import” English and have been developing local norms which sometimes deviate from the center standard English(es). The tension between acknowledging (a) rhetorical and linguistic practices in many non-native contexts which differ from center/standard English, and (b) the reality of the dominance of a particular (US/UK) standard variety of English has led to many debates about the proper role of non-native norms and models, and whether a non-center standard can or should be established. That written texts differ across contexts seems indisputable, but whether the features of those texts – and thus the knowledge they produce – are seen as aberrant or legitimate (by, crucially, a potentially diverse pool of readers) is a question to which too little thought has been given. It is true that different communities, whether in monolingual or multilingual English settings, may have developed standards which differ from the dominant standard promoted in the educational system. Second language writing scholars are uniquely poised to deal with globalized academic writing as a site where standards are maintained and struggled with; to better address this tension, L2 writing scholars will need a more robust theoretical approach to “non-standard” English, which is the subject of the next section. 2.4 Approaches to variation from standard English in academic L2 writing  We have established above that standard English is not a linguistic, but an ideological construct, and that the role of writing in the understanding of standard English is preeminent. We have also seen that the application of theories of standardization to standard (written) English in   30 globalized or international contexts is further complicated by the notion of local standards when considered in opposition to a kind of hegemonic Inner Circle English norm that predates the development of local norms. Given these tensions, it does not seem prudent to approach standard written English simply from a “common-sense” perspective which is able to identify some usages as objectively belonging to a monolithic “standard (written) English” while other usages do not. However, sometimes in traditional L2 writing scholarship, the notion of standard English and deviation from it have not been sufficiently problematized. In this section, then, I consider several related currents of research and thinking that can be brought to bear on the issue of variation/deviation from standard written English in NNES writers’ texts. Here I am specifically interested in exploring the implications these approaches have for lexicogrammatical variation, because though this area is of paramount importance to many stakeholders involved in L2 writing (students, instructors, test-takers, test creators, editors, and so on), we have seen little research or discussion of how it is dealt with in practice. In most contexts involving academic writing, lexicogrammatical variation from standard English is easily noticed by native and nonnative English speakers alike, and it can quickly lead to sentences, texts, or authors being perceived as deficient. While some variations are readily identifiable as linguistic errors, others may not be so straightforward, posing a challenge for us to draw the line between “acceptable” and “unacceptable” variations in certain contexts. Indeed, it is possible to view the acceptability of variations as socially constructed and highly unpredictable, as Williams (1981) demonstrated by deliberately inserting one hundred “errors” into an article in a prestigious academic publication. In scholarly inquiries focusing on second language writing, issues of linguistic variation have been discussed from several perspectives. While not all scholarly approaches to variations   31 in written language discussed below have historically made writing and/or lexicogrammar an explicit focus, each offers important implications in this area. Below, I highlight what I consider to be a more or less “traditional” approach to errors in L2 writing, followed by three related theoretical positions which are more likely to treat deviation from standard written English as potentially legitimate variation rather than errors.  Briefly, the four approaches discussed are: 1. The traditional error-based approach, which investigates errors in L2 writing and/or the unique characteristics of L2 writers’ texts as distinct from L1 writers’. 2. The WE approach, which advocates accepting multiple varieties of English. 3. The ELF approach, which investigates written language usage in terms of features common to nonnative English speakers’ discourse. 4. The translingual approach, which views texts as hybrid constructions influenced by multiple linguistic and rhetorical factors. In each section, I will describe the approach in general as well as its applications to writing and lexicogrammatical variation in writing in particular.  2.4.1 Traditional error-based approach The field of L2 writing in North America emerged, at least in part, as a “reaction to immediate pedagogical concerns in U.S. higher education” (Matsuda, 2003, p. 34), and during the early decades of its development as a field, was heavily influenced by the North American academic context. Perhaps unintentionally, this led to a monolithic conception of good writing based on practices of NES American students and instructors. NNES writers’ texts are thus usually read with an eye to how they differ from a presumed native speaker standard, often at the word and sentence level, and this has been a natural focus of L2 writing research. For example, Silva’s (1993) synthesis of early studies in ESL writing found that L2 writers made more errors   32 than L1 writers in the areas of morphosyntax, lexicosemantics, verbs, prepositions, and nouns. While great advances have been made in examining and explicating features of NNES writers’ texts at the global level and arguments in favor of rhetorical structures other than narrowly defined traditional Anglo-American ones have been made (Kubota & Lehner, 2004; Matsuda, 1997; Ventola & Mauranen, 1996), examinations of L2 writers’ lexicogrammar have primarily been influenced by traditions of error analysis in L1 and L2 composition.  What could be termed “error studies” became influential in L1 English composition toward the end of the twentieth century, with studies by Shaughnessy (1977), Connors and Lunsford (1988) (replicated by Lunsford & Lunsford, 2008), and Williams (1981) (see Chapter 3 for a detailed review of empirical studies in this area). L1 composition instructors recognized the disconnect between theory (the process approach placed less emphasis on correcting errors than developing whole compositions) and practice (most instructors felt a need to address students’ lexical, grammatical, and other local errors). L2 writing scholars found error studies more immediately relevant to their work (see Bitchener & Ferris, 2012, for further discussion of this phenomenon and a history of the treatment of error in L1 and L2 composition). Studies of teachers’ reactions to writers’ errors, measuring constructs such as acceptability, comprehensibility, and irritation, also became influential in L2 writing (Janopoulos, 1992; Kobayashi, 1992; Rinnert & Kobayashi, 2001; Roberts & Cimasko, 2008; Santos, 1988; Shi, 2001). Hinkel (2002) notes that studies in L2 writing have shown that faculty members reading L2 writers’ texts are “consistently focused on lexicogrammatical features of text, such as sentence structure, vocabulary, the syntactic word order, morphology/inflections, verb tenses and voice, and pronoun use, as well as spelling and punctuation” (p. 29). Lee and Chen (2009) highlight the pedagogical implications of this focus, suggesting more training for L2 writers “at   33 the lexico-grammatical level” (p. 281); they argue that “while these may be small and common words, instructors may need to make a bigger deal out of them, as they connect with larger issues in academic writing” (p. 292). It is clear that lexicogrammatical variation from mainstream norms in L2 writers’ texts is a concern of most stakeholders in L2 writing (writing teachers, content faculty, and of course, students) and has been a major concern in L2 writing scholarship and pedagogy. This is demonstrated in the large body of research on written corrective feedback (see Bitchener & Ferris, 2012), which suggests that research on L2 writing remains concerned with variation from standard written English, usually with the goal of allowing NNES writers to acquire the norms of academic written English in monolingual contexts.  2.4.1.1 The shift from error-based approaches  In recent decades, two relevant developments are opening new areas of inquiry for L2 writing: first, the growth of sociolinguistics-influenced research and theory in applied linguistics has given rise to a number of perspectives on difference (rarely referred to as error) from standard English relevant to writing, and second, an increased interest in sociopolitical concerns has focused on L2 writing for publication (rather than solely in the classroom). Studies on NNES graduate students’ and scholars’ difficulties publishing in English, often related to language issues (e.g., Cargill, O’Connor, & Li, 2012; Flowerdew, 1999, 2000, 2007; Y. Li, 2002; Li & Flowerdew, 2007), and larger studies of linguistic inequality between NES and NNES writers in academic publication (Ammon, 2001; Canagarajah, 2002; Lillis & Curry, 2010) should encourage us to look more closely at how non-error-based approaches could be applied to the issue of NNES writers’ written English. Below I describe three such approaches that are becoming influential.   34 2.4.2 World Englishes approach While research on L2 writing has focused on variation in the form of errors, more explicitly political calls for recognizing variation as legitimate, or “pluralizing” academic writing, have recently been made from the world Englishes (WE) perspective. The WE paradigm grew out of Braj B. Kachru’s research on English in Asia, which broke new ground in the understanding of the global spread of English. One of Kachru’s innovations was to distinguish between non-native English for international and intranational purposes: “there is a need to distinguish between (a) those countries (e.g. Japan) whose requirements focus upon international comprehensibility and (b) those countries (e.g. India) which in addition must take account of English as it is used for their own intranational purposes” (Kachru & Smith, 1985, p. 219). This theoretical division between second language and foreign language contexts is reflected in Kachru’s most well-known concept, the “Three Concentric Circles” model of English (1985) which was briefly described in Chapter 1. This model suggests that while there are significant differences in usage across the three circles, each variety should still be considered English. The language is pluricentric; the Inner Circle is not the “center,” but simply the result of the first historical diaspora of English. In second language writing, proponents of WEs have argued for recognition of non-Inner Circle features in academic texts as legitimate. These scholars hold that different national or social contexts are home to particular, localized (and legitimate) varieties of English whose norms and standards have uniquely emerged. Canagarajah (2006) made a call for “pluralizing academic writing by extending it to the controversial terrain of grammar” (p. 613) and has been active in promoting the acceptance of different varieties of English in academic publishing (Canagarajah, 2012). Y. Kachru (1995, 1997) suggests that L2 writers ought not simply be taught   35 to write in the dominant and allegedly “direct, linear pattern of Western academic writing” (1995, p. 28). She points out that the Western norm limits the acceptance of variation in style, genre, and individual creativity for any writer of English; all varieties of world Englishes, and their related rhetorics and styles, she argues, should be acceptable. Matsuda and Matsuda (2010) pinpoint the relevance of WEs to the lexicogrammatical level of academic writing, highlighting several important pedagogical implications, including the teaching of both dominant and nondominant forms and functions of English, examining “what works and what does not” in different writing contexts, and strategies for (as well as risks involving) deliberately using non-dominant or nonstandard varieties of English in writing (p. 372). Knowing that many writing teachers “would lower the student’s grade for a writing assignment if it includes features that deviate markedly from the perceived norm” (p. 371), Matsuda and Matsuda (2010) argue for the need to “embrace the complexity of English and facilitate the development of global literacy” (p. 373). 2.4.3 English as a Lingua Franca approach The basic definition of ELF, as communication between two nonnative speakers of English, was briefly covered in Chapter 1. Research on ELF traditionally focused on a “common core” of the features of English used in spoken communication between nonnative speakers (Erling, 2005; Jennkins, Modiano, & Seidlhofer, 2001) and on pragmatic strategies (Seidlhofer, 2004), and has begun to explore implications for academic spoken English (Mauranen, 2012) and, more recently, academic writing (Carey, 2013). Although the field has mainly been concerned with the highly contextual nature of ELF communication in spoken English, Horner (2011) argues that the “attitudes and strategies” identified in ELF, such as “tolerance for language variation, patience, humility, and strategies of accommodation and negotiation” should   36 be employed in writing and reading as well (p. 302). Since many of the features identified in ELF, such as non-marking of the third person verb, unconventional article usage, and so on are likely to be those that most easily “mark” texts as non-standard, a broadening of the ELF approach to writing could potentially identify worldwide trends in how academic language is actually used, which could represent a break from traditional idealized ‘standard written’ English. Jenkins (2011), commenting on the implications of ELF for internationally-minded English-medium universities and academics, suggests that ELF norms could be embraced by the worldwide scholarly community, arguing against “polishing” by native speakers and that there is “no principled justification for the norms of written academic English throughout the world to be those of Britain and North America” (p. 932). In her study on academic ELF in higher education, Jenkins (2014) reports a tendency among university instructors, especially those in Anglophone countries, to enforce conventional expectations for academic writing despite the actual practices of English writers in other contexts.  Empirical research on textual features from an ELF perspective is in its infancy; a corpus of Written English as a Lingua Franca in Academic settings (WrELFA) is currently being assembled at the University of Helsinki (The ELFA Project, 2012). The researchers describe the justification for their project:  The current world of academic writing and publishing is far more globalised than it was a decade or two ago. Yet we have no research evidence on the determinants of effectiveness in academic rhetoric in a world that is permeated by English as a lingua franca, and a constant flow of cultural influences from a variety of sources…. Project WrELFA collects and analyses academic texts written in English as a lingua franca. The   37 texts cover high-stakes genres in different fields, both published and unpublished. (para. 2-3) The results of this project are expected to provide insights into actual textual practice in ELF written academic discourse. 2.4.4 Translingual approach A more recent approach clearly shares affinities with both the WE and ELF approaches, though it is more postmodern and fluid in its application. This is a “translingual approach” to “language difference in writing,” proposed by Horner, Lu, Royster, and Trimbur (2011). (This should be understood as an approach to conceptualizing writing rather than a new term describing writing done by second language or other writers; see Atkinson et al., 2015). Rather than only focusing on discrete varieties of English or features of NNES communication, proponents of the translingual approach view “difference in language not as a barrier to overcome or as a problem to manage, but as a resource for producing meaning in writing, speaking, reading, and listening” (p. 303). Here, difference becomes the main reality with which to contend in academic writing, regardless of its source.  Specifically, the authors state their aims as: 1. Honoring the power of all language users to shape language to specific ends 2. Recognizing the linguistic heterogeneity of all users of language both within the United States and globally 3. Directly confronting English monolingualist expectations by researching and teaching how writers can work with and against, not simply within, those expectations (p. 304). These three points have important implications for a variety of issues in second language writing, but of particular concern here is the authors’ situating L2 writing in the framework of “language   38 difference,” a broad term which encompasses varieties of English as well as differences in proficiency, dialect, register, and other factors. In addition, lexicogrammatical variation is important: there is a need to look closely and empirically at “what constitutes a mistake (and about what constitutes correctness)… in matters of spelling, punctuation, and syntax” (p. 312). Canagarajah (2013) advocates the “translingual practice” of what he refers to as “codemeshing,” or “the mixing of languages,” “novel idiomatic expressions,” and “grammatical deviations from standard written English” (p. 1) in both student and scholarly writing. Canagarajah argues that some well-known scholars have successfully implemented codemeshing in their published texts and encourages students and novice scholars to embrace it (p. 125). A translingual approach to writing, as a more ideological examination of language differences, could potentially lead to wider acceptance of non-standard lexis and grammar in published texts.  You’s (forthcoming) concept of Cosmopolitan English (CE) is another translingual (or transliterate) approach. You calls CE the “English actually used by individuals across the globe, each with accents reflected in his or her pronunciation, vocabulary, syntax, and/or discourse structures” (pp. 9-10). According to You, CE “intends to capture the multiplicity of ways that individuals speak and write English within and across communities.…[I]t requires that we understand the characteristics and functions of English beyond any single cultural category” (p. 10). You, like the authors above, acknowledges language difference as the main issue to contend with in conceptualizing English use, and describes a definition of CE which “bypasses linguistic differences defined by birthright, nation, region, race, and ethnicity” (p. 9). Rather than defining English based on geographical location, native/non-native status, or other factors, CE acknowledges both the “nativeness” (“In a sense, every English speaker is a native speaker, native to one or multiple speech communities or to certain established norms”) and   39 “accentedness” (“At the same time, every speaker would sound different, or accented, to interlocutors outside his or her communities”) of every individual English speaker, and, by extension, every instance of English use. In fact, while You uses “accent” – a speaking-related metaphor—to introduce CE, his goal is “calling for a cosmopolitan turn in writing studies” (p. 14). 2.4.5 Summary: A variation-based approach to L2 writing All of these variation-based perspectives have important implications for second language writing research. While WE focuses more on local characteristics of English based on local (national) social contexts, ELF on a more general look at the emergent properties of NNES-NNES communication, and the translingual approach on a reappraisal of language difference by withholding judgment and honoring writers’ textual choices, what they all have in common is a refusal to simply view difference from idealized standard written English as error, but to further investigate the contexts in which the writing takes place. A variation-based theory (whether it draws primarily on any one of these three areas) that is agnostic about ‘error’ is a natural fit for a more sociolinguistic approach to L2 writing, and to what Lillis (2013) calls “uptake,” or broadly, what readers “do” with texts, especially when they deem the language use nonstandard. What is necessary, I argue, is to investigate what actually happens when readers are faced with texts in which they encounter instances of “difference,” however they define it. As You argues, the “accentedness” of any (written) English will be noticed by some readers, and it is necessary to explore why and how this is.  The discussion above has established that standard English is an ideological rather than a linguistic reality, and that variations from it can be approached from an understanding other than one of “error” for many reasons. If standard English is maintained by judgments against   40 nonstandard usage – and if nonstandard usage is just another kind of “language difference” that can be investigated in its own terms, rather than strictly in terms of “error” in (L2) writing – then a sound theoretical approach to readers’ judgments of variation is necessary. Below, I describe how the construct of acceptability, originally described by Chomsky (1965) as a more usage-oriented corollary of grammaticality, is suitable for this purpose. 2.5 Theorizing acceptability as an approach to language difference  As I have mentioned, a position I wish to advance in this dissertation is the use of “acceptability” as a construct in studies of writing, rather than traditional notions of “error.” This is not because “errors” have not been recognized as socially constructed and highly variable –they have, as mentioned above – but because shifting the focus from predefined, reified “errors” in language use or writing to a more general, bottom-up emphasis on language variation and how people react to it allows for a broader investigation of readers’ reactions to language use.  While part of the next chapter will review studies of acceptability which focus primarily on variation rather than error, in the following section of this chapter I describe the concept of acceptability in detail, emphasizing its social and ideological nature. 2.5.1 Distinguishing acceptability from grammaticality  Below, I discuss the genesis of the construct of acceptability in theoretical linguistics and it applicability to other areas of language studies. My purpose here is to show that while studies of acceptability in language grew out of a cognitive and quantitative tradition, the shortcomings it has been criticized for by linguists is a necessary boon to sociolinguistic investigation because of the ways in which it can be applied to variation in language.  Chomsky’s (1965) influential theory of language has driven the use of acceptability judgment tasks in language research. His distinction between “competence” (linguistic   41 knowledge) and “performance” (linguistic behavior) has long endured, and his assertion that “linguistic theory is concerned primarily with an ideal speaker-listener, in a completely homogenous speech community” (p. 3) has arguably had an important influence on the assumed primacy of the “native speaker” in linguistics and applied linguistics, though Chomsky does not believe he is uniquely responsible for promoting this concept and has rarely commented on applied linguistics (Wu, 2008). Nevertheless, the competence/performance distinction is the main reason the grammaticality judgment task (GJT) or later, the acceptability judgment task (AJT) was initially and remains one of the most important methods of linguistic research.   In its most basic form, the GJT simply involves “explicitly asking speakers whether a particular string of words is a well-formed utterance of their language, with an intended interpretation stated or implied” (Schütze, 2011a, p. 349). These “strings,” usually presented to research participants as single sentences or lists of sentences, are “arbitrary situations for adults to deal with, which tap the structural properties of language without having any real function” (Schütze, 1996, p. 2). This avoidance of “real function” is purposeful; while the study of language-in-use can to some extent allow linguists to discern the structure of a language, naturalistic data does not allow the distinguishing of “possible from impossible utterances” (p. 350). Schütze mentions four unique affordances of GJTs: examining sentences types that rarely occur, examining “negative information” about “strings that are not part of the language,” distinguishing grammatical knowledge from accidents of speech, and minimizing other factors in the study of the mental nature of grammar (p. 2). This final point is controversial, as the extent to which other factors can be minimized is difficult to discern. Theoretically, Chomsky’s generative grammar is an important catalyst for researchers’ interest in gathering data on the grammaticality of certain sentences. For Chomsky, competence,   42 “the speaker-hearer’s knowledge of his language,” and performance, “the actual use of language in concrete situations,” are distinct, and the linguist, if he or she wishes to determine the grammaticality of a particular sentence, needs a way to access that part of a person’s cognitive grammar faculties rather than “grammatically irrelevant conditions” such as “memory limitations, distractions, shift of attention and interest, and errors” (Chomsky, 1965, p. 4). This distinction has encouraged researchers to design tasks which allow access to their research participants’ “intuitions” (see Schütze, 1996), presumably guided by the cognitive mechanism of universal grammar (UG), an “innate endowment...of principles which are common to all languages” that allows humans to learn language (Lightbown & Spada, 1999, p. 18), rather than other aspects of cognition or allegedly confounding factors like those Chomsky lists above. The use of the term grammar here is twofold: linguists hope to gather reliable data from GJTs about whether certain syntactic structures are grammatical according to participants’ internal grammar (as a “component of the speaker’s mind)”, which will allow them to construct a formal linguist theory of the grammar of the language in the sense of what is and is not part of the language (ten Hacken, 2011, p. 349).  The distinction between the terms “grammaticality” and “acceptability” is instructive, as it points the way toward the use of acceptability as a concept outside of theoretical linguistics and in more socioculturally oriented fields of language study. “Grammaticality” is a theoretical term which refers to the acceptance of a sentence or linguistic feature as a confirmed part of a language’s grammar (i.e., the UG in an individual’s mind). Thus, “a sentence that is well-formed according to a given grammar” (grammar here in either sense of the term) can be considered grammatical (ten Hacken, 2011, p. 349). However, the construct validity – i.e., “the degree to which inferences may be made about specific theoretical constructs on the basis of the measured   43 outcomes” (Babbie, 1998, p. 298) – of “grammaticality” remains questionable for linguists, since one can never be sure how the UG knowledge interacts with other factors. Thus, it seems clear that most researchers actually elicit judgments of “acceptability,” which “is a concept that belongs to the study of performance,” and that “grammaticalness is only one of many factors that interact to determine acceptability” (Chomsky, 1965, p 11). This admission that “acceptability” is only a partial way to access a research participant’s grammar instinct, or competence, is problematic to linguists who wish to develop theories of syntax, but the definition of acceptability is actually a great boon to those who approach variation in language – and judgments of variation in language – under the rubric of “language difference” as proposed, for example, by the translingual approach. Already in Chomsky’s introduction of the term “acceptability,” we can see a flexibility that will serve a variation-based approach to L2 writing well. The main difference between acceptability and grammaticality is that acceptability is taken to include almost any other factor influencing a judgment about language use beyond (though also including) mental grammatical intuition. Greenbaum’s (1992) summary of acceptability calls it a judgment made by “native speakers as to whether they would use a sentence or would consider it correct if they met it” (p 451).  To Chomsky, acceptability refers to sentences (grammatical or not) which are “more likely to be produced, more easily understood, less clumsy, and in some sense more natural” (Chomsky, 1965, p. 10). Greenbaum (1977) adds that “a given piece of language need not be inherently acceptable or unacceptable,” but that “its acceptability may depend on sociological or psychological factors” (p. 6), and Schütze (1996) notes that “acceptability” can connote “contextual appropriateness rather than structural well-formedness” (p. 27). Greenbaum (1992) mentions a number of other influences, including   44 “views that a sentence is nonsensical, implausible, illogical, stylistically inappropriate, or socially objectionable” (p. 451).  While the difficulty in isolating grammaticality indeed presents a barrier to studies in (cognitive) linguistics, these “distracting” factors can and do, in fact, yield useful data for socially-oriented studies of acceptability.  2.5.2 Acceptability in world Englishes and related fields World Englishes, as a research area motivated by sociolinguistic realities of the global spread of English, is a prime example of a field which is concerned with the construct of acceptability in a social and cultural sense rather than simply “grammaticality.” WE looks at acceptability as indexed by both individual and wider social attitudes at the micro (features of language usage) and macro (the status of language varieties) levels. In the micro sense, WE is concerned with the “acceptability” of variations or particular “features” of a language (which may be innovations, if accepted) in use; in the macro sense, WE considers the “acceptability” of a variety of English itself, based on the attitudes of English speakers toward its features, speakers, or functions and status in a given society. (This type of work has also been done by sociolinguists not necessarily associated with the WE tradition; see the edited collection by Greenbaum, 1977, for several examples involving English in Nigeria, Guyanese Creole, and modern Hebrew).  Kachru (1986a) lays out two senses of “acceptability” (or “acceptance”). For Kachru, “acceptability” “expresses a language attitude, and implies various types of appropriateness” (p. 16); this is “an external matter, educational or social” (p. 30). At the micro level, acceptability is an attitude toward an innovation in a non-native variety of English: Kachru describes the stereotypical reaction of a native speaker to a feature of a non-native variety as a judgment rendered by the statement “as a native speaker I would not use it” (p. 16). This acceptability   45 judgment is therefore taken to be indicative of social norms, cultural knowledge of the form and functions of linguistic items across contexts, educational models, and other “external” factors. It is not only native speakers who act as arbiters of acceptability, however; Kachru’s notions of endonormativity and exonormativity in WEs suggest that non-native speakers can either willingly adopt an exonormative standard, based on “a native model (e.g., American or British)” or an endonormative one, based on “a local educated variety as the model for teaching and learning” (Kachru, 1986a, p. 21). A non-native speaker, therefore, can easily have a similar reaction as the hypothetical native speaker mentioned above, and depending on the social context, may make an acceptability judgment based on a preference for his or her own established non-native variety or a native variety.  Bamgbose (1998) further develops the concept of acceptability in WE when he names it as one of the five “factors necessary for deciding on the status of an innovation” (p. 3). For Bamgbose, innovations are key to the development and recognition of a variety of English: “an innovation is seen as an acceptable variant while an error is simply a mistake or uneducated usage. If innovations are seen as errors, a non-native variety can never receive any recognition” (p. 2). Thus, Bamgbose also connects the micro-level acceptability of variations to the macro-level acceptability of non-native varieties. In order to determine whether a given usage is an innovation, the questions “How many people use the innovation? How widely dispersed is it? Who uses it? Where is the usage sanctioned? What is the attitude of users and non-users to it?” must be answered; Bamgbose calls these (respectively) demographic, geographical, authoritative, codification, and acceptability factors (p. 3), noting that acceptability is “the ultimate test of admission of an innovation” (p. 5). Li (2010) further argues for the continuing importance of   46 acceptability in determining the status of innovations, specifically in an era when English speakers use “the Internet as a catalyst of acceptance” (p. 627).  At the macro level, Kachru’s (1992) explanation of the development of institutionalized non-native varieties of English is analogous to the micro-level question of the acceptability of a variation: a variety of English can either be accepted or rejected by users; it can move from “non-recognition” to eventual “institutionalization,” which is “the sociolinguistic acceptance of new norms” or local models for English usage (Mollin, 2007, p. 34). Acceptability or acceptance by non-native (and perhaps native) speakers is therefore crucial in the question of a variety’s legitimacy and continued use. Kachru lays out other criteria for the ontological status of institutionalized varieties, including consideration of its “sociolinguistic status” (p. 39); the gradual acceptance of a new non-native variety is called an “attitudinal process,” which, along with the “linguistic process” of a variety’s development, is a key factor in the establishment of the variety. In order for a variety to have sufficient sociolinguistic status, users should accept the local norm. Kachru argues that “a variety may exist, but unless it is recognized and accepted as a model it does not acquire a status” (p. 57), and that non-native speakers (to say nothing of native speakers) are often “hesitant to accept” a local variety of English. Finally, in his discussion of the development of a local “model” (essentially defined as an accepted standard), Kachru again places social acceptability at the heart of non-native Englishes, which, he says, move from a status of non-recognition to sometimes mocking recognition (“English labeled Indian was an ego-cracking linguistic insult”) to eventual recognition or acceptance, when there is “linguistic realism and attitudinal identification with the variety” (p. 57). With these two intimately related senses of acceptability together, we can begin to see the importance of the concept for studies of “non-standard” uses of English.  At the micro level,   47 AJTs can be used to investigate how a particular variation which is attested in non-native usage is perceived by a participant, and it is assumed that this judgment will be based on conscious attitudes, beliefs, and so on, influenced by a participant’s life experience. A judgment of “unacceptability” will be of interest to researchers, as might a comparison of AJT results to actual usage in the particular country or context under investigation. The actual judgment of acceptability or unacceptability may be less important than the reasons participants give or the discourse the produce as they make the judgment, which can be analyzed to discern broader orientations to norms, standards, attitudes, and beliefs about English (see, for example, Higgins, 2003, discussed in more detail in Chapter 3). In addition, macro-level acceptability, or research investigating whether a given variety (or variant) is accepted by an individual or a speech community, can also be indirectly investigated by means of AJTs used in conjunction with surveys which ask direct questions about language attitudes. One does not need to be working exclusively in the WE paradigm in order to apply AJTs in this way; rather, the way that WE theorizes acceptability shows that it is relevant to any scholar working in a variation-based paradigm. 2.5.3 Re-theorizing acceptability judgment research There is obvious resonance between the concept of acceptability and the variation-based or non-error based approaches to language difference in L2 writing described in the previous section. Shifting from a construct of “error” to one of “acceptability,” in which “language difference” is confronted, discussed, and analyzed, will open up many new possibilities for research on readers’ uptake of L2 writers’ (and indeed L1 writers’) texts, and this has important real-world consequences for writers.    48 In fact, most social theories of language which have risen to prominence in applied linguistics are amenable to views of “acceptability” which would interpret individual judgments of language use as being embedded in a social milieu rather than as products of cognitive phenomena. While many linguists take a view of one’s native language as a mental structure, social theories focus on the cultural forces which shape our conceptions of what language is or should be. For example, Bourdieu (1991) argues that a social study of language should “take as its object the relationship between the structured systems of sociologically pertinent linguistic differences and the equally structured systems of social differences” (p. 54, emphasis in original). Similarly, Bakhtin’s theories suggest that languages comprise “dynamic constellations of sociocultural resources that are fundamentally tied to their social and historical contexts” (Hall, Vitanova & Marchenkova, 2005, p. 2). In short, a social view of language, while not denying the worthiness of investigating the mental structures of grammar, instead emphasizes the multiple social and linguistic influences on language (and again, by extension, judgments about language) by all who use it. By now, it should be clear that acceptability is a useful construct for those who wish to investigate language difference and its consequences. Table 2.1 below proposes a “social/ideological model” of AJTs which differs from the “cognitive/linguistic” model. Table 2.1 Comparison of linguistic/cognitive view and social/ideological view of AJTs  Cognitive/linguistic Social/ideological Academic heritage  Generative linguistics, second language acquisition, cognitive psychology  Sociolinguistics, sociology of language, world Englishes, English as a Lingua Franca Prominent goals of research Build theoretical models of language (linguistics) or describe learner language (SLA) Investigate perceptions about variations in usage  Understanding of acceptability A judgment (at least partially) determined by an individual’s internal, mental grammar(s) A judgment (mostly) produced by an individual’s encounter with norms, standards, attitudes     49  Cognitive/linguistic Social/ideological Purpose of AJT Allowing (indirect) access to learner’s internal cognitive process of making a judgment Producing discourse about speaker’s judgments based on social factors  Texts used in AJT Context-free “unusual” sentences Samples of authentic language from media, literature, corpora, etc  Methods used for AJT Questionnaire and/or interviews  Questionnaire and/or interviews Type of Analysis  Often quantitative   Often qualitative     View of non-native speakers  and their judgments Learners of target language whose judgments describe their cognitive knowledge of the target language (May be) competent speakers of non-native varieties whose judgments index norms, attitudes Status of “unaccepted” instantiations of language Not part of a language’s grammar; may be errors Further source of investigation (to ask “why”); potential innovation in language use  My purpose in providing this table is to reiterate the ways in which a variation-based approach to judgments of acceptability goes beyond the theoretical linguistics approach and opens up new questions and possibilities for research. Rather than being limited to judgments about decontextualized sentences and whether or not they are a part of an individual’s intuitive mental grammar, the social/ideological AJT looks at naturally occurring samples of language and investigates all possible reasons for judging language use as unacceptable. Kachru referred to acceptability as an “external” matter, and a social or ideological view of acceptability acknowledges that making judgments about language – like the use of language itself – is an inherently social activity, and one that is frequently engaged in by all users of language. The section below explores this concept in more detail, looking at the everyday, commonplace use of judgments of acceptability and what this means for investigations of acceptability judgments.   50 2.6 Acceptability, ideology, and authority While “acceptability” has its roots as a theoretical construct for investigating grammar, judgments of acceptability (or correctness, standardness, appropriateness, or any number of other concepts) are actually a commonplace reality in everyday language use. Research by sociolinguists and linguistic anthropologists on language ideologies has shown that judgments of language use are a part of normal linguistic behavior, and that language ideologies emerge, as it were, in interaction (Liebscher & Dailey-O’Cain, 2009) from metalinguistic talk. In this section I undertake a brief description of the role of acceptability judgments in talk about language from a ‘language ideology’ perspective, arguing that judgments of language are an indispensable part of any language user’s linguistic behavior. I look at perspectives from Milroy and Milroy (2012) and Cameron (2012) on what these judgments are, what they do, and how authority is implicated. Finally I take up the question of authority in judgments, arguing that authority is not (only) an institutional force, but a kind of social resource which individuals are able to claim in order to position themselves as credible judges of acceptability in language use. 2.6.1 Language ideology Metalinguistic discourse, or talk about language, is a common everyday practice, and language ideologies are commonly produced through this kind of talk (Cameron, 2012; Libescher & Dailey O’Cain, 2009; Laihonen, 2008; Milroy & Milroy, 2012). I use the term “ideology” or “ideological” in more or less its most basic meaning in order to distinguish a phenomenon like standard English or a judgment of acceptability in language as “ideological” (that is, wholly or partially constituted by beliefs, attitudes, or other subjective ideas) rather than, say “objective” or “scientific.” I adopt the perspective that language ideologies are simply “explicit metalinguistic…talk about language” with a “social character” (Laihonen, 2008, p.   51 669); they are a “set of beliefs about language shared by a community” (Bex & Watts, 1999, p. 169). The notion of language ideology thus provides a useful way to examine how people “take up” language use, since it recognizes the socially constructed nature of (un)acceptability of particular usages as well as the perceived legitimacy of language varieties.  2.6.2 Metalinguistic judgments as a linguistic practice  It has been argued that language ideologies, and specifically judgments of language use, are as much a part of language as its functional use is: language ideologies and metalinguistic judgments are “an integral part of using” language (Cameron, 2012, p. 3). Silverstein (1979) refers to linguistic or language ideologies as “any sets of beliefs about language articulated by the users as a rationalization or justification of perceived language structure and use,” and argues that these ideologies are influential in shaping language use itself; “to rationalize it (language use), to ‘understand’ one’s own linguistic usage is potentially to change it” (p. 233). The influence of language ideology on language use can take place at the individual but more importantly the social level, as beliefs about language use and structure become reified and stable, influencing behavior. Thus language ideologies, often manifested in explicit judgments, play a key role in the ongoing maintenance of a standard language; as Cameron argues, “how people understand and evaluate language, and what they do with it…may not be so easily separated” (p. 32). There are two useful models for looking at the role of ideological judgments of language use: Milroy and Milroy’s “complaint tradition” and Cameron’s “verbal hygiene.” Milroy and Milroy (2012) argue that the primary methods of maintaining standard English are first the codification of formal written English in “dictionaries, grammars, and handbooks of usage,” and subsequently “prescription through the educational system” (p. 30). The other important public method of maintaining standard English is what they refer to as the   52 “complaint tradition.” The “complaints” are mostly associated with public intellectuals, journalists, writers, politicians, and people who write letters to the editors of newspapers, and Milroy and Milroy separate them into two types: Type 1 complaints are “concerned with correctness” and attack “mis-use” and “errors” in English, while Type 2 complaints are “moralistic” arguments in favor of “clarity in writing” and attack “abuses of language that may mislead and confuse the public” (pp. 30-31).  The two types of complaints have a slightly different focus – Type 1 is focused on usages which are taken to be a violation of the structure of standard English, and is associated with the idea that a language is “degenerating” or that schools are no longer properly teaching grammar. These types of comments, while they sometimes refer to written English (as in the popular website known as “The ‘Blog’ of Unnecessary Quotation Marks”), often take language users to task for not following the norms of standard written English in their speech, as in the complaint cited by Milroy and Milroy that a professional used the form “I seen” rather than “I saw” (p. 31). Type 2 focuses specifically on standard written English; these complaints do not argue that the questionable usages are nonstandard, but unclear, dishonest, or irresponsible. The many complaints made in contemporary popular discourse about the use of the “passive voice” is one pervasive example of this type of complaint (see Pullum, 2014). The Milroys refer to many of the complaints made about language as “irrational” and as resulting from a confusion between the norms of standard written English (which, they argue is more stable) and spoken, colloquial English. They also associate language complaints with social and political attitudes and “authoritarianism.” We will return to the question of authority in the next section.   53 Cameron (2012) proposes another model for describing judgments about language, which she refers to as “verbal hygiene.” Unlike Milroy and Milroy, Cameron does not dismiss ideological judgments of correctness and morality in language as “irrational,” but considers them “a discourse with a moral dimension that goes far beyond its overt subject to touch on deep desires and fears” (p. xiii). Verbal hygiene, she argues, is part of a “general impulse to regulate language, to control it, to make it ‘better’” (p 9). It is something that all language users, not only pundits, engage in, and happens “whenever people reflect on language in a critical (in the sense of ‘evaluative’) way” (p. 9). This includes judgments which would fall under both Type 1 and Type 2 complaints in Milroy and Milroy’s schema: in addition to complaints about nonstandard usage, Cameron’s wide-ranging examples of verbal hygiene include practices like copyediting, campaigning for spelling reform, debating the use of non-sexist language, and mocking people’s accents. Unlike Milroy and Milroy, Cameron rejects the dichotomy between professional linguists’ efforts to be “descriptive” of actual language structure and usage practices and linguists’ condemnation of “prescriptivism,” which usually means laypeople’s judgments of correctness in language according to traditional grammar or other non-professional or “folk” rules of English. She refers to linguists’ tendency to condemn prescriptivism as itself a prescriptive stance, and argues that linguists are involved in practices which lend themselves to prescription, including language planning and lexicography. While description of language is often treated as scientific and rational and prescriptivism as ideological and irrational, Cameron asks “a real and serious question as to whether it is inherently irrational to care about the way language is used” (p. 222). She argues that every language user (including linguists) agrees in practice that “some kinds of language really are more worthwhile than others” (p. 224).    54 Rather than isolating non-experts’ judgments as ideological, she proposes some guidelines for discussing language judgments made by anyone. One can legitimately make value judgments on the use of language, she argues, but not all judgments are equally valid. Debates about language use should be subject to the same rigorous standards of argument as other topics are, including “reasoned argument, logic, the marshaling of evidence,” and so on (p. 225), and objective facts about language should not be ignored, even if judgments are often subjective. She also argues for further inquiry into how and why specific “rules” of language have come to be promoted, and whose interests and agendas are served by judgments about language use. Though they differ in their emphases, both models of judgments about language use described above offer a useful framework for analyzing judgments of acceptability in language. Rather than regarding acceptability judgments as inherently irrational, non-scientific, and irrelevant to understanding what constitutes language, they should be understood as an important component of language use and language users’ linguistic repertoires, and as contributing in no small measure to the ongoing maintenance of standard English in both its spoken and – too often ignored – its written form. Considering all judgments of correctness, appropriateness, rightness or wrongness as being connected to questions of social, political, and linguistic values levels the playing field, as it were, for researchers to explore judgments from many different types of people. While traditional notions of grammaticality may privilege professional linguists’ judgments and may regard laypeople’s judgments with suspicion as to whether they are truly based on grammatical intuition, sociolinguistically and ideologically oriented studies are free to look at judgments from linguists, non-linguists, and a variety of others in between (e.g., language professionals such as teachers, writers, editors, and so on) and investigate the ways in which they make judgments and the reasons they give for doing so.    55 The reasons for judgments – that is, the “why” questions Cameron refers to – are tied to the notion of authority, or the basis on which judgments are made, because in reality, judgments of language have two parts: first, the actual judgment, and then, the reason or basis for the judgment, or the authority on which a person is able to make it. Below, I briefly describe language ideology approaches to authority, and then argue that a more individual or intersubjective notion of authority is more appropriate in investigations of authority in judgments of linguistic acceptability. 2.6.3 Authority: Claiming the right to make judgments about language use Traditionally, the concept of authority has been approached in studies of language ideology from an institutional perspective, referring to authority as coming from the state (Errington, 1995), the public (Gal & Woolard, 1992), language academies such as those in France and Spain (Villa, 2013), academic or educational institutions (Bermel, 2006), or “writers, teachers, media practitioners, examination bodies, publishers, and influential opinion leaders” (Bamgbose, 1998, p. 4). In this view, “authority” is understood in a political or otherwise institutional sense, as something external to an individual language user which exerts power or control and which can be appealed to as a reason for making a judgment or for support in a debate about acceptability.  Milroy and Milroy (2012) operate with a similar external understanding of authority. Though the Milroys’ book is subtitled “Authority in Language,” they do not explicitly define their understanding of authority; it seems, however, to be linked to their understanding of “prescription,” or “imposition of norms of usage by authority” (p. 2). In turn, they argue that prescriptivism, in the views of many who complain about language, creates a connection between “good grammar” and “obedience to authority” in a general sense (p. 134). In addition to   56 this general understanding of authority as a person or institution with power or control, they also refer to people’s tendency to refer to dictionaries and other books as “authorities,” which, although it is less institutional, still conceives of authority as something external to the individual language user. Cameron’s (2012) view of authority differs from a strictly institutional one. She argues that “it is not always so easy to identify the relevant authority, or to know whence it derives its legitimacy” in matters of language judgments (p. 6). This suggests that while there can be external sources and institutions for authority in language (most notably, linguists and other scientists), there is also a sense in which authority is a less clearly-defined abstract concept; Cameron refers to authority in language as “the respect people have for custom and practice, for traditional ways of doing things” (p. 13), and argues that “linguistic conventions are routinely felt to be of a different order from many other social rules and norms. Their authority is not just an external imposition, but is experienced as coming from deep inside” (p.14). The way Cameron describes authority in language judgments here suggests that convention in language use itself has authority (“their authority”), but also that authority is something people feel “inside.” While acknowledging this highly subjective, personal notion of authority, however, Cameron notes that various people in positions of power operate as authorities in judging language, such as editors, corporations, lexicographers, and so on. While authority is most often conceived of in studies of language judgments as something external, imposed by institutions, people in power, or publishers of prescriptive language materials, Cameron’s assertion that authority is “experienced as coming from deep inside” is worth more consideration. Studies of acceptability have analyzed participants’ metalinguistic discourse as a way to measure what has been referred to as “ownership,” a   57 concept which has been used to describe this internal disposition Cameron refers to above – sometimes referred to as legitimacy, indigenization, or the degree to which speakers “project themselves as legitimate speakers with authority over the language” (Higgins, 2003, p. 615).  While the metaphor of “ownership” to describe the legitimacy of English speakers as people with the right to use, shape, and judge English has been prominent in applied linguistics for twenty years, I suggest that a subjective and language-user-focused definition of “authority” is more useful for discussing an individual English user’s relationship to the language and the judgments of linguistic acceptability.  I advocate the concept of authority rather than ownership (e.g., Higgins, 2003) for several reasons. First, the concept of ownership – even though it has been used in order to argue that native speakers are not the sole “owners” of English but that non-native speakers are also legitimate owners (see Widdowson, 1994) – encourages a short-sighted view of English and language in general that views it as a static entity that can be “owned” by a person or group.   Secondly, actual users of English, even when they are involved in making judgments or engaging in ideological debates about it, are always already sidestepping the ownership question, because the reality of the language and its use are immediately relevant in their lives whether or not they feel “ownership.” Regardless of whether people perceive themselves to be “owners” of English, they must face real-world language problems in the same way any other speaker would: by drawing on their own knowledge of, proficiency in, and beliefs about English – and in this and similar studies, they are given the task of making a judgment regardless of whether they feel ownership of the language. Thus, the pertinent question becomes not whether they talk about language in a way that positions them as owners of English, but how they talk about how they   58 are able to claim the authority, or in a sense ethos – their own credibility and authority – in order to describe how or on what grounds they make judgments of (un)acceptability.   Thus, I posit a notion of authority as a symbolic resource to be accessed by language users, and one which, like language ideology itself, emerges in the normal course of language use (which itself includes judgments). Just as Henry (2010) argues that labeling an English use as “Chinglish” is not an objective linguistic categorization but is “produced in the intersubjective engagements between language learners and native speakers” (p. 669), authority needs to be conceived in a more intersubjective way; when we make judgments about language, we actually position ourselves as people with authority based on our own knowledge, experience, and self-conception. Rather than viewing authority as an external force, I view it as something that individual language users claim for themselves. Authority is produced by individuals’ drawing on ideologically-inflected understandings of language, language learning, institutions, and the relationship between readers and writers.  This will be further explored in the data analysis in Chapter 7. 2.7 Conclusion  This chapter has examined four intimately interrelated concepts: the ideological construction of standard (written) English, non-error-based approaches to variation from standard written English, theorizing acceptability as an approach to people’s reactions to variation, and the role of authority in acceptability judgments. Based on the framework laid out above, we might expect to see complex and contradictory understandings of what constitutes acceptable usage for English writing in different contexts. Writing is likely to be a site of struggle, and various theoretical positions outlined above suggest that there may be factors of push and pull toward both favoring a standard written English based on Inner Circle norms, and   59 local standards in which English is written in accordance with norms that have developed in a particular context. Different people may make different arguments for the (un)acceptability of particular non-standard usages and make different claims to authority based on their own contexts, backgrounds, and dispositions.  Surprisingly, very little research has been carried out on these topics, especially in the area of writing. By asking people what they think about actual written language, we can probe their attitudes about the “deviant” features, and see whether these can point us toward understanding how local context makes a difference in terms of how acceptability is perceived. We know that “standard English” is an ideological construct which writers – especially academic writers, if they wish to have the benefit of being considered worth the attention of readers – must aspire to. Yet because standard written English is a kind of chimera, there is no simple account of how writers follow norms of standardness nor how readers determine what they deem acceptable usage, and on what grounds, and with what authority. It seems clear that a number of diverse factors play a role in any English user’s understanding of acceptability: individual proficiency, exposure to written texts, concerns of audience, rhetorical purpose, and institutional and individual understandings of appropriate style and register, to name a few. An ideological approach to acceptability judgments of putatively non-standard usage in texts offers fruitful opportunities for research in second language writing.     60 Chapter 3:  Literature review of empirical studies of reactions to variation from standard English The previous chapter discussed the theoretical underpinnings of an approach to (written) English which does not assume a stable, clearly defined category of “error,” but assumes that reactions to variations from a presumed standard (written) English are ideological, subjective, and contextual, and that the globalization of academic writing and of English itself further calls into question traditional notions of standards and authority in written English usage. It also proposed “acceptability” and the use of acceptability judgment tasks as a potentially useful method of studying reactions to “nonstandard” English usage in texts.  With this in mind, I review in the current chapter empirical studies from several different research traditions – L1 composition, L2 composition, and world Englishes (WE) and English as a Lingua Franca (ELF) – in order to examine the methods they use and results they obtain when investigating reactions to deviations from presumed standards in English. While they were carried out in different fields, what the studies reviewed below all have in common is that they involve providing written texts (whether full-length compositions or lists of sentences) to research participants to solicit their judgments of language use. Studies in each of the three areas are reviewed below, with discussions of the unique characteristics of the approach taken by scholars in each field, and finally a conclusion discussing strengths and weaknesses of each approach in light of the present study. 3.1 Error studies in composition  While error was one of the most important facets of composition since its inception in the late 19th century, it was not until the 1980s that an understanding of error as a social construction, or a result of reader response to a text, came to prominence. The contemporary   61 tradition of L1 composition error studies begins with Shaughnessy’s pioneering 1977 work Errors and Expectations. Shaughnessy’s work signaled the end of the era Bitchener and Ferris (2012) refer to as one of “error as character flaw” (p. 30). Santa (2005), in a similar classification, refers to the history of L1 composition until the 1960s as an era of “attention to mechanical correctness” (p. 25); during this period the predominant view was of “error as deficit and… the writer as the source of the deficiency” (Olinger, 2011, p. 419). In the last thirty years – particularly in the 1980s and 1990s – a number of conceptual essays (Anson, 2000; Bartholomae, 1980; Horner, 1992; Lu, 1994; Williams, 1981) and empirical studies (especially Connors & Lunsford, 1988; Hairston, 1981; Lunsford & Lunsford, 2008; and Wall & Hull, 1989) were instrumental in recasting the notion of error as a highly subjective and variable social construction. In a sense, the translingual perspective, championed by Horner, Lu, Royster, and Trimbur (2011) and used in an empirical study by Zawicki and Habib (2014), represents the logical theoretical and methodological culmination of this line of thinking in composition studies. One can easily trace Williams’ (1981) notion that error is a “flawed verbal transaction between a writer and a reader” to Lu’s (1994) discussion of error as a “conflict between the codes of standard English and other discourses” (p. 455), then to Canagarajah’s (2006) assertion that nonstandard usage can be “an active choice motivated by important cultural and ideological considerations” (p. 609), and recently to Zawacki and Habib’s (2014) description of error as a negotiation not only “between the student and the instructor,” but “a whole host of interested others who populate the contact zone where error is negotiated” (p. 187). Errors have thus been placed in a social context where the issue is less one of absolute adherence to unchanging rules of correctness in usage and more one of how writers and readers deal with the many changing contexts and standards for language use in writing.   62 Though error studies in L1 composition have had less prominence in the discipline since the 1990s, the empirical studies reviewed below show the methods that composition researchers have used to determine both the ‘how’ and ‘why’ of writing teachers’ perceptions of and reactions to ‘errors’ in student writing. The studies I review here represent two streams of research: the first, “error gravity,” was a prominent type of study in both L1 and L2 writing from late 1970s until the early 2000s, and it still exists as an approach. Although this research has been more thoroughly developed in L2 writing (see, for example, Rifkin & Roberts, 1995, and Roberts & Cimasko, 2009, for reviews of error gravity research), error gravity has been the most common empirical method for advancing error studies in L1 composition; interestingly, all of the classic error gravity studies (with the traditional method of presenting participants with a list of decontextualized sentences containing errors, and asking them to rate the egregiousness of the errors) use businesspeople as their participants, since a sense of how the average “layperson” views written English errors is seen as an important indication of how composition instructors should shape their pedagogy. The second stream of research is what I refer to here as “post-hoc error analysis”. This category involves looking at texts that have already been written and/or marked by instructors in order to determine which errors are the most commonly noticed or marked by instructors. Two studies are less classifiable, with one in each category: Wall and Hull (1989) is similar to an error gravity study but uses a full composition rather than a list of sentences (which is an important difference, described below), and Zawacki and Habib (2014) involves no texts, only interviews about faculty members’ attitudes toward their L2 students errors’ – a kind of post hoc error analysis without the use of actual texts by the researchers. (Though this study technically involves L2 writers, it is included here because the work is situated in the area of writing across the curriculum (WAC) and adopts a translingual   63 perspective, and WAC and translingual writing are largely  situated within the L1 composition disciplinary community.) Table 3.1 and Table 3.2 below outline the major empirical error studies which involve readers’ judgments of texts in the tradition of L1 composition research.   Table 3.1 Error Gravity studies reviewed Author (Year) Participants/sources Instrument/text Hairston (1981) 84 nonacademic professionals 65 sentences based on common student errors Wall & Hull (1989) 55 English instructors (primary/secondary/college)  One college admissions essay Gilsdorf & Leonard (1990) 200 business academics, 133 business executives 58 sentences with errors typical of business students Beason (2001) 14 business professionals in two cities Five versions of a business document, each with specific types of errors deliberately inserted Leonard & Gilsdorf (2001) 130 business academics, 64 business executives 50 sentences, similar to 1990 study Gray & Heuser (2003) 84 nonacademic professionals 88 sentences, similar to Hairston (1981)  Table 3.2 Post-hoc error analysis studies reviewed Author (Year) Participants/sources Instrument/text Shaughnessy  (1977) 4,000 student writers 4,000 texts by community college writers Connors & Lunsford (1988) 300 college writing instructors 3,000 academic texts marked by college writing instructors Lunsford & Lunsford (2008) Unspecified number of college writing instructors 827 academic texts marked by college writing instructors Zawacki & Habib (2014) 18 university faculty members Interviews about “disciplinary genres and the performance of L2 writers”     64 3.1.1 Post-hoc error analysis studies Shaughnessy’s (1977) study, for which she gathered “some 4,000” placement essays by incoming first-year university students, outlines the most common errors that “basic writers” make in their compositions, in the categories of handwriting and punctuation, syntax, spelling, vocabulary, and what she calls “common errors,” or those which “have the power, when they occur frequently, to hinder or even halt the average reader” (p. 158). Shaughnessy’s central point is that error is a developmental stage for basic writers, and composition instructors need to adjust their expectations in order to see them as novices who will improve, rather than individuals whose linguistic and intellectual skills are so flawed that they have no hope of producing readable prose. Though she did not explicitly question the notion of error from a sociolinguistic perspective, Shaughnessy’s view of error as part of a developmental stage that basic writers pass through was influential in launching further studies of errors and how they were perceived by teachers and non-teachers alike. Connors and Lunsford’s 1988 study is perhaps the most well-known empirical error study in L1 composition; it was influential enough to be replicated by Lunsford and Lunsford in 2008 and is frequently cited as evidence of how patterns in students’ composition errors have changed (or not changed) over time. Connors and Lunsford’s goal was in part historical: they wanted to compare composition instructors’ response to student errors in the 1980s with previous taxonomies and/or lists of the most common college writing errors created by researchers in the early 20th century.  The authors collected essays that had already been written and marked by teachers; they received over 20,000 papers and analyzed a random sample of 3,000. Connors and Lunsford created a taxonomy of errors by reading 300 papers and attempting to identify every error in the papers, including those which were not marked by the teachers who had corrected   65 them. This produced a taxonomy of about 50 different types of errors. They used the top 20 errors from this taxonomy to train the raters of the 3,000 papers, and were able to produce a table of which errors occurred most frequently in all the essays (as determined by the raters), how often those were marked by the teachers, and a comparison of those errors identified by raters with those marked by the teachers.  Connors and Lunsford describe several findings: first, “teachers’ ideas about what constitutes a serious, markable error vary widely” (p. 402). Second, teachers on average marked only about 43 percent of the top 20 errors, and even the most frequently marked errors were only marked “two-thirds of the time” (p. 402). Third, a teacher’s choice to mark any given error is complex, and depends on “how serious or annoying the error is perceived to be at a given time for both teacher and student, and how difficult it is to mark or explain” (p. 404). Finally, the authors claim that although patterns of error may have changed since the early 20th century, students appear to be making roughly the same number of errors per composition as they have for the last century. Connors and Lunsford conclude by stating that the study “has raised more questions than it has answered” – chief amongst them being “Where….do our specific notions of error come from” (p. 407)?  Lunsford and Lunsford (2008) replicated Lunsford and Connors’ (1988) study twenty years later. Their procedures for collecting and coding the texts were very similar to the original study (though they noted that the process of recruiting participating teachers and collecting texts was much more laborious due to increased strictness in research ethics review processes). The authors noted two “major shifts” in the papers since the 1988 study: first, the 2008 papers were on average more than twice as long as those collected in 1988, and second, the 2008 papers were more focused on “argument and research” where the 1998 papers had been mainly “personal   66 narrative” (p. 793). They also note the presence of errors related to problems with documenting or integrating source texts, which may have been related to the shift in genres being written.  Overall, Lunsford and Lunsford (2008) noted “how little some things have changed in terms of teacher comments” in the two studies; in general the errors marked in both studies were “the highly visible and easy-to-circle mistakes” (p. 794). The number of errors per paper barely changed (2.26 in 1988 to 2.45 in 2008), and only slightly fewer of the errors were marked (38% in 2008 vs. 43% in 1988), which they attributed to the longer texts in 2008. While their focus, like the original study, was not qualitative descriptions of how teachers reacted to error, but a quantitative tally of what types of errors occurred in student essays and which were most frequently marked by teachers, their study reveals that types of errors shift depending on the genres being written, and that this requires composition scholars to “continue to work toward a more nuanced and context-based definition of error” (p. 801). Zawacki and Habib (2014) is the only study to take an explicitly translingual approach to error. Though they do not examine specific texts, their research involved interviewing faculty members about their experiences of L2 writers in their courses, and for this paper they focused on the theme of error and how the instructors “negotiated” error. They found that for the faculty members they interviewed, “decisions about whether to ignore errors, correct them, take off points, or fail the paper became much more complicated when the errors involved lexical choices that raised worrisome questions about comprehension” (p. 192.) That is, the reason most of the instructors were concerned about student errors was that they felt some lexical and grammatical errors belied students’ lack of content knowledge. The study also addresses questions of “fairness” to L2 writers in education and the workplace – not in the sense of equitable grading (i.e., “equal” treatment of L1 and L2 writers) – but in the sense of adequately preparing them to   67 meet readers’ expectations in other areas of their lives. Finally, the authors discuss the participants’ changes in “readerly disposition” depending on the genre of writing they assign their students; the authors found that assigning reflective writing allowed the instructors to shift their focus from perceived errors, “to stop worrying about perceived external pressures and expectations, and to focus on how the students are learning the material and on their processes for writing about that learning” (p. 200). The authors conclude that faculty members’ willingness to “negotiate” errors  “derives from a complex mix of motives, including their learning and writing goals for students, their sense of what’s fair to L2 students along with the other students, and their understandings – and misunderstandings – of L2 error” (p. 202). 3.1.2 Error gravity studies  Hairston (1981) was interested in the views of the “administrators and executives and business people” who “care about standard usage or at least some features of it” (p. 794). Her research asks of these people: “Do all mistakes matter? If not, which ones do? Do they [non-writing teachers] have the same priorities for writing that we do” (p. 794)? Her study was a sixty-five item questionnaire; each item was an English sentence with “one error in standard English usage” and had three options for a response: “Does not bother me; Bothers me a little; Bothers me a lot” (p. 795). This research design has much in common with linguistic AJT studies (see, for example, Ross, 1979) and some error gravity studies in L2 writing. Hairston categorized the errors (based on the number of responses each got) into “Outrageous, Very Serious, Serious, Moderately Serious, Minor, or Unimportant” (p.  796).  In the “Outrageous” category were what Hairston refers to as “status markers,” like the case of using “brung” instead of “brought,” which 79 of her 80 participants said bothered them a lot. This category also included other verb use associated with nonstandard dialects, such as “we was,” “Jones don’t think,” and so on.  This   68 category also included double negatives. The Very Serious category included sentence fragments, comma slices, and parallelism problems. A number of different types of grammatical deviances from standard English were in the serious to fairly serious categories, and the minor and unimportant categories were mostly for stereotypically unacceptable usages like qualifying “unique” (e.g., “very unique), “different than” vs. “different from,” and treating “data” as a singular rather than a plural noun. Hairston’s study inspired a number of other error gravity studies; it was explicitly replicated by Grey and Hueser in 2003, and was cited as the primary inspiration for Leonard and Gilsdorf (1990), which itself was replicated by Gilsdorf and Leonard in 2001. Grey and Heuser (2003), in their replication of Hairston (1981), identify what they take to be a number of weaknesses in Hairston’s original study, such as a lack of consistency in the number of errors from each category, a lack of explanation about why categories were included, some sentences that contained multiple or no errors, and a lack of clear reasoning for how she ranked the seriousness of errors in her results. The authors attempted to rectify this by, in part, removing some sentences, purposefully including some correct sentences (and including a “no error” response option), and re-ranking Hairston’s results in order to compare them to their own. Their instrument, based on Hairston’s, included 88 items, and was completed by 84 non-academic professionals. Grey and Heuser found that, in general, the amount of “bothers me a lot” responses decreased from the original study, while the number of “bothers me a little” responses increased. They also found that the ranking of most bothersome errors was similar between the two studies, with the most bothersome errors for both being non-standard verb forms, double negatives, object pronouns used as subjects, and subject-verb disagreement. There were some differences: the 2003 study included tense switching and spelling among the most bothersome   69 errors, which the earlier study did not, while the 1981 study included sentence fragments and capitalization errors, which the later study did not. Overall, the authors believe that their results suggest a “trend toward tolerance” which they encourage (p. 62). Notably, Grey and Heuser found variability in responses to two errors of the same type; for example, for two nearly identical sentences containing subject-verb agreement errors, 60% of respondents found one very bothersome, while only 28% found the other only somewhat bothersome. The authors suggest that this may be due to the content or meaning of the sentences. In general, they note that error gravity studies such as theirs are “weakened by the impossibility of researchers knowing for sure which part of the sentence participants are judging,” and that the decontextualization of sentences may influence the way errors are perceived (p. 61).  Leonard and Gilsdorf (1990) also based their study on Hairston’s, though their interest was more explicitly in the business world; their participants were drawn from members of an academic association for business communication (200) and executive vice presidents of top American companies (133). Their instrument was a 58-sentence questionnaire, using sentences that would appear “in a business context” and containing usages “seen frequently…in business students’ writing” (p. 142). They used Hairston’s three-point scale of bothersomeness. They ranked the ten most and ten least distracting “questionable elements” (notably, Leonard and Gilsdorf reserve judgment on whether the usages are errors, including some elements which are proscribed by traditional writing handbooks but may be common in contemporary usage), finding that most (7/10) of the most distracting items involved “basic sentence-structure errors” like run-on sentences, fragments, faulty parallelism, and so on. The least distracting elements tended to be related to word choices deemed improper by tradition, such as “feel” rather than “believe,” “quote” for “quotation,” “data” for “datum,” and so on. Like Gray and Heuser,   70 Leonard and Gilsdorf found inconsistency in the ways that pairs of errors of the same type were received by participants: of the 19 error types which had two sentences, 11 pairs had inconsistent reactions from participants. In their 2001 replication of the study, Leonard and Gilsdorf (using a new 50-sentence questionnaire and a 5-point rather than a 3-point scale) had similar results. The 2001 study involved 130 academic business communication specialists and 64 business executives, and again the most distracting usages were those involving sentence structure errors which made an “impact on readability” (p. 449). Similarly, the least distracting errors were those which involve traditional “rules” of usage which are not ungrammatical, such as “very unique,” ending a sentence with a preposition, or starting a sentence with “but.” The authors also found, once again, that the same errors were not equally bothersome when they occurred in different sentences. They also found that academics were less tolerant overall than executives, which had also been suggested by the results of their earlier study. In general, Leonard and Gilsdorf’s work suggests that changes in acceptable usage are not particularly troublesome for business professionals, and that writing teachers should be willing to be flexible as the language changes. Wall and Hull’s (1989) study falls somewhere between the error gravity studies (which mostly examine the reactions of non-academics) and the post-hoc error studies focused on writing teachers’ reactions to errors. They ask: “How do readers— in particular, teachers—label and interpret errors in a text” (p. 261)? Unlike Hairston, they did not use decontextualized sentences, but full essays; they gave participants essays they had not previously read, and asked them to a) mark all errors, b) choose the three most significant errors and explain their significance, and c) “comment on the overall strengths and weaknesses” of the writer (p. 265). The 55 participants included elementary, secondary, and university instructors. The overarching question the authors had in mind was to discover whether all of these teachers constitute a more   71 or less cohesive “interpretive community” when it came to identifying and marking errors; the hypothesis was that they would share a “common vocabulary for labeling the ‘errors’ they see in texts” as well as a “common sense of when it is appropriate to do so” (p. 266). The authors looked for areas in which there was high, medium, or low consensus in terms of errors marked by teachers. The authors identified 35 errors in the paper; the average number marked by the participants was 32.73. The most high-consensus errors were those involving punctuation: 20 of the 25 highest-consensus errors involved punctuation, usually in the form of missing commas, although there was little agreement about how to remedy these errors. The “medium consensus errors” involved what the authors call “style/structure” and “logic/clarity,” often having to do with the participants being unclear about the writer’s intended meaning and specific terms the writer used; the authors argue that this shows “how much readers depend on culturally established textual cues to guide them in constructing meaning” (p. 271). The lower consensus errors were the majority of those marked: around 75% of the places marked as errors in the text were marked by fewer than 20% of participants. Many of these responses were idiosyncratic; the authors cite the comment “Not an error exactly, but makes no sense” as a representative example (p. 273). Overall, the authors describe many of the teachers’ comments as “ambiguous,” frequently including vague descriptions of the errors as ‘wrong,’ or needing to ‘change,’ without specifying what they perceive to be the nature of the error. The authors’ conclusions are specifically intended to provide implications for how teachers should re-consider how they teach about and respond to error so as to better serve their student; they especially urge better and more training in terminology for identifying and labeled errors. One part of their conclusion, addressed to students, is instructive regarding their conclusions about teachers’ responses to errors:   72  For particular teachers errors may have several different and interchangeable names  Different teachers may name the same kinds of errors differently  Certain jargon terms like “usage” have different meanings depending upon the teacher  What teachers consider an error in writing may vary considerably   A teacher’s labels may be imprecise or even missing  Suggested revisions or comments like “wrong word” imply a version of the text that the teacher is constructing, not exactly the one a student wrote or intended  Even if all errors are identified with labels, some require consulting a rule or convention to be corrected while others involve revisions that are more negotiable (pp. 286- 287)   Beason’s (2001) study of business people’s reactions to errors is one of the most methodologically sophisticated of the L1 error studies; it is a two-part questionnaire and interview study of 14 business peoples’ reactions to errors in a text. Beason uses a full composition (written for a business context rather than an educational one) with errors deliberately inserted and represented in bold text in order to draw participants’ attention to the errors being investigated. The participants respond to each error on an error gravity scale of 1-4 (not bothersome at all, somewhat bothersome, definitely bothersome, and extremely bothersome). In a quantitative analysis of the average ratings for each error, Beason found “widespread inconsistencies—from type of error to type of error, from person to person, and even among the responses of an individual person to errors of the same kind” (p. 46). In his analysis of the qualitative interview data, Beason found greater consistency in participants’ reactions due to what he calls “extra-textual issues” involving the “needs, biases, and intentions   73 of readers” (p. 46); even if different errors may elicit different qualitative reactions from the participants, they are similar enough to group together into several categories relating to how the participants construct their image of the writer and the writer’s ethos based on errors. These categories are: writer as writer (hasty writer, careless writer, uncaring writing, and uninformed writer), writer as a business person (faulty thinker, not a detail person, poor oral communicator, poorly educated person, and sarcastic, pretentious, aggressive writer), and writer as a representative of the company (representing the company to customers and representing the company in court). The names of Beason’s categories show the many ways the participants constructed a negative image of the writers; he concludes that it is necessary to understand “the myriad ways in which a writer’s ethos can be unnecessarily endangered by errors” (p. 59).  3.1.3 Summary  The studies reviewed above vary in their focus. Most use some combination of quantitative methods, such as counting the number of errors marked by participants or the ranking of errors in terms of how bothersome they are to readers, and qualitative methods, such as interviews with participants about their attitudes toward texts and writers vis-à-vis errors. They also had different purposes, with some aiming to discern which types of errors are most common and/or commonly marked by college writing teachers (e.g., Connors & Lunsford, 1988; Lunsford & Lunsford, 2008; Shaughnessy, 1977) with conclusions about participants’ attitudes or judgments of errors being a byproduct of the larger study, while others sought specifically to understand how and why readers reacted to errors the way that they did (Beason, 2001; Hairston, 1981; Gilsdorf & Leonard, 2001; Leonard & Gilsdorf, 1990; Zawacki & Habib, 2014). While all of the studies suggest there is rarely agreement on which errors are most serious or even which usages in a specific piece of writing are errors, there are ways of talking about error which seem   74 to be common across the participants: perceived errors are seen as writers’ lack of content knowledge or familiarity with genre expectations (Zawacki & Habib, 2014), regarded as reflecting poorly on the writer (as a writer or as other identities, as in Beason, 2001), and in general participants are willing to acknowledge grey areas when it comes to their perception of error, even as they often express strong feelings about the unacceptability of obvious “surface” errors (Gilsdorf & Leonard, 2001; Leonard & Gilsdorf, 1990; Wall & Hull, 1989).  3.2 Comparisons of NES/NNES responses to L2 writers’ errors L2 writing (and applied linguistics more generally) embraced studies of error gravity to a greater extent than L1 composition. Like L1 composition error studies, L2 writing error studies rose to prominence in the 1980s, and have gradually tapered off, with only a few studies in recent years (e.g., Hyland & Anan, 2006; Roberts & Cimasko, 2009). Perhaps not coincidentally, error gravity studies emerged in TESOL and L2 writing around the same time Shaughnessy’s (1977) work began to influence first language composition. Error gravity studies occurred in L2 writing with more regularity than L1 composition in the early 80s; Rifkin and Roberts reviewed over two dozen studies of error gravity in second language studies in 1995.  The purpose of error gravity studies, which can be seen in L2 writing as being part of a broader tradition of error analysis in second language acquisition (e.g., Corder, 1967), is generally to understand how readers (often faculty in specific disciplines) view lexical and grammatical errors in L2 writing (or oral production, though I will focus on writing here), and specifically which errors they regard as more (or less) serious. This information, in turn, is meant to be useful for ESL writing instructors in prioritizing their own efforts in both pedagogy and assessment.     75  In addition to error gravity studies, L2 writing research also has a tradition of comparing native and non-native speakers’ ratings of L2 writers’ texts – what we might call, to borrow the title of Kobayashi’s (1992) article, “native and nonnative reactions to ESL compositions.” Unlike error gravity studies, these studies are usually concerned with the reader’s reaction to a whole composition, focusing on a variety of linguistic and rhetorical features of texts rather than being limited to reactions to lexical and/or grammatical issues; often an analytic or holistic rating for a whole piece of writing is used (as in Shi, 2001). While some of these studies deal primarily with assessment and testing, and are geared toward developing validity in language tests and other assessment procedures, some of them directly or indirectly address differences in native and non-native speakers’ reactions to errors.  The studies outlined below are drawn from these two streams of research – error gravity and comparisons of NES/NNES reaction to ESL writers’ texts – and were selected because they met both the criteria of (a) addressing in some way the reaction of readers to errors in L2 writers’ texts and (b) specifically comparing NES and NNES participants’ reactions to those errors. Table 3.3 and Table 3.4 below provide brief synopses of relevant details of these studies. Table 3.3 Error gravity studies comparing NES/NNES responses Author (Year) Participants Text/Instrument James (1977) 17 NESTs,  17 NNESTs 50 sentences  typical ESL errors Hughes & Lascaratou (1982) 10 Greek NNESTs, 10 British NESTS,  10 British non-teachers 32 sentences w/ typical Greek errors Davies (1983) 34 Moroccan NNESTs, 34 British non-teachers 82 sentences w/typical Moroccan errors   76 Author (Year) Participants Text/Instrument Sheory (1986) 62 US NESTs,  34 Indian NNESTs 20 sentences with errors found in 97 compositions by ESL writers  Santos (1988) 144 NES content professors, 34 NNES content professors 2 compositions by  Chinese and Korean undergrads Schmitt (1993) 18 Japanese NNESTs, 20 NESTs in Japan  30 sentences w/ errors collected from Japanese student compositions Porte (1999) 16 Spanish NNES professors, 14 NES professor working in Spain 20 sentences w/ errors collected from Spanish undergrad compositions  Table 3.4 Comparisons of NES/NNES reactions to L2 writers’ texts Author (Year) Participants Text/Instrument Takashima (1987) 1 Japanese NNEST,  2 NESTs 1 composition by Japanese English major graduate Hyland & Anan (2006) 16 Japanese NNESTs, 16 British NESTs, 16 British non-teachers 150-word composition written by Japanese undergraduate Rinnert & Kobayashi  (1996, 2001) 127 inexperienced Japanese ESL writers,128 experienced Japanese ESL writers, 104 Japanese NNESTs, 106 NESTs in Japan 2 compositions by Japanese undergrads Kobayashi (1992) 145 US NESs, 126 Japanese NNESs (various academic levels) 2 compositions by Japanese undergrads    77 3.2.1 NES/NNES error gravity studies in L2 writing  The earliest prominent study of error gravity in L2 writing was James (1977); the provisional nature of James’ study is clear from its informal tone and the author’s self-deprecating statement that the research is “crude” and “not ‘scientific’” (p. 124). However, James’ methods and specifically his comparison of NES and NNES reactions to errors were influential and led to a number of studies in the same vein (Hughes & Lascaratou in 1982, Davies in 1983, and Sheory in 1986). James “collected, from numerous sources, one hundred errors committed by foreign English learners” and pared this list down to 50 errors (in ten categories) which were “recognisable in no further context than the sentence containing it” (p. 116). These 50 sentences were presented to a group of 20 NESTs and 20 NNESTs (3 of each misunderstood the directions, leading to a final n of 17 for each group), who were asked to “(i) underline the mistake if you think there is one, (ii) write a correction in column two, and (iii) show how serious you consider the mistake by writing a number – 0, 1, 2, 3, 4, 5 – which says how many marks you think should be ‘lost’ for that mistake,” with 5 being the most serious (i.e., -5 marks) and 0 being the least (i.e., there is no mistake) (p. 118). This method would prove to be common to many error gravity studies comparing NEST and NNEST reactions.   James’ findings showed that the NNES participants were more strict in their marking, averaging -138 marks total to the NESs’ -123. He found most participants in each group to be mostly consistent in their ratings, but that the two groups had different ranges, means and distributions of marks. He suggests that based on the distribution the NNEST group could possibly be split into two groups, one relatively intolerant of error, and the other relatively tolerant, though admits this is mostly speculation. Finally, James looked at particular items and types of errors which showed similarities and differences between the two groups; he found that   78 they tended to be mostly in agreement about errors involving articles, tense, and lexical choices, but that NNESs appeared more strict on errors of “case” (prepositions) and lexis, while the NNESTs were more strict on errors of tense and concord. By combining both gravity ratings and frequency of errors in the samples, James proposes a general hierarchy of “error types considered most serious,” which he ranks as follows: “transformations, tense, concord, case, negation, articles, order,” and finally lexical errors.  Hughes and Lascaratou (1982) conceived of their study as a response to James’, with three important differences: all errors came from a homogeneous group (secondary school-level Greek EFL learners) of which the NNEST participants were also members, a third group of NES non-teachers was involved, and participants were asked not only to rate error gravity but if they gave an error a five (the most serious rating), they were asked to explain why. Their instrument involved 32 sentences from Greek students’ English compositions involving eight error categories, with some correct sentences as distractors, and it was administered to ten participants each in groups of a) Greek NNESTs, b) British NESTs, and c) British NESs who were not teachers (the latter group was included to represent some sense of how non-specialist native speakers viewed non-native speakers’ errors). The authors found, confirming James’ (1977) finding, that NNESTs were stricter than both groups of NESs, although they also found NNESTs more lenient in their view of spelling errors. Their results also revealed an important distinction that would be shown in other similar studies – NNESTs focusing on rule violation as a reason for marking an error as serious, and NESTs focusing on understanding or intelligibility. The authors explain that the Greek teachers often felt that the students “should know better” or should have mastered some “simple” rules of English earlier in their studies. Finally, Hughes and Lascaratou touch on an interesting finding which they find “amusing” but which reveals the enormous   79 amount of individual variation in the interpretation of error (which is more fully revealed in Hyland & Anan, 2006 below) when they describe how many participants marked some “correct sentences” as containing errors: the sentence “The boy went off in a faint,” which was taken from a dictionary, was considered an error by 2/3 of the total participants, including 9/10 of both NES groups (teachers and non-teachers). The authors conclude by suggesting that NNESTs should perhaps be more aware of NES priorities regarding intelligibility. Davies (1983) administered a very similar questionnaire, involving 82 sentences with “invented examples…typical of…Moroccan secondary school learners of English” (p. 304). Comments on the errors in this study were optional, and participants were 34 Moroccan NESTs and 34 British NESs who were not teachers. Once again, the NNES group was seen to be more strict, with an average number of 140.3 marks deducted, as opposed to the average of 97.28 for the NES group. Davies describes the differences as in part having to do with differences in “attitudes toward the test and towards errors in general” (p. 306). The non-teacher NES participants tended to view the task as fun and interesting, and were enthusiastic to participate, while the NNESTs tended to view the task as tedious, depressing, and anxiety-provoking, perhaps a “reminder of their failure” as teachers (p. 306). In terms of differences on specific types of errors, NNESTs were more strict in their judgments of errors of morphology and tense, while NESs were more strict with errors involving subordinate clauses and the order of words in sentences. Davies describes the Moroccan NNESTs’ reactions to morphology and tense errors as being “syllabus”-related concerns which suggest “inexcusable carelessness or… a failure to grasp an elementary and frequently practised point” by students (p. 307), though these were seen as relatively unproblematic by the NES group. Finally, Davies found that for most of the sentences which contained errors that could be seen as relating to transfer from French or   80 Moroccan (the students’ primary languages), the NES participants had a harsher rating than the NNESTs (in 24 of 28 such sentences). Davies suggests, then, that if a reader shares the writer’s language background, he or she is more likely to be lenient with errors involving possible interference. However, she also notes that the opposite effect may sometimes be possible, as the NNESTs were much harsher with a few of the French-like sentences, including “He interests himself in horses.” Davies perceptively concludes that error evaluation and assessment by teachers is never an objective measure, but is always likely to be influenced the teacher’s “competence in both the target language and the learners’ other languages, familiarity with the learners and their background, teaching priorities, the syllabus being used, in short, by the whole teaching and learning context against which he or she will inevitably view the errors” (p. 310).  The final study reviewed here which follows the NES/NNES error gravity comparison model inspired by James (1977) is Sheory (1986), who sought to investigate how NES and NNES teachers react to ESL writers’ errors, whether they operate with an implicit “scale” of error gravity, and whether those scales are the same or different for NES and NNES teachers. To compile his questionnaire, Sheory used 97 randomly selected compositions written by ESL students (of various language backgrounds) at a U.S. university. He arranged these into eight error categories and created a questionnaire of 20 items based on these (the sentences in the questionnaire were invented rather than taken directly from the compositions). The participants included 62 NESTs from the U.S. and 34 NNESTs from India. Echoing previous studies, Sheory found that the NNESTs were stricter (with an average of -59.82 marks) than the NESTs (with an average of -50.19 marks). He also found that the groups seemed to operate with different implicit scales of error gravity: while both groups rated errors involving verbs the most seriously, there were statistically significant differences in how the groups rated errors involving tense,   81 agreement, indirect question formation, prepositions, and spelling, with NNESTs rating each category more strictly. However, NESTs were more strict with lexical errors. Sheory concludes with a call for NNESTs to change their priorities in error correction to more closely mirror NES preferences, assuming that the goal of ESL instruction is communication with native speakers, but also asks whether NESTs are too lenient and may be “short-changing” learners who may desire more correction. 3.2.2 NES/NNES reactions to texts  Takashima’s (1987) study is notable in that it was one of the first to compare native and non-native speakers’ reactions to L2 writing errors in the context of a whole piece of writing (or “free composition”) rather than attempting to statistically measure error gravity for different types of errors. In this small study, one Japanese NNEST and two NESTs (of unspecified backgrounds), all university instructors, read a composition by a Japanese university graduate (from a Japanese university) and were asked simply to “correct it” (p. 4). Takashima’s article reports the results of the corrections for three of the six paragraphs originally written, showing each of the three participants’ corrections (or, in some cases, non-corrections), and offering a short commentary on each. He then compares the NNEST’s corrections to those of the NESTs overall, showing that while both groups corrected almost all the same errors, the NNEST did not correct certain spelling, verb, and article errors, and “badly made” several corrections related to word choice and transitions. (For the most part, the author refers to NNEST corrections that differed from those made by the NESTs as “badly made,” a claim which should not necessarily be taken at face value. For example, where the NESTs wrote “I think I owe a great debt of gratitude to…” the NNEST wrote “Since I came here I have learned a lot from…” which Takashima considered a “bad” correction.) Takashima concludes that Japanese EFL teachers   82 need more training and need to consider the “content or discourse level” more seriously, in addition to the “grammatical or sentence level” (p. 48). While Takashima admits that his small study is “just a beginning” (p. 48) its use of a whole composition as an instrument to compare NEST and NNEST reactions to errors was innovative. Like Takashima (1987), Santos (1988) used as her instrument two actual compositions written by L2 writers rather than the decontextualized, invented sentences used by the previous studies reviewed. Santos’ was a large-scale error-gravity study which, while one of its findings is important in the area of comparing NES/NNES reactions to ESL writers’ errors, differs in several ways from the other error gravity studies discussed here.  Santos’ main goal was not comparing native and non-native reactions, but actually comparing the reactions of professors in different disciplines to ESL compositions, in particular the “hard sciences” and social sciences/humanities. In addition to looking at the disciplinary differences, Santos aimed to compare the professors’ ratings of content and language in the essays, as well as how they rated the comprehensibility (ability to understand), acceptability (whether they viewed it as approximating native-like ability), and irritation (the degree to which they found the error bothersome) of errors in the essays, as well as what factors influenced their ratings, such as age, language background, and so on. The participants were 178 UCLA professors, with 144 NESs and 34 NNESs. Participants read the two essays, and were asked to rank them on six different scales regarding language and content. They were then asked to go back through the composition and “correct everything that seemed incorrect,” and to list the “most serious” errors, which they were to rate in terms of comprehensibility, acceptability, and irritation. These procedures are notable in that, while Santos had already identified the errors in the composition, her instrument did not single them out for participants’ attention. Briefly, Santos found that participants rated   83 the essay’s language more highly than their content; in terms of errors, they found sentences with errors to be, in general “highly comprehensible, reasonably unirritating, but linguistically unacceptable” (p. 76), with lexical errors being judged the most serious in both essays. She also found that the older professors tended to rate errors as less irritating than the younger, and most importantly for this review, that NNES participants rated the acceptability of language used “significantly lower” than the NESs (p. 81). Thus, while Santos’ main focus was not a direct comparison of NES and NNES reactions, this echoes previous findings that NNESs tend to judge language use more harshly. Santos attributes this to the fact that NNES professors “have attained an extremely high level of proficiency” in English and judge more harshly “because of their investment of effort in the language” (p. 85).  Kobayashi (1992), like Santos, used two actual compositions written by L2 writers in her study of NES and NNES reactions to ESL compositions. While Kobayashi’s study does not involve error gravity ratings, it does involve a comparison of various holistic ratings of the compositions (grammaticality, clarity, naturalness, and organization) as well as reactions to specific errors, between NES and NNES readers of varying academic levels (undergraduates, graduate students, and professors in the USA and Japan). Kobayashi sought to compare reactions not only across language background but also academic status, and to identify reasons for different patterns in reactions in terms of both evaluation (holistic ratings) and correction (individual errors). There were 269 participants: 145 American NESs and 126 Japanese NNESs, all involved in language-related academic fields. The procedures were nearly identical to those described in Santos’ study above. Contrary to many of the error gravity studies, this study found that NESs were “more strict about grammaticality” and made “far more corrections” than the NNESs, though the NNESs were more strict with ratings for clarity of meaning and naturalness.   84 Kobayashi also found that the NESs were more likely to detect and correct the “unambiguous errors” that she and three NES informants identified in the essays, while the NNESs left many article, preposition, and number errors uncorrected. She found that NESs tended to provide more of a variety of possible corrections to word choice errors, and that they were more willing than Japanese participants to offer “more than mere mechanical corrections” – so much so, in fact, that NESs occasionally made corrections which directly contradicted the writers’ communicative intentions, as determined by follow-up interviews with the two essay writers themselves. Finally, she found that the Japanese participants seldom corrected usages which could be traced to the influence of the usage of English loanwords in Japanese, such as Japan-specific meanings of words like “manners” (meaning unofficial social expectations) and “master” (meaning a work supervisor), supporting Davies’ (1983) finding that usages related to L1 background are less likely to be judged harshly by those who share the same background. Kobayashi concludes by noting a certain “superiority” of NES editors of L2 writers’ texts, but warns of their inability to understand when “an English lexical item is used to express what is essentially a Japanese concept” (p. 105). She also notes that within the groups, “there were significant interaction effects between language and academic rank” (p. 106), and that the higher-status Japanese participants more “accurately” detected errors. Thus, while the NESs were seen as both stricter and more “accurate” judges, the study also suggests that those with higher academic status (and presumably more experience with academic reading, writing, and editing) tend to react to errors similarly, regardless of language background. Schmitt undertook an error gravity study in 1993 which looked at sentences involving multiple error categories, using Lennon’s (1991) “extent/domain” classification: the “extent” of an error is the actual “linguistic unit which the error permeates,” such as “morpheme, word,   85 phrase, clause, sentence, or discourse,” but the “domain” of the error has to do with to what extent and how much “the reader/listener must examine to determine if an error has occurred” (p. 183). The study in part sought to examine whether the extent/domain classification was “a viable way to describe errors,” but the primary focus of the study was whether Japanese teachers judged errors more harshly than AETs (assistant English teachers, the term for the NESTs that the Japanese government recruits to work in Japanese English classrooms alongside Japanese teachers), and if so, which categories of errors they judged more harshly. Schmitt collected 14 compositions from Japanese students in a pre-college English program in Japan, from which he gathered 60 sentences containing errors (of ten different extent/domain categories), which were pared down to 30 for the instrument. Participants were asked to judge the seriousness of the errors on a seven-point Likert scale, and then to answer the question “on what basis did you judge the seriousness of the errors” (p. 185)? 18 Japanese NESTs participated, as did 20 AETs. Schmitt found that the mean of the error seriousness ratings was higher for the Japanese teachers in every error category with the exception of “word/discourse” (i.e., word choice) and “sentence/sentence,” with four of the categories having statistically significant differences. Those categories with significant differences tended to be more “local” errors, such as morpheme or subject-verb agreement. Although the Japanese teachers judged language accuracy more harshly, both groups of teachers reported that their main criterion for judging error seriousness was comprehensibility (although three of the Japanese teachers also cited grammaticality). This is in contrast to Hughes and Lascaratou (1982) and Hyland and Anan (2006), which both found that NNESTs tended to focus much more on grammaticality than comprehensibility.  Kobayashi and Rinnert undertook research similar to Kobayashi (1992) in 1996 and 2001 (Kobayashi & Rinnert, 1996, and Rinnert & Kobayashi, 2001), which again involved groups of   86 NNES Japanese participants (127 inexperienced Japanese ESL student writers, 128 experienced Japanese ESL student writers, and 104 Japanese EFL teachers) as well as NESTs (106 native-speaking EFL teachers working in Japan). For these studies, the researchers again used two essays by L2 writers, but edited them to create 16 different essays (or eight versions of each) to investigate different variables, including various combinations of American or Japanese rhetorical patterns, syntactic and lexical errors, disrupted sequences of ideas, and error-free prose. While these studies are less focused on errors (and not at all on specific errors) than on participants’ holistic ratings, both articles touch on findings relevant to the groups’ reaction to errors. In the 1996 study, the authors found that while both the NES and NNES teachers judged essays with language use errors more harshly than the Japanese students did, the NESTs and NNESTs “did not differ in the overall severity of their judgments of language use errors” (p. 422), which the authors note contradicts both previous error gravity studies that showed NNESTs to be harsher, and Kobayashi’s 1992 study showing NESTs to be harsher. In the 2001 study, the authors again note that the teachers were much more likely to make comments on language use errors than the students, showing that while the NESTs and Japanese NNESTs differed very slightly on how many of them commented on language use (57% of Japanese NNESTs and 61% of NESTs), both groups of students were much less likely to do so (32% of the inexperienced writers and 39% of the experienced writers). The error-related results of these studies suggest that there may be less of a quantitative difference in NEST and NNEST reactions to errors in ESL compositions than proposed by the authors of the error gravity studies involving decontextualized sentences.  While many of the L2 writing studies on error in the 1990s involved Japanese participants, Porte’s (1999) study was carried out in Spain. Using an error gravity approach   87 similar to many of the other studies reviewed here, Porte surveyed 16 Spanish NNES professors and 14 NES professors using an instrument of 20 sentences drawn from a group of 79 compositions collected by the researcher (notably, Porte used the actual sentences from the compositions, rather than creating new sentences with the types of errors found in the compositions, like Sheory did in his 1986 study). The study used a scale of 0-5, with no error correction or explanations on the part of the participants. As in many previous studies, the NNEST group deducted more points than the NEST group (an average of -55.12 vs. 48.07, respectively, which was statistically significant). In particular, the groups’ differences in judgments of tense and spelling were statistically significant. However, Porte ran a strength of association test which suggested that “relatively little variability between groups could be accounted for by the NS/NNS variable, and it may well be that we have to look for less obvious contrasts to explain some of these findings” (p. 431). Porte also noted that certain sentences had a surprisingly (to him) low number of marks deducted, and speculates that this may have to do with familiarity with the students’ L1, which could lead to teachers being “desensitized to error” (p. 432). Finally, Porte suggests that future research should “consider what might be gained from introspective or retrospective data from subjects” (p. 432).  The most recent prominent study investigating NES and NNES reactions to ESL compositions, Hyland and Anan (2006), combines some innovative features of previous studies. It uses a composition written by an ESL university student (in this case, a 150-word text by a Japanese student) rather than decontextualized sentences, and asks participants to identify and correct all errors, including identifying which are the “most serious”; like Hughes and Lasacaratou (1982), it involves not only teachers but also non-teacher NESs and asks for explanations about why certain errors are ranked as the most serious. Like both Santos (1988)   88 and Kobayashi (1992), the researchers identified 11 “target errors” in the original composition (in nine different error categories). The essay was given to the participants (16 Japanese NNESTs, 16 British NESTs, and 16 British non-teacher NESs), who were asked to evaluate them holistically, identify and correct all errors, select and rank the three most serious errors, and give the reasons for identifying these three as the most serious. The authors found that both the NEST and NNEST groups recognized over 80% of the target errors, with the Japanese group finding just slightly fewer than the NEST group. They also confirm Hughes and Lascaratou’s (1982) finding that NNESTs are more likely to rely on (in Hyland and Anan’s words) “infringement of rules” when judging error gravity, while NESTs are more likely to cite “unintelligibility” in their judgments (p. 512). For all three groups, four types of errors in particular garnered the judgments of being the most serious: agreement, word form, tense, and word order. The most notable finding of Hyland and Anan’s study is that, in addition to their 11 target errors, a total of 42 other “errors” were noted by participants, with the Japanese group in total identifying 38 errors, the non-teacher NES group identifying 22, and the NEST group identifying 16. The authors point out that there was much less inter- and intra-group agreement about the “non-target” errors than the target errors, and identify three categories into which the non-target errors found by participants fell: style (e.g., formality and appropriacy), which was more notable to the NES groups, discourse (e.g., cohesion and organization), which was more notable to the Japanese NNEST group, and semantics (e.g., meaning and clarity), which was common to all groups.  3.2.3 Summary All of the L2 writing studies reviewed here have the common property of comparing native English speakers’ reactions to L2 writers’ errors to the reactions of non-native English   89 speakers. While many of the error gravity studies seem to clearly show that the overall strictness or harshness with which NNESTs regard writers’ errors is greater than that of NESTs (and non-teacher NESs), several of the findings point to a more nuanced picture: NESTs actually found more errors when they were given a whole composition (Rinnert & Kobayashi, 1992; Takashima, 1987), and sometimes there was no real difference in the extent to which native and non-native speakers noticed or commented on errors (Kobayashi & Rinnert, 1996). There was great variability both between and within groups of NEST and NNEST participants when they were not reacting to errors that were already “targeted” by researchers (Hyland & Anan, 2006). Taken together, the findings of the studies above suggest that while NNESTs may, on average, judge errors more harshly, NESTs may be likely to notice and correct more errors. Perhaps most importantly, the various methodologies of these studies reveal gaps in what kinds of knowledge can actually be obtained from such studies, and the use of whole texts with no a priori determination of errors by Takashima, Kobayashi, and Hyland and Anan (and the wide variety of responses they received) further sheds light on the limitation of error gravity studies if the researcher desires a picture of how both NES and NNES readers react to what they perceive as errors when they read L2 writers’ texts. It seems clear that a fuller and more complex picture of how readers react to language usage in texts can be obtained by giving readers the opportunity to read whole compositions and identify particular instances they find unacceptable. 3.3 Acceptability studies in world Englishes and ELF The final category of studies reviewed here, acceptability studies in World Englishes and ELF, has roots in sociolinguistics, and is more directly related to traditional (socio)linguistic studies of acceptability than the previous two sections. Although these studies seldom claim an explicit link to acceptability studies in linguistics (e.g., Quirk & Svartik, 1966) and   90 sociolinguistics (e.g., Greenbaum, 1977), their use of AJTs (a term discussed in detail in the previous chapter) and their authors’ participation in the tradition of WE scholarship places them firmly in the descriptive tradition in linguistics from which AJTs first emerged as a method.  Crucially, unlike the studies in L1 and L2 composition mentioned above, AJT studies in WE and ELF do not operate with a notion of error; the authors are more interested in determining participants’ reactions to features that are assumed to be in some way typical of (potentially) legitimate varieties of English. While the methodologies and participants’ reactions can be similar to those of error studies, these studies focus on differences between English varieties and what people’s reactions to these differences can tell us about the participants’ attitudes toward particular usages, or the varieties themselves.  It is important to note that these studies have not been specifically oriented toward written language (except in the case of Gupta, 1988, Ivankova, 2008, and Parasher, 1983), but tend to use written texts as a proxy for “language” in general – a potentially problematic conflation, but one that is by no means unique in the domain of linguistics, as discussed in Chapter 2. Where they differ from the error studies reviewed above is that the studies below are much more likely to use a two-part, mixed-methods approach to probe the participants’ reasons for their judgments and their attitudes toward the usages in order to answer a variety of different research questions. I have divided the AJT studies into four basic types:  1) Micro-level, which includes studies that use traditional AJTs (i.e., ones with decontextualized sentences, similar to error gravity studies) for the purpose of investigating a particular feature of a variety;    91 2) Macro-level, which includes studies that use traditional AJTs in order to investigate to what extent a particular variety of English is emerging as acceptable by speakers or non-speakers of the variety; 3) Discursive, which includes studies that use AJTs primarily as a tool to produce participant discourse which is analyzed to investigate their orientation toward or “ownership” of English (an application of AJTs unique to WE, though it was also used in a sociocultural SLA study by Goss, Zhang, & Lantolf in 1994); 4) Open-ended textual, which comprises studies done using whole texts rather than decontextualized sentences (but without “target errors” as in Hyland & Anan, 2006, above). The tables below outline the studies reviewed. Table 3.5 Micro-level AJTs reviewed Author Participants Texts Bautista (2004) 205 Philippine university students 20 sentences (11 common to Philippine students; 9 ‘correct’)  Table 3.6 Macro-level AJTs reviewed Author Participants Texts Murray (2003) 253 English teachers in Switzerland (138 NES,  104 NNES, 11 ‘bilingual’) 11 “typical Euro-English sentences” Chen & Hu (2006) 21 international business  people working in China 11 sentences typical of Chinese English; 6 Chinese idioms in English Mollin (2005, 2007) 435 European professors 20 sentences “alleged to be typically Euro-English” Y. Wang (2013) 769 Chinese ESs (502 university students,  267 professionals) 10 sentences with “deviations” from ENL, typical of CE use Yang & Zhang (2015) 14 Chinese NNESTs 20 sentences (10 with putative features of C, 10 distractors from literature or student essays)      92 Table 3.7 Discursive AJTs reviewed Author Participants Texts Higgins (2003) 12 Outer Circle ESs  (India, Malaysia, Singapore),  4 Inner Circle ESs (US)  24 sentences attested in various WE varieties Bokhorst-Heng, Alsagoff, McKay, and Rubdy (2007)  8 Singaporean Malay ESs Same as Higgins  Rubdy, McKay, Alsagoff and Bokhorst-Heng (2008)  12 Singaporean ESs  (4 Chinese, 4 Malay, 4 Indian) Same as Higgins  Wiebesiek, Rudwick, and Zeller (2011) 20 South African Indian ESs 3 sentences  (2 typical of SAIE, 1 not)   Table 3.8 Open-ended textual AJTs reviewed Author Participants Texts Parasher (1983) 2 British NESs, 2 American NESs, 2  Indian NNESTs  188 professional letters from India  Gupta (1988)  (reports two mini-studies) Researcher & her students;  5 NESTs (2 British, 2 US, 1 Australian) 89 texts from Singapore;  1 text from Singapore Ivankova (2008) Russian ESs  (n not specified) “passages of written texts in China English produced by educated Chinese speakers of English”  3.3.1 Micro-level AJT Bautista (2004) makes acceptability the focus of her study on the modal verb would in Philippine English. She presented an AJT focused on would to 205 college freshmen, who were given the options “correct, not correct, and can’t decide” and told to make corrections when necessary (p. 116). The author examined the corrections and analyzed the results with descriptive statistics and some discussion of individual items of interest. Results showed that higher-proficiency students tended to judge “standard American English”-style sentences as correct and to “correct the deviant sentences better than the students with low proficiency” (p. 122). Bautista   93 suggests that proficiency, as well as the fact that “modals have been taught poorly” in the Philippines might account for nonstandard usage, but also notes that ideas of “uncertain future, especially hopes, are expressed with would,” which is attested in a corpus of Philippine English and was considered acceptable by most participants (p. 124). Bautista concludes that the “non-standard” use of would “could have come from a convergence of imperfect learning, non-assertiveness, and simplification” (p. 126). She recommends that future research use AJTs “made up of items appearing in discourse, not in isolated sentences” (p. 126). Overall, this study shows the usefulness of AJTs when presented alongside other data: Bautista considers her results in light of an informal analysis of local textbooks, other studies of Asian Englishes, and the Philippine section of the International Corpus of English. This suggests that sociocultural context should play an important part in WEs studies involving AJTs, as it does in the studies reviewed below. 3.3.2 Macro-level AJTs  Another area in which AJTs have been used in WEs research is to investigate whether “developing” varieties of English are becoming institutionalized or gradually accepted. Since acceptability (by a variety’s own users) is one of the major factors in the development of a variety, AJTs administered to either those who are putative users of that variety or to other speakers can help researchers to discern whether particular innovations, or indeed the variety as a whole, is accepted by them. The two main varieties in which these studies have been used are Euro-English (Mollin, 2005, 2007; Murray, 2003) and China English or Chinese English (Chen & Hu, 2006; Y. Wang, 2013). While these studies are clearly important in establishing the status of these varieties, their theory and methodology of AJTs has not always been clearly stated.   94 Murray (2003) carried out a survey of Swiss English teachers in order to both “find out about teachers’ attitudes to changes which Euro-English might conceivably bring to ELT” and  “explore the acceptability of certain types of Euro-English formulations” (p. 154). Her participants were 253 English teachers in Switzerland of varying language backgrounds, though nearly 55% identified as native speakers of English (p. 154). The results of the attitude portion suggest the teachers held a fairly “tolerant” view of non-native English usage, though native speakers expressed more favorable views of Euro-English than did non-native speakers. The AJT portion of the survey found respondents tending to reject “violations of rules” (see similar findings in L2 writing studies reviewed above) such as “the film who I saw,” but tending to accept “possible but unusual structures” such as “that is the car of my dentist” (p. 160). Murray believes that the general acceptability of the “unusual” structures is an indication of Euro-English’s possible future development in education: “non-rule breaking Euro-English usage will increasingly find its way into listening and reading materials, which will serve as indirect models for learners’ speaking and writing” (p. 160). While her conclusions are intriguing, Murray’s AJT itself is problematic for several reasons: first, for her purposes, authentic language is necessary, yet the source of the sentences is not provided, nor is a rationale for their inclusion in the task. In order for the participants’ acceptance or rejection of the sentences to be valid, the widespread usage of the linguistic features given needs to be attested. Similarly, she lists a “standard ENL version” in contrast to each of the “Euro-English” items, but in some cases it is unclear why the ENL version is considered standard.  Mollin (2005, 2007), in a comprehensive three-part investigation into whether Euro-English can be considered a legitimate variety, adapts Murray’s AJT, referring to the items as examples of “what the literature has alleged to be typically Euro-English” (2007, p. 180). She   95 also states that the use of items from a corpus of Euro-English would have been preferable (2005, p. 164). Mollin’s explanation of the AJT’s purpose seems to conflate a cognitive/linguistic view with a social/ideological view: “to reflect the norm of English that respondents follow, the standard in their mind” (p. 165). Mollin compares the results of the AJT to the participants’ (435 European professors from varying countries and disciplines) self-reported competence in English; she reports that more “native-like” responses on the AJT correlated with self-reports of higher competences, and that more acceptance was correlated with lower competence, which “shows that speakers do not accept the allegedly Euro-English forms because this is the standard they wish to adhere to, but because they do not know any better – were they told that native speakers consider this an error, they would in all likelihood try to avoid these forms” (p. 182). Mollin also includes an attitude section on macro-level acceptability of Euro-English on her survey, in which only five percent of participants reported aiming to speak “English as it is spoken in mainland Europe” (p. 182). The prominent place of an AJT in Mollin’s study and its explanatory power when combined with other measures of attitude and competence suggests the AJT is particularly suited to studies of controversial varieties of English. Another such variety, Chinese English, has also been investigated with the use of AJTs. Chen and Hu (2006), in a study of non-Chinese English speakers, administer two AJTs as part of a larger attitude-based questionnaire which is similar to Hu’s work with Chinese students (2004) and teachers (2005). While those studies focus on macro-level acceptability, asking questions about whether participants have heard of Chinese English and whether they accept it, Chen and Hu’s study adds AJTs with samples of putative Chinese English, eliciting native speakers’ judgments of these sentences as another measure of macro-level acceptability. This section of the   96 questionnaire, however, suffers the same problem as Murray’s: the choice of sentences is not attested or explained, which is particularly problematic in the context of China, where a distinction between “Chinglish” (as learner error) and “China English” (as standard English with some Chinese characteristics) has been posited by Hu (2004). In the first AJT section, sentences like “Is this seat empty?” and “I’m a public servant” seem puzzling choices for a study about Chinese English (p. 235). A second task on “Chinese sayings,” which asks participants to guess the meaning of a Chinese idiom rendered in English, seems more suitable. Overall, although the authors convincingly claim that the results of all their questionnaire items point to the acceptance of Chinese English by the participants, the lack of rationale for including AJT items causes that part of the study to seem less credible, and underscores the unique importance of authentic language for AJTs in WEs studies. Y. Wang (2013) fills this gap in her own China-based acceptability study by specifically choosing sentences for her AJT which have been attested in various sources (e.g., online publications, corpora, etc.). While Wang situates her work in the ELF rather than the WE tradition – and frames her work as investigating “non-conformity to ENL norms” rather than strictly “China English,” I include her study here because its methodology and concerns are very similar to those of Chen and Hu (2006). The study involved administering an AJT to 769 Chinese participants (502 university students and 267 professionals), followed by interviews with 35 of them (12 English majors, 12 non-English majors, and 11 professionals). Wang found that participants’ average rating for all the items on the AJT (on a 5-point scale) was slightly closer to 3 (mildly acceptable) than 2 (mildly unacceptable), and that the mode of all responses to all items was firmly within the “mildly acceptable” range, suggesting a “positive orientation” to the sentences (p. 265). In the qualitative data (which included comments on the AJT and   97 interviews), however, Wang found a greater ambivalence, with participants sometimes favoring a more native-like standard, and sometimes a more local one. In interview data, she found participants had views of ENL as “the essence of English” (p. 269), as being more “norm-based” (p. 270), and “socially preferred” (p. 272), but that they also felt more ELF-like (or endonormative) usage was suitable for communicative purposes and the expression of “Chinese cultural identity,” which was viewed as positive (p. 276). The ambivalence of the results again shows the usefulness of following a quantitative AJT with a qualitative data collection procedure. The final study in this section, Yang & Zhang (2015), was in part inspired by a preliminary analysis of some findings from my own doctoral study (published in Heng Hartse & Shi, 2012), and is both a macro-level AJT study (aiming to see whether features of CE index acceptance of CE as a variety) and a discursive AJT (using a method similar to that of Higgins, 2003, described in detail in the section directly below). Seeking to find out how the participants reacted to features of CE and whether they actually associated them with CE, the authors presented 14 Chinese English teachers with a 20-item AJT (including putative features of CE and distractors, with response options OK, Not OK, and Not Sure) which they discussed in dyads. Interestingly, the phrase “China English” was never mentioned, and “Chinese English” only by one dyad, while there were numerous mentions of “Chinglish.” The authors found that participants were likely to accept lexical features of C but reject most syntactic features, and that in general their acceptance of putative CE was more likely when the feature resembled standard (inner circle) English. Yang and Zhang conclude that “the notion of CE is esoteric” to participants, and its putative features are paradoxically either very similar to standard English or too stigmatized to be considered legitimate by participants, seeming to leave no room for actual   98 features of CE. These findings differ from earlier studies which implied a growing acceptance of CE among teachers. 3.3.3 Discursive AJTs   The first study to fully appropriate the idea of AJTs in WE research, representing perhaps the first instance of the social/ideological AJT I described in Chapter 2, was Higgins (2003), which employed the “concept of ownership to see how speakers’ talk enacts identities that carry legitimacy as English speakers” (p. 623). Higgins administered an AJT to 16 speakers of English from both Inner and Outer Circle countries; participants, who were international students from India, Malaysia, and Singapore in an advanced ESL composition course at a U.S. university (n=12) as well as white middle-class Americans (n=4), were paired in dyads by their country of origin. Higgins used conversation analysis to examine the talk produced by the dyads’ discussion of how to respond to each sentence on the AJT. Her “purpose…was not to see whether participants accepted specific forms, but to elicit and record talk that might contain within it their stances toward English” (p. 625). In her conversation analysis, Higgins examined “the language of the actions the participants took as they shifted their footing from receptor to interpreter to evaluate the sentences,” focusing on “references to the speakers’ own English usage” as well as the use of modals and human subject pronouns (p. 629).  Higgins found that all groups “displayed similar indicators of authority over English” such as using the phrase “you can say…” although Outer Circle speakers “displayed less certainty” (p. 640). She suggests that this may be due to “multiple and conflicting norms” which circulate in some contexts and calls for a more nuanced alternative to simply seeing English users as native or non-native speakers (p. 641).    Higgins’ innovative use of AJTs to investigate “ownership” and “orientation” to English suggests that the process of making acceptability judgments can be used to investigate a variety   99 of attitudinal aspects of WEs. Two related studies have taken up Higgins’ methods: Bokhorst-Heng, Alsagoff, Mckay, and Rubdy (2007), and Rubdy, McKay, Alsagoff and Bokhorst-Heng (2008) both adapt Higgins’ framework and research questions for the specific context of Singapore: the first study examines the ownership of English among Singaporean Malays, while the second does the same with Singaporean Indians. The studies use Higgins’ AJT and are likewise more interested in the process of decision-making than the judgments themselves. In both studies, the dyads are grouped by social class (upper-middle or lower-middle) and age (older or younger) for a total of four groups. After a discussion about the uses of English for each ethnic group (Malays tend to have less cause to use English in Singapore, while Indians are the only group which uses it as an intraethnic lingua franca), the authors report on each dyad, offering excerpts of their discussion followed by conversation analysis, attending to the same features as Higgins (2003). The authors found that the Malay groups tended to rely more on exonormative rules than the Indians, and that generally the younger dyads in both groups seemed more confident about their own judgments. While the scope of both studies is limited, they do suggest that local English norms tend to be more accepted by the young in Singapore, and that the Malays may have been more influenced by their classroom learning of English while Indians may have been more influenced by their non-school usage of English. Like Higgins, the authors suggest that their work calls the native/nonnative dichotomy into question, and additionally, that the officially promoted government view of English relies too much on this and other stereotypical views of English which ignore the more nuanced reality of usage and ownership of English among Singaporeans.   Wiebeseck, Rudwick, and Zeller’s (2011) exploration of South African Indian English (SAIE) is also an innovative social/ideological AJT study, in that the researchers use the   100 acceptability task as part of an interview study. Like the studies above, discourse about the participants’ judgment is analyzed, this time in the context of semi-structured interviews, beginning with an AJT which “provided an entry point into a discussion about SAIE and what they perceived as good/proper English” (p. 258). The study involved individual interviews with twenty South African Indian university students, each of which began with an AJT in which students read or were read three sentences (two SAIE and one “standard” English) and asked whether they or anyone they knew would use such a grammatical construction. The participants could then choose one of four possible responses to each sentence: A) I use this kind of grammatical construction myself. B) I don’t use this grammatical construction, but other English speakers do. C) I’ve never heard anyone use a construction like this, but I would guess that some native speakers do use it. D) Nobody would say this. (p. 257) The AJT here worked as a way to index attitudes toward SAIE and the “standard” white South African English. The students’ choices showed that some acknowledged the existence of SAIE while distancing themselves from it, while others both recognized it and admitted to using it. The comparison of the thematically analyzed interview data with the results of the AJT revealed, according to the authors, the ambivalence of the participants toward SAIE; while they generally seemed not to want to be associated with SAIE, their attitudes were “fluid” and could be “re-negotiated” (p. 11). In this and the other discursive studies, the AJT proves an important catalyst in prompting participants to reflect on their attitudes and judgments not only about discrete grammatical items, but also about their relationship to a particular variety of English.   101 3.3.4 Open-Ended Textual AJTs The final category of studies reviewed here, the open-ended textual AJTs, contains the few WE-related acceptability studies which specifically concern themselves with written usage. Parasher (1983), using a corpus of 188 business letters written in India, asked British, American, and Indian speakers of English (two each) to offer “acceptability judgments” on the language usage. Parasher tallied the total number of all unacceptable usages by all participants, finding that syntax was the most unacceptable (48.24% of all unacceptable forms), with lexis and style accounting for 23.47% and 28.29%, respectively. Within syntax, the highest percentages of unacceptable items were determiners and modifiers (10.84% of the syntax group), verb tenses (8.31%) and prepositions (7.41%). For particular items, Parasher compared the reactions of the British and American participants to the Indians’ in order to determine whether the usage was established as Indian English. After discussing some 86 different items from the corpus, Parasher concludes that although syntax was the most frequently unacceptable category, there was wide agreement among both the NES and Indian readers about which were unacceptable, meaning that most of the syntactic features rejected are not part of Indian English, but that there was considerably more disagreement on matters of lexis and style. He confirms his hypothesis that “educated IndE conforms to the major syntactic rules of the language and has peripheral differences in syntax and marked differences in lexis and style as compared with native educated varieties” (p. 163). While Labov (1972) has warned that metalinguistic judgments cannot necessarily be used as proof of the features of the research participants’ own varieties of English, Parasher’s study is innovative in its methods and appears to be sound in its claims. Gupta (1988) undertakes a similar approach to written WE usage in a paper which appears to report on two small-scale studies; her goal was investigating whether a standard   102 written Singapore English can be said to exist. The first study involved assembling a corpus of 89 texts of Singapore English and analyzing it (along with her students) for “nonstandardisms,” which she grouped into 30 categories. She found that they “occurred most frequently in verb group choice, proposition choice, vocabulary, use of articles, clause linage, punctuation, number, word order, anaphora and pronoun choice, use of idioms, and subject-verb concord” (p. 35). The second study employed five NES linguists (two British, two American, one Australian) working in Singapore, who read a single Singaporean English text. The participants identified 48 nonstandardisms in total (with each identifying at least 15), but only three of the nonstandardisms were identified by all five participants, and only another three were identified by four participants – a very high level of disagreement about what constituted a nonstandardism in written Singapore English. After reporting on these two studies, Gupta offers a list of putative features of standard written Singapore English (seemingly selected by her) and concludes by noting that there exists in Singapore a “de facto local standard side by side with a climate of opinion which would reject an official endonormative standard” (p. 45). While the presentation of empirical data and the author’s own speculation are conflated in a confusing way in this paper, it is notable for its use of authentic texts and its finding that there is very little agreement among NESs as to what constitutes an unacceptable written usage. Finally, Ivankova’s study of China English (presented at the International Association for World Englishes conference in Hong Kong in 2008 and eventually published in Russian – references here are to Ivankova’s 2008 PowerPoint Slides) positions itself as one which examines intelligibility, comprehensibility, and interpretability of Chinese English, and looked at “native speakers’ and non-native speakers’ perception of and tolerance towards non-standard or deviant linguistic features of China English” and their reasons for their judgments. Ivankova   103 used “passages of written texts in China English produced by educated Chinese speakers of English, such as students majoring in English, journalists, translators, and scholars… from collections of students’ essays, books translated from Chinese into English, English language newspapers,” and so on. This innovative usage of whole published texts in a variety of genres is noteworthy. Ivankova instructed participants to “underline words and/or phrases unfamiliar to [them],” and “highlight words or word combinations whose meaning is unclear to [them] in these contexts.” Her results suggest that some “non-standard” linguistic features “will be unproblematic in case they are found in a context which provides more information on the purpose of the utterances.”  This study, though all the results were not presented, again underscores the need to look at acceptability judgments in “real-world” rhetorical contexts in order to obtain more meaningful judgments from participants. 3.3.5 Summary Because the studies reviewed in this section had a variety of different purposes, the findings are not always comparable. In general, it seems that non-speakers of WEs were fairly likely to accept non-standard varieties of English (Chen & Hu, 2006; Ivankova, 2008; Murray, 2003), but speakers of those varieties were more likely to have an ambivalent relationship with those varieties, vacillating between acceptance and preference for native speaker standards (Higgins, 2003; Y. Wang, 2013; Weibeseck, Rudsick, & Zeller, 2011) which may be related to their own proficiency (Bautista, 2004; Mollin, 2007) or their sociolinguistic background (Bokhorst-Heng, Alsagoff, McKay, & Rubdy, 2007; Rubdy, McKay, Alsagoff & Bokhorst-Heng, 2008). While obvious violations of grammatical rules were often rejected (Murray, 2003; Parasher, 1983), there were also disagreements among participants about what constituted unacceptable usage in writing (Gupta, 1988; Parasher, 1983).   104  Since WE places importance not only on describing the “features” of particular Englishes but also the ideological or attitudinal “acceptance” (of both particular linguistic variations and varieties of English as a whole), AJTs are a useful method for studies adopting a WE perspective on local variations from standard Inner Circle English and readers’ attitudes about these variations. The different uses of AJTs in the studies above shows that AJTs can be useful when combined with other methods, particularly interviews, and that this combination can reveal complexity and ambivalence where a traditional AJT focused only on specific usages may present a more limited picture. It is also evident that AJTs are particularly useful when administered with authentic language in its original context, rather than merely with decontextualized sentences. The last group of studies (involving whole texts) seems especially prudent for sociolinguistic studies of WEs, since they truly offer a “level playing field.” With no external source guiding the participant to regard certain types of usages as more worth scrutiny than others, researchers have a better chance of learning about participants’ own subjective judgments. 3.4 Conclusion    Taken together, the studies reviewed above suggest a move away from a straightforward error gravity or AJT research design (in which participants rank usages exemplified by a list of sentences) to a more complex design involving qualitative data such as open-ended judgment tasks, interviews, group discussions, and research questions beyond the basic “Is this OK or not?” instrument. In addition, there is a trend in the results of the studies from the fairly unambiguous results of some studies (i.e., NNESTs judge errors more severely than NESTs) to more nuanced results in recent studies which suggest ambivalence, contradiction, and wide variability in participants’ judgments of language use.   105 The trend toward more qualitative, open-ended studies makes it difficult to state with certainty any common core to the findings in all of the studies reviewed in this chapter. In general, it appears that NESs are harsher judges of nonstandard usage, especially when it comes to speakers who share their own linguistic background, but it is also true that widespread agreement about types of nonstandard uses that are widely rejected is rare; it does seem true that “surface errors” (Gilsdorf & Leonard, 1990) or more obvious “rule violations” (Murray, 2003) are more likely to be rejected by participants than unconventional lexis or turns of phrase, but it is difficult to see any pattern in which more “obvious” grammatical violations are rejected. The current study is inspired in part by many of the different types of studies above: it could be described as an open-ended textual AJT with an NES/NNES comparison element and an error gravity-inspired concern with which usages are most frequently selected as unacceptable by participants. It also takes up the useful addition of interviews to the basic judgment task as a way to generate richer, fuller data regarding reasons for particular judgments (e.g., Hughes & Lascaratou, 1982), orientations toward varieties and usages (e.g., Y. Wang, 2013; Wiebeseck, Rudwick, & Zeller, 2011) and notions of authority (e.g., Higgins, 2003). In the following chapter, I describe in detail the methods and methodology for this study.    106 Chapter 4: Methodology  4.1 Introduction  This chapter describes the methods and methodology of this study, which involved two phases of data collection – an acceptability judgment task (AJT) and follow-up interviews. Since the current study is inspired by concerns similar to those of L1 and L2 composition error studies as well as the more variation-based acceptability judgment studies in world Englishes and ELF, both writing research (L1 and L2) and sociolinguistic research (WE and ELF) studies as described in the previous chapter influenced the decisions made for the methodology of this study. Below, I describe the research design in relation to the research questions, as well as the recruitment of participants, the places they worked, and their demographic information. I then describe the creation of the AJT and how it was administered, as well as follow-up interviews and how they were conducted, and describe the data analysis procedures. Finally, I briefly discuss issues of validity and the limitations of this type of study, and discuss my own position as a researcher working on a project involving English writing in China. 4.2 Overview of the study and research design This study, like many of those described in the previous chapter, involved collecting data from two different groups of English language teachers: Chinese-speaking English teachers from China (n=30) and non-Chinese English teachers from other countries (n=16). The Chinese participants were recruited from three Chinese universities, and the non-Chinese from one joint Sino-foreign university in China and from my personal contacts and a Canadian professional association. Both groups completed an AJT in which they responded to seven essays written by Chinese university students; for each essay, they were asked to identify no more than ten instances of language use they found unacceptable for any reason. (For the instructions, please   107 see Appendix C; there is also a more detailed description of the process below.) There were then follow-up interviews with volunteer participants (n=20) discussing their responses to the AJT as well as English writing, criteria for judging language as unacceptable, English in China, and other relevant issues. Before discussing the details of the research design, it is useful to revisit the three research questions this study is investigating: 1. What lexical and syntactic features in Chinese student writing do Chinese and non-Chinese English teachers identify as unacceptable, and why?   2. How do participants react to chunks which evince features of either Chinese English or English as a Lingua Franca, and why?  3. By what authority do participants make judgments about the acceptability of English usage in writing?    As I have alluded to earlier, the first two questions are related in that they approach issues of acceptability – first in a bottom-up way, with no assumptions about what features of writing or types of deviations from presumed standards of written English are likely to be noticed by participants, and second in a top-down way, looking at specific features that have been considered typical of Chinese English or English as a Lingua Franca. These two questions are complementary because they both deal with how participants judge deviations from standard written English in the unique sociolinguistic context of China. The first question investigates   108 what might be missed by looking only at CE and ELF when it comes to readers’ reactions to written English in the Chinese context, while the second question looks at English in China based on existing theory and research on CE and ELF. If the first two questions ask what and why about the judgments, the third question asks how, in the specific area of authority. I have described in Chapter 2 why authority is an important question in language ideology (of which acceptability is a facet). There were two methods of data collection, AJT and interview, and research questions one and two involved both; the third question involved primarily the interviews, though the interviews themselves are based on participants’ responses to the AJT (they are, in a sense, text-based interviews). Thus the study roughly follows a cross-sectional mixed methods design with a survey (AJT) component and an interview component, which has been identified as one of the most common types of mixed-methods designs in education (Bryman, 2006) and applied linguistics (Dörnyei, 2007). However, it should be noted that while the study includes two different methods of gathering data – the survey-like AJT and the semi-structured interview – unlike most “mixed methods” studies, the survey portion is only partially quantitative; the AJT, rather than being analyzed with inferential statistics as in some linguistic acceptability studies, is approached through both descriptive statistics and a thematic analysis (see Braun & Clark, 2006). The relationship of the AJT and interview data is similar to Bryman’s (2006) categories of “explanation,” where one method (the interviews) “is used to help explain findings generated by the other” (the AJT), and “enhancement,” whereby the more in-depth interview data is used in “making more of or augmenting” the AJT data (pp. 106-107).  I take a view of the “research interview as social practice” in which “data cannot…be contaminated” (Talmy, 2010, p. 3). Thus, while the study involves participants giving accounts   109 of their judgments via two different methods – self-administered AJT and interview – both can be analyzed in terms of their being “situationally contingent and discursively co-constructed” (p. 3). The purpose of interviews is not for interviewees to access their previous psychological states when they were completing the AJT, but rather to discuss the general themes and issues brought up by the study and by their participation in the AJT in particular. My approach to interviews is one which “uses interviews primarily to collect data about the insights or perspectives of research participants, with less attention paid to the actual linguistic or textual features of the discourse,” and thus “content or thematic analysis, rather than a linguistic or interactional analysis, is primary” (Duff, 2008, p. 133). The way I choose to analyze and handle the data is ultimately a synthesis created by myself; I make decisions about what to include from the interview transcripts, and I draw conclusions based on my own perspective and analysis.  I hope that the finished product will be received by the participants in the spirit with which it is intended: I interviewed them in order to learn more about their perceptions involving language use and English writing in the Chinese context, in order to enrich my own and others’ understanding of how teachers make judgments of acceptability in writing. The interviews were conducted with a spirit of curiosity, collegiality, and openness to the participants’ perspectives; the analysis of data and the writing of the dissertation, though it filters the interview and AJT data solely through my own lens, is undertaken with that same spirit. 4.3 Recruiting participants My goal, initially, was to recruit a more or less equal number of Chinese and non-Chinese English language teachers to facilitate comparing their responses on the AJT – perhaps 40 in each group. The main interest I have in separating the two groups is looking at the   110 reactions of Chinese NNESTs vs. non-Chinese NESTs. Whether all the non-Chinese participants are “native speakers” in the traditional sense is debatable; not all are monolingual English speakers, but all are from Inner or Outer Circle countries. For this reason I prefer “Chinese” and “non-Chinese” to refer to the groups, rather than NNEST and NEST. This is not to essentialize groups based on nationality or origin, but to emphasize the difference between “insider” and (relative) “outsider” status vis-à-vis English education in China.  Recruiting participants proved more difficult than I had anticipated, and I ended up with 30 Chinese participants (around 10 from each of the three Chinese universities I contacted) and 16 non-Chinese (from a joint Sino-foreign university and from personal connections in Canada). Since this is not a quantitative study aiming to make widely generalizable claims, but a primarily qualitative study which aims at a kind of “snapshot” through examining participants’ reactions to texts, having equal numbers of participants from each group was less important than being able to look for general patterns in the participants’ subjective responses to the AJTs, which were then used for further qualitative analysis. Sites for recruiting participants were selected based on my previous experience and/or personal connections. Participants from the Chinese institutions were recruited initially via an email message to department heads at the respective sites (see Appendix G for an example), and later attending department meetings to introduce the project. After some initial participants expressed interest, snowball sampling was used and earlier participants assisted in recruiting others in their departments. Chinese participants who completed only the AJT were given a 100 yuan (approx. $20 CAD) gift card to a local supermarket, while those who also volunteered to participate in interviews were given 150 yuan gift cards.  I had initially intended to separate the non-Chinese participants into two groups: those   111 working in China (from the joint-venture university) and those who had never worked in China (from my connections in Canada), and those were the eligibility criteria I submitted to my university’s ethics review board. As the study progressed, this became a less important concern, and the two groups were collapsed into one. The non-Chinese participants who worked at the joint-venture university in China were recruited in a similar manner to those at the Chinese university; I approached the director of their department, and emailed all the instructors after obtaining permissions to do so. The Canadian participants were recruited through personal networks, described below. The non-Chinese participants were not compensated for their participation; although the time that both sets of participants invested in the project was short (1-2 hours at most), I made the decision to compensate the Chinese participants because they were, on the whole, sacrificing more by giving up their time, due to their busier schedules and lower salaries compared to many of the non-Chinese participants.  Often personal networks were the only way in which I could proceed fruitfully. For example, at one research site I had trouble recruiting participants because my only contact was busy with other commitments. Though my contact was able to help me recruit several participants, I was frustrated by my lack of ability to find others and resorted to emailing professors whose profiles on the university website revealed research interests similar to mine. One of the professors I emailed had spent a previous summer at a course given by one of my dissertation committee members, and was happy to help with recruitment, recruiting nearly ten more participants. This type of convenience sampling also happened when a mentor of mine working at another site casually mentioned that he knew the dean of an English department at a neighboring university. He contacted the dean, and within about a week, another nine participants were recruited from that department. (See Appendix G for an example of an email to   112 a department head.)  Recruitment of Canadian participants proved to be slightly more difficult. My original plan had been to recruit through a local professional association, but they considered my call for participants to be an advertisement, which I did not have the funds to place. I spoke to some colleagues in person at the association’s local conference, which led to the recruitment of a few participants. Eventually, I posted a call on their website’s social network (see Appendix H), which led to the recruitment of several more; a colleague at a university where I had been teaching part-time also saw the call and forwarded it to her MA students. This allowed me to recruit several more Canadian teachers. 4.4 About the participants  In total, there were 46 participants – 30 Chinese and 16 non-Chinese – recruited from five different sites, in this study.  Table 4.1 and     Table 4.2 below describe the participants’ general demographic information and other details. Participants’ ID numbers refer to their institutions, described in more detail in section 4.5. Table 4.1 Profiles of the Chinese participants (n= 30) ID Interview? Sex Age Years teaching Highest degree Degree field SIC1  M 30 5 MA Translation SIC2 Y F 30 8 BA English SIC3  F 36 11 BA English SIC4 Y M 30 5 MA Translation   113 ID Interview? Sex Age Years teaching Highest degree Degree field SIC5 Y M 30 3 MA English SIC6 Y M 35 10 MA English SIC7 Y F 32 9 MA (not provided) SIC8  F 28 2 MA (not provided) NKU1  F 33 10 MA Applied linguistics NKU2  F 47 26 MA Applied linguistics NKU3  F 39 15 MA Education NKU4  F 45 22 MA Applied linguistics NKU5  M 37 15 MA Applied linguistics NKU6  F 39 13 MA English NKU7  F 50 27 BA English NKU8  F 45 27 MA Applied linguistics NKU9 Y F 38 12 PhD Education NKU10  F 38 14 MA Applied linguistics NKU11 Y F 38 17 MA Applied linguistics ATC1 Y F 30 9 MA English & Applied linguistics ATC2 Y F 45 23 PhD (not provided) ATC3 Y M 30 7 MA (not provided) ATC4 Y F 31 8 MA Applied linguistics ATC5 Y F 38 16 MA Applied linguistics ATC6  F 31 9 MA Applied linguistics ATC7  F 32 9 MA (not provided) ATC8  F 31 9 MA Applied linguistics ATC9  F 35 15 MA (not provided) ATC10  M 35 10 MA (not provided) ATC11  F 32 6 MA Foreign Languages    114    Table 4.2 Profiles of the non-Chinese participants (n = 16) ID Interview? Sex Age Years teaching Highest degree Degree field Country of origin JVU1  F 58 7 BA TESOL Scotland JVU2  F 42 20 MA TESOL Fiji JVU3 Y M 54 15 MA TESOL Ireland/UK JVU4 Y M 34 12 MA English language & education UK JVU5 Y M 39 16 MA TESOL USA JVU6 Y M 54 20 MA Applied linguistics UK JVU7  F 41 18 MA English literature India CAN1 Y F 23 1 MA TESOL Canada CAN2 Y F 46 17 MA TESOL Canada CAN3  F 58 4 BA general studies Canada CAN4  F 55 24 MA TEFL USA CAN5 Y M 41 5 MA TESOL Canada CAN6  F 49 23 BA Linguistics / Spanish Canada CAN7  F (Participant did not provide this information) Canada CAN8  M 34 10 MA Education Canada CAN9 Y F 60 32 PhD English USA   The typical participant in this study is a woman in her mid-30’s with a master’s degree in applied linguistics or another field related to English language teaching, with around ten years of teaching experience. On average, the Chinese participants were younger (average age of 36) than the non-Chinese (average age 46), and the non-Chinese participant group had a slightly higher average number of years of teaching experience (almost 15, to the Chinese group’s 12). Most   115 participants were female (seven Chinese and seven non-Chinese men participated in total, so less than a third of the participants were male) and most had a master’s degree (though there were three BAs each in the two groups, with two PhDs in the Chinese group and one in the non-Chinese group.) Two of the non-Chinese group reported being in the process of working on PhDs.  Each group had both younger and more experienced teachers. 4.5 Research sites  The institutions where I recruited Chinese participants for this study are all in the same province. Although the province where I collected data in China may not be representative of the whole country, it is on the more affluent, economically developed, and “progressive” East coast of China, which seems to be leading the way toward the middle-class society the Chinese government is currently promoting. It is one of the provinces that are increasingly more urban, educated, white-collar, and foreign-language-speaking than the rest of the country.   I have divided the participants among the different institutions they work for, more for purposes of identifying them than any methodological need to keep their institutional affiliations separate – though the institutions are different and in some cases their individual characteristics are salient in the participants’ responses. Below I enumerate the pseudonyms for the different sites and offer a brief description of each. The first three are the sites the Chinese participants are affiliated with, while the latter two are those the non-Chinese teachers are drawn from. 4.5.1 SIC (Small Independent College) SIC is one of a relatively new type of higher education institution in China, which is technically privately funded; these are referred to as minban (non-state) institutions; because the “notion of ‘private education’ remains politically incorrect in socialist China, scholars and government officials tend to use the term minban to refer to institutions not run by the state   116 actors” (Ong & Chan, 2012, p. 168). At the time of the data collection, SIC was located on the campus of a local provincial university, though it has since moved to a suburban university district of the type that some Chinese provinces are currently promoting.  SIC enrolls approximately 8,000 students, and the foreign language department enrolls nearly 1,000 students, many of whom are English majors. I had previously worked at SIC as a foreign teacher and had collegial relationships with the administrators and several teachers in the department. 4.5.2 NKU (National Key University)  NKU is one of China’s larger universities (it enrolls around 50,000 undergraduate and graduate students), and is one of a select group of universities under the direct supervision of the Chinese Ministry of Education (formerly known as “National Key Universities”). It is consistently ranked among the top universities in the country, and all of its faculty members have master’s or PhD degrees. I had also previously worked at NKU, though I was less intimately familiar with its foreign language department(s), since it was much bigger than SIC’s and I rarely had contact with my Chinese colleagues. 4.5.3 ATC (A Technical College)   The third participating institution, ATC, is a technical college affiliated with NKU, enrolling approximately 10,000 students. All deans (and a few professors) at ATC are drawn from the faculty of NKU, and graduating students are awarded NKU degrees. In practice, however, ATC is a wholly separate institution, located in a different city, and largely employing its own academic staff. I was put in touch with the English department at ATC by a professor at JVU (see below). SIC and ATC were actually very similar institutions, and what I think of as “typical Chinese universities,” at least of the type I am familiar with – many large, grey buildings, arranged in quadrangles, and large offices where a department’s entire English   117 teaching staff shares a single office, stacks of yellow composition books in mountains on every teacher’s desk. 4.5.4 JVU (Joint Venture University)  This is one of a growing group of what I call “joint-venture universities” (also called CFCRS, or “Chinese-Foreign Cooperation in Running Schools”; see Ong & Chan, 2012), partnerships between Chinese and non-Chinese universities. While there are hundreds of agreements between Chinese and foreign higher education institutions, ranging from exchange programs to jointly issued diplomas, there is a small number of CFCRS institutions which themselves issue degrees. JVU has been operating for a little less than a decade; its faculty are largely international, while its student body is overwhelmingly Chinese. It is part of a larger university system based in a western country. The department where I recruited participants was the university’s English Language Center, which is tasked with teaching preparatory English for Academic Purposes to all of the university’s students. Academically, I was based in the English department of JVU during the data collection for this study, and participated in various research and teaching activities there. 4.5.5 CAN (Canadian Participants)   The final category is not a “site” but a group of Canadian participants recruited in the way described above. I was interested in recruiting teachers working at the postsecondary level who did not have experience working in China, because I was considering a comparison between JVU and CAN participants; in the end, due to the relatively small numbers of each, I elected not to make this comparison. None of the CAN participants had any experience teaching in China, though one (CAN1) had briefly taught in Hong Kong.   118 4.6 Creating the AJT  The AJT was created using essays taken from the Written English Corpus of Chinese Learners (WECCL) published by Foreign Language Teaching & Research Press in Beijing (Wen, Liang, & Yan 2008). This is a corpus of some 4,000 essays written by Chinese university students at various stages in their college careers, in several different genres, some for exams (timed) and others in class (untimed). Because I was interested in seeing participants’ reactions to compositions written by actual students (as opposed to decontextualized sentences or essays that were specially prepared to display certain errors), which essays were selected was less important than it might have been in similar studies with more of a focus on particular linguistic forms, for example. After skimming some of the essays, I selected seven essays with similar characteristics: all were argumentative essays written by fourth-year English majors at Chinese universities. (See Appendix C for the actual essays used.) The essays I selected were, in my opinion, readable due to a relative lack of major spelling errors or extremely ungrammatical sentences, which might have been too distracting to readers. In a sense, I was aiming to present participants with texts that are representative of the abilities of the type of English major most Chinese university English departments would hope to graduate; that is, texts written by young people with a fairly sophisticated grasp of English writing in an EFL context. Whether the texts are “good” is, essentially, beside the point: the important thing for this study is that they are actual writing samples from fourth-year Chinese university English majors. The topics of the essays and the letters I used to identify them in the study, along with their word counts and whether they were timed or untimed, are shown in  Table 4.3 below.    119      Table 4.3 Information about the AJT essays Prompt Topic Word count Timed/Untimed A Cost of education 387 Timed B Animals: pets or resources? 402 Timed C Government spending 389 Timed D E-dictionaries 379 Untimed E Competition vs. cooperation 413 Timed F Living off campus 340 Timed G Electronic greeting cards 421 Untimed   The intent of the AJT was to encourage participants to focus more specifically on language use issues that were the focus of the study: namely, lexical and grammatical usage that the participants deemed to be nonstandard or “unacceptable.” Nevertheless, many participants did point out features of the texts that were related not only to lexis and grammar, but also discourse, content, and even, in some cases, the appropriateness of the writing prompts or what they perceived as a disconnect between the essays and the very nature of academic writing.   Administering an AJT, rather than studying teachers’ reactions to their own students’ writing in a “natural” way, as in the studies by Connors and Lunsford (1988) and Lunsford and Lunsford (2008), creates something like “lab conditions.” By removing the writer from the   120 equation (or at least removing the relationship between the teacher and writer), I hoped to tap into the participants’ judgments of linguistic acceptability rather than their pedagogical diagnosis of any one particular student. Predictably,  some participants did report completing the task in a way that they would have if they had been reviewing their own students’ papers, as indeed often seems to be the case in this type of study. 4.7 Collecting the data 4.7.1 AJT Participants were sent the AJT via email and asked to complete it within a two-week period. I asked participants to spend no more than 90 minutes on the task, but I could not strictly enforce this, since the task was completed independently. The instructions emphasized spending as little time as possible on each essay by limiting the number of comments to only those the participants felt were the most salient – i.e., the ten “most unacceptable” instances of language use (referred to in the analysis as “chunks,” as discussed in Chapter 1). Participants completed the AJT using Microsoft Word’s comment function to identify perceived unacceptable usage of lexis and syntax; the instructions they were sent (see Appendix C for the full document) asked them to “read each essay, and using Microsoft Word’s comments features, select any word, phrase, or arrangement of words which [they] consider[ed] unacceptable” and to briefly explain why in the comments. Participants were not initially told that the essays were written by Chinese undergraduates; they were told the essays came from “English majors in their fourth year of university in a non-English-speaking country” and that the purpose of the study was to investigate their “perception of acceptable features of written English.” It was probably clear to most of the participants that the essays were written by students from China due to their content,   121 and in many of the interviews I explained that the essays were written by Chinese students. My thinking in not introducing the fact that the writers were Chinese at the outset of the study was not to “hide” this fact, but to, hopefully, allow the participants to begin making judgments without having been expecting any features they may have believed to be stereotypical of Chinese students’ writing.   The use of the term “unacceptable” to describe the chunks that participants rejected was not one chosen lightly, nor was it necessarily well-received by all participants. Because of my theoretical approach to variation from standard written English, and my desire to collect more bottom-up data with no a priori definition of errors, I purposefully avoided using words like “error,” “incorrect,” or “wrong” in both the instructions on the written AJT and my face-to-face or email discussions with individual participants when explaining the focus of the AJT. (In interviews, I did sometimes use these words, since they often naturally come up in conversations between English teachers.) In many cases I was asked to clarify what I meant by unacceptable, and I often replied to this by saying something like “just identify anything that, for any reason, seems unacceptable, unusual, or ‘not OK’ – it could be for lexical or syntactic reasons, or spelling, or grammar, or style, or appropriateness in the context, or your own personal pet peeves – really any reason.” I remember repeating sentences like this one many times during the course of data collection. While some participants still felt uncomfortable with the label “unacceptable,” it was clear to me after looking at the AJTs that almost all of the participants had in fact marked uses of language they disliked for a variety of reasons of the types I mentioned above. 4.7.2 Interviews Those participants who were willing to participate in a follow-up interview checked a box on the AJT stating their availability. A total of 24 participants agreed to be interviewed.   122 However, one short interview was with a participant who had not appropriately completed the AJT (there were no comments) and the discussions in the interview were mostly irrelevant to the study, so it was not used. In addition, three interview recordings (two from NKU and one from JVU) were inadvertently lost during the process of transferring files from the recorder’s SD card to my laptop. Therefore, a total of 20 interviews were included in the project: 5 SIC, 2 NKU, 5 ATC, 4 JVU, and 5 CAN. The primary topic of discussion was their responses, their reasoning, their reactions to the texts in the AJT, as well as more general discussion of their attitudes and beliefs about standard English, English in China, and the like. (See Appendix F for the interview guidelines used.) There was often a short turn-around time between the AJT and the interview, so in many cases I briefly read through the participants’ responses before the interview, highlighting comments I thought were interesting or unique, to ask about them. Interviews were open-ended and the topics discussed varied, depending on the context and my relationship with the participant. Sometimes participants chose to bring up comments I had not previously deemed notable. While I aimed to be open to allowing the participant’s responses to “lead” the interview, I also recognize that as an interviewer, I played a major role in shaping responses. When I quote excerpts from interviews in the data analysis chapters, I often include my own half of the conversation.  Each participant was interviewed in person for approximately one hour, with the exception of one of the Canadian participants who was interviewed via Skype. Most interviews took place in the participants’ own offices or classrooms on their campuses; some took place at coffee shops. Interviews were audio-recorded using a Zoom H2 Handy recorder. All interviews were conducted in English, which has been a somewhat controversial practice in applied linguistics research; Jin and Cortazzi (2011) advocate being clear about which languages were   123 chosen and why for interviews with participants who are Chinese English language teachers. In this case, I chose English both for convenience, as my Chinese ability is intermediate at best, and because I view the Chinese participants as competent English users and teachers. Most appeared, from my perspective, to be comfortable conversing in English, though it was also clear that their proficiency varied. Interviews were transcribed in a more or less “naturalized transcription” (Bucholtz, 2000, p. 1461), in which oral speech is recorded in a way that is in accordance with written discourse conventions, such as conventional spelling and punctuation. However, I attempted to preserve what I perceived to be the general sense of “flow” of both my and the interviewees’ speech, including false starts, laughter, some nonverbal sounds, and so on. 4.8 Managing the AJT data  Before I describe more specific data analysis procedures for each of the research questions, some description of how the AJT data was managed and analyzed is necessary to give a better idea of the size and scope of the project. There were 46 participants each reading and commenting on the same seven essays, for a total of 322 different unique commented-upon documents (46 x 7).  Each participant was asked to limit himself or herself to ten comments per essay, prioritizing based on what he or she felt were the most important ‘unacceptable’ features (though some gave fewer and some went over this number). In the end, there were 2,843 comments coded (this number is slightly higher than the actual number of comments in the original documents, because if a comment seemed to be dealing with multiple “issues” in the text, I duplicated it and coded each duplicate comment as being connected to a different chunk).  As mentioned above, the essays were emailed to participants as Microsoft Word documents which included instructions on the AJT procedure as well as the seven essays in randomized order. Participants were to use Word’s “Comment” feature to select chunks of text   124 they deemed unacceptable, and provide comments as to their reasons. Figure 4.1 below gives an example of what each selected chunk and comment looked like.   Figure 4.1 Example of AJT   The comments, along with their related chunks of text, were exported from Word to Excel through the use of a Visual Basic macro (a short, simple computer program designed to automate a task) written by a colleague. Using data such as which prompt (the seven prompts were labeled A through G) the essay was and which line of the page each chunk began on, I was able to arrange all of the comments in more or less a linear order in Excel, from beginning to end of the essay, for each prompt.   The next step was to determine in which cases participants were selecting the same “issue” in the text. First, I printed out the Excel sheets for each prompt, showing data for prompt, participant, chunk, and comment. Then, looking at each chunk/comment, I judged which of them were describing the same basic ‘complaint.’ Some cases were easier to judge than others.   For example, one sentence in Prompt B about the treatment of animals reads: “if you think that they [animals] are heartless, dirty and ugly, you can treat them just as animals.” Five participants selected the same word (“heartless”), and all of them made comments that were clearly related to questioning whether the word appropriately conveyed the meaning intended by   125 the writer. The chunk, labeled B064 based on its prompt (B) and position in the text (it was the 64th chunk selected), was selected and given the following comments by the five participants:   Table 4.4 Comments on “heartless” (B064) Participant Comment CAN6 Wrong word. Used for people. “without feeling/have no feeling”. Check parallel structure: BE without feeling vs. HAVE no feeling.  ATC8 Word choice CAN4 What is meant by this? SIC6 soulless? NKU11 “Heartless” means cruel, not appropriate here.  Thus, the common selection of the same chunk, combined with the fact that all the participants made comments related to questioning the appropriateness of this adjective in this context, made it easy to confidently label all of these as belonging to the same chunk, B064. Each comment, then, got a “chunk ID” which labelled it as belonging to the same chunk as other related comments, beginning with A-G depending on the prompt. (CE and ELF chunks that were not selected by any participants, discussed in Chapter 6, are given a chunk ID beginning with the prompt but with “XX” instead of a number, e.g., FXX.) Some of the chunks and comments from participants were more difficult to interpret, as when participants selected only one word but their comment indicated they were referring to an entire phrase or sentence, or when participants selected chunks of varying lengths.   In addition, sometimes participants referred to multiple unacceptable ‘issues’ in the same comment. For example, they might have selected the verb phrase “feel sense of inferiority” and mention both a missing article (e.g., “it should be ‘feel a sense’”) and their preference for “have”   126 rather than “feel” in this expression. In this case, the comment was copied and pasted again into the spreadsheet, so that it appeared twice. These were color-coded (for my reference) to ensure that they were noted as two different comments, and then each was coded as belonging to a different chunk. 4.9 Data analysis procedures for the research questions This section will describe the procedures I undertook to analyze data for each of the three research questions.  4.9.1 RQ1  The first part of analysis for the first research question involved arranging the chunks in an Excel document in such a way that the frequency with which each chunk was being marked by each group of participants could be discerned. I experimented with a number of different ways of looking at this data: seeing which chunks were the most selected overall and for each of the seven essays, looking at those chunks which had a high degree of “disagreement” (marked by a fairly large percentage of one group but not the other), and so on. This also yielded some general data about how much consensus there was overall.  Because there were so many chunks, it was necessary to find productive ways to determine which chunks or groups of chunks would be useful answering the research question. After careful examination of the data, I elected to focus on the following groups of chunks, which I have named “high consensus” and “differing priorities” in order to provide a convenient shorthand to describe the contents of each group: 1. High Consensus: Those chunks which were marked unacceptable by more than 50% of each group; that is, chunks that were marked unacceptable both by more than 50% of Chinese participants (or at least 15 of the 30) and by more than 50% of non-Chinese   127 participants (or at least 8 of the 16). 2. Differing Priorities: A chunk was considered noteworthy if it fit one of two criteria: a. It was marked unacceptable by over 50% of one group, but less than 50% of the other group, or b. It was marked unacceptable by at least 20% of one group but by 0% of the other group. (The idea here was to look for chunks which were selected by at least a sizable minority, or more, of one group while seemingly “ignored” by the other.) I began to look through chunks and comments that fit these criteria for patterns. Certain types of chunks or types of complaints of unacceptability were noticeable as either common to both groups or unique to one of them, and these were selected for further qualitative analysis. Using participants’ comments and relevant interview data, I interpreted the reasons for the similarities and differences in the groups’ priorities.  4.9.2 RQ2  The second research question involved identifying existing features of Chinese English and English as a Lingua Franca to look for in the AJT texts, and then analyzing participants’ reactions to them. After extensive reading in the CE and ELF literature, I identified a list of features that have been identified or proposed for each. Below, I detail these features. 4.9.2.1 Features of CE There are roughly four categories of lexical features of CE mentioned in previous literature: semantic shifts in which words’ meanings change due to recontextualization in the Chinese cultural setting; translations of Chinese idioms, proverbs, or slogans; loanwords; and loan translations. A number of CE studies mention these features, sometimes with different   128 emphases which overlap. The table below describes those scholars’ categorizations of those features.   Table 4.5 Proposed lexical features of China/ese English Feature Example(s) Author(s) Semantic shift  Propaganda (with a positive connotation); “open” to mean “turn on”     Cheng, 1992; Gao, 2001; Xu, 2010 Idioms and slogans  Long time no see; good good study day day up Cheng, 1992; Fang, 2008 Loanwords/borrowings  Baozi (steamed bun), mantou (steamed bread), kung fu, ginseng  Cheng, 1992; Xu, 2010; Yang, 2005; Yang, 2009 Loan translations Special economic zone, red envelope, paper tiger Gao, 2001; Xu, 2010 Yang, 2009  In terms of grammatical or syntactic features, I rely on Xu’s (2010) detailed list of features of CE based on both spoken and written data.  Table 4.6 and Table  below show those features. (Note that for the written features, the latter two were found in short story dialogue, while the others were found in news writing.) Not all of these features were found in the texts used in this study; those which were are described in more detail in Chapter 6. Table 4.6 Proposed grammatical/syntactic features of CE by Xu (2010) based on spoken data Feature Example Adjacent default tense  Yesterday I write a letter. Null subject/object utterances  Sometimes just play basketball. Co-occurrence of connective pairs  Because I X, so I Y. Subject pronoun copying  My mother she likes to do that. Yes-no response  “You don't like sports?” “Yeah.” [Meaning “I   129 don’t.”] Topic-comment  Cigars, the president never smokes them. Unmarked object-subject-verb (OSV) Both languages I can't speak well. Inversion in subordinate finite wh-clauses  I don't know what should I learn.  Table 4.7 Proposed grammatical/syntactic features of CE by Xu (2010) based on written data Feature Example Nominalization Many types of nominalized noun phrases (see pp. 224-228) Multiple-coordinate construction (done with “Chinese pragmatic motivations” and tending to “come in threes” (p. 91))  “The ministry will maintain the principle of supporting overseas studies, encouraging the return of overseas Chinese students, and lifting restrictions on their coming and going” (p. 91) Modifying-modified sequence (preference for forward-linking, subordinate clauses first)  “If she goes home, she cannot bear the sorrow of coming back to work” (p. 96) Use of imperatives (as opposed to questions)  to express commands or requests  “Go buy a carp. Stew it tomorrow afternoon and take it to my office” (p. 101) Tag variation  Wide variety of tag questions  4.9.2.2 Features of ELF  As mentioned in Chapter 1, there have been a number of features claimed to be a part of ELF since scholarship on this topic began. According to Jenkins (2014), in Seidlhofer’s original publication outlining the probable features of ELF (2004), the publisher omitted the “scare quotes” or quotation marks indicating Seidlhofer’s “skepticism towards the pejorative terms” used (p. 33), such as “‘dropping’ the third person present tense -s” or “inserting ‘redundant’ prepositions.” Below I summarize Seidlhofer’s list, using non-pejorative descriptions as far as I am able.    130 Table 4.7 Proposed grammatical features of ELF (Seidlhofer, 2004) Feature Example Unmarked third person present verbs  He go to the store. Interchanging who/which  The man which I know… Omitting and adding articles  I live in the China. Invariant or “incorrect” tag questions You got a new job, is it? Redundant prepositions Let’s discuss about that. Feature Example Overuse of verbs of “high semantic generality” Today we will do basketball. Replacing infinitive constructions with that-clauses  I want that we go swimming. Over-explicitness  The car was black colour. Making traditionally non-countable nouns countable (see Jenkins, 2014) Informations, researches  4.9.2.3 Analysis procedures  After coming up with these lists of features from my reading of the literature, I read through the seven AJT essays and identified chunks which corresponded to the features of CE and ELF. In some cases, these chunks corresponded to those which had been identified by participants as unacceptable; in others, they had not previously been mentioned by any participants. I then looked at each category of features of CE and ELF that occurred in the texts and noted how many of each of the groups (Chinese and non-Chinese) had marked them as unacceptable. Using the participants’ AJT comments, relevant interview data, previous literature on CE and ELF, and knowledge of the context of English in Chinese higher education, I qualitatively interpreted and analyzed the possible reasons that some features of CE and ELF seemed more likely to be marked than others.    131 4.9.3 RQ3 For the third research question, I undertook a thematic analysis of the interview data (interviews were discussed in section 4.7.2 above, and a full list of interview questions is in Appendix F). This is in contrast to the micro-level conversational cues such as pronouns or modals used in other qualitative studies which use AJTs as a basis for soliciting interview data about “ownership” of English from participants (as in Bokhorst-Heng, Alsagoff, McKay, & Rubdy, 2007; Higgins, 2003; Rubdy, McKay, Alsagoff & Bokhorst-Heng, 2008). I decided to approach coding in a way that combined “descriptive coding,” which describes “the basic topic of a passage” with the spirit of “initial” or “open” coding, which encourages researchers to “code quickly and spontaneously” but also “pay meticulous attention,” coding line-by-line, sentence-by-sentence, or paragraph-by-paragraph (Saldaña, 2009, p. 70). I first coded a sub-section of the data (the four JVU interviews) by this method, and began to take note of codes that struck me as unexpected. I had been expecting most of the data to deal with metalinguistic descriptions of language and writing, but I found I had generated codes like “how English is taught at my institution,” “students as novices,” “telling a story to make a point,” “my own education,” and others which were not related to the more expected categories such as “errors,” “Chinglish,” “grammar,” and so on. What the “unexpected” categories had in common is that they often involved the participant drawing on his or her own experience to show why he or she was someone who was credible or had the authority to make an acceptability judgment. After thinking about these codes, and reading the language ideology literature and finding resonant concepts in Cameron (2012) and Milroy and Milroy (2012) described in Chapter 3, I settled on the notion of “authority.” I then began reading through each of the transcripts looking for examples of ways in which participants described how or why they were able to make a   132 judgment, especially noting instances where they seemed to be describing things like their own credibility, the source of their judgment, their educational, pedagogical, or linguistic ability or proficiency, and so on. I coded these by hand at first, and then began to cut and paste relevant sections of text into a Word document. I labeled the relevant utterances as relating to “authority,” because they all had some description of the participant’s own knowledge, identity, ability, or ethos, especially in relation to the text, the (real or imagined) writer, and the judgment itself. Comparing these experts allowed me to identify several ways in which participants claimed authority to make judgments, and to explain similarities and differences in how the Chinese and non-Chinese participants did this. 4.10 Validity, limitations, and related concerns  The study is methodologically unique among studies of writing for the way it combines a survey-like AJT with follow-up interviews, and the methods of analysis described are appropriate for the data and the context in which it was collected. The research traditions from which the study primarily draws methodological inspiration – L1/L2 composition error studies and acceptability judgments in world Englishes/ELF – have not had a particularly robust methodological footing. I have endeavored, in the preceding chapters, to tease out some of the theoretical and methodological implications of using these approaches. One final area it is necessary to comment on, however, is the question of “validity” in this type of research.  It has been observed that “qualitative research has come of age in applied linguistics, where it continues to flourish” (Lazaraton, 2003, p. 1) and that there is a “growing enthusiasm” for many different kinds of qualitative applied linguistics research (Duff, 2010, p. 50). The goal of this research is usually “to produce an in-depth exploration of one or more sociocultural, educational, or linguistic phenomena” and/or “participants’ and researchers’ own positionality   133 and perceptions with respect to the phenomena” (Duff, 2006, pp. 73-74). As such, traditionally quantitative, positivist criteria for evaluating research are often not applicable to this type of research – the highly contextual nature of qualitative research involving, for example, multilingual learners of various languages, in various cultural, geographical, and education settings, is unlikely to produce the type of research that will be or can be “replicated.”    The notion of “validity,” which “is concerned with the integrity of the conclusions that are generated from a piece of research” (Bryman, 2001, p. 30), is relevant to qualitative research, though the traditional areas in which validity of research is assessed tends to be focused work which seeks generalizability. Alternate terms for similar constructs have been proposed for qualitative research, such as trustworthiness, credibility, and rigor (see, for example, Rolfe, 2006) to avoid the theoretical entanglements with positivism that validity implies; for qualitative research, the question of whether a research project can be judged as trustworthy or credible has much more to do with the internal coherence and transparency of the work. This is why, in studies such as this one, it is important to make explicit things like theoretical frameworks, detailed methods of data collection and analysis, descriptions of decision-making processes and various pathways which are followed (or created) by the researcher at every turn, from conceiving the project to collecting the data to analyzing it and drawing conclusions. This explicitness is something I aim for in this chapter and this dissertation.  On a related note, and as I have mentioned, this study does not aim for generalizability, but seeks to offer one particular and contextualized account of an important area in studies of L2 writing, which is a non-error-oriented approach to readers’ reactions to texts’ perceived variations from standard written English.  Similar methods can be used for similar studies in similar – or indeed different – contexts. The current findings will only reveal information about   134 this particular set of participants, who are, I believe, individuals who can be seen as “representative” of others in similar positions, in the sense that they are practicing teachers of English who deal with L2 students’ writing on a regular basis. Though I identify them as “Chinese” and “non-Chinese” for convenience, their perceptions and responses should not be taken as prototypical for others who share their national or linguistic backgrounds.  4.11 My position as a researcher When I began this project, I very much saw myself as a “foreign teacher” in China, a native English speaking teacher (NEST) working in an English as a Foreign Language (EFL) context, since I had just spent two years in that job. However, I felt and feel affinity with a number of different communities, and this has changed and expanded as I have gone through my doctoral studies, taught in Canada, and carried out research and data collection in China. A number of academic identities resonate with me: I am an applied linguist, an English language teacher, a (former) “foreign teacher,” a NEST who has worked in an EFL context, a PhD candidate, and an emerging scholar anxiously thinking about job prospects. I also see myself as having affinities with Chinese English teachers and other scholars who do research about English in China, as well as people who do research on world Englishes, ELF, and L2 writing. In terms of the identities and affinities I bring to this research project, I have a kind of insider/outsider status that has been both useful and limiting. I am not Chinese, nor am I fluent in the Chinese language, and so remain an obvious outsider when I am in China. I will never be a “Chinese English teacher” in the way that my Chinese friends and colleagues are. This outsider status has led me to make certain decisions in this project: choosing to conduct interviews in English, for example, and choosing to be based at the western-style JVU while I carried out my data collection. However, I have experienced living and working in several different settings in   135 China and am familiar with scholarship and research on language in China. This insider status has helped me to build rapport with research participants, to find participating institutions and individuals in the first place, and to find my place in a scholarly conversation. Above all, my insider/outsider status has perhaps given me a unique perspective from which to observe sociolinguistic issues of English in China. Another area that may be of interest is my understanding of my own ideological position vis-à-vis variation from standard written English. Obviously, I took pains to distance myself from the concept of “error” to an almost unrealistic degree during the study; in the instructions to the AJT I even wrote that the task was “not necessarily” to find and correct “errors” in the text, even though I wanted the participants to select language that was “unacceptable,” which begins to sound like a politically correct euphemism for “error.” Despite the occasional awkwardness of describing language use this way, I think it captures the tension I – and perhaps other – scholars and teachers of writing find myself in; I have training in a perspective which views language use as a fluid, changing, social practice, and I am very sympathetic to the translingual and other perspectives influenced by sociolinguistics. However, I also teach second language writing and I frequently tell students they are using language incorrectly, pointing out “mistakes” and “errors” like any other teacher who does not sympathize with alleged “progressive” perspectives. (This is dealt with in the conclusion of Heng Hartse & Kubota, 2014.) I am interested in pursuing the possibilities that emerge when teachers and researchers alike reserve judgment on unconventional language use, though I freely acknowledge that most of us, myself included, are likely to revert to “common sense” notions of error when the subject comes up in everyday practice, and this seems natural and unavoidable to me.   136 4.12 Conclusion In this chapter, I have discussed the research design, recruitment, participants, and methods of data collection and analysis for the study. Although each research question approaches the data from a slightly different angle, the questions are related and by answering them, this study will explore and illuminate the ways in which both Chinese and non-Chinese participants make their judgments of (un)acceptability, and some general differences in their approaches to the task. The following three chapters will detail the findings for each of the three research questions described above.     137 Chapter 5: What gets marked: Differences and similarities between Chinese and non-Chinese participants in the AJT 5.1 Introduction  This chapter is an analysis and discussion of the results of the acceptability judgment task (AJT) from a “bottom-up” perspective (Research Question 1); that is, rather than starting with a priori theoretical notions to guide the investigation of which types of “errors” in written English are noticed by the participants, or what “features” of Chinese English (Xu, 2010) or English as a Lingua Franca (Seidlhofer, 2004) are noticed by them (see Chapter 6), it simply looks at all “chunks” that were designated as unacceptable by any participants and describes chunks with which there is a high degree of consensus (or lack thereof) between the Chinese and the non-Chinese groups.  The main goal of this chapter is to identify both similarities and differences in the chunks (and types of chunks) that Chinese and non-Chinese participants mark as unacceptable, as well as the reasons they chose those chunks. I begin with a description of the “big picture” – an overview of the results, quantitative and qualitative, briefly describing the total number of chunks selected, explaining notable features of the overall set of data, and explaining procedures for reducing the data to be more manageable for analysis. Next, I look at chunks which were marked by a majority of both the Chinese and non-Chinese groups and describe them. Finally, I look more specifically at those chunks which were more marked by one or the other group and describe unique trends in the pattern of each group’s most marked chunks. 5.2 Overview of chunks  The total number of chunks identified as unacceptable in this study by participants was 748. From the beginning of the analysis, it was clear that there was not a high level of consensus   138 between the two groups: of the 748 chunks, 304 of them (41%) were only unacceptable to one or more participants from the Chinese group, while 202 (27%) were only unacceptable to one or more of the non-Chinese group, leaving a total of 241 (32%) which were common to at least one participant from each group.   Perhaps the most notable initial finding is that nearly half of the chunks – 359 of the 748, or 48% — are instances in which only one of the 46 participants labeled a particular chunk as unacceptable. Even if we add the next lowest level of consensus – 112 chunks with only 2 participants rejecting them – we get a total of 471/748, or 63% of chunks, which have only one or two participants deeming them unacceptable. Of course, it is important to remember that participants were not asked to look for every possible use of language they deemed unacceptable, but those they select are meant to represent those they deem the “most unacceptable” – i.e., the “top ten” language uses in the essay they objected to – to make the data more manageable and help me identify any possible pattern or trends.  In contrast, only 18 chunks, or less than 2.5% of all chunks, were marked as unacceptable by over half (at least 23/46) of all participants. Even if we were to look for a more modest level of consensus – for example, chunks which at least 20% of all participants (10/46) agreed were unacceptable — this is still only 73/748, or a little less than 10% of the total number of chunks.   These preliminary findings already point to the deeply subjective nature of acceptability judgments in writing, which should give us pause when we think about traditional monolithic definitions of good writing, error, well-formed sentences, proper word choice, and related concepts. If nearly half of the “unacceptable” chunks are unacceptable to only one in forty-six participants, it seems clear that the source(s) of unacceptability are not black and white, but that unacceptability must be produced in the reader’s encounter with the text in relation to the task   139 they are given (for example, an AJT like this one, or the marking of a student’s paper in an instructional setting). While some previous studies (Hyland & Anan, 2006; Wall & Hull, 1989) showed that the process of making judgments of language use is idiosyncratic, they did so with a single text. By looking at how chunks were marked as unacceptable across seven texts, the present study confirms even more decidedly that there is likely to be little agreement between readers of a text on what constitutes unacceptable language use. 5.3 Comparison of groups’ responses to chunks 5.3.1 Introduction  The rest of this chapter focuses on comparison of how the participants responded to specific chunks. There are three sections. The first, which I refer to as “high consensus” chunks, includes those chunks which were marked by more than 50% of the Chinese group and more than 50% of the non-Chinese group. The next two sections are what I call “differing priorities” for the Chinese and non-Chinese groups: those chunks that stood out as more clearly marked by one group than the other, each including 1) chunks which were chosen by over 50% of one group but less than 50% of the other, and 2) chunks which were chosen by at least 20% of one group but by no one in the other group. Each of these categories is analyzed below, using descriptive statistical data from the AJT, as well as the participants’ comments from the AJT and relevant excerpts of follow-up interviews.  (Because there are more chunks in the “differing priorities” categories than can easily fit on a page and be readable, these can be found in Appendices A and B.)  Previous studies in error gravity and comparisons of NES/NNES reactions to L2 writing have suggested that NNESs tend to focus more on “rule violations” while NESs focus on “meaning” or “intelligibility” when choosing what to reject (e.g., Hughes & Lascaratou, 1982; Hyland & Anan, 2006; Sheory, 1986). The findings below, however, show that in this open-  140 ended task, both groups appeared to place a high priority on rule violation (including syntax and/or collocation as well as (potential) typographical mistakes). In terms of “differing priorities” for determining unacceptability, I highlight two for each group. For the Chinese group, this was 1) rejecting what they perceived as Chinese-influenced syntax, and 2) rejecting collocations that conflicted with a narrow interpretation of words’ semantic content. For the non-Chinese group, this was 1) a dislike of “dictionary words,” or writers’ use of unfamiliar or obscure lexical items, and 2) rejection of the use of certain discourse markers, sometimes labeled by the non-Chinese participants as clichéd or incorrectly used.   5.3.2 High consensus: Chunks marked by 50% of both groups  Nine chunks were marked by more than 50% of both groups (that is, over 15 participants in the Chinese group and 8 in the non-Chinese group). These included six issues related to rule violations and three potential or probable typographical errors. (I should note that I did not see these as “typos” when I first read the essays, but the consensus of the participants suggested many of them viewed them this way; see below.)  These two categories are described below using representative examples of each. 5.3.2.1 Rule violations Table 5.1 shows the chunks in what I call the “rule violation” category and how often they were chosen by each group, while Table 5.2 shows the same chunks along with representative comments from those participants who selected the chunk. Below the tables, I highlight several cases in this category.      141 Table 5.1 “Rule violation” chunks       ID Chunk (relevant section in bold text) Chinese (%) Non-Chinese (%) Total (%) D066 Sometimes we can’t assure ourselves won’t forget everything. 22 (73%) 11 (69%) 33 (72%) C007  For the purpose of satisfying the people's need, with developing quickly in construction, the government offers lots of funds to build theaters, sports stadiums, etc. 22 (73%) 9 (56%) 31 (67%) E028 Besides, China is a large family with more than 1.3 billion people, and the growth of the service sector, the increasing demand for skilled workers which adds pressure to the intensifying competition. 19 (63%) 12 (75%) 31 (67%) A074 So degree certificates are a most factor to find job. 18 (60%) 9 (56%) 27 (59%) D005 However, there are still some arguments thinking that the overuse of E-dictionaries might have more disadvantages than advantages for our English learning. 19 (63%) 8 (50%) 27 (59%) G006 Nowadays, we are using computers more often, computers are becoming more and more important in our daily life, and some people would say that their life and work can not going on without computers. 16 (53%) 10 (63%) 26 (57%)   142 Table 5.2 “Rule violation” chunks and representative comments   ID Chunk Rule violated Representative comments D066 Sometimes we can’t assure ourselves won’t forget everything. Reflexive pronoun used as subject “Misuse, sounds like Chinese” (SIC1)  “Wrong word: maybe she means ..we can’t be sure we won’t…” (CAN7) “Missing something – interfering with meaning – we can’t ensure we won’t forget things?” (JVU2) “Grammatical mistake: two verbs in the sentence. Beside, “assure” cannot be used this way. Should be “we cannot ensure that we remember everything.”” (NKU11) C007  For the purpose of satisfying the people's need, with developing quickly in construction, the government offers lots of funds to build theaters, sports stadiums, etc. Use of a verb phrase in a prepositional phrase where a noun phrase is preferred (i.e., “with the quick development of construction”) “Chinese students’ common mistake of preposition phrase. “ (ATC7) “Confusing as written, I suggest ‘by quickly developing construction projects’” (CAN2) “this would be better as a noun phrase” (JVU6) E028 Besides, China is a large family with more than 1.3 billion people, and the growth of the service sector, the increasing demand for skilled workers which adds pressure to the intensifying competition. Compound-complex sentence with no main verb in the second clause. “this is not a sentence.” (SIC5) “The sentence is not grammatically correct. It’s better to say: the growth of the service sector and the increasing demand for skilled workers add pressure on…” (NKU6) “awkward and unclear” (CAN8) “sentence structure” (JVU1) A074 So degree certificates are a most factor to find job. Misuse of “most” as an adverb – it should be modifying an adjective, not a noun “Most what?” (NKU1) “Awkward: word missing? ‘important’ maybe?” (CAN8) “Meaning is unclear – the most important factor in finding a job?” (JVU2) D005 However, there are still some arguments thinking that the overuse of E-dictionaries might have more disadvantages than advantages for our English learning. Collocation, or attributing agency to something that has no agency.  “Arguments do not think.” (JVU7) “‘Arguments’ cannot ‘think’!” (ATC9) “Anthropomorphism” (CAN8) “we never say arguments thinking what, but I frequently saw Chinese students tend to use this, because they are not used to English” (NKU9)  G006 Nowadays, we are using computers more often, computers are becoming more and more important in our daily life, and some people would say that their life and work can not going on without computers. Verb form following a modal (“can”) would normally be the base form of the verb, not the present progressive “can + base form” (CAN3) “We use the original form of a verb after the modal verb ‘can.’” (ATC5) “Grammatically it should be ‘can not go.’” (NKU3) “wrong verb form, though meaning still clear” (JVU6)    143 As Table 5.1 shows, most of the chunks marked by a majority of both groups are identifiable as violations of grammatical rules of English – usually syntax – as opposed to, for example, informal “violations” which do not actually violate the linguistic structure of the language (the latter of which are often complained about in public discourse, according to Milroy & Milroy, 2012). Not surprisingly, it seems the English language teachers in this study are more apt to recognize uses which actually violate the rules of English syntax, as they understand them as in the case of Leonard and Gilsdorf (1990), whose participants were more likely to be distracted by “basic sentence structure errors” (p. 145) than by usages which contradicted writing handbooks’ advice but were not strictly ungrammatical. (In fact, these types of basic “rule violation” comments are also common in the “differing priorities” groups discussed below). It is to be expected that many of the most frequently marked chunks would be violations of grammatical rules familiar to both native and non-native expert users of English, since “acceptability” is easily interpreted as a judgment of correctness at the level of grammaticality, especially in the this study which asked for judgments at the word or sentence level.  Most participants seemed to be in accord regarding the specific reasons each of these chunks was marked. The types of comments presented in Table 5.2 are representative, ranging from simple declarations like “wrong verb form” or simply “sentence structure” (implying a problematic sentence structure) to more detailed suggestions of what would make the chunk or sentence better (such as many of the comments on D066, the first chunk in the table). The high degree of rejection of these rule violation chunks, and the comments relating to grammar, shows the priority that both groups placed on rejecting usages that violate rules of English syntax. However, several of the Chinese commenters directly referenced Chineseness, whether this had to do with Chinese language (“sounds like Chinese,” SIC1) or Chinese students   144 (“Chinese students tend to use this,” NKU9). While references to Chinese are not plentiful for these chunks, they are mentioned by some Chinese commenters above, which suggests that some of them, at least, are “on the lookout” for Chinese influence in the essays. This is also suggested by some comments on those chunks which belong to the “differing priorities” section for the Chinese participants, discussed below. There are two chunks above that stand out for further scrutiny; the first is the “with developing quickly in construction” chunk (C007). While both groups reject the chunk for similar (grammatical) reasons, it seems likely that the Chinese teachers are more familiar with this type of clause in general. Here is one perceptive comment made by NKU8 in his original comment on the AJT:  The first problem is the meaning the phrase wants to convey. Here it might mean that “the infrastructure of the country has been developing quickly”, but this structure is very confusing. It can be changed to “with infrastructural construction developing quickly”. The second problem is even with this correction, the phrase seems to be incoherent with the preceding sentence. It may be changed into “and with the rapid development in the construction of infrastructure.”  NKU8’s suggestion that the student intends to write about the infrastructure of the country and its rapid development suggests that she is familiar with this kind of political/socio-economic discourse in student writing. It is common in contemporary Chinese academic writing at the university level to begin an essay with a reference to China’s rapid economic development; for example, 407 of the essays in the WECCL include the phrase “with the development,” and 52 include “with the rapid development.” So while the reactions to this chunk may not be substantially different for the different groups, the Chinese teachers may be more able to   145 recognize this as a variation of a common phrase rather than (only) a simple grammatical error.  The second case is that of “arguments thinking.” This is the only “rule violation” which may not be a primarily grammatical or syntactic rule violation; participants’ objections to “arguments thinking” can be seen as one of two types: first, as an inappropriate or uncommon collocation (“arguments thinking”), and second, as a semantic problem, an inappropriate “anthropomorphism” which attributes agency to a conceptual entity. Two of the commenters specifically referenced collocation, but the majority included comments along the lines of the first quoted in the table: “arguments cannot think.” Thus, “arguments thinking” is clearly a “violation” felt strongly by both Chinese and non-Chinese commenters, not unlike a sense of grammatical rule violation.  5.3.2.2 Possible Typos  The possibility of student writers making typographical errors seems familiar enough to most of the participants that many of them assumed some chunks were not cases of choosing the wrong word or misusing a word, but accidental typing mistakes. The three “possible typo” chunks are shown in  Table 5.3 and are described below.   Table 5.3 Possible typos        ID Chunk (relevant section in bold text) Chinese (%) Non-Chinese (%) Total (%) E002 Nowadays, with the rapid development of our society and economy, competition is becoming more and fierce, while cooperation, as a traditional idea, is losing its position in our society. 21 (70%) 11 (69%) 32 (70%) F067 This is quite a heavy burden to ordinary families and also will cause some students to compare living conditions with each other, which will lead to bad earning environment for a university. 21 (70%) 10 (63%) 31 (67%) C051 According to the official statistics, nearby half of children in rural regions are unable to finish their elementary education and some of them fail to go to school because of lack of money. 17 (57%)) 9 (56%) 26 (57%)   146 “More and fierce” (E002) was one of the most widely marked chunks in the study, and many of the comments point to readers interpreting this as a typo or a forgotten word, assuming that the writer intended to write either “more fierce” or “more and more fierce.” (In fact, I later discovered that this may have been a typo on my part, as the original essay from the WECCL corpus reads “more and fiercer,” so the loss of the “r” must have occurred when I cut and pasted the text to my own AJT document.) As ATC4 wrote, “This writer probably forgot to write the second ‘more’ due to his or her carelessness.” While only one specifically called this chunk a typo, eight of the commenters assumed that the writer intended “more and more fierce,” and four assumed he or she meant “more fierce.” While one other commenter (CAN8) speculated as to whether “more” suggested that the writer was using “competition is becoming more” to mean “competition is increasing,” the majority of the remaining comments were either simple declarations of confusion or incorrectness (e.g., CAN1: “awkward phrase,” SIC2: “more cannot be used this way.”) or suggestions to rewrite the sentence with a meaning similar to “more fierce.”  Another one of the most widely marked “high consensus” chunks, “bad earning environment” (F067) is also a candidate for a likely typo. Of the 31 participants who marked it, 12 of them commented either that it was likely a typo (SIC2: “Clerical error here I think,” ATC3: “Maybe a typo here?”) or that the writer probably intended to write “learning” (NKU11: “Should be learning,” CAN1: “Letter deletion. Should be ‘learning.’”). There were 12 more participants who simply stated that “earning” or “earning environment” was confusing or hard to understand (CAN6: “Meaning unclear,” NKU3: “It does not make sense,” NKU10, “Inappropriate use of earning.”) Finally, a few participants suggested changing the word to express a different meaning, or tried to come up with a way to change the sentence to be more related to the   147 university’s financial situation. It seems, though, that the majority of participants interpreted “bad earning environment” as a typo or possible typo. For the “nearby half” chunk (C051), 17 of the 26 participants marking it assumed that this was a spelling mistake and/or that the writer had intended to write “nearly”, or simply mentioned that “nearly” was the preferred word here. The remaining commenters mentioned word choice or grammatical inaccuracy, but taken together, the vast majority seemed to assume this was a matter of simply confusing two similar words, possibly as the result of a typo. As JVU2 wrote: “Easy mistake to make as [Microsoft] Word does not pick it up – she means ‘nearly’. One could still bypass it and get her meaning as ‘nearly.’” 5.3.2.3 Summary  There were relatively few chunks which were widely marked by both the Chinese and non-Chinese participants in this study; of the nine that were marked by at least 50% of both groups, three were primarily interpreted as typing mistakes, while five were obvious violations of the rules of English grammar. The one exception was the “arguments thinking” chunk, which many participants rejected because “arguments cannot think.” The findings – that the most widely marked chunks that both groups agree on are rule violations and typos – are not surprising, as these are perhaps the most “obvious” types of errors or variations from accepted standard written English that tend to be noticed by teachers or other readers. In the case of the apparent typos, although they are sometimes dismissed as simple “carelessness,” the fact that they were among the most marked chunks does suggest that both groups of participants are on the lookout for language that obscures meaning due to being grammatically or semantically irrelevant in context. Rule violations are among the most widely marked chunks in the “differing priorities” sections below as well, but new and different categories do emerge.   148 5.3.3 Differing priorities for the Chinese group  There were a total of twenty-one “priority” chunks for the Chinese group. The majority were also rule violations, though two of these were perceived to be “Chinglish” and will be described in more detail below. The next largest category was collocations marked due to semantic interpretation of lexical items (six), also described below. The remaining chunks are not described in detail, but can be viewed in Appendix B. These included some which involved participants disagreeing with the writer’s word choice, one perceived typo, and one chunk perceived as “wordy.” 5.3.3.1 Perceived Chinglish As mentioned above, many of the priority chunks for the Chinese group were related primarily to conventional grammar or syntax rule violations. The apparent reasons for marking these chunks were similar to those in the “high consensus” category; we might e