UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Commitment and engagement : the role of intonation in deriving speech acts Heim, Johannes M. 2019

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata

Download

Media
24-ubc_2019_november_heim_johannes.pdf [ 3.27MB ]
Metadata
JSON: 24-1.0380772.json
JSON-LD: 24-1.0380772-ld.json
RDF/XML (Pretty): 24-1.0380772-rdf.xml
RDF/JSON: 24-1.0380772-rdf.json
Turtle: 24-1.0380772-turtle.txt
N-Triples: 24-1.0380772-rdf-ntriples.txt
Original Record: 24-1.0380772-source.json
Full Text
24-1.0380772-fulltext.txt
Citation
24-1.0380772.ris

Full Text

COMMITMENT AND ENGAGEMENT:  THE ROLE OF INTONATION IN DERIVING SPEECH ACTS   by  Johannes M. Heim  1st State Exam, Eberhard-Karls-Universität Tübingen, 2012  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (Linguistics)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)   August 2019  © Johannes M. Heim, 2019  ii  The following individuals certify that they have read, and recommend to the Faculty of Graduate and Postdoctoral Studies for acceptance, the dissertation entitled: Commitment and Engagement: The Role of Intonation in Deriving Speech Acts  submitted by Johannes M. Heim in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Linguistics  Examining Committee: Martina Wiltschko Supervisor  Lisa Matthewson Supervisory Committee Member  Hotze Rullmann University Examiner Roberta Bellerin University Examiner  Additional Supervisory Committee Members: Michael Rochemont Supervisory Committee Member Carla Hudson Kam Supervisory Committee Member  iii  Abstract This dissertation provides the ingredients to solving a long-standing problem in linguistics: What is the relation between the form of an utterance (clause type) and its function (speech act)? I argue that intonation is a key to solve this Speech Act Problem. Speech acts need to be decomposed into two conversational variables, which are encoded by the shape of the sentence-final contour in North-American English – specifically, its excursion and its duration. Speaker Commitment, the first variable, captures the degree to which the speaker publicly commits to the truth of a proposition. Addressee Engagement, the second variable, captures the degree to which the speaker engages the addressee to resolve any issue under negotiation. My analysis overcomes a traditional divide between those accounts that focus on propositional attitudes and those that focus signaling in/completeness with sentence-final intonation. Both functions are incorporated in my analysis. Furthermore, my account can model similarities of speech acts across different clause types.  Chapter 1 introduces the speech act problem and surveys existing solutions. Chapter 2 reviews problems created by analyzing intonation as a modifier of a clause-type-based notion of speech acts and by neglecting the rich variation in form and function of sentence-final intonation. Chapter 3 lays out my own proposal by motivating and explicating the variables of Commitment and Engagement. Chapter 4 provides empirical evidence for an intonational encoding of these variables. Chapter 5 uses the ingredients of this proposal to model several speech acts and their intonational variation. Chapter 6 concludes and points to areas of future research.   Overall, this provides us with a new typology of speech acts which is grounded in how the speaker and the addressee relate to the propositional content of an utterance. Empirically, I demonstrate that speech acts express fine-graded attitudes and intentions by the shape of the sentence-final contour. Analytically, I demonstrate that intonation encodes two independent variables that capture the conversational properties of several different speech acts. Theoretically, I demonstrate that speech acts are epiphenomenal at the most basic level: even questions and assertions need to be decomposed into their degrees of Commitment and Engagement.   iv  Lay Summary Questions and statements seem intuitive labels for sentences. Linguists have often relied on word order to discriminate them. This dissertation points to the limitations of such an approach and deconstructs both questions and statements. For this, I use intonation as a window into the functions of questions and statements independent of their word order. Sentences are decomposed into two variables which describe the speakers’ attitudes towards what they are saying and their expectation towards the addressee about how to respond to them. I refer to the former as Speaker Commitment and to the latter as Addressee Engagement. Changes in either of these variables lead to a change in the shape of the final part of the intonational contour. I demonstrate this with a complex perception experiment. This dissertation therefore invites us to rethink the traditional concepts of questions and statements by involving the discourse participants in their meaning. v  Preface The work presented in this thesis is original research conducted by the author, Johannes M. Heim. It is based on a framework developed in a number of co-authored papers (see below) and builds on some of the key insights about interactional language in Wiltschko (2014; 2017). The focus on intonation and the conception of Commitment and Engagement as variables that decompose speech acts is my own but builds on previous work of Dr. Martina Wiltschko’s ‘Eh Lab’ (http://syntaxofspeechacts.linguistics.ubc.ca).  The design of the experiments reported in Chapter 4 and 6 was developed by Johannes M. Heim with guidance from Drs. Carla Hudson Kam and Molly Babel. Data collection was performed by Johannes M. Heim and a research assistant under his supervision, Gagan Cheema. All data analyses were conducted by Johannes M. Heim, with advice from Drs. Carla Hudson Kam and Alexis Black.  The experimental research undertaken for this dissertation is covered under ethics approval for the project “Canadian ‘eh?’” (H12-01864) granted to the supervisor, Dr. Martina Wiltschko. Research presented in this dissertation was first introduced or published at the following venues: The Dialogical Speech Act Model developed in Chapter 3 was first introduced to the audience of a workshop at the 39th DGfS meeting in Saarbrücken, Germany, in March 2017. It is currently in print in a special issue of Linguistische Arbeiten with the title Prosody in syntactic encoding, edited by G. Kentner & J. Kremers (eds.). The main focus of the presentation was to model a syntactic integration of intonation, which is not discussed in the present thesis. An earlier version of this model is published in Wiltschko & Heim (2016).  The experimental results reported in Chapter 4 were first introduced to the audience of the 4th Intonation Workshop at the University of Toronto in February 2019.  The expansion of the Dialogical Speech Act Model as presented in Chapter 5 was first introduced to the audience of the Given! workshop in honor of the late Dr. Michael Rochemont at the University of British Columbia in December 2018. vi  Table of Contents  Abstract ......................................................................................................................................... iii Lay Summary ............................................................................................................................... iv Preface .............................................................................................................................................v Table of Contents ......................................................................................................................... vi List of Tables ................................................................................................................................ xi List of Figures ............................................................................................................................. xiii List of Symbols .......................................................................................................................... xvii List of Abbreviations ............................................................................................................... xviii Acknowledgements .................................................................................................................... xix Dedication ................................................................................................................................... xxi Chapter 1: Introduction ................................................................................................................1 1.1 Speech acts are acts of negotiation ................................................................................. 1 1.2 Introducing the Dialogical Speech Act Model ................................................................ 2 1.3 Naming conventions in this thesis .................................................................................. 9 1.4 Anchoring the proposal ................................................................................................. 10 1.5 Data and methods .......................................................................................................... 14 1.6 Roadmap ....................................................................................................................... 16 Chapter 2: Previous Solutions to the Speech Act Problem ......................................................19 2.1 Conventional mappings of forms and functions ........................................................... 19 2.2 The classic Speech Act Problem: direct mapping of clause type and SA .................... 20 2.3 Previous solutions to the Speech Act Problem ............................................................. 25 vii  2.3.1 Primary speech acts have homogenous form-function mappings............................. 25 2.3.2 Derived speech acts have heterogeneous form-function mappings .......................... 31 2.3.3 Division of labor between syntax and prosody ......................................................... 32 2.4 Previous descriptions of intonation ............................................................................... 36 2.4.1 Describing intonation: targets vs. configurations ..................................................... 36 2.4.2 The role of the final part of the contour .................................................................... 42 2.4.3 Summary of the existing descriptions of prosodic form ........................................... 45 2.5 Previous accounts of intonational meaning .................................................................. 46 2.5.1 Tune-based meanings................................................................................................ 50 2.5.1.1 Diversity in the British school .......................................................................... 50 2.5.1.2 The Holistic approach by Liberman and Sag .................................................... 53 2.5.1.3 Functional Meaning: Gussenhoven’s universal codes ...................................... 55 2.5.1.4 Summary of tune-based approaches to intonational meaning .......................... 57 2.5.2 Meanings of Tones .................................................................................................... 57 2.5.2.1 Diversity in the American tradition .................................................................. 58 2.5.2.2 Pierrehumbert & Hirschberg (1990): coherence and in/complete beliefs ........ 60 2.5.2.3 Bartels (1997) and Truckenbrodt (2012): primary speech acts ........................ 61 2.5.2.4 Pragmatic inferencing in Steedman’s work on intonation ................................ 66 2.5.2.5 Westera’s account of intonational compliance marking ................................... 69 2.5.2.6 Summary of tone-based approaches to intonational meaning .......................... 72 2.6 Speech act ontologies build on epiphenomena ............................................................. 72 2.7 Conclusion .................................................................................................................... 78 Chapter 3: Decomposing Speech Acts by Commitment and Engagement .............................80 3.1 Acts of negotiation ........................................................................................................ 81 viii  3.1.1 Revising the table analogy to reflect the dialogical character of speech acts ........... 83 3.1.2 Epistemic development as a key to the use conventions of four speech acts ........... 89 3.1.3 Modelling use conventions and conversational effects with the table analogy ...... 100 3.2 Negotiating with Commitment and Engagement ........................................................ 108 3.2.1 Motivating Commitment and Engagement ............................................................. 108 3.2.1.1 Commitment reflects the speaker’s propositional attitude.............................. 108 3.2.1.2 Engagement reflects the speaker’s intended effects ....................................... 115 3.2.2 Formalizing Commitment and Engagement ........................................................... 121 3.3 Commitment and Engagement compose the Dialogical Speech Act Model .............. 127 3.4 Comparison with earlier speech act models................................................................ 131 3.5 Conclusion .................................................................................................................. 136 Chapter 4: Encoding Commitment & Engagement ...............................................................137 4.1 Prosodic correlates of Commitment of Engagement in declaratives .......................... 138 4.2 Previous findings on meaningful prosodic variation .................................................. 140 4.2.1 Variation within the temporal dimension: pitch height and excursion ................... 140 4.2.2 Duration .................................................................................................................. 142 4.2.3 (In)dependence of duration and pitch ..................................................................... 145 4.2.4 Hypothesis and Predictions ..................................................................................... 146 4.3 Methods for investigating question 1 and question 2 ................................................. 147 4.3.1 Participants .............................................................................................................. 148 4.3.2 Materials ................................................................................................................. 148 4.3.3 Procedure ................................................................................................................ 151 4.4 Q1: Correlations of duration and excursion with Commitment .................................. 151 4.4.1 Instructions .............................................................................................................. 151 ix  4.4.2 Results ..................................................................................................................... 152 4.4.3 Discussion ............................................................................................................... 156 4.5 Q2: Correlations of duration and excursion with Engagement ................................... 159 4.5.1 Instructions .............................................................................................................. 159 4.5.2 Results ..................................................................................................................... 159 4.5.3 Discussion ............................................................................................................... 164 4.5.4 Prosody and SAs ..................................................................................................... 164 4.6 What about interrogatives? ......................................................................................... 168 4.7 Conclusion .................................................................................................................. 172 Chapter 5: Expanding the Dialogical Speech Act Model .......................................................174 5.1 Wh-interrogatives and (wh)-echoes ............................................................................ 175 5.1.1 Falling wh-interrogatives ........................................................................................ 176 5.1.2 Rising (wh)-echoes ................................................................................................. 179 5.1.3 Commitment & Engagement in wh-interrogatives and (wh-) echoes .................... 188 5.2 Choices in fall-rising declaratives and disjunctive interrogatives .............................. 192 5.2.1 Disjunctive interrogatives ....................................................................................... 193 5.2.2 (Rise-) fall-rising declaratives................................................................................. 194 5.2.3 Commitment & Engagement in disjunctive interrogatives and fall-rising declaratives  ................................................................................................................................. 197 5.3 Modified rise and plateau contour .............................................................................. 200 5.4 Prosodic variation ....................................................................................................... 204 5.5 Conclusion .................................................................................................................. 209 Chapter 6: Conclusion ...............................................................................................................211 6.1 Summary ..................................................................................................................... 211 x  6.2 Future research ............................................................................................................ 215 6.2.1 Additional cues that contribute to the encoding of speech acts .............................. 215 6.2.2 Relating the role of intonation in common ground management and content marking   ................................................................................................................................. 216 6.2.3 The relation of sentence-final intonation to earlier parts of the contour ................. 217 References ...................................................................................................................................220  xi  List of Tables Table 1.1: Naming conventions for the SAs discussed in this dissertation .................................. 10 Table 2.1: Direct mapping of clause type to SA ........................................................................... 20 Table 2.2 Contextual model of intonational meaning per Bartels (1997) [notations adapted] ..... 35 Table 2.3 Correspondence of nuclear tunes (British tradition) and tone combinations (AM) ..... 45 Table 2.4: Tune meanings in Palmer (1922) ................................................................................. 50 Table 2.5 Marked and unmarked tones in Halliday & Matthiessen (2004) .................................. 52 Table 2.6 Attitudinal meaning in O’Connor & Arnold (1973) ..................................................... 52 Table 2.7: Intonational Codes in Gussenhoven (2004) ................................................................. 55 Table 2.8: Tune meanings in Gussenhoven (2002) ...................................................................... 56 Table 2.9 Tune meanings in Pike (1945) ...................................................................................... 59 Table 2.10: Intonational meaning in Pierrehumbert & Hirschberg (1990) ................................... 60 Table 2.11 Overview of tune- and tone-meaning of Pierrehumbert & Hirschberg (1990) in Büring (2016) ............................................................................................................................................ 60 Table 2.12: Intonational meaning in Truckenbrodt (2012) and Bartels (1997) ............................ 62 Table 2.13: Intonational meaning as a result of combining Bartels (1997) and Truckenbrodt (2012)....................................................................................................................................................... 63 Table 2.14: Intonational meaning in Steedman (2000; 2008; 2014) ............................................ 66 Table 2.15 : Tonal morphemes in Steedman (2000; 2008; 2014)................................................. 66 Table 2.16: Tonal morphemes in Westera (2017) ........................................................................ 69 Table 2.17: I- and A-Maxims in Westera (2017) .......................................................................... 70 Table 2.18: Semantic approaches to variable, polarity, and alternative questions ....................... 74 Table 2.19: Clause typing based on semantics in Portner (2004) ................................................. 75 Table 3.1: Epistemic development in the context of falling declaratives ..................................... 90 xii  Table 3.2: Epistemic development in the context of rising interrogatives ................................... 91 Table 3.3: Summary of conventions of use .................................................................................. 99 Table 3.4: Overview of conversational effects ........................................................................... 107 Table 3.5: Diagnosing a knowledge asymmetry as a basis of Commitment .............................. 111 Table 3.6: Diagnosing a knowledge asymmetry as a basis of Commitment .............................. 114 Table 3.7: Commitment and CoA according to Beyssade & Marandin (2007) .......................... 116 Table 3.8: CoA and Commitment of High-rising declaratives ................................................... 117 Table 3.9: Summary of Engagement properties ......................................................................... 121 Table 3.10: Conversational properties of falling, high-rising, rising declaratives and rising interrogatives............................................................................................................................... 128 Table 3.11: Configurations of Commitment and Engagement (preliminary) ............................. 129 Table 3.12: Configurations of Commitment and Engagement (still preliminary) ...................... 129 Table 3.13: Conversational effects of Commitment and Engagement ....................................... 130 Table 3.14: Degrees and effects of Commitment and Engagement ............................................ 131 Table 3.15: Ingredients of different conversational models ....................................................... 133 Table 3.16: Issues and moves in the present account of primary and derived SAs .................... 134 Table 4.1: Mean duration and excursion of rises ........................................................................ 149 Table 4.2: Distribution of stimuli (by type, excursion and duration) ......................................... 150 Table 4.3: Q1 model of speaker confidence by pitch and excursion for critical items ............... 154 Table 4.4: Q1 model of speaker confidence by pitch and excursion for controls ...................... 156 Table 4.5: Q2 model of response expectation ratings by pitch and excursion for critical items 161 Table 4.6: Q2 model of response expectation ratings by pitch and excursion for controls ........ 163 Table 4.7: Significant main effects in Q1 & Q2 ......................................................................... 165  xiii  List of Figures Figure 1.1: Interaction of the Clause Type Convention and Fall/Rise Convention ........................ 3 Figure 1.2: Syntactico-prosodic division of labor........................................................................... 3 Figure 1.3: Remapping conversational effects (simplified) ............................................................ 6 Figure 1.4: Pragmatico-prosodic division of labor ......................................................................... 7 Figure 1.5: Prosodic variation between and within SAs ................................................................. 8 Figure 1.6: Syntactic integration of Commitment and Engagement (Heim & Wiltschko to appear)....................................................................................................................................................... 10 Figure 1.7: Architecture of universal categories (Wiltschko 2014) .............................................. 11 Figure 1.8: Extended universal spine ............................................................................................ 11 Figure 1.9: Presentation and acceptance phase of a proposition in Wiltschko & Heim (2016) ... 13 Figure 1.10: Epistemic development of a conversation in the context of a falling declarative .... 15 Figure 2.1: Conventionalized mappings of SAs ........................................................................... 19 Figure 2.2: Negotiating the cg in Farkas & Bruce (2010) ............................................................ 29 Figure 2.3: Splitting the negotiation table in Ettinger & Malamud (2013) .................................. 30 Figure 2.4: Projection at (almost) every level in Malamud & Stephenson (2014) ....................... 32 Figure 2.5 Different frameworks on a tonal configuration-to-target continuum .......................... 37 Figure 2.6: Tones and contours in Trager & Smith (1951)........................................................... 39 Figure 2.7: Possible tonal configurations in Pierrehumbert (1980) [notation simplified by JH] . 41 Figure 2.8: Form-function mappings according to the Clause Type Convention and the Fall/Rise Convention .................................................................................................................................... 47 Figure 2.9: Form and functions of intonation according to Bauman & Grice (2007: 14) ............ 48 Figure 2.10: Categorizing intonational meaning .......................................................................... 49 Figure 2.11: Contradiction contour (in black) with optional contrast (in grey) ............................ 54 xiv  Figure 2.12: Tilde contour ............................................................................................................ 54 Figure 2.13: (a) Hat contour (with optional rise) and (b) surprise/redundancy contour ............... 54 Figure 3.1: Interaction of the Clause Type Convention and the Fall/Rise Convention ................ 82 Figure 3.2: Revising the table analogy (preliminary) ................................................................... 84 Figure 3.3: Revising the table analogy (still preliminary) ............................................................ 85 Figure 3.4: Building a conversational model (final) ..................................................................... 88 Figure 3.5: Epistemic development in the context of high-rising declaratives ............................. 95 Figure 3.6: Epistemic development in the context of rising declaratives ..................................... 98 Figure 3.7: Conversational effects of falling declaratives .......................................................... 101 Figure 3.8: Conversational effects of rising interrogatives ......................................................... 102 Figure 3.9: Conversational effects of a high-rising declarative .................................................. 103 Figure 3.10: Conversational effects of a rising declarative ........................................................ 105 Figure 3.11: Commitment and Engagement at the table of negotiation. .................................... 124 Figure 4.1: Encoding Commitment and Engagement ................................................................. 137 Figure 4.2: Encoding of Commitment and Engagement for rises (left) and falls (right) ............ 139 Figure 4.3: Manipulations of the critical stimuli ........................................................................ 149 Figure 4.4: Sample item with low and high excursion for short (left), medium (mid), and long (right) duration ............................................................................................................................ 150 Figure 4.5: Mean response ratings for speaker confidence for critical items (error bars represent ±SE) ............................................................................................................................................ 153 Figure 4.6: Mean response ratings for speaker confidence for controls (error bars represent ±SE)..................................................................................................................................................... 155 Figure 4.7: Mean response ratings for response expectation for critical items (error bars represent ±SE) ............................................................................................................................................ 160 xv  Figure 4.8: Mean response ratings for response expectation for controls (error bars represent ±SE)..................................................................................................................................................... 162 Figure 4.9: Encoding of Commitment and Engagement for rises (left) and falls (right) ............ 168 Figure 4.10: Duration of high-rising declaratives, rising declaratives, and rising interrogatives170 Figure 4.11: Excursion of rising declaratives, high-rising declaratives, and rising interrogatives..................................................................................................................................................... 171 Figure 4.12: Encoding of Commitment and Engagement (preliminary) .................................... 172 Figure 5.1: Conventionalization and variation of Commitment and Engagement ..................... 175 Figure 5.2: Epistemic development in the context of falling wh-interrogatives ......................... 178 Figure 5.3: Epistemic development in the context of wh-echoes ............................................... 182 Figure 5.4: Epistemic development in the context of non-wh-echoes ........................................ 183 Figure 5.5: Epistemic development in the context of non-contrastive echoes ........................... 184 Figure 5.6: Epistemic development in the context of rising declaratives ................................... 185 Figure 5.7: Epistemic development in the context of wh-echoes following an interrogative .... 186 Figure 5.8: Epistemic development in the context of non-wh-echoes following an interrogative..................................................................................................................................................... 186 Figure 5.9: Negotiating missing information with falling wh-interrogatives ............................. 189 Figure 5.10: Negotiating missing information with echoes ........................................................ 191 Figure 5.11: Growing the Dialogical SA Model to incorporate wh-interrogatives and echoes . 192 Figure 5.12: Epistemic development in the context of disjunctive interrogatives ...................... 194 Figure 5.13: Epistemic development in the context of fall-rising declaratives .......................... 197 Figure 5.14: Conversational effects of disjunctive interrogatives .............................................. 199 Figure 5.15: Conversational effects of fall-rising declaratives ................................................... 199 Figure 5.16: Growing the Dialogical SA Model to incorporate falling interrogatives and fall-rising declaratives ................................................................................................................................. 200 xvi  Figure 5.17: Epistemic development in the context of a modified rise inside a falling declarative..................................................................................................................................................... 201 Figure 5.18: Epistemic development in the context of a continuation in alternative questions . 202 Figure 5.19: Epistemic development in the context of a continuation in alternative questions . 203 Figure 5.20: Dialogical SA Model including conversational effects .......................................... 204 Figure 5.21: Dialogical model including variation in form and function ................................... 205 Figure 6.1: From a form-based model of SAs (left) to a function-based model of SAs (right) . 212 Figure 6.2: The revised table analogy including aspects of presentation and acceptance .......... 213 Figure 6.3: Syntactic integration of intonation encoding Commitment and Engagement .......... 216 Figure 6.4: Elicited intonational profiles of rising and falling declaratives ............................... 218  xvii  List of Symbols  a rise with high pitch excursion  a rise with low pitch excursion  a fall  Δ the difference between two measures  an acceptable sentence within a given context  an unacceptable sentence within a given context {…} a context provided to illustrate the natural use of a given sentence xviii  List of Abbreviations AM Autosegmental-metrical cg Common Ground CI Confidence Interval CoA Call on Addressee DEC Declarative INT Interrogative M Mean p proposition SA Speech act SAI Subject-auxiliary-inversion SD Standard deviation SE Standard error SFI Sentence-final intonation SSB Set of shared beliefs SSP Set of salient propositions x a variable of any semantic type XP any phrase xix  Acknowledgements Let me tell you a secret: Reading acknowledgements was among the things that kept me going. Acknowledgments put a face to a thesis. They share a story of how it came about. They give some insight into the losses and gains accrued in the process of writing a dissertation. If your motivation for reading these acknowledgments is the same, I’d like to tell you: keep going – it’s worth it! Three other things kept me going: my faith, my family, and my supervisor. I mention them first and then continue with the people inside and outside of academia who contributed to this thesis. I begin with my faith in Jesus: it grounds me, it inspires me, and it fulfills me like nothing else ever will. It was tested during this time for sure. But writing this thesis has shown me that fulfillment doesn’t come with accomplishments. It comes with how we fill our time in their pursuit.  My family has grown with the progress of my work. I started the PhD program only five days after marrying my wife. When I completed the revisions, my daughter had just started kindergarten and my son was still turning nights into days. My family, too, was tested during this time. My wife sacrificed a lot. There were countless times when she kept our family going while I was away on conferences or completely immersing myself in my writing. I am infinitely grateful for everything she did to support me. I feel immensely blessed to have you in my life, Sara, Leila and Nehemia! I am deeply indebted to my supervisor, Martina Wiltschko, who has been a source of support and inspiration well beyond the realm of this thesis. When I asked a fellow student to describe her mentoring style, she said that she would perfectly tailor it to her students’ needs. I couldn’t agree more: Martina, you were the Doktormutter I needed at every stage of this journey. You knew when to push me and when to go easy on me. You believed in my work even when I didn’t. Thank you! I want to acknowledge the invaluable support I received from my advisory committee, Carla Hudson Kam, Lisa Matthewson, and Michael Rochemont. You all were willing to work with me on this ambitious project and to help me contain the problem, refine the proposal, and communicate the potential of my solution. I am grateful for Carla’s eye for coherence and logic in presenting experimental data, Lisa’s detailed comments that have shaped much of the conception of my proposal, and Michael’s thoughtful suggestions that will accompany me beyond the submission of this thesis. Michael’s passing in Summer 2018 was by far my greatest loss during these years. xx  I would like to thank Hotze Rullmann and Roberta Bellerin for serving as university examiners and Kenneth Reader for chairing my defense. You all contributed to a fair and engaged discussion of my thesis and provided me with many thoughtful questions. I also want to thank Nancy Hedberg for serving as an external examiner, for her detailed review and for her wonderful feedback.  I am thankful for the training and support from various faculty at UBC’s Department of Linguistics, including Henry Davis, Bryan Gick, Rose-Marie Déchaine, Kathleen Currie Hall, and Murray Schellenberg. I also want to name Laurel Brinton from the English Department for her support in my teaching. Special thanks go to Gunnar Hanson and Molly Babel for their support as grad advisors. Molly also contributed to this thesis with helpful advice on experimental design and providing me with the necessary equipment – my experiments would have been impossible to run without her support. The same applies to Gagan Cheema who diligently supported me in conducting them. I also thank Claudia, Edna, Vicky and Will for making my concerns their own.  I am grateful for the stimulating conversations related to this project with scholars outside of UBC. I want to explicitly name Alex D’Arcy, Peter Culicover, Emily Elfner, Julia Hirschberg, Wolfram Hinzen, Manfred Krifka, Sophia Malamud, Hubert Truckenbrodt, Michael Wagner, Matthijs Westera, and Lavi Wolf for their helpful feedbacks. I also want to express my gratitude to Susanne Winkler, who inspired my earliest research in linguistics and set me on my path to make it a career. I would like to thank my fellow graduate students, including those from Martina’s eh lab, Thesis Anonymous, and my locus support group - especially Adriana, Anne, Emily, Elise, Hermann, Megan, Sonja, Oksana, Valentina and Yifang. You all gave this department a special place. I also want to acknowledge the importance of Trinity Central during the first two years in Canada. This church helped us settle in Vancouver and provided a wonderful community in which we learned so much. I am particularly thankful for the friendships that survived our departure. Vancouver would have been a lonely place without Alex & Nathan, Benny, Gelareh, Andy & Sue. Special thanks go to our landlords during the last stretch of our time in Vancouver, Gita and Richard. You have become family not only to our children, but also to Sara and me. You gave us a home that became more than a roof over our heads. Your support has been invaluable! I want to conclude these acknowledgements with giving thanks to my parents and my in-laws. Your prayers, thoughts, gifts, visits, and homes have been our safety-net through all these years. xxi  Dedication  To my Maker.1 Chapter 1: Introduction 1.1 Speech acts are acts of negotiation What’s the point of talking to each other? It’s certainly not only to exchange information. Oftentimes, a conversation is more like a negotiation. This negotiation comes with an agenda: namely to enlarge our set of shared beliefs. This is commonly referred to as the Common Ground (Stalnaker 1978; cg). To realize this agenda, we not only communicate informative content, we also communicate how we relate to this content. Human conversation is interactional and dynamic. This is true for the most basic phenomena in human conversations: questions and assertions. They are basic because with them, the course of the negotiation is straightforward. We have economized the conversational protocol by defaulting to accepting assertions and resolving questions. I refer to them as primary speech acts (SAs). Other SAs, which I refer to as derived, have a more complex protocol. Their function is derived from a more complex mapping of form and function. Due to their complex conversational effects, derived speech acts require more negotiation effort.  In this thesis, I develop an account of the conversational effects of questions, assertions, and SAs that fall in between them. A key function for negotiating shared beliefs in North-American English is ascribed to sentence-final intonation (SFI). SFI serves to communicate the speaker’s attitude toward the content of an utterance and their intention of how the conversation should continue. Negotiating shared beliefs is nuanced, and so my account of conversational effects needs to span a large set of scenarios. This thesis incorporates nine different functions of SFI for negotiating shared beliefs. These functions can all be characterized by two parameters, Speaker Commitment and Addressee Engagement, which in turn have prosodic correlates. For rising SFI, the degree of a public commitment of the speaker to the truth of a proposition correlates with the duration of the SFI; the degree of the Engagement of the addressee to resolve this issue correlates with the pitch excursion. Under this proposal, every contribution to the negotiation is dialogical: The Dialogical SA Model presented here includes a speaker- and an addressee-message in every turn. This chapter begins with a brief overview of the complexity of the conversational effects of different types of questions and assertions and how I propose to revise previous treatments of their effects. In section 1.2, I present all the essential ingredients to this dissertation. In Section 1.3, I anchor my proposal in a framework of analysis developed over recent years as an extension of the Universal Spine Hypothesis (Wiltschko 2014). In Section 1.4, I review several models that attempt 2 to capture similar conversational effects. In section 1.5, I present the methodology employed. In section 1.6, I explain how the subsequent chapters unfold my proposal. 1.2 Introducing the Dialogical Speech Act Model Questions and assertions are primary means of human conversation: they are universally attested (Saddock & Zwicky 1985), largely grammticalized (e.g. Huddleston 1984), and uncontroversially accepted as basic ingredients of human conversation (e.g. Stalnaker 1978; Roberts 1996; Ginzburg 1996). Our understanding of how conversations unfold rests on our understanding of these primary SAs. In English, their encoding is conventionally characterized by clause types and sentence-final intonation (SFI).  Clause Type Convention:  declarative (DEC) = assertion       interrogative (INT) = question  Fall/Rise Convention:  falling intonation () = assertion      rising intonation () = question Despite numerous counter-examples, the Clause Type Convention and Fall/Rise Convention are ubiquitous in discussions of questions and assertions (see e.g. Stalnaker 1978; Huddleston 1984; Pierrehumbert & Hirschberg 1990; Truckenbrodt 2012). Depending on how these conventions combine or interact, the interpretation changes. I refer to any deviation from one of the conventions in (1) and (2) as the Speech Act (SA) Problem. The SA Problem arises whenever we have a heterogeneous configuration of prosody and clause type, i.e. when prosodic form and clause type map onto different SAs according to the Clause Type Convention and the Fall/Rise Convention. As an illustration, consider the conversational effects of the following utterances and their contexts of use. The examples in (3) show how the Clause Type Convention and Fall/Rise Convention hold for falling declaratives (DEC) in (3), and for rising interrogatives (INT) in (3). This is a homogenous configuration of SFI and clause type. The examples in (4) show that mixing these conventions – which results in a heterogeneous configuration – results in different effects.  a. It is raining {after a glance out of the window}  b. Is it raining {after the addressee reported that he checked the weather report}   a. It is raining {after the entrance of a wet coworker into a windowless office}  b.  Is it raining {asking the same question again after no response} 3 Figure 1.1 summarizes the possible configurations arising from both clause-type and intonation and their resulting conversational effects. Homogenous configurations lead to a question/assertion interpretation; heterogeneous configurations lead to interpretations that do not fit this distinction.  Figure 1.1: Interaction of the Clause Type Convention and Fall/Rise Convention In contrast to primary SAs with homogenous conventions (3), I claim that SAs arising from heterogeneous conventions (4) are decomposable, and hence derived. I assume a correlation between the nature of the encoding and the complexity of the interpretation. Derived SAs are typically described as exemplifying properties of both questions and assertions (e.g. Gunlogson 2003 for (4) and Bartels 1997 for (4)). The standard analysis associates these effects with a complementary division of labor between syntax and prosody (e.g. Farkas & Roelofsen 2017), schematically represented in Figure 1.2. The effect of SFI is added to the effect of clause type scoping over the sentence radical, which carries the information content (Stenius 1967; Lewis 1972).  Figure 1.2: Syntactico-prosodic division of labor 4 In this thesis, I argue that any SA account working with the ingredients of conventions in (1) and (2) and a complementary division of labor between clause type and SFI faces four problems. • Problem I: Clause types lack unambiguous markers in English. • Problem II: A Fall/Rise distinction ignores meaningful variation in SFI. • Problem III: A question/assertion distinction is only one possible function of SFI. • Problem IV: Primary and derived SAs are epiphenomenal.  Problem I is inherent to the notion of clause type. While English has several candidates for marking clause type, none of them do so unambiguously: Inversion, question words, and even rising intonation are also present in constructions not associated with interrogativity. This is problematic because the notion of interrogativity exists to capture those grammatical forms encoding questions. Problem II simply states that a fall/rise distinction is not enough. Regardless of the phonological framework, this inventory needs to be expanded by at least one further contour, a modified rise (). This modified rise has a smaller excursion than the rise found in polar questions and can be almost level in so-called plateau contours (Halliday 1967). Problem III arises in light of other functions associated with SFI. One uncontroversial function which is difficult to reconcile with the question/answer distinction is that between complete and incomplete turns. Consider the use of the turn-medial rise in the declarative in (5) and the interrogative in (5), which contrasts with the turn-final fall in both utterances.  a. It is raining so I will take a bus home  b. Will you bike or take a bus One possible objection to Problem III may be that a questioning function could map onto  and a continuation function onto . While this mapping diversification is correct, it does not resolve Problem III entirely. The modified rise can also be found in so-called high-rise questions, such as example (6), where it does not signal continuation, but an uncertainty about the relevance (Hirschberg & Ward 1995; the example is from Pierrehumbert & Hirschberg 1990).  My name is Mark Liberman {after walking up to a receptionist}  Finally, Problem IV results from the sum of the previous problems: several conversational effects, which also incorporate propositional meaning, are relevant to the SA Problem but cannot be modeled on the basis of the Clause Type Convention and the Fall/Rise Convention. The example 5 in (6) is relevant here as well. In the literature, this type of utterance is referred to as a high-rise question (Hirschberg & Ward 1995). This suggests that it is interpreted as a question despite its declarative word order. Semantically, however, it cannot target the proposition or a propositional choice, since we can safely assume that the speaker knows his name is indeed Mark Liberman in this context. The other half of the name of the phenomenon (high-rise) refers to its intonational properties, suggesting that this is not the typical question rise. So, neither by form nor by function is the examples in (6) a primary SA. Consider further examples to see how this problem extends to other types of questions. Both examples in (7) contain a variable. The wh-question in (7) exhibits a sentence-initial variable and a fall, the echo-question in (7) exhibits a sentence-final variable and a rise.   a. Who won the award  b. He won WHAT Both questions in (7) have (at least) one form associated with an assertive interpretation by convention. They also both contains a question word. Yet, neither example in (7) can be captured by assuming both assertive and questioning effects, in the same way it has been suggested for rising declaratives (Gunlogson 2003). In comparison with a rising declarative, as in (4), and a high-rise question, as in (6), the issue then is: to what extent can the examples in (7) be characterized as questions? Is it the wh-pronoun that is responsible for their questionhood? I propose that a single conceptual shift will resolve all four problems: the central parameters for interpreting SAs cannot be conceived of as binary concepts. They need to allow for some middle ground. This is possible if we (momentarily) depart from the question of encoding primary and derived SAs and focus on their conventions of use instead. These use conventions can be described by two variables which capture the speaker’s attitude and their expectation of the addressee’s response. Both variables relate to how an issue is relevant to the development of the cg: • Speaker Commitment: Degree to which the speaker publicly commits to the issue currently negotiated for entering the cg. • Addressee Engagement: Degree to which the speaker engages the addressee to resolve the issue currently negotiated for entering the cg. Due to the non-binary conception of Commitment and Engagement, the resulting model of conversation makes room for some middle ground in each dimension, namely the middle ground 6 between the originally binary distinctions of Rise/Fall and Dec/Int. Consequently, conversational effects are no longer defined by a binary distinction either. Questions and assertions are merely the combination of the extremes on a range of conversational effects, with a middle ground reserved for negotiating the cg.  Figure 1.3: Remapping conversational effects (simplified) Figure 1.3 also captures how these conversational variables find their expression in prosodic form. The degree of Speaker Commitment correlates with the duration of SFI. The degree of Addressee Engagement correlates with the pitch excursion of SFI. Pitch duration and excursion correlate with Commitment and Engagement. They determine the temporal and amplitudinal scaling properties of the shape of the SFI, which in turn encode the speaker’s propositional attitude and their intention toward the addressee. Naturally, we cannot inspect a given contour and determine the degree of Commitment and Engagement of its host clause. We can only assume that between two utterances – everything else being equal – an increase of Commitment or Engagement corresponds to an increase in the duration or excursion of the SFI. The resulting division of labor is therefore independent of the notion of clause type (see Figure 1.4). Commitment and Engagement are two units of language scoping over the proposition expressed by the sentence radical. They are in a hierarchical relationship because the projected Engagement relates the presented Commitment. To include the necessary middle ground, I assume that any degree of Commitment or Engagement that lies in between their endpoints (i.e. Full or No Commitment/Engagement) is unmarked by the speaker and therefore automatically leads to a negotiation of a propositional issue or the SA itself. 7  Figure 1.4: Pragmatico-prosodic division of labor Where does this leave us with the above noted conventions about the interpretation of prosodic and syntactic form? After all, they were assumed to be means of encoding primary SAs. I propose that SAs can be conceived independently of clause types considering the ambiguity of morphosyntactic forms for encoding questionhood. I turn to semantics instead, where the Clause Type Convention is recast in terms of semantic contents whose denotations correspond to closed propositions for Full Commitment and propositional choices for No Commitment. Anything in between can be subsumed under the notion of a faulty propositions, which encompasses open propositions, incomplete propositions, and propositions lacking a truth value. For the Fall/Rise Convention, I propose that what falls between a rise and a fall is best captured as a modified rise.  With this shift, I can describe the conversational effects of a wide range of questions and assertions and capture other uses of SFI, including the modified rise in (5) and the high rise in (6). Moreover, it allows me to incorporate attitudinal meaning, which is often linked to prosodic variation (e.g. Bolinger 1964). We thereby need to distinguish two aspects of attitudinal meaning: conversational and emotional aspects. The amount of prosodic variation is larger for the former than for the latter. Conversational aspects, such as keeping the addressee to the point of a question (Schubinger 1958), can reduce the rise in the polar interrogative in (8) to a fall in (8). Emotional aspects, such as expressing surprise (Ward & Hirschberg 1992), can cancel the modification of the final rise in a rise-fall-rise (RFR) contour – effectively changing it from a  in (9) to a  in (10).  a. It is raining {after the entrance of a wet coworker in a windowless office}  b.  Is it raining {after asking the same question twice before without a response}  A: Have you ever been West of the Mississippi?  B: I've been to Missouri... ()     (Ward & Hirschberg 1985) 8  A: I'd like you here tomorrow morning at eleven.  B: Eleven in the morning?! ()    (Ward & Hirschberg 1986) Attitudinal meaning is therefore also incorporated in the notion of Engagement. The default degree of engaging the addressee for resolving an issue can be modified for rhetorical purposes or to express an emotion. A change in Engagement results in a change in the pitch excursion. Correspondingly, a change in pitch duration correlates with a change in Commitment. The different degrees of Engagement and Commitment therefore not only serve to categorize different SAs; they are also at the root of any prosodic variation within each of these SAs. This combination  of prosodic variation between and within SA is represented in Figure 1.5. That same figure also shows how my proposal can account for the conversational effects of the full range of phenomena discussed above: rising polar interrogatives (INT), rising declaratives (DEC), echoes (XP), (rise)-fall-rising declaratives (DEC), the modified rise (XP), high-rise declaratives (DEC), disjunctive interrogatives (INT), falling wh-interrogatives (WH-INT), and falling declaratives (DEC). A choice marker hereby encompasses both SAI and the fall-rising contour since both encode the presence of (at least) two propositional alternatives (see Chapter 5).  Figure 1.5: Prosodic variation between and within SAs Figure 1.5 illustrates how my proposal resolves each of the four problems of proposals based on the Clause Type Convention, the Fall/Rise Convention and a complementary division of labor between clause type and prosody. Problem I (clause types lack unambiguous markers in English) is solved by disentangling the relationship of clause types and SAs by decomposing the latter into their clause-type-independent degrees of Commitment and Engagement. Since interrogativity – a 9 purely morphosyntactic notion – cannot be motivated independently on grounds of unambiguous encoding, this is a welcome result: it allows us to reconceptualize SAs across different word order configurations. Problem II (a Fall/Rise distinction ignores meaningful variation in SFI) is solved by breaking up the distinction between rises and falls. As we will see in Subsections 2.4 and 4.1, not a single description of intonational form can be faithfully reduced to this binary distinction. We need to assume at least one other type of SFI, a modified rise, in addition to a considerable amount of prosodic variation. Problem III (a question/assertion distinction is only one possible function of SFI) is solved by incorporating attitudinal meaning and turn-taking into a proposition-based account of intonational meaning. Signalling continuation is just a way of putting the negotiation on hold, and emotional or rhetorical attitudes can alter the degrees of Engagement and Commitment. Finally, Problem IV (primary and derived SAs are epiphenomenal) is solved by shifting from a form-based to a function-based account of the conversational effects of different questions and assertions. This allows us to model heterogeneous form pairings whose effects are not the sum of assertive and questioning properties. Consequently, we can recategorize different types of questions and assertions, including SAs with wh-pronouns. 1.3 Naming conventions in this thesis The terminology used in the literature dealing with the SA Problem is problematic: more often than not, the terms capturing different phenomena draw on more than one domain. For example, the term for so-called high-rise questions draws on a prosodic detail (a rise starting with a high tone) and a pragmatic interpretation (i.e. question). Rising declaratives are sometimes also referred to as queclaratives or declarative questions, which is another case in point: the form (declarative) is mixed with the function (question). To streamline the discussion in this dissertation and to avoid an impact of terminology on the analysis, I will restrict the naming of SAs to a combination of forms (clause-type and prosodic pattern) rather than function (SA type). Table 1.1 summarizes the phenomena I discuss here with my terminological conventions and those found elsewhere: 10 Forms Terms used here Alternative terms DEC  falling declarative Canonical declarative DEC fall-rising declarative (rise) fall-rise contour, incredulity contour, surprise contour DEC  high-rising declarative High-rise questions, uptalk, upspeak, high-rising terminals DEC rising declarative Declarative questions, queclaratives, biased questions XP  modified rise List intonation, plateau contour, level intonation XP (wh-) echo Echo questions, in-situ interrogatives WH-INT falling wh-interrogative Variable questions, information questions, wh-questions, wh-interrogatives INT disjunctive interrogative Alternative questions, rhetorical questions INT  rising interrogative Polar interrogatives, polarity questions Table 1.1: Naming conventions for the SAs discussed in this dissertation 1.4 Anchoring the proposal My proposal builds on Wiltschko’s (2014) Universal Spine Hypothesis (USH) and its extension to include SA modifiers across different languages and dialects (Heim et al. 2016; Wiltschko & Heim 2016; Thoma 2017; Wiltschko 2017; Wiltschko forthcoming). The extended USH allows for a formal analysis of SA modifiers by integrating them into the syntactic spine dominating CP. Since this thesis is built on the related idea that intonation plays a key role in deriving the conversational effects of SAs, I adopt some of the assumptions of this framework. Within this framework, the division of labor between Commitment and Engagement (as illustrated in Figure 1.4) can be modelled syntactically as in Figure 6 (cf. Heim & Wiltschko to appear).  Figure 1.6: Syntactic integration of Commitment and Engagement (Heim & Wiltschko to appear) 11 A syntactic integration of SFI is possible within the architecture of the USH and allows us to model how syntactic and prosodic information interact. However, it is not a necessary assumption for the purposes of this dissertation. An advantage of anchoring my proposal within the framework of USH is its conceptual architecture. In what follows, I provide a brief overview of its core assumptions. Wiltschko (2014) assumes that every functional category relates two (sometimes abstract) arguments to each other. This relation is mediated by an unvalued coincidence feature (as in Figure 1.7). If the coincidence feature is valued positively, the arguments are asserted to coincide; if it is valued negatively, the arguments are asserted not to coincide.   Figure 1.7: Architecture of universal categories (Wiltschko 2014) This architecture is at the core of all clausal projections, in both the verbal and the nominal domain. Each projection has a dedicated function with content that varies cross-linguistically. Within the propositional structure, these functions include classification, the introduction of perspective, and the anchoring of any event or entity. To account for SA management, we can expand this architecture by two further categories: one that serves to ground the proposition, and one that captures its response properties.  Figure 1.8: Extended universal spine linkinganchoringpoint-of-view classificationgroundingresponding12 What is important for our purposes here is that each configuration predicted by the underlying architecture (Figure 1.7) – i.e. positive, negative, and unvalued – is attested in the Grounding and Responding layer. In both layers, there is a discourse particle that overtly encodes each of the three configurations (see Wiltschko forthcoming). My dissertation reports that these configurations can also be instantiated via intonation. That is, intonational tunes, just like particles can value the coincidence feature (Heim et al. 2016). For Commitment, which is hosted in the Grounding layer, the positive valuation [+coin] asserts that what is being said coincides with the speaker’s ground. Hence, this configuration is compatible with a context of use in which the speaker fully commits to the truth of the proposition. In contrast, a negative valuation ([-coin]) marks that the speaker cannot commit to the proposition because the proposition and the speaker ground are asserted not to coincide. The same logic applies to Engagement. Here [+coin] encodes that what is being said is placed into the response set, hence it requires full Engagement of the Addressee. In contrast, [-coin] encodes that what is said is not placed into the response set and hence does not require Engagement to resolving the issue under negotiation.  Specific to the categories of Grounding and Responding (see Figure 1.6), the coincidence feature can also remain unvalued. I argue that there is a specific form which instantiates this configuration. This is a crucial assumption in the overall argument of my thesis: it expands the otherwise binary architecture by a third possibility. In categories of the propositional domain, unvalued features lead to ungrammaticality since truth conditions cannot remain unassigned. For categories of grounding, however, unvalued coincidence features are interpretable. That is, if the coincidence feature remains unvalued, the interpretation is that the speaker is unsure whether they should or should not commit to the proposition, and whether they should or should not engage the addressee. It is possible to accept a proposition for the purpose of a conversation; hence interlocutors must have means to leave their attitude unmarked. While [+/- coin] clearly marks the degree of Commitment or Engagement, [ucoin], the unvalued feature, corresponds to the decision to leave their attitude (Commitment) or intention (Engagement) unmarked. Commitment and Engagement are relational parameters. They relate the speaker’s attitude (in Grounding) to the propositional content (in CP) and to the addressee’s projected response (in Response). If either of these categories is unvalued, we expect to see a renegotiation before the expansion of the cg.  A consequence of the architecture of the USH is that propositions are evaluated in their relation to the interlocutors (Wiltschko & Heim 2016). This follows from the coincidence of Grounding with 13 the sentence-radical and the coincidence of Grounding with Response. Different languages have particles dedicated to referring to either of those grounds (Heim et al. 2016). Speaker attitudes are beliefs about the truth of a proposition. i.e. they capture the degree of Commitment. Speaker intentions are expectations toward the addressee on how they should relate to the expressed degree of Commitment. Consequently, grounding captures the relation of the speaker to the propositional content and the addressee’s expected response to that.  It follows that negotiating the cg involves both interlocutors. This idea lies at the heart of Brennan & Clark’s (1991) model of conversation. Updating the cg is a process of negotiation that requires the active participation of both speaker and addressee. Their model includes a presentation and an acceptance phase, which describe the contribution of different turns over the course of a conversation. Both phases can extent over several turns if the addressee requires clarification of what is asserted. Likewise, the acceptance phase can be an exchange going back and forth since negotiating the Common Ground is not a static procedure.  a. Presentation phase   b. Acceptance phase  Figure 1.9: Presentation and acceptance phase of a proposition in Wiltschko & Heim (2016) This interactive conception of cg management lays the foundation on which I build my proposal. It is composed of the key ingredients of successful conversations: grounding and responding. 14 These relate the issue under negotiation to the speaker’s and to the addressee’s ground, which capture the beliefs of each interlocutor separately. In Figure 1.9, these beliefs exist independently (Bel (p)) and with reference to the interlocutor (Bel (S,p)). The latter type is included to represent that interlocutors not only monitor their own beliefs, but also those of the other interlocutor. I refer to the pragmatic variables corresponding to grounding and responding as Speaker Commitment and Addressee Engagement. Figure 1.9 also includes a third space besides the speaker and the addressee ground. This space represents the negotiation table (Farkas & Bruce 2010), which allows for propositions to be rejected or amended before they can enter the addressee ground. This additional space therefore reflects the fact that SA’s are more like proposals for common ground updates: there is no direct way of making a belief a mutually-shared belief. In the Dialogical SA Model, negotiation always involves both interlocutors. 1.5 Data and methods The data analyzed in this thesis is comprised of a mix of English examples from the existing literature and the experimental stimuli used for the perception study reported on in Chapter 4. Consider (11) as an illustration of the presentation of some adapted data from the literature.  {A is sitting in in a windowless office. B enters from outside.}   T1: B: It’s raining (H* L-L%) T2: ✓A1: Oh, I didn’t know that.         ✓A2: says nothing.         #A3: Yes, that’s right. The context for the example in (11) – set off by a set of curly brackets – sketches a possible scenario in which this conversation could naturally occur. The conversation is comprised of several turns (Tn). The conversation participants are indicated by the letters A or B. Typically, a turn is linguistic in nature, but sometimes it can also consist of a non-verbal action. Because this example is relevant for the discussion of the response properties of T1, different response options are listed for the final turn in T2. These response possibilities are evaluated as either acceptable (✓) or unacceptable (#) relative to a given context. The contexts and their modifications throughout the argument in this thesis bring out the nuanced differences in the contexts of use of SFI. All examples and contexts were crosschecked with the intuitions of native speakers of English with training in linguistics. 15 Support for the Dialogical SA Model is based on experimental evidence as well as careful analysis of the conventions of use of SFI and its conversational effects as exemplified in (11). This methodology has its precedence in the existing research on derived SAs and has been largely influenced by Gunlogson’s (2003) treatment of rising declaratives in comparison with falling declaratives and rising interrogatives. For the analysis of their use conventions, I expand Thoma’s (2016) epistemicity matrices to keep track of the epistemic development of each interlocutor during a conversation. Epistemic development refers to the changes in the interlocutor’s epistemic state. Figure 1.10, for instance, tracks the epistemic development of the falling declarative from example (11). It is a schematic representation of the knowledge state of both interlocutors with a shift from acceptance to belief in Speaker A. The initial state in the conversation is characterized by an asymmetry of knowledge, which is the catalyst of the following negotiation, whose purpose is to reduce the asymmetry – ideally to a state of knowledge congruence (Osa forthcoming). The initial asymmetry is rooted in an absence of belief in Speaker B, which is represented in Figure 1.10 by a dash in Speaker A’s ground before T1. This dash changes into a {p} which marks the presence of the proposition in the context of the conversation right after Speaker B utters the falling declarative in T1. If Speaker A acknowledges the truth of the proposition in T2, it is likely that it enters the ground of Speaker A, which then results in a knowledge congruence.   Asymmetry Negotiation Congruence A -  {p}  Bel(p)  T1 T2  B   Bel(p)  Bel(p)  Bel(p) Figure 1.10: Epistemic development of a conversation in the context of a falling declarative My expansion therefore applies the logic of the epistemicity matrix to the dialogical exchange and crucially includes the addressee’s response. It also resembles Malamud & Stephenson’s (2014) table analogy by including different types of propositional attitudes.  The experimental evidence is composed of two parts of a perception study in which participants had to rate the speaker’s confidence and their response expectation for several audible stimuli. The results presented in Chapter 4 speak to the core assumptions about the encoding of Commitment and Engagement. For the experimental data, I relied on native speakers of Western Canadian 16 English only. Prosodic variation is language- and often dialect-specific, which is why I constrained the sampling of participants according to geographic criteria.  1.6 Roadmap This dissertation is organized as follows. In Chapter 2, I explicate Problems I-IV and contextualize them in the existing literature. Problem I (clause types lack unambiguous markers in English) is a problem pertaining to linguistic form as it is about the notion of clause types. Interrogatives and declaratives are often presented as basic elements of English grammar despite the fact that there is not a single morphosyntactic cue that can motivate either category. This is a problem because it is a notion that should only describe the sentence form. At the same time, interrogatives are taken to be the encoding of a questions. If we cannot rely on form to define interrogativity, but have to rely on questionhood (which pertains to function) to determine which markers contribute to interrogativity, we have introduced a circular definition of both questionhood and interrogativity: It takes an interrogative to encode a question and a question to identify an interrogative (see also Gazdar 1983, Huddleston 1984). Declaratives, on the other hand, are only defined in opposition to interrogatives (or at least some of the markers we associated with interrogativity). If clause types cannot be encoded unambiguously, however, clause types cannot serve as a basis for encoding SAs. Consequently, we cannot take clause types to be the determining factor for our typology of different SAs. Problem II (a Fall/Rise distinction ignores meaningful variation in SFI) is another problem pertaining to linguistic form as it points to the inadequacy of a rise/fall distinction. In a survey of different descriptions of SFI, I show that there is too much variation in SFI to allow for a binary distinction. At least, we need to allow for a third category of intonational contours, which is a modified rise. Although not addressed explicitly as such, even the most reductionist description of intonation, the Autosegmental-metrical approach (Pierrehumbert 1980) has this ternary distinction based on different combinations of edge tones. Hence, the SA literature needs to incorporate the findings of the intonational literature and expend the inventory of intonational contours. Problem III (a question/assertion distinction is only one possible function of SFI) targets the vast literature on intonational meaning of which only a subset relies on a question/assertion distinction. Deriving SAs is only one out of many functions ascribed to intonation. This propositional aspect of intonational meaning draws on completely different concepts than the incompleteness function of intonation or the emotional encoding via intonation. The problem that arises from this multi-functionality of intonation is that these aspects are often 17 irreconcilable even though the different aspects of meaning all rely on the same inventory of forms. Finally, Problem IV (primary and derived SAs are epiphenomena) is a problem pertaining to function in that it takes questions and assertions as basic elements of conversation. There are several derived SAs that share properties of both categories, and it would be an oversimplification to reduce them to just one. A brief overview of the conversational effects of several SAs shows that none of them are easy to categorize as either a question or an assertion.  In Chapter 3, I motivate my proposal on the basis of the conventions of use of two primary SAs and two derived SAs, namely falling declaratives, rising interrogatives, rising declaratives and high-rising declaratives. I argue that a solution to the mapping problem requires a revision of the description of both form and function of SAs and a linking between prosodic and pragmatic information. As an alternative to relying on the Clause Type Convention and the Fall/Rise Convention, I propose that SFI marks the Commitment of a speaker and their expectation toward the addressee about how to engage with the utterance. The former is encoded by the duration of SFI, the latter by the pitch excursion of SFI. I refer to this proposal as the dialogical meaning hypothesis. A decomposition of SAs into their degrees of Commitment and Engagement unlocks new possibilities of categorizing them across traditional distinctions and beyond the phenomena captured by a distinction between questions and assertions. For the four constructions under scrutiny, I show how this accounts for the similarities and differences between SAs independent of their clause type. I then compare this proposal with existing approaches to constitutive and derived SAs. Chapter 3 therefore provides solutions to Problems I and IV. In Chapter 4, I provide empirical support for the conversational parameters of Commitment and Engagement and their prosodic correlates. This support consists of the results of a complex perception study, which conceptually targets the core assumptions of my proposal. In two separate rating tasks, native speakers of Canadian English had to rate propositional attitudes of speakers on a five-point scale. In the first part of the experiment, participants had to rate the speaker’s confidence based on intonation alone. Stimuli were manipulated by pitch excursion and duration of the SFI. In a second experiment, the same participants had to rate the response expectation on a five–point scale. Results confirm a strong correlation of pitch excursion with an expectation for a response. This is in line with the assumption that pitch excursion encodes the degree of Addressee Engagement. Results also confirm a strong correlation of pitch duration and speaker confidence, which by extension confirms that duration encodes Speaker Commitment. 18 In Chapter 5, I expand my proposal to incorporate five additional SAs, thereby solving Problems II and III. I begin with a comparison of the conventions of use of wh-interrogatives and echoes. This comparison is particularly interesting because both exhibit forms that are associated by convention with questions and assertions. Although both are usually considered to be questions, I argue that they differ distinctly from rising interrogatives. A decomposition into their degrees of Commitment and Engagement will help to model their conversational effects and explain their prosodic variation. I then continue with a comparison of disjunctive interrogatives and fall-rising declaratives. Both of these phenomena address more than one proposition and therefore resemble polarity questions in their semantic properties. I then use the remaining logical combination of the different degrees of Commitment and Engagement to model the conversational effects of the modified rise. This completes the list of logical possibilities of combining the degrees of Commitment and Engagement and shows how a continuation function can be incorporated into a proposition-based account of intonational meaning. Finally, I demonstrate how my account can integrate other aspects of intonational meaning (see Section 2.4) and explain prosodic variation found within all SAs that corresponds to changes in Commitment and Engagement. Chapter 6 concludes and points to areas of future research. 19 Chapter 2: Previous Solutions to the Speech Act Problem 2.1 Conventional mappings of forms and functions Intrinsic to most proposals on intonational meaning, and those of propositional meaning in general, is the idea that – by convention – there is a direct mapping between form and function. For propositional meaning, we find two types of forms discussed in the literature: the syntactic and the prosodic form. An inversion of Subject and Auxiliary (henceforth: SAI) for instance, is assumed to encode a question. This mapping corresponds to the Clause Type Convention, which is a morphosyntactic convention. An analogous prosodic convention, the Fall/Rise Convention, is the mapping of a sentence-final rise onto a question. The mapping of these conventions and their interaction is visualized in Figure 2.1 (repeated here from Chapter 1).   Figure 2.1: Conventionalized mappings of SAs Obviously, there are other functions associated with both forms, which I review in subsections 2.2.1 and 2.3.2, but both conventions in Table 1 are frequently found in the literature. The convention of associating Speech Acts (SAs) like questions and assertions with particular clause types has been around since Protagoras (490-420 AD; Allan 2006). Sadock and Zwicky (1985) show in a survey of over thirty languages that clause types like declaratives, interrogatives, and imperatives as well as SAs like assertions, questions, and requests are widely attested. While there is considerable variation for other SAs and clause types, the cross-linguistic stability of these small selection of clause types and SAs suggest a close relation that we can consider ‘primary’ (Allan 2006). One description of this relation, which is at the heart of the Clause Type Convention, is the so-called literal force hypothesis (Sadock 1974; Levinson 1983) summarized in Table 2.1. The Fall/Rise Convention is typically only related to questions and assertions. 20 Clause type SA Declarative  Assertion Interrogative Question Imperative Request Exclamative Exclamation Table 2.1: Direct mapping of clause type to SA The overall goal of this chapter is to explicate the SA Problem in detail and to explore the limitations of previous approaches. This will allow us to develop a proposal that aims to solve it. Special attention will be given to the role of intonation; I argue that it plays a key role for the solution. Specifically, I introduce what is at the core of the SA Problem in Section 2.2 through a discussion of the mapping of clause types onto speech acts. In Section 2.3, I review previous solutions to the SA Problem by investigating how they analyze primary SAs (questions and assertions), derived SAs, such as rising declaratives and falling interrogatives, and the division of labor between syntax and prosody. Thus, in this section I address the relation between the Clause Type Convention and the Fall/Rise Convention. Since prosody plays a prominent role in modeling derived SAs, I dedicate Subsection 2.4 to the description of its form and Section 2.5 to the conception of its function. Regarding the form of intonation, I discuss the decompositionality of intonation and the relevance of SFI in tone- and tune-based models of intonation. Regarding the function of intonation, I compare the prosodic distinction of SAs with other functions proposed in the literature about intonational meaning, such as signaling incompleteness or any paralinguistic function. By drawing on the conversational properties of non-canonical questions and assertions, such as uptalk and declaratives with a (rise-) fall-rise contour, in Section 2.5, I show that SAs like questions and assertions are themselves epiphenomenal. In sum, in this chapter I show that the existing mappings of prosodic and morphosyntactic forms onto propositional meaning need to be characterized as indirect unless we revise our descriptions of forms and conceptualizations of functions substantially.  2.2 The classic Speech Act Problem: direct mapping of clause type and SA In this section, I show that all the morphosyntactic cues associated with a particular clause type in English can also occur elsewhere. I also show that other cues than those associated with particular clause types serve to encode speech acts. This raises doubts about a clause-type-based conception of SAs. I focus on the morphosyntactic features of interrogatives in my discussion since 21 interrogatives are considered the marked clause-type, while declaratives are considered the unmarked clause-type, or default case (Huddleston 1984). If none of the characteristic properties of an interrogative are restricted to questions, then interrogativity may be a poor choice for determining what a question is. The same logic applies to the second argument: If more than the cues associated with interrogativity are important for encoding a question, there is no direct mapping of clause types onto SAs. I begin my survey by reviewing cross-linguistic and theoretical motivations for postulating the notion of a clause type. I then discuss clause-type markers such as a distinct word order in matrix clauses and the absence of these markers in embedded clauses. I conclude with a discussion of those units of language that serve to encode SAs in English.  Many languages have dedicated forms to mark clause type, including word-order, particles, and verbal morphology (Sadock & Zwicky 1985), but English is void of any forms that can unambiguously identify a clause-type. This contrasts with languages, such as Japanese and Swahili, that poses overt clause typing morphemes (Krifka 2011). The closest candidate for marking interrogatives in English is an auxiliary-initial word order. Word order alone cannot mark clause-types, however, as some clauses interpreted as questions have the same word order as clauses interpreted as assertions (see below). The postulation of a one-to-one correspondence between clause type and SA therefore fails empirically. The underlying assumption of theory-driven accounts of clause-typing is that word orders diverging from a subject-verb-object order constitute marked cases and need to be explained by operations of movement. SAI has been upheld as a defining feature of interrogativity in the Generative tradition (Chomsky 1995). Examples of interrogatives that do not display SAI are therefore assumed to undergo a covert operation of inversion or miss the interrogative operator triggering the movement (Radford 2013).  SAI also motivated the postulation of Force as an abstract operator dedicated to clause-typing (Cheng 1997; Chomsky 1995; Reis 1999). Following Rizzi (1997), Force is situated in the left-most projection of the clause, i.e. the Complementizer Phrase (CP). If Force is not overtly encoded, a viable alternative is an abstract operator (Katz & Postal 1964; Baker 1970). The appeal of a Force operator is that it can explain the difference in word order between interrogatives and declaratives and that it can motivate the attraction of wh-expressions to sentence-initial position in movement-based accounts (Cheng 1997; Chomsky 1995). So even in cases where Force does not have any overt effect on the morphosyntactic form, it has theory-internal effects on the modelling of the clausal architecture. In the following, I refrain from theory-internal motivations of clause-typing 22 and focus on properties of English alone to investigate whether there really is a connection between clause-typing and speech acts. I begin my review of the empirical facts with word order. The examples in (12) constitute distinct clause types, and the difference corresponds to word order. The order ‘subject-auxiliary-verb’ in (a) exemplifies a declarative. The order ‘auxiliary-subject-verb in (b) exemplifies an interrogative.  a. It is raining. [word order: subject – auxiliary – verb]   b. Is it raining? [word order: auxiliary – subject – verb] A sentence-initial auxiliary cannot be a defining feature of interrogativity, however, because inverted auxiliaries can also occur in other contexts. In (13), the order of auxiliary is identical to the one found in interrogatives. In (14), the auxiliary is also inverted and preceded by a constituent.  Had I known this, I would never have agreed.  a. Under no circumstances would they accept our offer.  b. Only then did I realize I made a mistake.  c. Such a fuss would he make that we’d give him his money back. (examples from Huddleston 1984: 442) Based on the examples in (14), an inverted auxiliary does not suffice to mark interrogativity. It seems that the inverted auxiliary must follow a subject for interrogativity. Yet, there are also interrogatives that do not display SAI. This includes subject-wh-questions as in (15).   a. Who did this to you?  b. Whose pants are these? While the absence of inversion in (15) can be explained by covert movement (where the subject is attracted to a higher position after undergoing an inversion), inversion is completely absent from embedded questions and therefore the notion of questionhood cannot be bound to interrogativity.  a.  He wanted to know whether it rained.  b. She explained why the sun had disappeared. Lehmann (1988) calls the effect of the absence of inversion in embedded clauses a form of “desentencialization” because the hallmark of interrogative force is missing in this context. Hence, the standard view is to treat embedded questions not at question. Nevertheless, embedded 23 questions often follow a verb that describes the act of questioning (16). So, one could argue that question verbs serve as a substitute for SAI, which would supply the required marker for interrogativity. Example (16), however, shows that this is not mandatory either.  Examples (15) and (16) have in common that they contain wh-expressions. Embedding of questions is only possible with wh-words and with wh-complementizers (what and if). We could therefore postulate a rule that wh-expressions serve as markers of interrogativity whenever auxiliaries do not occur sentence-initially. Yet wh-expressions are neither restricted to clauses categorized as interrogatives nor to a position at the beginning of a clause. Example (17) is an exclamative, example (17) is an echo-question with the wh-expression in situ, and example (17) contains a relative clause introduced by a wh-expression. So while all examples in (17) contain markers of interrogativity, none of them are interpreted as questions in the sense of eliciting missing information.  a. What a jerk!  b. You did what?  c. I really like the guy who starred in Aviator. On empirical grounds alone, it seems therefore impossible to identify a morphosyntactic cue that can reliably define an interrogative clause. SAI and wh-expressions – or at least one of the them – are necessary, but not sufficient identifiers of interrogatives. It seems that it takes an interrogative to know an interrogative. We can conclude that for the distinction between questions and assertions, it is doubtful that clause type is a sufficient criterion to directly identify them.  If interrogatives cannot be reliably distinguished from declaratives, clause types cannot provide the key to encoding SA types. One possibility to supplement syntactic information is to also consider morphological cues. Geluykens (1987), for instance, looks into the relation of personal pronouns and questionhood as a confounding factor for interpretations of intonation. In a forced-response study, participants had to decide whether an utterance was a ‘definite question’, ‘more question than statement’, ‘more statement than question’, or a ‘definite statement’. The stimuli were comprised of declarative sentences manipulated by a three-fold variation in the use of pronouns (first/second/third person singular) and a five-fold variation of contour (a fall, two different rises, and two fall-rises). The sentence in (18) is one example of how pronoun variation is tested in this study.  24  And I’m/you’re/he’s not feeling very well Of the 75 declarative sentences, 53% of those containing a second person singular pronoun were interpreted as questions. Only 12% of the declaratives containing a first person singular pronoun were interpreted as questions. The intonational factor only had a significant effect when contrasting the falling contour with the rising contours for the declaratives containing a first person singular and a third person singular pronoun. One conclusion that is safe to draw from the findings is that a second person singular is indicative of a question to the extent that this choice of pronoun correlates with the difference between rising and falling intonation. This suggests that questionhood may be associated with more than those cues associated with interrogativity. There are other cues that are relevant for encoding a question, such as contextual cues, discourse markers, and turn taking. Nilsenova (2006), for example, reports on a rating study based on scripts from the Santa Barbara corpus. Subjects had to mark sentences without punctuation as questions in case they considered them as such. Only one of 218 questions was interpreted as a question by all speakers; only five speakers had a success rate of higher than 60% for identifying those examples as questions that the author identified as questions. The majority of participants did considerably worse. Even the proportional agreement between participants varied extensively (M = .94, SD =.568). These results lead Nilsenova to conclude that there is little agreement between speakers on what a question is. Nilsenova identifies and ranks several predictors that significantly helped her participants to recognize a question. Note that only three of the following predictors are genuinely morphosyntactic (namely the presence of a wh-word, a tag, and SAI). - uncertainty, - you know tag, - wh-word, - subsequent yes/no answer,  - SAI, - turn-final occurrence. We see, therefore, that the notion of questionhood rests on a combination of morphosyntactic and pragmatic cues. Even though we cannot identify a sufficient marker of interrogativity, a direct mapping of clause type onto SA has been assumed within previous characterizations of SAs. As a consequence, The Clause Type Convention remains the starting point for proposals to solve the SA Problem. In the next section, I look at some of these proposals in more detail. 25 2.3 Previous solutions to the Speech Act Problem In this section, I survey previous accounts of the conversational effects of different types of SAs to contextualize my own proposal. The idea of conceiving their conversational effects as acts of negotiation is grounded in Stalnaker’s (1978) seminal chapter on assertions. We can conceive of assertions as updates to the interlocutors’ shared set of beliefs, the Common Ground (henceforth: cg). These updates need to be characterized as dynamic processes that involve both the speaker and the addressee. I argue here that uttering an assertion entails a responsibility of the speaker to commit to its truth and comes with the expectation that the addressee will proceed by integrating it into their belief set or accept it for the sake of continuing the conversation. Following Roberts (1996), I argue that questions complement assertions in driving forward a conversation. A central analogy for the act of negotiating the cg is the (negotiation) table (Farkas & Bruce 2010). The processing of negotiating the cg is more involved for derived SAs than for the primary SAs of questions and assertions. Correspondingly, the metaphorical table grows with the complexity of the SAs (Ettinger & Malamud 2013; Malamud & Stephenson 2014). In the following subsections, I will review different proposals of modelling primary SAs (Subsection 2.3.1) and derived SAs (Subsection 2.3.2) and how this translates into a division of labor between the different grammatical modules (Subsection 2.3.3). 2.3.1 Primary speech acts have homogenous form-function mappings Assertions and polar questions are members of an exclusive set of conversational phenomena, which are essential ingredients of conversation within each language (Huddleston 1984). This has led me to categorize them as primary SAs (see also Farkas & Roelofsen 2017). To be precise, questions and assertions did not enter the first inventories of SA theories, but they have been treated as SA soon after. Beginning with Austin (1962), these inventories center on use conventions rather than effects of their semantic content. Consider Austin’s original inventory in (19).  a. Verdictives: delivering a finding  b. Exercitives: giving a decision regarding a course of action  c. Commissives: committing the speaker to a course of action  d. Behabitives: attitudes toward behavior/attitudes of others  e. Expositives: expounding of views, conducting of arguments, clarifying 26 We see from the list in (19) that questions do not fit in any of the five categories while assertions may come about by means of any of them. Over the decades of SA research, this inventory has grown to an amount that is difficult to motivate independently. SAs that are not grounded in the basic acts of asserting or questioning seem to be as numerous as there are different conventions of use. For a form-based notion of SAs, then, it makes sense to focus on questions and assertions because these notions are not exclusively defined through their conventions of use. Questions and assertions are primary in that they serve as a basis from which other conversational effects are constructed. They are also primary in that they are conventionalized universally, albeit with different lexical means. Therefore, I assume that the illocutionary acts postulated by Austin (1962) all fall within a spectrum that is defined by primary SAs, such as questions and assertions.  Austin (1962) is important for another reason, which is central to the Clause Type Convention. He introduces the distinction between locutionary, illocutionary and perlocutionary acts. These terms correspond to what is said, how it is intended, and how what is said affects the addressee. To explain the relation between a single form and its many functions, Allan (2006) proposes that one locutionary act can map unto many different illocutionary acts because the addressee determines the latter based on the former. In analogy to the direct force hypothesis (Sadock 1974; Levinson 1983), Allan (2006) therefore sees a direct correspondence between primary locutions (clause types) and primary illocutions (or SAs). For all those cases where non-primary illocutions arise, he proposes that context plays a central role: “Hearer hears the locution, recognizes its sense, looks to the context to figure out the apparent reference, and then seeks to infer Speaker’s illocutionary intention” (Allan 2006: 3). How contextual information contributes to establishing the intended illocutionary act remains to be systematically described. The point here is that the SA literature distinguishes between a direct and a less direct mapping of form to function (e.g. Farkas & Roelofsen 2017). Indirect mappings generally rely on context to complement the Clause Type and the Fall/Rise Convention. Here, I review different accounts of both direct and indirect mappings. Assertions have been the foundation of philosophical discussions about the mechanisms of language since Aristotle. Their denotation can be considered to be a singleton set of propositions. Stenius (1967) has termed this propositional content the sentence radical. Propositions need to be conceived of as a set of possible worlds in which they are true. For an assertion like It is raining, this means that it reduces the set of worlds which are doxastic possibilities for the participants to the worlds in which it is true that it is raining. The contribution of an assertion to a conversation is 27 proposing to add a proposition to the set of shared beliefs of interlocutors (Lewis 1972; Stalnaker 1978). The process of expanding this set is captured by Stalnaker (1978) as the development of cg. cg is defined as the set of worlds we agree are possible candidates for the actual world. The worlds relevant to the conversation form the context set.  The original proposal by Stalnaker (1978) lacks a dynamic component that captures the interaction between interlocutors. This dynamic aspect is necessary because cg is an inherently dialogical notion: each interlocutor has their own context set. cg is therefore by definition the intersection of those beliefs that both speaker and addressee publicly and mutually commit to. It therefore requires ingredients that allow the development of the cg as a consequence of each turn. Later additions to this model (Stalnaker 2002; 2014) distinguish two ways of how interlocutors can relate to a proposition that is publicly uttered. Propositions can enter an interlocutor’s belief or be accepted for the sake of the ongoing conversation (a third possibility being, of course, that it can be rejected). In Stalnaker’s (2014: 46) words: “one may accept things, in the relevant sense, that one does not believe in cases where it facilitates the conversation to do so.” The distinction between what is believed and what is accepted reflects the fact that assertions are just that: proposals to add to the cg (Clark & Schaefer 1989; Clark 1992 and Ginzburg 1996). That is, asserting is not itself adding the proposition to the cg. Regardless of whether the addressee agrees with the proposal, it is available as a commonly-acknowledged belief of the person who stated it from that point forward in the conversation. The distinction between accepting and believing makes it necessary for speakers to communicate how they relate to the propositional content. Speaker attitudes are paired with intentions: in as much as the speaker expresses an attitude toward a belief, they also have an expectation of what the addressee will do with their utterance. Because intentions are hard to predict, they require a broad knowledge of what is already known (Stalnaker 2014). But speakers can encode their intentions just as they can encode their attitudes. In doing so they communicate a metalinguistic message in addition to the propositional content. There are a range of different views on whose responsibilities it is to interpret this metalinguistic message (i.e. the intended response to the utterance). Stalnaker (2002) associates the responsibility with the addressee: the speaker asserts a proposition and it is upon the addressee to decide whether to believe it, reject it, or simply accept it for the sake of a conversation. Similarly, Truckenbrodt (2006) assumes that an assertion comes with an expectation toward the addressee that they believe it. His paraphrase “S wants (from A) (that it is cg) p” (265) suggests that the addressee is the one 28 responsible for expanding the cg. In contrast, there are also proposals that associate the primary responsibility with the speaker (Searle 1969; Brandom 1983; Alston 2000; MacFarlane 2011). Here, the speaker is obliged to be a reliable source of truth. This is the basis for any public commitment (Gunlogson 2008). While the decision to add a belief to the cg may lie with the addressee, the speaker is responsible for contributing something worth believing. The consequence of committing to a proposition without a credible source is to lose face in the process of negotiating the cg (Krifka 2015). Rather than associating the responsibility with either the speaker or the addressee, I believe that it is their shared responsibility in the negotiation of propositions. While the speaker can make their attitudes and intentions known to the addressee, which may come at the risk of losing face if the speaker cannot back up their attitude, it is nevertheless the privilege of the addressee to decide whether they want to adhere to the speaker’s intentions. We can conclude that adding a proposition to the cg is a dialogical act. Questions have different conversational effects than assertions, yet their treatment in the SA literature is compatible with the treatments of assertions. Roberts (1996) proposes that the question under discussion (QUD) propels interlocutors from one topic to another. The most general QUD corresponds to the question of What is the way things are? Within this framework, the basic components of conversation can remain the same as before: context sets are reduced to a singleton set, which corresponds to true beliefs about the world. That is, the singleton set corresponds to the single proposition which conjoins all the propositions which are held as true by the interlocutors in this conversation. Assertions, however, are primarily understood as answers to the questions that drive a conversation. Questions are composed of a presupposed and a proffered component. The proffered component corresponds to the asserted or non-presupposed ingredients to an assertion and a question, respectively. A question (once accepted) commits the addressee to answer it. Once accepted by the addressee, a question determines the unfolding of the discussion. The immediate QUD relates to other questions that arise from it. This relationship is characterized by Roberts (1996) as one of entailment. Metaphorically speaking, this corresponds to a stack of ordered questions. A reasonable discourse strategy is therefore to work one’s way through the stack of questions (not necessarily in consecutive order). Questions and assertions are both important to resolve the goal of inquiry, which consists in choosing among alternatives. Different types of questions are characterized by different ways of determining alternatives. For polar questions, the alternatives correspond to a binary set of propositions. For information questions, 29 they are determined by a variable. Alternative questions provide a predetermined set of alternatives. Farkas and Bruce (2010) describe this dialogical dimension of expanding the cg with a metaphorical space between speaker and addressee, which they call the table. This negotiation space is necessary because the speaker does not have the ability to directly access the cg. Items on the table are what is ‘at issue’ in a conversation and need to be resolved. Conversations are therefore structured by two different strategies: expanding the cg and resolving issues on the table. Naturally, the latter works toward achieving the former. Another addition in Farkas & Bruce (2010) is the idea of projected sets, which include the current and the future state of the cg. Assertions project the integration of a belief into the cg. The notion of projection points to a moment in a conversation after a proposition was proposed to enter the cg in which that proposition does in fact enter it. The projected set (which contains a singleton set of propositions for assertions and a non-singleton set of propositions for questions) therefore refers to a future state of the conversation which the speaker anticipates.  Figure 2.2 visualizes Farkas & Bruce’s (2010) model using the terms I have introduced so far. The key difference here is between the cg and the projected set. The latter is the superset of current and future cgs. The projects set is visualized closer to the addressee because it reflects the canonical way of resolving the item being negotiated. An unmarked assertion is projected to enter the cg through conventional moves. Its transfer from table into the cg is the “least marked” move following the utterance of an assertion. Derived SAs, however, require a more elaborate negotiation before their content can enter the cg. This negotiation is often accompanied by discourse markers that trace the propositional attitudes of the interlocutors (see Heim et al. 2016). Speaker Table Addressee Speaker beliefs ‘at issue’ items Addressee beliefs Cg Projected set Figure 2.2: Negotiating the cg in Farkas & Bruce (2010) We have seen, then, that assertions are typically considered to be the core ingredients of SAs for expanding the set of beliefs shared by the speaker and addressee. Crucially, the inclusion of propositions into the set of shared beliefs (that inclusion corresponds to the ingredients of the projected set in Farkas & Bruce 2010) is not a conversational necessity. Propositions can be 30 accepted just for the sake of a conversation (or they may require a process of negotiation before they can enter the cg). This negotiation is a responsibility shared by both interlocutors. Questions and assertions are both viewed as proposals. The difference lies in the size of the negotiated issue: a declarative denotes a singleton set; a polar interrogative a non-singleton set. In the terms of Farkas and Bruce (2010), only questions are inquisitive, assertions are informative. Elaborating on the proposal nature of questions, Ettinger & Malamud (2013) modify the table model by splitting the table into one part reserved for proposals and one reserved for choices. Conversational updates (i.e. utterances relevant to a conversation) therefore can consist of propositions or choices between propositional alternatives (A and ¬A). Polar interrogatives, then, constitute a complex move in that the speaker adds both the information that a choice between alternatives is relevant (Table 1choices) and the issue that this choice needs to be resolved (Table2proffer). The utterance therefore ends up on both parts of the table. For ease of comparison, I amend the model from Farkas & Bruce (2010) according to this split in the table. Speaker Table 1choices:  add information that A or that ¬A to cg   add information that A to cg    add preference for A to cg Addressee Speaker beliefs Table2proffer: add information that A to cg   add preference for A to cg   add issue whether A or ¬A to cg Addressee beliefs Target cg: info, issues, preferences Figure 2.3: Splitting the negotiation table in Ettinger & Malamud (2013) To conclude this survey, then, what questions have in common with assertions is that they revolve around alternative propositions. Assertions reduce the set of possible worlds to those worlds where a proposition is true for both interlocutors. Questions are instructions to the addressee about what is required for that reduction: a choice between alternatives or a replacement of a variable (see Section 2.4 for the discussion of the semantics of different question types). Within the realm of propositional meaning, we can distinguish between primary SAs which are a singleton set of closed propositions for assertions ({p}) and a binary set for polar questions ({p, -p}).1  1 I do not include information questions in the category of primary SAs since they violate the Fall/Rise Convention; instead I will include them in the category of derived SAs. 31 2.3.2 Derived speech acts have heterogeneous form-function mappings Generally speaking, derived SAs are different from primary SAs in that they exhibit conversational effects of more than one SA. Derived SAs have given rise to a new type of literature focusing on the conversational effects of SAs with a heterogeneous pairing of the Clause Type Convention and the Fall/Rise Convention. Proposals that rely on these conventions typically associate the syntactic form with one conversational effect and the prosodic form with another. Before we get to the division of labor between syntax and prosody in previous SA models, I briefly sketch here how the effects of derived SAs differ from those of primary SAs. To limit the scope of our discussion, I focus here on different ways of dealing with rising declaratives (20), a classic example where heterogeneous forms lead to a complex interpretation.  It is raining {after the entrance of a wet coworker into a windowless office} Gunlogson (2003; 2008) can be credited with establishing the significance of what I call derived SAs for the development of cg. In her dissertation, Gunlogson 2003 provides evidence that sentences like (20) have conversational effects similar to both unmarked questions and assertions. While the Speaker’s Commitment relates to a proposition, just as is the case for unmarked assertions, the speaker requires ratification from the addressee that this proposition is, in fact, true. Gunlogson (2008) refers to this as a contingent commitment. Contingency is the property that rising declaratives share with rising interrogatives. Rising declaratives can therefore neither be interpreted as determining the QUD nor as reducing the set of possible worlds for both interlocutors. They seem to align somewhere in between the effects of assertions and questions. The mixed effects of rising declaratives make them principal candidates for being negotiated at the table. Malamud & Stephenson (2014) draw on Farkas & Bruce’s (2010) notion of the projected set and propose that commitments can either be actual or projected. Projections are based on assumptions about the normal course of a conversation. The key innovation in Malamud & Stephenson (2016) is that they split the speaker’s and the addressee’s ground into their current and future state. Projected commitments introduce a sense of tentativeness in the speaker, which adds a metalinguistic issue corresponds to the bias in rising declaratives. In Figure 2.4, I expand the original table model to include Malamud & Stephenson’s (2014) projection at (almost) every level. 32 Speaker commitments Table Addressee commitments current  projected propositions, alternatives, metalinguistic issues current  projected Current cg Projected cg Figure 2.4: Projection at (almost) every level in Malamud & Stephenson (2014) For rising declaratives, a proposition and metalinguistic issues (a set of salient propositions) are added to the table and hence need negotiation. However, the acceptance of the proposition is already projected in the speaker’s future ground. Since metalinguistic issues and propositions are stacked (in that order), the addressee needs to resolve the metalinguistic issue first. As a consequence, the projected cg differs from the present cg in that it contains possible resolutions of the metalinguistic issue. Breaking down the complex machinery, this means that a rising declarative requires to settle a metalinguistic issue before a proposition can be added to the cg.  Farkas & Roelofsen (2017) approach the phenomenon of rising declaratives by assuming that they share their semantic content with polar questions: they are both non-informative, but inquisitive. Their primary conversational effect, which they share with both rising interrogatives and falling declaratives, is adding a proposition to the table and its informative content to the speaker’s set of commitments. It is the inquisitive/informative value that differs between declaratives and interrogatives. The marked SA status of rising declaratives arises from a secondary effect which expresses a bias toward one alternative and a (low) credence toward the complementary alternative. The key difference to Malamud & Stephenson (2014) is that negotiation is still at the propositional level rather than at the metalinguistic level. The notion of projection does not enter their model. Hence, the table in Farkas & Roelofsen (2017) does not capture the negotiation of the cg. Consequently, their model is far less complex than that of Malamud & Stephenson (2014). 2.3.3 Division of labor between syntax and prosody In the preceding subsections, I reviewed how previous models analyze SAs with a homogenous form-function mapping and SAs with a heterogeneous form-function mapping. The former type of SAs adhere to both the Clause Type Convention and the Fall/Rise Convention, the latter type only to one of them. In this subsection, I review the division of labor between syntax and prosody assumed in the literature. I hereby identify two strategies of dealing with the interactions of the Clause Type Convention and the Fall/Rise Convention. The first strategy is widely assumed: it suggests a complementing division of labor between syntax and prosody (Gunlogson 2003; 2008; 33 Malamud & Stephenson 2014; Krifka 2014; Farkas & Roelofsen 2017). I refer to this as the complement strategy. It adds the function of following the Fall/Rise Convention to that of the Clause Type Convention. If these form-function mappings contradict each other, the division of labor between syntax and prosody will determine which convention modifies the other. The second strategy, which I henceforth refer to as the operator strategy, is employed by Bartels (1997), who proposes that two attitudinal features, one overt and one abstract, serve to derive intonational meaning. In effect, it adds a third element, an operator, to the Clause Type and the Fall/Rise Convention to resolve the cases of heterogeneous pairing. Each strategy is reviewed here on the basis of how it deals with primary and derived SAs. While both strategies can handle some of the derived SAs that have a heterogeneous form-function mapping, neither of them can account for all types of clauses.  The complement strategy preserves the contribution of both clause type and intonation, but presents their interaction as the sum of both. Gunlogson (2003; 2008), for example, shows that rising declaratives share properties of questions and assertions. She derives these properties by associating the declarative word order with the assertive and the sentence-final rise with the questioning function. In her own terminology, rising declaratives express a commitment to the truth of a proposition that is contingent on the addressee’s ratification. Commitment comes with declarative form; contingency comes with a rise. Hence, a rising declarative and a polar interrogative are similar in that they are both contingent on the addressee. They are different in terms of their pragmatic contribution, however: only a rising declarative expresses a commitment; a polar interrogative does not. In Gunlogson’s approach, contingency is a result of lack of evidence and therefore the inability to commit to the truth of a proposition.  We find similar versions of this idea in Beyssade & Marandin (2007). They associate the rise with a call-on-addressee, which is typically found with questions, and they associate a declarative clause type with assertion of a proposition. One advantage of their analysis over Gunlogson’s is that the call-on-addressee associated with question can be also paired with interrogatives and imperatives. Each clause type can combine with a particular call-on addressee, either matching the present clause type or another. Krifka (2015) and Malamud & Stephenson (2014) follow this complementary division of labor between the two SAs’ cues. At first glance, this also holds for Farkas & Roelofsen (2017), since syntactic and prosodic form both combine with the sentence radical to derive the interpretation. Nevertheless, these authors insist that both declaratives and interrogatives have the 34 same semantic type derived from an informative and an inquisitive component whereby the informative component is trivial for interrogatives and the inquisitive component is trivial for declaratives. Despite the different assumptions about the semantic content associated with different clause types, however, they follow the complement strategy: syntactic and prosodic form determine individually whether a sentence is inquisitive or informative. Once combined, the two forms determine the overall interpretation of the utterance. Since both polar questions and rising declaratives come with a rise – which expresses inquisitiveness – their primary contribution to the conversation is considered to be inquisitive, non-informative. The one property in which Farkas & Roelofsen (2017) differ from other complement models, such as Krifka (2014) and Malamud & Stephenson (2014), is that they associate derived SAs with two, rather than one conversational effect. In addition to the non-informative effect derived from their semantics (encoded by syntax and prosody), rising declaratives come with a bias toward the truth of a proposition and a low credence toward the complementary alternative (which is not encoded). Accounts following the complement strategy all follow the same principle: For every heterogeneous mapping of prosodic and syntactic form, they motivate the precedence of one form. This precedence of one form over the other is exemplified in (21). The distinction between open and closed corresponds to inquisitive and non-inquisitive in Farkas & Roelofsen (2017).  a. Falling declarative: assertive syntax + assertive prosody = closed assertion  b. Rising declarative: assertive syntax + questioning prosody = biased question  c. Falling interrogative: questioning syntax + assertive prosody = closed question  d. Rising interrogative: questioning syntax + questioning prosody = open question All accounts reviewed here arrive at a question interpretation for rising declaratives by relying more on the prosodic than on the syntactic form. None of these accounts, however, can model the conversational functions of phenomena where the contribution of the syntactic form takes precedence over that of the prosodic form. High rise questions and alternative questions are instances of this kind of SAs, and wh-questions and echo questions, whose properties cannot be characterized with the labels in (21), are suspiciously absent from the discussion in this literature. Bartels’ (1997) operator strategy differs in two ways from the complement strategy to derive the conversational effects of different types of questions and assertions. This is largely because all of the phenomena not discussed by accounts subscribing to the complement strategy serve as 35 motivation for the additional machinery in the operator strategy. Firstly, Bartels (1997) associates assertiveness (captured by the feature [ASS]) with a low boundary phrase accent (L*) rather than declarative form. Secondly, she postulates an abstract feature [WH], which signals the presence of alternatives. This feature largely corresponds to interrogativity, but does not serve to distinguish questions from assertions (or their semantic values). As a consequence, interrogatives can express assertiveness when their prosodic form includes a low phrase accent. Both “attitudinal features” (10) – here: assertiveness and signaling the presence of alternatives – are pragmatic rather than syntactic in nature. Together, [ASS] and [WH] suffice to characterize a significantly larger number of phenomena than any of the accounts subscribing to the complement strategy.  [+ASS] [-ASS] [+WH] Alternative questions Falling wh- and polarity questions Rising wh-questions Wh-echo questions [-WH] Statements Falling polarity questions Rising polar questions Rising non-wh echo questions Table 2.2 Contextual model of intonational meaning per Bartels (1997) [notations adapted] With a model based on [+/-WH] and [+/-ASS], interrogatives are not always interpreted as questions, and declaratives are not always interpreted as assertions. At first glance, this is a welcome result because ’Bartels’ primary concern is to model the contribution of intonation, which is introduced through the [ASS] feature in Table 2.2. For the division of labor problem, however, the [WH] feature is problematic feature, since it recreates the distinction between interrogatives and declaratives with two exceptions: rising polar questions and rising non-wh echo questions are both [-WH]. This is a consequence of ’Bartels’ idea to characterize alternative questions and falling polarity questions as whether-questions, which in turn allow her to characterize them with [+WH]. While all this might be compatible with a system that is defined by the number of asserted alternatives, it is incompatible with a grammatical encoding of the [WH] feature. The operator therefore neither maps onto a wh-pronoun nor on an interrogative marker or operator.  In Bartels’ system, neither uncertainty (marked by [-ASS]) nor the presence of alternatives (marked by [+WH]) can identify what a question is. The system therefore fails to explain how prosody and morphosyntax (or pragmatic features for that matter) complement each other in distinguishing questions from statements. So, while Bartels (1997) can model the conversational effects of more derived SAs than those accounts with a strict complementation strategy, we are 36 left with an unsatisfying consequence for primary SAs: the prosodic form alone distinguishes their conversational effects. Due to the clashes of [WH] and interrogativity (with the exceptions mentioned above), it is also unclear how to derive the semantics of questions and assertions. The operator strategy therefore introduces additional machinery without solving the SA Problem.  2.4 Previous descriptions of intonation In light of the prominent role of intonation in the previous accounts of SAs, it is worth considering which aspects of intonation are relevant. In this Section, I address two specific questions in this regard: (i) How should we conceptualize intonation – as a sequence of tones or as tunes? (ii) Which aspects of the intonational contour are significant for solving the SA Problem? The first issue is about a debate that shaped the discussion of intonational phonology for decades. Intonation is either conceptualized as a fixed configuration, such as a rise or a fall, or it is conceptualized as a combination of tonal targets, such as a sequence of low and high tone (which would be the equivalent of a fall). The second issue is about the prominence of the final part of the contour, typically called the nuclear tune/tone. With a great overlap with the first issue, different models of intonational phonology differ on whether the nuclear tune should receive special attention. I review existing accounts based on both issues and argue that – contrary to what some accounts claim – every account relies on configurations of tonal targets and associates a special role with the final part of the contour. The assumption that SFI has a special role and that it is best conceived of as a configuration is not trivial. Considering the dispute regarding both issues in previous research will be crucial for the deciding about the description of the encoding Commitment and Engagement. 2.4.1 Describing intonation: targets vs. configurations In this Subsection, I survey different approaches to describing intonation to provide a conceptual basis for the role I ascribe to intonation as a means of encoding Commitment and Engagement. The critical question for my analysis is whether to associate these variables with individual tones or with specific contours. I review four different approaches of describing intonational form: The British tradition (e.g. Palmer 1922; Kingdon 1959; O’Connor and Arnold 1973; Crystal 1969), the American tradition (e.g. Pike 1945; Wells 1945; Trager & Smith 1951), the Dutch tradition (e.g. Cohen & t’Hart 1967; ‘t Hart, Collier & Cohen 1990), and the Autosegmental-metrical framework (e.g. Pierrehumbert 1980; Beckman & Pierrehumbert 1986; henceforth: AM). These frameworks primarily differ in whether intonation is characterized as a sequence of individual tones or as a 37 contour that cannot be fully decomposed. A helpful analogy for this distinction is the difference between targets and configurations (Bolinger 1951; Ladd 2008). The latter option is rooted in the British tradition, the former option in the American tradition. The Dutch tradition is configurational in nature, but also includes elements that can be considered to be target-based. This puts the Dutch tradition somewhere between the British and the American traditions. The AM framework is sometimes argued to have overcome the configuration vs. target schism (Ladd 2008). From a descriptive point of view, however, it is even more compositional than the American tradition. The number of targets is reduced from four to two, and these tones only vary in distribution. Figure 2.5 summarizes how the association of frameworks with the target vs. configuration distinction mirrors a trend of decomposing the contour into fewer abstract units.  Figure 2.5 Different frameworks on a tonal configuration-to-target continuum It is worth noting that the distinction between targets vs. configuration has consequences for the mapping of phonetic detail onto phonological units. For a configuration-based approach, variation in form is limitless; for a target-based approach, variation is limited by the inventory of targets. The onus on a configuration-based approach is therefore to reduce the amount of phonetic variation to a degree that allows an explanatorily-adequate theory of intonational meaning. The onus on a target-based approach is to map the theoretically-motivated inventory of targets onto the attested contours. The fewer targets assumed, the more important a theory of tone-to-contour mapping. The prosodic descriptions within the British tradition are shaped by an increase of postulated configurations over time. Palmer (1922) distinguishes between four different types of nuclear tunes: a falling, a high rising, a low rising and a falling-rising tune. Falling contours stand out in that they are treated as invariable. They can occur with a preceding slight rise in which case their meaning is intensified. But this does not lead to a categorical distinction between different types of falls. The opposite is true for rising contours. Palmer distinguishes between low rises and high rises and rises preceded by a fall. Bolinger (1958; 1965; 1989) extends the range of possible contours to six with three singular movements and three combinations of movements. Halliday (1967) follows a similar logic of individual contours and contour combinations. Haliday’s list of 38 individual contours includes five tone groups and two compound tone groups. Only two movements are singular in shape, a falling (Tone 1) and a rising contour (Tone 3). Three tonal movements are combinations of falls and rises (Tone 2: falling-rising or rising, Tone 4: (rising)-falling-rising, Tone 5: (falling)-rising-falling). Two contours exist as combination of movements (Tone 1 and 3, and Tone 5 and 3). Halliday specifies three types of tune endings: contours can either end at low, medium or high level. Finally, O’Connor and Arnold (1973) further extend this inventory to ten different contours by specifying the onset as either high or low (e.g. low-fall and high-rise). The extension of contours follows a trend: contours are increasingly described as complex constellations that show a combination of falling and rising movements. Rises are grouped into low and high rises as early as Palmer (1922). In O’Connor and Arnold (1973) we also find the same distinction for falls. For modeling the encoding of Commitment and Engagement within the British tradition, I would associate different degrees of both variables with different types of contours. Given the range of contours available, I would need to defend my selection of contours that map unto the configurations predicted by my model.  The key difference between the British and the American tradition is that the latter does not conceive of contours as a continuum. Instead, the American tradition decomposes the contour into tonal targets of four different heights. Pike’s (1945) analysis is based on four tonal targets, where Tone 4 is the lowest tone, and Tone 1 the highest. Almost at the same time, Wells (1945) also published an account based on four pitch levels with one difference: the numbering of tone levels is reversed. Conceptually, a pitch contour is treated neither “as a continuum, nor as an atom” (Wells 1945:30). In Wells’ account, there are 19 different contours composed of 4 pitch tones. Despite Pike’s (1945) primacy in conceiving the target-based analysis (Bolinger 1972; Ladd 2015), it was Wells’ ordering of the tone levels that was adopted by following publications in the American tradition. The primary source of consolidating of this order is Trager and Smith (1951). In their system, minor variation is indicated by four different diacritics; the relation to the preceding tone is marked by three additional diacritics. For each of the four tones, we therefore have the variation visualized in Figure 2.6. 39  Figure 2.6: Tones and contours in Trager & Smith (1951) What is important for the following discussion of encoding Commitment and Engagement is that all three of the primary sources of the American tradition (Pike 1945; Wells 1945; Trager & Smith 1951) combine tonal targets into configurations. The traditional divide between approaches based on levels vs. targets is therefore to be taken with a grain of salt. Even though tones are regarded as the primitives in the American tradition, variation and transition receive more attention than is usually assumed. Tonal movements and falling vs. rising pitch are frequently discussed. Transitions are significant, particularly for the terminal pitch. Any consecutive occurrence of tones in the same category is represented as one tone with scope over several syllables. The tone only changes with the final pitch. For instance, a fall from tone 1 to tone 3 is not considered to involve an intermediate tone 2. In the American tradition, then, configurations do play a central role. If I were to model the encoding of Commitment and Engagement within the American tradition, any configurations associated with these variables would need to be decomposable into tonal targets. If Commitment and Engagement were encoded by targets, I would need to explain which adjacent phonological events determine their combination, and which of those were meaningful. The Dutch tradition (e.g. Cohen & t’Hart 1967; ‘t Hart, Collier & Cohen 1990) constitutes in some ways a middle ground between the American and the British tradition. It follows the British tradition in conceptualizing contours as configurations. Additionally, the Dutch tradition supplements the phonetic description with a phonological underpinning that distinguishes high from low pitch levels. High and low levels are targets in the sense that a speaker can willfully aim at arriving at that target. Nevertheless, the transition between targets is just as meaningful (‘t Hart, Collier & Cohen 1990). Hence, while the focus is on the configuration rather than the target, the Dutch tradition at least recognizes the importance of the latter. The smallest unit of analysis is nevertheless a discrete pitch movement consisting of several tonal targets. ‘t Hart, Collier & Cohen (1990) list 5 types of falls and rises. These movements can vary along the variables of timing (early, late, very late), rate of change (fast, slow), and size (half, full). Not all combinations are 40 attested for each movement. The different falls and rises fall into three categories, prefix, root, and suffix, which mark the distributional properties with reference to the acoustic high-point, the root.  In the Dutch tradition, contours are configurations in the sense that the individual sequence of movements is grammaticalized. Rise 1, for instance, can only be followed by Fall A and B, but not by Fall C in both Dutch and British English. These sequences are typically represented by idealized lines linking the different movements, which reduce the infinite variation of contours to a small number of basic patterns, most notably the hat pattern, the valley pattern, and the cap pattern (Collier & t’Hart 1983). Intonational analysis in the Dutch tradition includes both phonological and phonetic features. The prosodic movements constitute the building blocks of a limited number of configurations, which undergo phonetic variation including a general trend of declination (Cohen & Collier 1982). Ladd (2008) goes as far as characterizing the Dutch tradition to be the first framework to adopt a phonological structure since it distinguishes high and low pitch levels (cf. Collier and t’Hart 1983). This interpretation is contested by proponents of the Dutch tradition, however. t’Hart et al (1990) insist that “there are no pitch levels” (75), just rises and falls. The different configurations underlie a strict pattern of grammaticalization that can predict which contours are allowed in a given language. The key lesson to take away for our purposes that the phonological underpinnings should translate into a faithful representation of the prosodic details of a contour.  With the emergence of the AM framework (Pierrehumbert 1980), the gap between target- and level-based accounts has arguably become smaller. Though fundamentally target-driven in its conception, this framework recognizes the importance of including phonetic detail in the analysis of prosody. Tones are considered separate from segmental information and underlie a phonological hierarchy (hence the term, Autosegmental-metrical (Gussenhoven 2002)). The inclusion of pitch accents is a further indication that the British and the American tradition both found their way into the AM model (see Ladd 2008; 2015). Pierrehumbert’s (1980) inventory of tones is minimalist: two tones, high (H) and low (L), occur as pitch accents, phrase accents or boundary tones. Pitch accents (T*) are aligned with the stressed syllable of a phonological word; phrase accents (T-) combine with the pitch accent; boundary tones (T%) occur at both ends of a phrase independently of the metric structure. Only pitch accents can occur in pairs; all other tones are monotonal. If the pitch accent is bitonal, the acoustically more dominant tone receives the *-marking. Through location- and context-specific rules, only the tone mappings in Figure 2.7 are allowed. 41  Figure 2.7: Possible tonal configurations in Pierrehumbert (1980) [notation simplified by JH] Beckman & Pierrehumbert (1986) further reduce the inventory of pitch accents to six by eliminating H*+H and that of initial boundary tones to an optional H*. Two theory-driven stipulations undermine the compositional character of Pierrehumbert’s (1980) account. For pitch accents, she reserves H*+L to trigger downstep, which is the AM equivalent to declination in the Dutch tradition. For boundary tones, L% represents the absence of a rise rather than a fall. If a low boundary tone follows the combination of a low pitch accent and a high phrase accent (L* H-L%), the boundary tone indicates that the pitch remains at that level instead of raising further (as in L* H-H%). The latter stipulation is more theoretical in nature since it is often not possible to locate the phrase accent when combined with a boundary tone. This has led researchers to abandon the notion of phrase accent in some language-specific notation rules (e.g. ToDi: Gussenhoven 2000; 2005). This is one example that helps to understand how the AM framework and the ToBI (Tone and Break Indices) transcription system (Silvermann et al. 1992; Pitrelli et al. 1994; Brugos et al. 2006), a brainchild of AM researchers and linguists more interested in prosodic phrasing and boundaries (Price et al. 1991), relate to each other. While ToBI serves descriptive purposes only – using language-specific inventories based on the AM framework – transcription practices have had some impact on the theoretical discussions among linguists who adopted the AM framework. Other examples are the treatment of downstep and a reassignment of the H*+L accent to peaks followed by a fall (e.g. Ladd 1983; Féry 1993; Gussenhoven 2004; 2005). Finally, it should be noted that the co-existence of H* and L+H* introduces some practical and theoretical problems. Even for experienced analysts, their distinction is acoustically challenging (Pitrelli et al. 1994; Steedman 2014). The fact that the high tone is preceded by a low tone in both accents reduces the clarity of which tonal targets should be grouped together (Ladd 2008).  With the reduction of the phonetic form to two underlying targets, the matter of how high or low an individual accent is with respect to the speaker’s overall range and the neighboring tones is a matter of realization. Tonal scaling is independent of the architecture of the phonological 42 framework. The same holds for the temporal dimension: tone alignment is primarily a matter of phonetic realization. While considerable effort has gone into investigating the consequences of alignment and scaling for the basic tenets of the theory (see esp. Bruce 1978), the inventory of tones has hardly changed over the years. In terms of the decomposition of a contour, the AM framework is therefore the most radical in its conception. Two tones and their combinations suffice to characterize the entire contour. Diacritics merely serve to indicate the location and prominence of a tone. For an encoding of Commitment and Engagement, this system is most easily compatible with a one-to-one mapping of tone to meaning. We find such proposals in Pierrehumbert & Hirschberg (1990), Bartels (1997), and Truckenbrodt (2011) (which I discuss in Section 2.5). The former distinguishes complete from incomplete meaning; the latter two between assertive and questioning meaning. Couching the configurations of Commitment and Engagement in the AM framework will quickly exhaust the inventory of the phonology (especially if we stick with monotonal pitch accents). A way out of this problem is to combine tones into configurations, which goes against the compositional spirit of the AM framework. We have seen in this section, however, that a strict distinction between configurations and tonal targets is an artificial distinction.  In conclusion, the perception of a clear-cut distinction between targets vs. configurations is somewhat tainted by the current dominance of the target-based AM framework, which builds on the work of the American structuralists. Though that distinction plays a central role in decades of intonational research, the early treatments in the American tradition clearly show an awareness that tones are strongly influenced by their contexts. It is only with the arrival of the AM framework that tonal targets become the singular focus of the phonological treatment; the relation between individual tones is primarily a question of how these morphemes are realized phonetically. It is an empirical question whether listeners will be sensitive to any differences in realization. This question is the focus of the experimental investigations reported in Section 4.2. 2.4.2 The role of the final part of the contour One additional feature that is relevant for the characterization of intonation is the so-called nuclear tune (alternative terms are nucleus or nuclear tone). The nuclear tune is a section late in the contour that centers around the perceptively most prominent accent. It is sometimes divided into onset, head, and tail. Consider the schematic representations of the intonational contours in (22). For both the declarative (22) and the interrogative (22), the nuclear tune is on the word raining, with the 43 head of the tune consisting of the fall or rise toward the second syllable. The onset and primary stress falls on the first syllable. The tail captures the part of the contour that follows the rise/fall.              a. It’s          r a i n i n g.  b. Is     it     r a i n i n g? While theoretical implications might differ, every framework lends some special attention to the description of the nuclear tune. The only exception to this pattern are holistic models of intonation. Holistic models, such as Liberman and Sag (1974) and Sag and Liberman (1975), conceive of the entire contour as a singular prosodic form. Since these holistic descriptions of form have been successfully incorporated in other approaches to form (Bolinger 1982; 1986; Gussenhoven 2004), we can ignore them here. Nevertheless, some remnants of this approach keep resurfacing. Contours such as the hat contour, the surprise contour and the contradiction contour are frequently discussed in the intonational literature because of their specific meaning (e.g. Goodhue & Wagner 2018). In all accounts of the British tradition, the nuclear tune plays a central role. Yet, it is never described in isolation. Typically, it co-occurs with a pre-nuclear tune, sometimes also with a post-nuclear tune. In Palmer (1992) for instance, the nuclear tune falls on the most prominent syllable and can be preceded by several heads. It is followed by a tail which can extend over several syllables. In Bolinger (1985; 1989), all contours can occur either as part of the nuclear tune or before (“pretonically”) and usually occur in combination with each other. The early accounts of the American tradition kept the notion of a nuclear tune very much alive. Pike’s (1945) focus is on so-called primary contours that start with an accented syllable. These primary contours make up the tonal movements, such as falls and rises. Six tonal configurations based on four pitch levels each make up the falling and the rising contour. A fall-rise incorporates nine different configurations. The rise-fall and the level contour have only one tonal configuration each. They can combine with three types of precontours, which gives us 69 meaningful combinations. Hence, both in the American and the British tradition, the nuclear tune receives a special role in the description of intonation. The Dutch tradition also distinguishes between different parts of the contour as a result of the alignment of pitch and stress, which “cause[s] the impression of prominence” (t’Hart et al. 1990: 96). Prominence-lending movements are distinguished from non-44 prominence-lending movements. Distributionally, the different falls and rises fall into three categories, prefix, root, and suffix. Only the root is mandatory and can be recursive. In a way, then the Dutch tradition recasts the idea of a nuclear tune in terms of prominence. It is only the AM framework that claims to have abandoned the nuclear tune: A phrase accent has the same function independent of whether it occurs as part of the nuclear tune or before (Pierrehumbert 1980). In reality, the nuclear tune is not entirely dispensed with. Final pitch accents mandatorily combine with the following phrase accent and – at the level of the intonational phrase – the boundary tone. Since there are no restrictions on how phrase accents, pitch accents, and boundary tones can combine, the AM framework allows 24 combinations of sentence-final tones. This makes it possible to map the configurations of the British tradition onto sequences of targets in the AM framework. Table 2.3 lists the different combinations of final pitch accents and edge tones that correspond to a singular pitch movement in the British tradition (adapted from Ladd 2008: 91). The ordering factor is the pitch height of the beginning of the fall and the ending of the rise, respectively. The distinction between high and low movements is based on the onset of the contour marked by the pitch accent, which has consequences for the pitch excursion. The distinction between low and high rises is based on the pitch accent that anchors the movement. Stylized contours are reserved for specific conventions of use, such as the ‘calling contour’ (Pike 1945; Bolinger 1951), with a sense of stereotype or predictability (Ladd 1978). 45 Movement Pitch height British tradition AM framework Fall  fall H* L-L% low fall (with high head) H+L* L-L% low fall L* L-L% Rise  stylized low rise L* H-L% stylized low rise L*+H H-L% stylized high rise (with low head) L+H* H-L% stylized high rise H* H-L% low rise (narrow pitch range) L* L-H% low rise L* H-H% low rise L*+H H-H% high rise (with low head) L+H* H-H% high rise H* H-H% Table 2.3 Correspondence of nuclear tunes (British tradition) and tone combinations (AM)  The existence of such correspondence mappings (see also Pierrehumbert 1980: 390ff.), as well as the scholarly effort to demonstrate the compatibility of the two frameworks (Roach 1994), suggests that the combination of final pitch accents, phrase accents, and boundary tones (involuntarily) reintroduced some of the configurational aspects of earlier frameworks. In effect, the nuclear tune is alive and well in the AM framework. It is simply no longer the focus of the architecture. Pitch accents receive notably more attention than edge tones and how the latter combine with the former: “The degree of real independence of pitch accent and edge tone has long been an unresolved issue in AM theory” (Ladd 2008: 101). The nuclear tune as a phenomenon is certainly a unit that can be separated from other parts of the contour. Consequently, it can be associated with a dedicated meaning. 2.4.3 Summary of the existing descriptions of prosodic form What we see across the different characterizations of prosodic from are the following tensions: i. a tension between under- and over-specifying the phonological components of the signal, ii. a tension between location-specific and context-dependent descriptions,  iii. an ambivalent awareness of the role of the nuclear tune, i.e. the final part of a pitch contour.  46 Each tension results from the challenge of providing meaningful characterizations of tunes that come to terms with the unlimited amount of phonetic variation. Although the levels vs. targets distinction has been a helpful categorizer of different approaches, not a single approach completely ignores the relevance of the opposing view. Target-based approaches include context-sensitive rules, and level-based approaches recognize the need to compartmentalize the signal to arrive at a phonological theory. What remains a point of contention is the independence of tonal targets. Moreover, there is no denying that the nuclear tune has a special status in every description of the prosodic signal. The latter holds even for the AM theory, which assigns equal values to pre-nuclear and nuclear pitch accents, but still lends prominence to the nuclear tune due to its acoustic properties and its relation to edge tones.  The key question for any description should be its purpose: Is it to be maximally faithful in representing what is perceived? Is it to be maximally efficient in predicting changes in the signal? Or is it to be maximally helpful for describing its usage? At a phonological level, the AM approach may well be suited to providing the necessary ingredients to help characterize the main events in a contour. But for a description of what is pragmatically meaningful, we require further innovation. None of the existing descriptions lend themselves as optimal candidates for mapping Commitment and Engagement onto their ingredients. Either their inventories are too limited or too large. The key question therefore will be one of perceptual reality: which are the prosodic units that can be perceptually discriminated and are associated with differences in pragmatic meaning? The answer to this question will get us closer to understanding the role of prosody in the encoding of Commitment and Engagement. 2.5 Previous accounts of intonational meaning  In the majority of studies of intonation, the function of prosody receives less attention than its form. There is a noticeable hesitation to link individual contours with specific functions. At the same time, no study completely refrains from associating some of those contours with either a SA or a clause type. The underlying reason is that the mappings of the Clause Type and the Fall/Rise Convention as illustrated in Figure 2.8 are considered the default for the distribution and meaning of rises and falls. 47  Figure 2.8: Form-function mappings according to the Clause Type Convention and the Fall/Rise Convention We saw several attempts to deal with the tension between direct and indirect mappings of form to function in Subsections 2.2 and 2.3. Although distinguishing SAs is only one function proposed for SFI, the Clause Type Convention and the Fall/Rise Convention are omni-present as evidenced by the terms used: e.g., declarative contour or question intonation (e.g. Pierrehumbert & Hirschberg’s 1990). In this section, I contextualize the propositional function of SFI in the literature on intonational meaning.  One prevalent distinction in the discussion of intonational meaning is that between linguistic and paralinguistic meaning. The latter captures emotive aspects of meaning, such as boredom, joy, or anger. In this context, researchers keep coming back to Bolinger’s (1964) analogy of the human voice and the surface of an ocean. He equates waves with accents, swells with phrasing, and tides with emotions; ripples are accidental, and thus neglectable. By extension, we can equate the expression of emotions with paralinguistic meaning: it can modify linguistic meaning by modifying prosodic form (Ladd 2008). On this view paralinguistic meaning is added to linguistic meaning. However, the line between linguistic and paralinguistic meaning is hard to draw, especially when it comes to intonation, because the same means, such as lengthening, intensity, or pitch excursion, are used to express them. Moreover, what some consider linguistic meaning, others consider paralinguistic meaning. The expression of emotions, sometimes referred to as emotive meaning, or attitudinal meaning, has sometimes been used synonymously to paralinguistic meaning (O’Connor & Arnold 1961; Liberman & Sag 1974; Sag & Liberman 1975). Pierrehumbert & Hirschberg (1990), however, claim that speaker attitude towards propositions is equally dependent on context for the interpretation of intonation and hence resemble those functions associated with paralinguistic meaning: “Though speaker attitude may sometimes be 48 inferred from choice of a particular tune, the many-to-one mapping between attitudes and tune suggests that attitude is better understood as derived from tune meaning interpreted in context than as representing that meaning itself” (Pierrehumbert & Hirschberg 1990: 284). However, we cannot equate the conversational effects of emotional and propositional attitude. Propositional attitude is often encoded lexically, while emotional attitude often is not. Of course, we can verbalize these emotions, but speakers frequently rely on the addressee to infer from their body language or the quality of their voice whether they are angry, sad, happy, excited or afraid.  An interesting proposal for categorizing the different approaches to intonational meaning is put forth in Grice & Baumann (2007). Figure 2.9 distinguishes between paralinguistic and linguistic meaning, which correlates with a distinction between categorical and gradient expression.   Figure 2.9: Form and functions of intonation according to Bauman & Grice (2007: 14) The shift from linguistic to paralinguistic function starts with lexical tone languages and ends with emotional states and attitudes. Syntactic structure, information structure and SAs fall in between. Interestingly, the increase in paralinguistic flavor correlates with an increase in gradience. For our discussion, the scope of Grice & Baumann’s system exceeds the realm of SFI, but I would like to preserve its spirit for my own categorization of existing approaches. I separate tune-based accounts of meaning from tone-based accounts of meaning, since they come with a conceptual difference. 49 The level of abstraction in tone-based accounts is too high to allow a direct link between form and function. All proposed functions must be abstract as well. Consequently, tune-based meanings tend to be more gradient than tone-based meanings. This trend corresponds with a trend of increasing grammaticalization from top to bottom in Figure 2.10. Where the tune-based approaches and the tone-based approaches converge is the propositional function, which is sentence-wide in scope and distinguishes questions from assertions. While the distinction between questions and assertions appears to be a categorical distinction, we see that at least one of the tune-based approaches also assumes some gradeability there (Halliday and Matthiessen 2004). The distinction between grammatical and conventional (where the former cannot be violated for the sake of clarity, but the latter can) runs parallel to that of gradient and categorical notions of intonational meaning.  Figure 2.10: Categorizing intonational meaning  In the following survey of approaches to intonational meaning, I use the above scheme as a tool for categorization. It is organized first by form, then by function, and – as a direct consequence of function – by conversational effect. Intonational form is traditionally either described as a sequence of individual tones or as a tune that cannot be fully decomposed. The tone-based approach has its roots in the American tradition (e.g. Pike 1945; Wells 1945; Trager & Smith 1951); the tune-based approach has its roots in the British tradition (e.g. Palmer 1922; Kingdon 1959; O’Connor and Arnold 1973; Crystal 1969). Besides a separation of tune- and tone-based meanings, the presentation of the different accounts follows their order of publication. 50 2.5.1 Tune-based meanings In this section, I give an overview of prosodic functions in the British tradition. Prime candidates of the tune-based approach to intonational meaning are the accounts of Liberman & Sag (1974) and Sag & Liberman (1975). In their approach the mapping between tune and function is direct. I also include Gussenhoven’s (2002; 2004) account of intonational meaning. Even though his view of prosody is tone-based, his approach to meaning is predominantly based on the phonetic realizations and the configuration of contours. In this way, it is closer to the tune-based approaches to meaning in spirit because meanings are associated with a specific combination of tones. 2.5.1.1 Diversity in the British school I begin my survey of intonational meaning with Palmer (1922), a representative of the British school. This is a good starting point to introduce four tendencies common to many approaches that focus on form rather than function. His configuration-based account has four types of tunes. The degree of rise or fall correlates with the speaker’s animation. Table 2.4 summarizes the different functions and their distribution for each tune. The mapping of tunes onto SAs is incomplete. Form Function Distribution Falling multiple (depending on the preceding head): fact, tangent, condition, knowledge, surprise unrestricted High rising questioning, lack of finality statements, commands, polar questions, echo questions Falling-Rising concession statements, commands Low-rising reassuring statements, commands Table 2.4: Tune meanings in Palmer (1922) Four issues are apparent. First, some of the contours are associated with more than one meaning. Also, all of the contours occur with more than one clause type. Palmer acknowledges that it is challenging to find one accurate core meaning for all four different contours. Secondly, Palmer refrains from providing a core meaning for falling intonation. He specifically dismisses finality considering the many exceptions he would need to discuss. This differs significantly from tone-based approaches (see Section 2.4.2). Thirdly, meaning is defined in terms that are difficult to incorporate under one central category. Finality is primarily a notion of coherence; questioning refers to primary SAs; and concession and reassuring refer to attitudes and are not based on 51 propositional meaning. While this is not problematic in isolation, it contrasts with other accounts that find equivalent distinctions within the same category of meaning. Finally, and importantly for our discussion, none of the proposed meanings or attested distributions qualify for establishing a direct link between contours and SAs. Statements occur with all types of contours and questions also cut across the distinction between rises and falls. These four issues are characteristic for accounts of intonational meaning in general. Specific to Palmer’s inventory of tunes is that it is too small to allow an association with specific functions in the first place. Yet, even larger inventories of tunes defy a direct link between tune and function. While Bolinger (1958) associates a falling accent (Profile A) with newness or assertiveness, he later avoids the mapping of form and function altogether: “any intonation that can occur with a statement, a command, or an exclamation can also occur with a question” (Bolinger 1989: 98). This might suggest that intonation is inherently autonomous, and hence is subject to different rules than segmental phonology (a conclusion that opposes the striving of the American tradition to relate segmental and suprasegmental form). We see a similar hesitation in the Dutch tradition, which relies on both tunes and tones (see Section 4.1). There is a noticeable reservation toward associating an individual contour with a meaning. t’Hart (1984) notes that there are “at least ten times” as many meanings, implications or interpretations as there as are different intonational patterns. The Dutch tradition recognizes the importance of intonation for attitudinal meaning but considers it a many-to-many relation between form and function. If there is any link between SAs and intonation in Bolinger’s work, it is not encoded by the form of the contour. The one thing questions may have in common is that they are realized with higher pitch than non-questions. Anything that may be regarded as a default form for a specific function (e.g. a rising intonation as a default for questions) is best described in terms of strong correlations. Accordingly, each profile listed in Bolinger (1989) correlates with several clause types and interpretations. Bolinger therefore clearly avoids providing a tune-specific definition of intonational meaning. Halliday (1967) follows a different strategy: he associates each contour with several different functions. Statements, for instance, occur with each of the five postulated tones; only tone 4 (Rise-Fall-Rise contour) has a unique coloring in reversing the statement. In comparison with Bolinger’s work, Halliday and Matthiessen’s (2004) proposal makes a stronger claim about the relation of form and function. The claim that clause types have an unmarked contour something stronger 52 introduces a reason for the strong correlation of the two. If a clause type occurs with a contour not associated with it by default (i.e. a marked contour), the interpretation is paralinguistic in nature.   Tone 1: fall Tone 2:  high rise Tone 3:  low rise Tone 4:   fall-rise Tone 5: rise-fall declarative unmarked reserved insistent tentative protesting Wh-interrogative unmarked tentative    Polar interrogative preemptory unmarked    Imperative command  invitation plea  Table 2.5 Marked and unmarked tones in Halliday & Matthiessen (2004) The assumption that every clause type has an unmarked (i.e. default) contour has far-reaching implications, which is evident from the comparison with Bolinger’s work. For Bolinger, prosodic variation is so substantial that falling intonation cannot serve as a defining feature of declaratives, just as rising intonation cannot be a defining feature of interrogatives. Halliday’s conception of prosodic variation is much more restricted, and a marked contour only adds an attitudinal aspect rather than changing the interpretation of the clause type. While Halliday heavily relies on the grammatical function of the unmarked contour, O’Connor and Arnold (1973) predominantly rely on attitudinal aspects of intonational meaning. Depending on an interaction with clause types, the attitudinal function of intonation finds different forms of expression.  (Prenuclear) + nuclear tune Meaning C1: (low) + low fall unsympathetic, uninterested C2: (high) + low fall considered, weighty, categorical C3: (low) + high fall interested, lively, surprised C4: (high) + high fall neutral, friendly C5: (high) + rise-fall impressed, challenging, shrugging off responsibility C6: (low) + low rise reserved, cautious C7: (high) + low rise reassuring, patronizing  C8: (high) + high rise questioning C9: (high) + fall-rise implicative C10: (high) + high fall + low rise sympathetic, persuasive, plaintive Table 2.6 Attitudinal meaning in O’Connor & Arnold (1973) 53 The contours listed in Table 2.6 are not restricted in their distribution. Their expressed attitudes can occur with statements, questions, commands and interjections. For each of these SAs, O’Connor and Arnold (1973) list an individual meaning for each contour, which suggests an interaction of SA and contour meaning. Several inconsistencies within their description stand out. The meaning of C8 is described in propositional terms (“questioning”) rather than attitudinal terms (e.g. uncertain). C8 is also the only contour that seems (grammatically-) context-sensitive: If C8 occurs with statements, they become questions; if it occurs with commands and interjections, they question an utterance of the addressee. It seems dubitable that none of the other contours change their meaning if they occur with different clause types. Nevertheless, some contours are labeled as the “neutral” choice of a particular environments: C1 is a neutral contour for the end of a list, or in a series of short questions, C4 is neutral across the board, C6 is neutral in question tags and non-final elements of a list; C7 is neutral for polar questions and fronted subclauses; C8 is neutral for polar questions in American English. How this neutral contours go together with the ascribed meanings (e.g. C7 – reassuring, patronizing – with polar questions), is not addressed explicitly.  One thing is evident in this brief survey of the different approaches in the British school: increasing the inventory of meaningful tunes does not lead to a one-to-one mapping on contour and meaning. In every account, the list of ascribed meanings is longer than the list of tunes proposed. The variation between and within the different accounts is too great to allow a coherent and concise mapping of form unto function. Another conclusion to be drawn from this comparison is that a distinction between linguistic and paralinguistic meaning is difficult to maintain. Paralinguistic meaning has an impact on the shape of a contour independent of how many linguistic functions are associated with it. Halliday and Mathewson’s (2004) association of unmarked mappings with paralinguistic meaning is an interesting proposal to explain how linguistic and paralinguistic meaning are related. But it ignores what other accounts assert repeatedly: marked mappings can transform the conversational effects. Intonation can change an assertive into a questioning interpretation independent of the clause type. 2.5.1.2 The Holistic approach by Liberman and Sag  A somewhat extreme version of a tune-based approach to meaning is the work by Liberman & Sag (1974) and Sag & Liberman (1975). Beyond the assumption of a fixed configuration surrounding the nuclear pitch accent, they assume that the entire contour of a sentence can be meaningful. 54 Numerous authors rely on such holistic analysis for isolated phenomena, such as the calling contour, and the children’s chant (e.g. Bolinger 1989; Pierrehumbert & Hirschberg 1990; Ladd 2008). The inventory of Liberman & Sag (1974) and Sag & Liberman (1975) is not intended to be comprehensive; it merely consists of four different contours. The contradiction contour is characterized in its pragmatic function without relying on Jackendoff’s (1972) notion of contrast. Liberman & Sag (1974) show that contrastive and contradicting meaning can combine and are then encoded in sequence (with a contrastive rise-fall added to the contradiction contour).  Figure 2.11: Contradiction contour (in black) with optional contrast (in grey) Sag & Liberman (1975) extend the approach of mapping the entire contour onto a function in their analysis of wh-questions. Similar in shape to the contradiction contour, the tilde contour forces a question interpretation. Hence, the SA interpretation is direct, rather than indirect (cf. Searle 1974). The non-questioning meaning of the contradiction (with or without the contrasting rise-fall) was referred to by Searle as an indirect interpretation. Such an indirect interpretation is unavailable for the tilde contour, which provides us with a one-to-one mapping of form and function here.  Figure 2.12: Tilde contour Two additional contours can serve to encode an indirect SA: Both the hat contour and the surprise contour change the interpretation of a wh-question. The former adds a suggestive, rhetorical, or negative flavor; the latter adds a flavor of surprise or marks redundancy.   Figure 2.13: (a) Hat contour (with optional rise) and (b) surprise/redundancy contour Strikingly, the last two contours, which can encode indirect SAs, can also occur with regular wh-questions. An indirect SA cannot be enforced. The only contour that can enforce a particular 55 interpretation, and hence allows a direct mapping of form and function is the tilde contour. For polar questions, Sag & Liberman (1975) propose that this function is fulfilled by a final rise.  2.5.1.3 Functional Meaning: Gussenhoven’s universal codes Gussenhoven (2002; 2004) relates phonetic properties of prosody to both universal and language-specific functions by expanding on Ohala’s (1983; 1984; 1994) notion of the frequency code. His account combines functional, propositional and conversational meaning. Only language-specific features are considered as morphemes; universal features are phonetic and are considered to be reflexes of attitudes. This comprises a theory of paralinguistic meaning, which contrasts with linguistic elements, such as H% encoding interrogativity or non-finality. Paralinguistic meaning is expressed in three biologically determined variables: the frequency, the effort and timing of production relative to breath. These codes can make the same contour sound very differently depending on whether the speaker is male or female, calm or emotional, and at the beginning or at the end of a breath group. The frequency code reflects the articulatory effects of anatomy, the effort code reflects the manner of articulation (i.e. the amount of energy invested), and the production codes reflects the effects of our egressive-pulmonic system as a whole, which in turn results in different breath groups and a default to start high and to end low in each contour. Just as in the wave analogy (Bolinger 1964), the effects of the three biological variables add variation to the tonal morphemes in intonation. Only these effects are universal. Importantly, they are controlled by the speaker, which means that they can also be suspended and/or substituted by lexical or phonetic forms (e.g. clefts and peak delay). Both frequency and effort have three types of effects: affective, informational and grammatical interpretations. Grammatical interpretations are (mostly) language-specific. The production code only has informational interpretations. Table 2.7 summarizes the effects of each code and its interpretation:  Frequency code Pitch height Effort code Pitch excursion Production code Boundary tone Affective dominant vs. submissive/polite Surprise, helpful, authoritative, pleasant  Informational certain vs. uncertain more or less prominent  Grammatical e.g. declarative vs. interrogative e.g. background vs. focus, obligedness new vs. old, continuation vs. finality Table 2.7: Intonational Codes in Gussenhoven (2004) 56 The underlying assumption in Gussenhoven’s approach is that physiological aspects of speech can be exploited for communicative purposes. For the production code, a reversal of the natural declination of pitch height (due to a decrease in subglottal air pressure) results in a continuation interpretation. Likewise, it would follow naturally to present new topics, which correlate with high accents, at the beginning of a breath group. The general trend to place new information late in a breath group and mark it with high accent goes against the natural trend of declination. The combination of the different codes leads to an interesting possibility to formally disambiguate intonational functions. An ambiguity of H% for questionhood (arising from uncertainty) and continuation in Germanic languages, for instance, can be explained by a difference in cause: the former arises from the frequency code, the latter from the production code. In Gussenhoven’s (2002; 2004) account, contours – i.e. the combination of individual tones – constitute forms that serve to negotiate shared beliefs. There are three grammaticalized meanings:  i) H*L serves to add and commit to a belief,  ii) H*L H% serves to select information that is considered shared beliefs, and  iii) L*H serves to request whether information belongs to a set of shared beliefs.  The latter two meanings are not derivable from the biological codes above. Interestingly, these functions can be oriented toward both interlocutors, which is summarized in Table 2.8. A falling contour (H* L L%), for instance, adds a proposition to the cg and therefore supplies information to the addressee. It can also serve as an inference when it relates to what is already in the cg.  Contour Function Speaker Effect Addressee Effect H*L L% Adding Inference Supply Info H*L H% Selecting Realization Reminder L*H H% Testing Request info Challenge to respond Table 2.8: Tune meanings in Gussenhoven (2002) In a way, these postulated meanings correspond to the question/assertion distinction (categorized as propositional meaning (i.e. referring to SAs) in my overview in Figure 2.10) with an extension of the effects to the addressee and the addition of a third effect that draws of the negotiation history (realization/reminder). The negotiating belief sets, which is signaled by the contours listed in Table 2.8, is complemented by a classification of information provided, which is encoded through the accent distribution across a sentence. Gussenhoven distinguishes between eventive, definitional 57 and contingent sentences, which are exemplified in in (23). An absence of pitch accents on the predicate marks eventive sentences, which describe changes in the context of the interlocutors. Statements that do not lead to a change are labelled as definitional and have unaccented predicates. Finally, contingent utterances request a validation or the relevance of an update. They bear accents on verb, object and the negator. The illustration of the three types of belief updates is exemplified by the three responses to Speaker A in (23), reprinted from Gussenhoven (2002). Capital letters indicate the presence of pitch accents whose distribution distinguishes the three types of updates.  A: What’s that scuffle?  B: Our CUSTomers aren' t admitted! (eventive)  B’: CUstomers aren' t adMITted. (definitional)  B’’: Our CUSTomers AREN' T adMITtEd! (contingent) In sum, Gussenhoven assumes that both pitch accents and nuclear tunes are meaningful. The pitch accents specify the type of cg management; the latter indicate the agenda for speaker and addressee. Both aspects are grammatical uses of the available biological codes. 2.5.1.4 Summary of tune-based approaches to intonational meaning Tune-based meanings can only be subsumed under one class of approaches when we consider the form to which meaning is ascribed. The postulated meanings vary considerably and include both linguistic and paralinguistic functions, often concurrently. The meanings are more often gradient than categorical, unless predominance is given to propositional meaning. An unambiguous mapping of form to function is virtually absent, which means that intonational meaning either lacks a clear definition or requires some pragmatic inferencing. In most of the approaches reviewed so far, the mechanisms responsible for such inferencing are not identified. We see in the following section, that (in correlation with a smaller inventory of tonal units) these mechanisms are more prominently discussed in tone-based accounts of intonational meaning. One approach that stands out in the previous discussion is Gussenhoven’s proposal of grammaticalized meaning in English in that he provides clear description of how inferencing can be guided through intonation. 2.5.2 Meanings of Tones This section reviews tone-based approaches to intonational meaning, with a notable difference between publications in the American tradition and those in the later Autosegmental-Metrical 58 (AM) framework. The AM framework reduces the inventory of the American tradition from four to two tones, which makes it impossible to assign specific meaning to the abstract forms. Intonational meaning therefore requires an act of inferencing, which receives considerably more attention in AM-based accounts than elsewhere. The American tradition reviewed at first is much closer in its conception of intonational meaning to tune-based approaches than the work of linguists working in the AM framework, such as Pierrehumbert& Hirschberg (1990), Bartels (1997), and Truckenbrodt (2012), Steedman (2004; 2008; 2014) and Westera (2014; 2017). 2.5.2.1 Diversity in the American tradition Intonational meaning receives little attention in the American tradition. The existing accounts are mostly descriptive in nature. The primary motivation of the American tradition was to treat suprasegmentals with the same analysis as segmentals. According to this tradition, both segmentals and suprasegmentals form morphemes that combine into an utterance (see e.g. Wells 1945:28). The idea that segmentals and suprasegmentals are of different nature lends credit to a distinction between linguistic and paralinguistic meaning. Only Pike (1945) provides some suggestion about the meaning of the different combinations of tone levels; Wells (1945) focuses on the phonetic technicalities. A notable exception in Wells’ account is that he associates the highest tone in his inventory (level 4) with the notion of surprise. His examples cut across different clause-types; they include constructions such as polar interrogatives, wh-interrogatives, conditionals, declaratives, exclamations, imperatives, and a number of fragments. In passing, Trager & Smith (1951) note that questions and assertions can have the identical distribution in tonal targets; association of tones and constructions is – rather interestingly – left to the study of syntax, not phonology. In Pike (1945), the direction of a tone movement determines the core meaning of a contour. The difference in tone levels determines the degree to which that meaning applies although some level differences are expressed in terms of secondary meaning, too. For instance, a fall generally expresses finality. If the falling movement occurs from the extra-high level ‘1’ to the low level ‘4’, the speaker expresses more contrastiveness than if the movement occurs just from the high level ‘2’ to the low level ‘4’. The different contours postulated for a fall-rise combine the meanings associated with falls and rises. The meaning is therefore both compositional and gradable: 59 Movement Meaning Levels fall Degree of finality & contrastiveness 2-4, 1-4, 3-4, 2-3, 1-3, 1-2 rise Degree of incompleteness 3-2, 3-1, 4-3, 4-2, 4-1, 2-1 fall-rise Implication/deliberation + incompleteness 2-3-2, 1-3-2, 2-4-2, 1-4-3, 3-4-3, 2-4-2, 1-4-2, 2-4-1, 1-4-1 rise-fall Repudiation 4-3-4 level contour Strong implication e.g. 3-3 Table 2.9 Tune meanings in Pike (1945) We see from Table 2.9 that the mapping of form and function is similar to proposals in the British tradition. Although the form is broken down into levels, their meaning is provided for combinations, which correspond to the inventories of the early British proposals. Meaning is described in terms that conflate conversational and pragmatic meaning (see Figure 2.10), as for the meanings ascribed to the different combinations that capture a fall-rise: incompleteness is primarily a marker of coherence; implication requires inferencing by the addressee. The choice of four tones in the American school allows for an association of the same tones with additional layers of meaning depending on their tonal environment. Trager & Smith (1951) propose, for instance, who follow Well’s hierarchy of tones, claim that the shape of the contour in a wh-question corresponds to whether it is asked politely or insistently – the former starting with a higher tone ‘3’ and continuing with lower tone (‘3-1-1’) than the latter (‘2-2-3-1’). Other paralinguistic elements can be expressed (cross-linguistically) by duration and extension. The use of paralinguistic pitch is presented as clearly distinguishable from linguistic pitch; the former is added to the latter. The expressions of paralinguistic pitch in English include the so-called vocal qualifiers, such as pitch height, duration, and intensity, which are all conceived independently from tone levels. Intensity affects absolute values of stress, not their relative values. Pitch height, a paralinguistic measure, is distinguished from pitch range, a linguistic measure, and marks a range that is either higher or lower than what the authors consider to be normal pitch. Pitch duration is regarded as a universally-relevant measure. In theory, this provides us with clear means to distinguish forms that encode paralinguistic meaning from forms that encode linguistic meaning. Yet, these means depend on a knowledge of the “normal” realizations of the tone levels. Factoring out within- and between-speaker variation has proven to be a very complex task (see Ladd 2008, chapter 5 for an overview). 60 2.5.2.2 Pierrehumbert & Hirschberg (1990): coherence and in/complete beliefs Accounts of intonational meaning that are couched with the AM framework are predominantly compositional in nature. Pierrehumbert & Hirschberg (1990) take the idea of compositionality to an extreme: every tone, which is conceived of as a morpheme, is meaningful by itself. Table 2.10 summarizes the form-function mapping of the individual accents and tones:  T* T- T% L old not part of larger unit not forward-looking H new part of larger unit forward-looking Table 2.10: Intonational meaning in Pierrehumbert & Hirschberg (1990) While Pierrehumbert & Hirschberg (1990) describe the function of intonation as relating to shared beliefs, this only holds for the meanings they ascribe to pitch accents. The meanings of edge tones are described in dialogical terms: they signal whether or not the current unit relates to the following phrase (phrase accent) or utterance (boundary tune). For those pitch accents derived of two tones, compositionality breaks down: While the starred tone determines the primary meaning of the accent (discourse-new vs. discourse-old), the secondary meaning is determined by the combination of the starred and the unstarred tone. The individual meaning of the unstarred tone is lost. Correspondingly, bitonal pitch accents are better interpreted as configurations rather than target tones. If the pitch accent is a rise, it invokes a scale. Büring (2016) visualizes this mix of tone-and tune meaning with Table 2.11. Tonal  Anchor  Point Movement none rise fall   L+H H+L Meaning – scale inference H* new, predication H* L+H* H*+L L* No predication L* L*+H H+L* Table 2.11 Overview of tune- and tone-meaning of Pierrehumbert & Hirschberg (1990) in Büring (2016) It is not only at this level where compositionality breaks down. Dainora (2001; 2002) shows that a number of sequences of pitch and phrase accents are fully grammaticalized. Similarly, Büring (2016) proposes that their combinations may be conditioned by construction (not clause type). Büring (2016) provides the following pair of rhetorical questions from German as an illustration for this construction-specificity. In example (24), the wh-pronoun occurs at the beginning of the 61 sentence, which is the default word order for wh-questions. These typically occur with a fall. In example (25), it is the auxiliary that occurs sentence-initially, which is the default word order for polar interrogatives. These typically occur with a rise. Hence, with a similar meaning, the construction seems to dictate the shape of the contour.       Was kann ich dafür? what can 1SG DEICT-for  ‘How is that my fault?’       Kann ich was dafür? can 1SG what DEICT-for  ‘Is that my fault?’ An obligatory relation between constructions and contours was first proposed by Leben (1973). Support for such a principle comes from the two trends that a high boundary tone is typically preceded by a low pitch accent, while a low boundary tone is typically preceded by a high pitch accent. While we saw exceptions to these trends in Chapter 1 (e.g. high-rising declaratives), their frequent occurrence suggests that the combinations of pitch accents and edge tones follow certain conventions. These conventions cast doubt on a free combination of pitch accents and edge tones. Moreover, the existence of such conventions re-introduce the relevance of the nuclear tune for intonational meaning, which Pierrehumbert originally abandoned by giving nuclear and pre-nuclear accents the same phonological value. 2.5.2.3 Bartels (1997) and Truckenbrodt (2012): primary speech acts  Bartels (1997) and Truckenbrodt (2012) reduce the AM inventory of pragmatically-meaningful, tonal morphemes to from six to two. That is, only two tones per account are meaningful. The core components of these accounts of intonational meaning are summarized in Table 2.12. 62  Truckenbrodt (2012) Bartels (1997) T* T- T- T% L - - asserting - H new questioning - continuation dependence Table 2.12: Intonational meaning in Truckenbrodt (2012) and Bartels (1997) Aware of the exceptions to the Clause Type Convention, Bartels (1997) refrains from a direct mapping of clause type to SA. Both question and statement are defined in functional terms: questions are interpreted as speaker uncertainty. Statements are simply seen as the opposite of questions and therefore encode a lack of uncertainty. L- (encoding assertiveness) is a sufficient, but not an exclusive marker of statements. Following Stalnaker (1978), uttering an assertion is defined by Bartels (1997) as having the consequence of adding a proposition to the set of mutual beliefs (through a reduction of incompatible possible worlds in the belief set). With a comparison of the sequences H* L- L% and H* L- H%, which correspond to a fall and a fall-rise in the British tradition, Bartels argues that out of the boundary tones, only the high tone bears meaning. The finality effect in H* L- L% is triggered by the preceding phrasal accent (L-), which has an assertive function. In contrast, continuation dependency in H* L- H% is encoded by the boundary tone (H%), which overrides the finality effect of L-. Consider Bartels’s (1997: 98f.) examples below where the dialogue has a sense of finality in (26), but not in (27).  A: Why won't you come to Mary's house with me?  B: I find Mary's dogs unbearable (H* L-L%).  A: What's your opinion - can we leave the car parked here?  B: I think it's alright (H* L-H%). Although both responses in examples (26) and (27) are statements, only (27) invites continuation. Across interrogatives and declaratives, H% may invite pragmatic inferencing, such as scalarity or restrictiveness. By associating each tonal morpheme with an abstract function, the assertive meaning of L- can combine with the continuation dependency meaning of H%. Hence, Bartel’s (1997) account is one that bridges the gap between the propositional and dialogical meaning.  Truckenbrodt (2012: 2049) proposes a function for one pitch accent and one phrase accent (28).  63  a.  H* marks a salient proposition as new in the sense of an instruction by the speaker to add the proposition to the cg of speaker and addressee.  b.  H- marks a salient proposition as put up for question by the speaker.  Since old information is the default from an information-structure point of view (its content is already grounded), L* can be assigned the opposite meaning of H*, which is associated with new information. L-, however, does not have the opposite meaning of H-. Truckenbrodt’s criticism of associating L- with an assertive meaning is mainly based on his observation that H* typically introduces assertiveness before the occurrence of the edge tone. Notice that this is the same logic that Bartels (1997) employed for refraining from associating a specific meaning to H- (see above). Whenever H* is not directly assertive and the sentence is still interpreted as a statement, Truckenbrodt questions that L- necessarily must provide the assertive meaning; it could be encoded or it could be provided by contextual information.  If we combine the different functions, we have a full set of morphemes corresponding to each tone. The proposed functions combine the Clause Type Convention and the Fall/Rise Convention and overcome the construction-dependency problem inherent to Pierrehumbert & Hirschberg (1990):   T* T- T% L old assertion finality H new question continuation Table 2.13: Intonational meaning as a result of combining Bartels (1997) and Truckenbrodt (2012) Even a proposal like the one in Table 2.13 does not resolve the core problem of intonational meaning: because the different configurations map onto several phenomena, we still need to rely on pragmatic inferencing to disambiguate between the possible interpretation of the combination of tones. While the preceding discussion of intonational meaning shows that a one-to-one mapping of form and function seems impossible to accomplish, the interpretation of the tonal morphemes should depend on contextual information if they are defined in absolute terms, such as questioning. The proposed mappings also face another problem: of all the contours marked by H-, only a polar interrogative is an uncontroversial case for a question interpretation. Equally problematic is the fact that L- occurs with both questions and answers. Finally, H-L% seems to completely fail the system: by stipulating that the boundary in this sequence marks the absence of a rise rather than a fall, Pierrehumbert (1980) has made it impossible to derive a coherent, compositional account of 64 intonation in the AM framework. For instance, it is difficult to see how a plateau contour or a calling contour (H* H-L%) can be conceived of as a question that is marked as complete (which is their literal meaning based on their tonal configurations). Their conventions of use qualify for neither a question interpretation not for a completeness interpretation. Considering these problems, it seems to make sense for Bartels (1997) and Truckenbrodt (2012) to limit the inventory of tonal morphemes and to supplement their functions with context-induced interpretations. There are both theory-internal (according to the AM framework) and empirical reasons (in light of the non-compositionality of constructions, such as the plateau contour) to do so. A reduction of the morpheme inventory is made possible in Bartels’ (1997) account by relying on salient propositions, which help to identify the prepositional alternatives in non-canonical questions. Truckenbrodt (2012) adopts this concept and extends its use to assertions. On a theoretical level, salient propositions serve to maintain the core meaning attributed to their tonal morphemes in the context of mappings that violate the Clause Type Convention and the Fall/Rise Convention. Though promising as a concept, an unconstraint use of salient propositions for interpreting SFI will prove to be inadequate for characterizing intonational meaning. The exact mechanisms of pragmatic inferencing remain underspecified. Saliency is defined as being “part of [the listener’s] focus space’ (Bartels 1997: 112), a notion Bartels adopts from Grosz and Sidner (1986), or as a possible presupposition of a question. Crucially, this notion of a salient proposition is linked to assertiveness. Salient propositions correspond to surface propositions in statements and to speaker-implied alternatives in questions: “Whereas in statements it is (generally) the surface proposition that is being asserted in this sense, in questions bearing L- it is a salient speaker-presupposed proposition derivable by the addressee” (Bartels: 1997: viii). Consider the following falling interrogative from Bartels (1997) where the notion of saliency explains how an interrogative sentence form is compatible with an assertiveness operator (encoded by L-).  a. Did you buy it? (H* L-L%)  b. alternative proposition: 'You bought it, or you didn't (buy it).'  c. alternative proposition: 'You bought it, or you rented it.' Even though Bartels (1997) does not provide any context that helps determine what is salient here, the presence of falling intonation suggests that (29) is not a real question that seeks to assess the truth of a proposition. Her proposal makes clear that L- is assertive. Hence, the question points to 65 alternatives to the surface proposition asserted. The key question, however, is how to determine the size of the set of propositional alternatives.  The difficulty to identify the propositional alternatives is even more evident in Truckenbrodt’s use of the notion of salient propositions, which is disassociated from assertiveness. Consider now his discussion of Pierrehumbert’s (1980) example in (30) where Liberman approaches a receptionist. The example is taken to exemplify a high-rise declarative, which has multiple contexts of use.  My name is Mark Liberman? Truckenbrodt (2012) identifies My name is Mark Liberman, and are you expecting me, or, am I in the right place? as the salient proposition that is ‘put up for question’ by the rise here. Yet, the set of alternative propositions cannot be restricted to two salient propositions; in this case, the expected arrival and the correct location. Other salient propositions include those given in (29).  My name is Mark Liberman, and…  a. … did anybody leave a message for me?  b. … do you recall our conversation on the phone earlier?  c. … do you remember me from High school?  d. … does this mean anything to you? These alternative propositions can all be salient if we introduce – or accommodate – a few details to the context of the utterance. Yet, there is no way to restrict what can be accommodated here, which makes it virtually impossible to identify a uniquely salient proposition. Consequently, interpreting SFI depends on unconstrained contextual resolution. This is even more the case for Truckenbrodt than for Bartels because he extends the relevance of salience to assertions. Referring to salient alternatives supplies the dynamic element required for drawing on mutual knowledge based on contextual information. The difficulty of identifying a uniquely salient alternative, however, points back to the core problem of pragmatic inferencing: it is presently unclear how to describe its mechanisms. Identifying what can be inferred needs to be systematically described in order to understand how it contributes to the interpretation of a specific contour. While the notion of questioning is well-defended in Truckenbrodt’s proposal, and the proposals for asserting and questioning are well defined in Bartels (1997) and Truckenbrodt (2012), their accounts can only be as good as their definition of what is a salient proposition. 66 2.5.2.4 Pragmatic inferencing in Steedman’s work on intonation A concrete proposal of how pragmatic information complements prosodic information is Steedman’s (2000; 2008; 2014) work on intonational meaning. Steedman assigns specific roles to a small inventory of tones that communicate who is responsible for the future or present state of the cg and how the current utterance relates to it. One important drawback of this account is that it conflates the notions of cg management and cg content. I demonstrate that this conflation has notable consequences for how we conceive of the role of pragmatic inferencing for intonational meaning. I also review Steedman’s choice to rely on perceptively small differences in tone sequences as the fundamental cue. This choice contrasts with many other accounts that attribute the meaning he postulates with tonal configurations rather than individual tones.  The basic assumptions of Steedman’s account are deceptively simple. Intonational meaning is defined in terms of a rheme vs. theme distinction (Halliday 1967). This distinction is defined in terms of cg updates and suppositions. Rhemes mark additions to the cg, themes mark beliefs that are already assumed to be in the cg. The anchoring tone in the pitch accent thereby indicates whether the speaker signals success or failure of the update or supposition.  success failure thematic suppositions L+H* L*+H rhematic update H*, H*+L L*, H+L* Table 2.14: Intonational meaning in Steedman (2000; 2008; 2014) Furthermore, boundary tones determine whether it is the speaker (L%) or the addressee (H%) who is responsible for success or failure of these operations. Following the previous format of representing tonal morphemes, we can therefore summarize Steedman’s approach with Table 2.15.  T* T-, T% L failure in supposition or update speaker agency H success in update addressee agency Table 2.15 : Tonal morphemes in Steedman (2000; 2008; 2014) The basic idea is that this interaction of pitch accents and boundary tones renders functions, such as commitment, uncertainty, politeness and even questioning as epiphenomenal. They are all implicatures of rheme updates and theme suppositions. Hence, there are four elements that 67 contribute to intonational meaning: accenting, which highlights alternatives, a theme/rheme distinction, which marks the current and future state of the cg, a claim of presence or absence of beliefs in the cg, and the agency of the interlocutors. Let us see how this process works for a canonical declarative.  A:  What’s new?  B: (It’s RAINING)     H* L-L% In example (32), the entire proposition corresponds to a rheme, which makes the utterance an update to the cg. The accent falls on raining, which marks the alternative to other weather conditions. The combination of high pitch accent and low boundary tone tells us that the speaker is responsible for a successful cg update. Since example (32) is an all-rheme utterance, it is irrelevant what is already in the cg. If we depart from all-rheme utterances, the benefits of Steedman’s account are evident: His inventory can mark what is already assumed to be a shared belief (L+H*) and what constitutes the update (H*). H% in (33) signals that it is the addressee’s responsibility to update the cg with new information. It would be the speaker’s responsibility if the proposition had not been given already. In (33), the addressee needs to include the goalkeeper in a set of beliefs which already contains the information that someone scored a second goal.  A:  I know that RONALDO scored the FIRST goal, but who scored the SEDOND?  B: (The goalkeeper) (scored the second goal).     H*     L+H* L-H% Unfortunately for our purposes, Steedman does not provide an analysis of interrogatives. To make it work for interrogatives, some of the meanings need to be redefined. For one thing, it is difficult to determine the rheme/theme status of falling interrogatives, such as example (34). Falling interrogatives can be asked out of the blue. Hence, we cannot assume that (34) draws on shared beliefs. Falling interrogatives cannot be updates either, unless we consider the question itself an update to the discussion. Furthermore, the combination of L* and H% in (34) predicts that the addressee is responsible for not updating the cg with some rhematic information.  (Is it raining)?    L* H-H% 68 Steedman’s account correctly identifies the addressee as the activated interlocutor. Yet rather than characterizing the question as a failure of a rhematic update, it should be characterized as an activation of the addressee to provide with an update in the next turn. The possible out-of-the-blueness of (34) rules out any previous failure on behalf of the addressee.  More promising is the application to falling wh-interrogatives. The low boundary in (35) signals that the speaker considers the information that somebody scored already as part of the cg. This leaves the rhematic update identified by the high accent on who, which is the missing information.   (Who)  (scored)?   H*      H* L-L% While Steedman’s account cannot specify the responsibility for the rhematic update – that is, filling in the missing information – it succeeds in marking the proposition as a theme. The latter matches the conversational effect of a wh-interrogative, which presupposes the truth of the proposition. Besides the lack of integrating interrogatives, Steedman’s work on intonation faces three serious problems: i) the conflation of cg content and update, ii) a considerable overlap between the four inferencing mechanisms, and iii) a reliance on a tonal distinction that faces both perceptual and theoretical challenges. I will briefly comment on each of these problems. Firstly, a conflation of cg update and content fails to acknowledge that the former always applies at the propositional level, while the latter singles out individual phrases. As a consequence, it is impossible to predict where Steedman’s notion of speaker/addressee agency applies. It is decided on an individual basis whether the agency applies to the rheme or the theme. Secondly, the failure of distinguishing cg management and update leads to an overlap of different inferencing mechanisms. Steedman distinguishes between rhematic updates and thematic suppositions on the one hand and between the presence or absence of a belief in the cg. A thematic supposition crucially depends on what is present in the cg, however. Similarly, highlighting the presence of alternatives typically correlates with the introduction of a rheme. Hence, are claimed to fulfill a double duty which can actually be subsumed under one (the theme/rheme distinction). And lastly, Steedman primarily relies on a tonal distinction that is hard to perceive and often shows little overlap among trained annotators (Pitrelli et al. 1994). Steedman blames this on an ambiguity in the ToBI manual, but I still consider the controversy around the H* and L+H* distinction too risky to make it the basis for 69 distinguishing between rheme and theme. This assessment receives support from the fact that Gussenhoven 1984 has associated this tonal distinction only with a difference in emphasis. 2.5.2.5 Westera’s account of intonational compliance marking Westera (2013; 2017) presents an account of intonational meaning that is (almost) exclusively based on pragmatic inferencing. Intonation marks compliance with or violation of conversational maxims with respect to a main or a secondary theme of a sentence (depending on whether it is marked by a boundary or a trailing tone). The secondary theme is the basis for accenting.  T* +T T% L old information compliance with secondary theme compliance with main theme H new information no compliance with secondary theme no compliance with main them Table 2.16: Tonal morphemes in Westera (2017) While (declarative) assertions can generally be analyzed within the domain of Westera’s informational maxims – which are modeled after Grice (1989) – (interrogative) questions are considered by Westera to lack the main informational content. Within Westera’s (2017) exhaustivity approach to conversational effects, this property of questions (as well as hints) requires a different kind of maxim, the so-called attentional maxims. These attentional maxims resolve some of the known issues with informational maxims, including granularity and some exhaustivity requirements. Table 2.17 lists the inventory of both type of maxims. 70 Maxims I(nformation)  A(ttention) Quality Intend to share only information you take to be true. Intend to draw attention only to states of affairs that you consider possible. Relation Intend to share only information that is thematic. Intend to draw attention only to thematic state of affairs. Parsimony - Intend to draw attention to a state of affairs only if, if you consider it possible, you consider it possible independently of any more specific thematic state(s) of affairs. Quantity Intend to share all thematic information you take to be true. Intend to draw attention to all thematic states of affairs you consider independently possible. Clarity Make sure the intent is understood. Make sure the intent is understood. Manner Be clear (about content and compliance), concise, and orderly (in aligning prominence). Table 2.17: I- and A-Maxims in Westera (2017) It is important to understand that in Westera’s account intonational meaning is understood in terms of conversational implicatures. High tones serve as triggers of pragmatic inferencing. The addressee has to infer which of the maxims is violated. The only cue intonation may give is the severity of the violation. Westera proposes that signaling the violation of the maxim of quality, for instance, is marked by a larger pitch excursion than signaling the violation of a maxim of manner. Violating the former is considered more severe than violating the latter. This strategy of violation severity and its prosodic encoding, however, is not comprehensively developed. In general, Westera (2017) aims to get by without “a grammaticalized version of the (natural) meaning of gradient, paralinguistic features” (173) – thereby relying on pragmatic inferencing only. For our purposes, the analysis of falling declaratives can be neglected here. They comply with the relevant maxims and therefore end in a low boundary tone. Much more interesting are rising declaratives and interrogatives with a high boundary tone (for the latter, low boundary tones mark exhaustiveness). For the polar interrogative in (36), the addressee has to infer non-compliance with the maxim of quantity; for the rising declarative in (36), with the maxim of quality.  a.  Is it raining? (L* H-H%)  b. It’s raining? (L* H-H%) 71 In example (36), the H% signals non-exhaustivity because the speaker fails to draw attention to all possible states of affairs (i.e. “something else may be relevant” (Westera 2017: 304)). I take issue with this view in Subsection 3.1.3 because I consider the response set to be exhaustive. The H% in (36) has a notably different meaning since it occurs with a declarative, and hence marks non-compliance with an information maxim, rather than an attention maxim. The non-compliance with Information-Quality (as opposed to Attention-Quality) is based on a lack of evidence to commit to its truth. In other words, it is thematic, but potentially false. A violation of manner (Westera 2013) or I-Clarity (Westera 2017) is exemplified in (37); example (37) shows the interaction of trailing and boundary tones in a rise-fall-rise contour.  a. I’d like... err... je veux... black coffee (H* H-H%)  (Westera 2014)  b. A: Have you ever been West of the Mississippi?   B: I’ve been to Missouri… (L*+H L-H%)  (Ward & Hirschberg 1985) The disfluency in (37) marks a lack of clarity either regarding the manner or intent: it’s unclear whether the addressee will understand. The non-compliance marked by both trailing and boundary tone in (37), signals the uncertainty whether Missouri is a city West of the Mississippi and the speaker’s presence in Missouri is at all relevant because it may be considered geographically too close. According to Westera (2017), the speaker strategically applies the RFR to signal the uncertainty about the link of the proposition to the conversation, but not its truth. I will return to the details of Westera’s analysis of (36) and (37) in Chapter 3 and 4, but for now I want to point to three questionable assumptions therein. Firstly, the decision to require the addressee to know what maxims are violated shifts the responsibility for successful communication away from the speaker. How would an addressee decide whether the speaker withholds information or provides them with wring information? Though the speaker signals non-compliance, the burden of inference is almost entirely the addressee’s. Consequently, prosody is given a rather small role in terms of encoding SA-related meaning. Secondly, the decision to analyze assertions and questions with different types of maxims inherently relies on the ability to distinguish the two SAs. The difficulty of identifying questions (see Section 2.2.1) raise the issue of whether derived SAs can be adequately assigned to one of the two types of maxims. Finally, the decision to assign different maxim violations (independent of the information/attention distinction) to polar interrogatives and rising declaratives may reflect their semantic difference, 72 but cannot capture their pragmatic similarity. What Gunlogson (2008) refers to as contingency (i.e. a dependency on the addressee for a ratification of beliefs) receives a different analysis for rising declaratives and interrogatives in terms of compliance marking. 2.5.2.6 Summary of tone-based approaches to intonational meaning More than tune-based proposals, tone-based approaches to intonational meaning rely on categorical distinctions, which correspond to their smaller inventory of tonal morphemes. Because a direct association of contour and interpretation is unrealistic with a two- or four-tone inventory, a prominent role in deriving meaning is given to pragmatic inferencing. Ladd (2008: 150) identifies the under-specification of the mechanisms of pragmatic inferencing the key problem of intonational meaning. He adds that we lack a theoretical framework to compare the success of the different proposals in predicting the meaning of an intonational contour. We saw in this section how different they are: While Bartels (1997) and Truckenbrodt still worked in the realm of pragmatic meaning, Pierrehumbert & Hirschberg (1990) refer to speaker beliefs that have no direct relation to SAs. Steedman (2001 et seq.) draws on classic notions of information structure and pairs them with speaker- or addressee-attribution, but the link to propositional meaning is unclear. Finally, Westera (2014; 2017) relies exclusively on inferencing and attributes only a markedness function to intonation: the principle marked by rising intonation can only be deduced from context. 2.6 Speech act ontologies build on epiphenomena The claim that the Clause Type Convention and the Fall/Rise Convention are inadequate for an account of propositional meaning should at this point no longer come as a surprise. In Section 2.2 I argued that the distinction between assertions and interrogatives is problematic due to its ambiguous encoding. In Section 2.3, I reviewed previous SA accounts and their limitations due to a reliance on Clause Type Convention and the Fall/Rise Convention. In Section 2.4, I showed that a rise/fall distinction ignores at least a third type of contour, which I called the modified rise. Based on the British tradition, there are more contour types we could draw on. In Section 2.5, I showed that distinguishing SAs is not the only function of SFI – it also serves to encode paralinguistic aspects and signaling incompleteness. If SFI has a prominent role in solving the SA Problem, it is therefore necessary to understand its relation to other types of intonational meaning. In this section, I discuss how the previous problems culminate in the SA Problem. The Clause Type Convention and the Fall/Rise Convention are not only problematic because they are grounded in inadequate 73 descriptions of forms. They are also problematic because they work with an inadequate conception of their functions. Specifically, I argue that the SA Problem exists because it relies on the idea that questions and assertions are conversational primitives, when in reality they are epiphenomena. Support for this claim comes from the denotations of different types of questions, which suggest that these types do not form a natural class. I also review the conversational properties of other so-called questions, including echoes and fall-rising declaratives. I argue that their conversational effects as discussed in the literature allow only one conclusion: we need to decompose both primary and derived SAs. Underlying many semantic accounts of questions and assertions is the idea that a function applied to the sentence radical determines how we should interpret that radical. Following Frege (1918) and Stenius (1967), we can summarize this idea as follows: for both questions and assertions, an operator scopes over the semantic content of the sentence (Krifka 2011).  a. Who will come?  b. QUEST (COME(X))  a. Bill knows who will come.  b. ASSERT (KNOW (COME(X) (BILL)) In essence, the semantics here directly corresponds to the pragmatics: it is the operator that determines the illocutionary act on the basis of the sentence radical, which corresponds to the locutionary act. If we conceive of the pragmatics of questions as being determined by such an operator, it may seem natural to include this operator in the syntactic derivation of a sentence. We therefore see that pragmatic treatments of questions and assertions still operate with a notion equivalent to the direct force hypothesis: the sentence mood (form) directly determines the sentence force (function).  Typically, information questions are considered to be open propositions. They specify with a wh-pronoun which part of the sentence radical is missing. The missing part in polarity questions is the truth value. For alternative questions, the missing part is specified to a choice of variables. Below is an example of each type of question in (a) with a schematic representation of their implementation in (b) and a description of the missing information in (c). 74  a. Did he score?  b. radical: He scored.  c. missing: truth-value  a. Who scored?  b. radical: X scored.  c. missing: X  a. Did he miss or score?  b. radical: He missed vs. scored  c. missing: resolution of choice Semantic approaches to these question types fall into four categories, the categorical (e.g. Cohen 1929; Hausser 1983), the propositional (e.g. Hamblin 1973; Karttunen 1977), the partitioning (e.g. Higginbotham & May 1981; Groenendijk & Stokhof 1982; 1984), and the inquisitive (e.g. Groenendijk 2008; Groenendijk & Roelofsen 2009) approach. Table 2.18 summarizes their approach to each question type, leaving the specifics in notational differences aside. Approach Question Categorical Propositional Partitioning Inquisitive Information (p – x) (p – x) (p – x)i ∨ (p – x)j, (p ∨ q ∨ …) Polarity (+/– (p)) (p ∨ –p) (pi ∨ pj) (p ∨ –p) Alternative (p – (x1 ∨ x2)) (p – (x1 ∨ x2)) (pi – (x1 ∨ x2)) ∨ (pj – (x1 ∨ x2)) (p ∨ q) Table 2.18: Semantic approaches to variable, polarity, and alternative questions Information questions are treated similarly in categorical and propositional approaches; the only difference is that the propositional approach specifies restrictions to propositions while the categorial approach specifies variables over partial functions. The partitioning approaches differs because it relies on indices. These indices specify an exhaustive set of non-overlapping propositions. Inquisitive semantics treats variable questions as members of a set of different propositions. The different treatment of polarity questions in categorical and propositional approaches is that the former includes a truth operator and the latter constitutes a set of opposing alternatives. The partitioning account is the equivalent using indices rather than truth values. 75 Finally, alternative questions are similarly handled in the categorical, the propositional, and the partitioning approaches. The approach to alternative questions in inquisitive semantics clearly stands out since it relies on propositional alternatives.  The strong similarity of the different approaches represented in Table 2.18 is no coincidence. Krifka (2011) shows that they can all be derived from the categorical approach. This makes it easier for us to get to the core of the typology: all approaches agree that the semantics of variable and polarity questions differ in that the former apply to open propositions while the latter apply to alternative, closed propositions. Alternative questions share properties from both variable questions and polarity questions: they are disjunctive between propositional alternatives, but it comes down to a choice of two propositions which only differ in a pre-specified variable. Interestingly the semantic similarity between variable questions and alternative questions contrasts with the pragmatic similarity of polar questions and alternative questions. Pragmatically, the latter resemble each other in being restricted to two possible answers (Krifka 2015). Of the three question types, then, only variable and alternative questions miss a part at the propositional level: they both include a variable to be supplied by the addressee. Polarity questions lack a truth-valuation, but are in themselves complete (sets of) propositions. The only common trait in this typology of questions is that all questions lack information that renders them true or false. The idea that questions are functions applied to the sentence radical typically finds its expression in assuming that sentence form contributes to the interpretation of SAs. There is an alternative view that breaks with this assumption. Following Hausser (1980), Portner (2004) proposes that including such a SA operator in a formal analysis is redundant since the conversational effects of questions, assertions, and imperatives directly arise from their truth-conditional properties. SAs can therefore be derived independently of a clause-typing operator.  Type Denotation Discourse Component Force Declaratives propositions cg Assertion Interrogatives Set of propositions Question Set Asking Imperatives Property (P) To-Do List Function Requiring Table 2.19: Clause typing based on semantics in Portner (2004) Similarly, Farkas & Roelofsen (2017) derive both primary and derived SAs without a dedicated operator. They argue that such an operator is redundant in light of the semantic content of questions 76 and assertions. However, they go one step further than Portner (2004). They assume that the content of both questions and assertions can be characterized with an informative and an inquisitive component (Groenedjik & Roelofsen 2009). The fact that some SAs are primary and others derived is the of so-called secondary conversational effects. Rising declaratives, for instance, are used by a speaker who is biased toward the truth of a proposition and has low credence toward its complementary alternative. This bias and the corresponding credence level distinguish their context of use from falling declaratives. What does this mean for the different questions we discussed? If we let semantic content determine our categorizations of questions, questions that violate the Clause Type Convention and the Fall/Rise Convention can be reconceptualized. We can see how this applies to the two variable questions from the introduction, here repeated as (43). Remember that the wh-question in (43) violates the Fall/Rise Convention by occurring with a fall and that the echo-question in (43) violates the Clause Type Convention by occurring with a declarative word order.   a. Who scored↓  b. He said what↑ Despite the violations of said conventions, both examples in (43) are treated as questions in the literature (e.g. Cheng 1997; Ginzburg & Sag 2001; Blakemore 1994; Sobin 2010). In line with this view is the fact that their semantic denotations can be modeled with the same ingredients as other types of (see Table 2.19): both contain a variable that identify an issue under discussion. Yet, the variable appears to fulfill very different functions. In the wh-interrogative in (43a), the variable flags missing information. In the wh-echo in (43b), the variable points to a referent that is either controversial or surprising (see Section 5.1 for a detailed analysis of both constructions). The juxtaposition of wh-echo and wh-interrogative in (43) also shows that we cannot make rising intonation responsible for questionhood. While that would explain how the Clause Type Convention can be violated in (43b), it errs on the side of the Fall/Rise Convention in (43a). Hence, if we categorize both examples in (43) as questions, these questions are different from primary questions in that they violate (at least) one of the postulated conventions. The semantic denotations specify their difference in factoring out a variable for wh-interrogatives and wh-echoes and in applying an operator that assigns a truth value to polarity questions. How we can account for their common analysis as questions and their different SFIs is a matter I will return to in Chapter 5.  77 The problem of associating prosodic or syntactic form with a specific SA extends to assertions. Consider the two utterances in (44) and (45), which both occur with a RFR contour where the final rise differs in pitch excursion (Ward & Hirschberg 1992). Their conversational effects go beyond those of an assertion by adding a metalinguistic layer to the propositional meaning, which I paraphrase with a follow-up question in the alternative responses (B’).  A: Have you ever been West of the Mississippi?  B: I've been to Missouri... ()     (Ward & Hirschberg 1985)  B’: I've been to Missouri. Does that count as West of the Mississippi?  A: I'd like you here tomorrow morning at eleven.  B: Eleven in the morning?! (↑)    (Ward & Hirschberg 1986)  B’: Eleven in the morning. Isn’t this a bit early? In example (44), Speaker A appears to question the relevance of their utterance; in example (45), Speaker B echoes a fragment of Speaker B’s previous utterance and expresses their surprise. The added paraphrases in B’ suggest that these examples raise a question just as much as they assert a proposition. Again, it may be tempting to attribute the metalinguistic layer to the prosodic form (cf. Truckenbrodt 2012) and thereby rely on a complementary division of labor to preserve the validity of the Clause Type Convention and the Fall/Rise Convention. But this would compromise the role of assertions in cg management. Neither of the responses from Speaker B in (44) and (45) constitutes a proposal for updating the cg. In fact, both utterances respond to such proposals on behalf of Speaker A. The contributions of the metalinguistic layers, which I paraphrased as questions, do not correspond to those of conventional questions either. Instead of requesting a verification of truth, they request a confirmation of relevance or appropriateness. Hence, while primary assertions and questions target propositions and propositional choices respectively, the utterances in (44) and (45) target the previous SA and negotiate its future role for the cg. In contrast to variable questions, however, we cannot attribute this to their status as open propositions. Both utterances contain (at least) one closed proposition, namely that Speaker B has been to Missouri in (44) and – by assuming some reconstruction for the fragment – that Speaker B needs to appear at eleven in the morning in (45). Again, I postpone the analysis of their conversational effects and their prosodic variation to Chapter 4.  78 If all that remains of The Clause Type Convention and the Fall/Rise Convention is that they capture the forms of two specific types of questions and assertions, i.e. falling declaratives and rising polar interrogatives, we might wonder how much worth there is to making these conventions the foundation for solving the SA Problem. The preceding discussion of variable questions and utterances with a RFR contour suggest that the conversational effects of different assertions and questions differ significantly – either at the semantic or the pragmatic level. Rather than taking two specific configurations of clause type and SFI to define what (primary) SAs are, it may be worth exploring an alternative solution to the SA Problem. If we assume that even primary SAs are epiphenomenal, we can explore the variables that condition their contexts of use and use these variables for composing a new typology of questions, assertions, and SAs that fall in between them. I employ this strategy in my own proposal for solving the SA Problem, which I develop in Chapter 3 and expand in Chapter 4. 2.7 Conclusion In this chapter I explicated the four problems associated with a complementary division of labor between syntax and prosody to solving the SA Problem, which attempts to preserve the Clause Type Convention and the Fall/Rise Convention. The preceding discussion of the interplay of prosodic and syntactic form for encoding SAs showed that neither convention can be grounded in binary distinctions between declarative vs. interrogative clause type and falling vs. rising intonation, respectively. Furthermore, the role that is attributed to SFI in providing a propositional function that complements that of clause types contrasts with competing analyses of the function of SFI. When combined, these problems suggest that questions and assertions are interpretations that are difficult to be motivated on the basis of prosodic and syntactic form. Because they also exhibit some variety in their semantic analysis, I suggested that SAs are epiphenomenal. In this context, it worth pointing out that those proposals that rely on pragmatic inferencing come with their own problems because they often lack a clear description of the mechanisms that allow interlocutors to infer what is the intended and/or salient interpretation. At this point, we only know that the Clause Type Convention and the Fall/Rise Convention are insufficient for modeling the conversational effects of the phenomena discussed here. Once we expand the scope of conversational phenomena, we see that what may suffice to model falling and rising declaratives and polar interrogatives, cannot be expanded to other types of questions or other uses of SFI. What is more, these conventions reduce their contexts of use to a one-dimensional development of cg 79 that ignores the dynamic and interactional elements of human conversation. In what follows, I develop a solution to the SA Problem that dispenses with the Clause Type Convention and the Fall/Rise Convention in favor of complex variables. These variables adequately characterize the contexts of use of the most prominent phenomena associated with cg management and have prosodic correlations in the shape of SFI.  80 Chapter 3: Decomposing Speech Acts by Commitment and Engagement The Speech Act (SA) Problem is grounded in conventions that associate questions and assertions with a particular morphosyntactic (46) and prosodic (47) form, as repeated below.  Clause Type Convention:  declarative (DEC) = assertion       interrogative (INT) = question  Fall/Rise Convention:  falling intonation () = assertion       rising intonation () = question Even combining the effects of these conventions (cf. Gunlogson 2003; Farkas & Roelofsen 2017), where one convention modifies the other, fails to solve the problem since several configurations remain ambiguous and display properties of both questions and assertions. In Chapter 2, we saw that these conventions are problematic because they rest on an oversimplification of the form and because they rely on epiphenomenal concepts, such as questions and assertions. In the present chapter, I introduce the Dialogical SA Model, which rests on the following assumptions:  i) The primary SAs of questions (i.e. rising interrogatives) and assertions (i.e. falling declaratives) are the endpoints on a scale of conversational phenomena. Everything that falls between these endpoints constitutes an act of negotiation. ii) All items on this SA scale can be characterized by two variables: Speaker Commitment and Addressee Engagement. A speaker can signal that they publicly commit to an issue, or that they do not commit to it. Likewise, a speaker can engage an addressee to resolve an issue, or engage an addressee to accept it. For both variables, it is also possible for the speaker to leave them unmarked, which results in in initiating a negotiation. iii) The resulting deconstruction of SAs into configurations defined by Commitment and Engagement allows for a new perspective on the similarity of conversational phenomena across the form-function conventions in (46) and (47). For instance, declaratives with a (rise-) fall-rise contour have the same degree of Commitment as falling and rising interrogatives. Rising declaratives share their degree of Commitment with falling wh-interrogatives, and falling declaratives share their degree of Engagement with echoes wh-interrogatives and disjunctive interrogatives (so-called alternative questions). These similarities transcend the traditional classification of SAs based on (46) and (47). 81 In brief, I propose to solve the SA Problem by way of focusing on the use conventions and discourse functions of the different SAs and postponing the question of encoding until we have more clarity on their functions (see Chapter 4). This allows me to replace the existing binary conventions based on syntactic and prosodic form with non-binarz variables that reflect the use conventions and intended effects of primary and derived SAs.  I develop this proposal as follows. In Section 3.1, I introduce a revised model of the negotiation table (Farkas & Bruce 2010) that reflects the use conventions and expected conversational effects of four different constructions: falling declaratives, high-rising declaratives, rising-declaratives and rising interrogatives. Conversational moves around the table of negotiation are defined by the variables of Speaker Commitment and Addressee Engagement. In Section 3.2, I review and expand heuristics from the previous literature for testing Commitment and Engagement. I also show that Commitment and Engagement adequately capture the relevant properties of SAs independently of any assumptions introduced by the table analogy. I then provide formal definitions of both variables and relate these variables to two constraints that regulate the economy of conversations and define their different phases. In Section 3.3, I discuss the relation of Commitment and Engagement and demonstrate that they are each other’s inverse for primary SAs: full Commitment maps onto no Engagement and vice versa. For derived SAs, the relation is more complex. In fact, English has a SA mapping for all of the of the logically possible configurations of Commitment and Engagement. An important outcome of the discussion of derived SAs will be that unmarked Commitment occurs in the context of what I call faulty propositions, which are those propositions that come with a bias, are incomplete, or miss some information. Unmarked Engagement occurs in the context where a speaker does not specify their expectation of how the addressee should respond to their utterance. In Section 3.4, I compare my proposal of a Dialogical SA Model with previous attempts to solve the SA Problem. In Section 3.5, I conclude and list the remaining aspects of the SA Problem, which I address in Chapters 4 and 5. 3.1 Acts of negotiation In this section, I analyze the conventions of use and intended effects of four different phenomena exemplified in (46): falling declaratives (a), high-rising declaratives (b), rising declaratives (c), and rising interrogatives (d). The arrows in the following examples indicate the shape of the 82 sentence-final contour.  corresponds to H* L-L%,  to H* H-H%, and  to L* H-H% in the AM framework (Pierrehumbert 1980).   a. It’s raining b. It’s raining  c. It’s raining  d. Is it raining In the preceding chapters, I identified assertions and polarity questions as primary SAs because they seem to be universally attested and often conventionally encoded. Rising and high-rising declaratives were identified as derived in response to their treatment as complex SAs in complementarian approaches to SAs, which rely on a symmetrical division of labor between prosodic and morphosyntactic form. In Figure 3.1, falling declaratives and rising interrogatives fall in the bottom-right and top-left quadrant, respectively. This translates into an interpretation that aligns with the Clause Type Convention and the Fall/Rise Convention. Rising declaratives and high-rising declaratives fall into the top-right quadrant through the oversimplification of the rise/fall distinction (cf. Gunlogson 2003; Farkas & Roelofsen 2017; Westera 2017), which captures the fact that a heterogeneous combination of prosodic and syntactic forms can lead to two different interpretations: the former is typically interpreted as a question, the latter as an assertion.   Figure 3.1: Interaction of the Clause Type Convention and the Fall/Rise Convention I claim here that conversations are best understood as acts of negotiation. What is standardly referred to as questions and assertions (i.e. rising interrogatives and falling declaratives, respectively) are special forms of negotiation because they correspond to the end points on a scale of negotiation. Primary questions and assertions stand out among SAs through their clear expectations on how the conversation continues. For SAs that fall in between those endpoints, the 83 negotiations are more complex. They differ in how the speaker relates to the utterance and what they expect from the addressee as a response (see Geurts 2019 for the idea of a default response for some SAs). These aspects define the conventions of use and the intended effects of several discourse moves. The following discussion models how these two aspects combine. To illustrate the conventions and effects, I adopt and refine the model of a negotiation table, which has been the basis of modelling conversational effects in several publications since it first appeared in Farkas & Bruce (2010; see Section 1.4). 3.1.1 Revising the table analogy to reflect the dialogical character of SAs I assume that SAs fall into two distinct categories: primary and derived SAs. The former category comes with a default expectation of how the conversation continues, the latter always involves some negotiation. Due to the oversimplification of the Clause Type convention and the Fall/Rise Convention, I refrain from defining the properties of these acts of negotiation based on their forms of encoding. In Chapter 2, I characterized the effects of primary SAs as requesting missing information for questions and as proposing to add a proposition to the cg (Stalnaker 2002) for assertions. The conversational effects of derived SAs cannot be defined exclusively by either of those options. Figuratively speaking, they end up on the table (Farkas & Bruce 2010), a negotiation space where the Questions Under Discussion (Ginzburg 1996; Roberts 1996) are stored. Figure 3.2 visualizes how these three conversational moves – requesting information, tabling an issue, and proposing a proposition – contribute to the development of the cg. With an assertion, the speaker proposes for a proposition to enter the ground of the addressee to make it a common belief. Here, the speaker places the proposition directly into the space of the addressee (indicated by an arrow extending to the addressee in Figure 3.2). With a derived SA, the speaker raises an issue for negotiation about whether it can enter the cg – hence, it is placed on the table (indicated by an arrow pointing to the table in Figure 3.2). With a question, the speaker can neither place a proposition on the table nor into the space of the addressee. The propositional choices remain in speaker’s space (indicated by the absence of an arrow in Figure 3.2). 84  Figure 3.2: Revising the table analogy (preliminary) This metaphorical description of conversational properties raises two closely-related questions:  i) Which SAs are acts of negotiation?  ii) What makes them acts of negotiation?  I answer both questions on a theoretical level before applying the model to specific examples. Firstly, all SAs are acts of negotiation, but for some of them, there are conventionalized ways – or shortcuts – of resolving them. For instance, there is a default assumption that an assertion will enter the belief set of the addressee (see Geurts 2019), which allows the speaker to place it in the space of the addressee and not on the table. An important insight by Brennan & Clark (1991) is, however, that interlocutors can engage with each other to an elaborate degree before a proposition can become a shared belief. When the speaker proposes for a proposition to enter the cg, this so-called presentation phase can extend over several turns. Likewise, it may take several turns before the interlocutors agree that the proposition is a shared belief; this phase is the so-called acceptance phase (Clark & Brennan 1989). The presentation phase can extend over several turns because the addressee may need some clarification about the content or the implications of the presented proposition. Likewise, the acceptance phase can extend over several turns either because the evidence is not strong enough for the addressee to accept the proposition or because the act of accepting is a collaborative effort. The key insight, then, is that grounding, i.e. the process of arriving at a common belief, is dynamic and complex. This understanding of cg development goes well beyond a distinction between adding and proposing for a proposition to become cg (cf. Farkas & Bruce 2010) and a distinction between accepting and believing a proposition (cf. Stalnaker 2002). This is because grounding in Brennan & Clark (1991) is understood as a collaborative effort. Correspondingly, we need to expand our conversational model by a set of moves that mirror the moves of the presentation phase to show that speaker and addressee share the workload of 85 arriving at a shared belief. Figure 3.3 visualizes this set of mirrored moves by reversing the set of black arrows going out from the speaker above the table as a set of grey arrows going out from the addressee below the table. As I explain below, each turn incorporates two moves, one involving the speaker, and another involving the addressee.  Figure 3.3: Revising the table analogy (still preliminary) The reason why questions (rising interrogatives) and assertions (falling declaratives) stand out among acts of negotiation is because for these acts, interlocutors can be maximally economical in following a principle of the least communicative effort (Clark & Wilkes-Gibbs 1986: 33).  Principle of least collaborative effort: In conversation, the participants try to minimize their collaborative effort – the work that both do from the initiation of each contribution to its mutual acceptance. Because the expectation toward the addressee is clear for both falling declaratives and rising interrogatives, the default is to move on in the conversation either by adding the proposition to the mutual belief set (in the case of a falling declarative) or by providing a propositional answer that can be added (in the case of a rising interrogative). So, to answer the first question, all SAs are acts of negotiation, but primary SAs do not require negotiation by default (see also Gunlogson 2008). It is nevertheless possible to turn them into negotiations, for example by asking for clarification or by rejecting the evidence that forms the basis of a rising interrogative or falling declarative. In either case, SAs are acts of negotiation because they make two separate contributions: one in which the speaker expresses their attitude and one in which the speaker expresses their intention about how the addressee should engage with their utterance. 86 This bring us to the second question, which aims at the motivation for why interlocutors negotiate. For Brennan & Clark (1991), negotiation is primarily a matter of arriving at an understanding.  Presentation phase: [speaker] A presents utterance u for [speaker] B to consider. He does so on the assumption that, if B gives evidence e or stronger, he can believe that she understands what he means by u.  Acceptance phase: B accepts utterance u by giving evidence e that she believes she  understands what A means by u. She does so on the assumption that, once A registers that  evidence, he will also believe that she understands. (Brennan & Clark 1991: 223) Presenting an utterance initiates a dialogue. Brennan & Clark (1991) show that a presentation phase can be complex or simple, but in any case, it determines the goal of the following conversational moves. A presentation phase therefore initiates a sequence with the target of establishing something as cg. The addressee has multiple ways to respond to the initial contribution, but ultimately it needs both speaker and addressee to collaborate to arrive at a cg. In a conversation, then, communication is not one-directional, but interactional. Building on Brennan & Clark (1991), I assume that the dialogical process has both local and global aspects. Globally, every conversation is structured by two alternating phases: presentation and acceptance. Locally, every utterance includes a speaker-oriented aspect (Commitment) and an addressee-oriented aspect (Engagement). Both local and global aspects are geared toward the same goal: speaker and addressee coordinate their efforts (Grice 1975) to establish a shared belief (Stalnaker 2002). While signaling an understanding is part of the interactive nature of grounding, this does not explain why we need to negotiate beliefs in the first place. By way of the table analogy in Figure 3.3, we see that negotiations can arise in the presentation and in the acceptance phase. It is necessary to negotiate, for example, if the speaker does not have enough evidence to commit to the proposition. Negotiation may also be required if the addressee does not agree with the proposition put forth by the speaker. Both are motivations that differ from Brennan & Clark’s notion of evidence which signals whether the interlocutors understand each other. Evidence for a belief is crucial for how interlocutors position themselves toward a belief. This positioning is captured by the notion of Commitment, which plays a central role in Gunlogson’s (2003) account of rising declaratives. Commitment captures the speaker’s public attitude toward an utterance. 87 Gunlogson (2008) identifies knowledge of the truth of a proposition as the basis of this attitude. Without a source for this knowledge, a speaker cannot commit. Naturally, this also applies to the next turn. If the addressee has a source that allows him/her to commit to an alternative truth, they can object to what the speaker said in the previous turn. Hence, the notion of Commitment is central to the negotiation of belief. I define Speaker Commitment in reference to the three logical possibilities of where the speaker can place a proposition in our conversational model, in the ground of the speaker, the table, and the ground of the addressee (see Figure 3.3).  Commitment: Let Commitment be the degree to which the speaker publicly commits to the issue currently negotiated for entering the cg. These degrees include: a. No Commitment: The speaker cannot publicly commit to a proposition because they have no evidence for its truth. Committing would come with the risk of losing face. b. Unmarked Commitment: The speaker has some reservation to commit to a proposition because the evidence they have for its truth is insufficient or incomplete. c. Full Commitment: The speaker publicly commits to a proposition because they have sufficient evidence for its truth. One motivation for negotiating a belief is insufficient evidence for the truth of a proposition, which results in an unmarked Commitment of the speaker to what he presents. A second motivation is the speaker’s Engagement of the addressee as a source that will resolve the issue under negotiation. This Engagement grounds in the speaker’s expectation of how the conversation will continue.  Engagement: Let Engagement be the degree to which the speaker engages the addressee to resolve the issue currently negotiated for entering the cg. These include: a. No Engagement: The speaker calls on the addressee not to engage because they expect the addressee to accept a proposition or a propositional choice. b. Unmarked Engagement: The speaker refrains from engaging the addressee because the proposition is not under discussion. As a consequence, the SA needs to be negotiated at a metalinguistic level to decide on the future development of the conversation. c. Full Engagement: The speaker calls on the addressee to engage with propositional issue and to resolve any propositional choice in the next turn. The speaker factors in the addressee’s belief during the presentation phase to anticipate the next turn. In Conversation Analysis, this type of anticipation is called projecting (Ochs, Schegloff & 88 Thompson 1996; Ford & Thompson 1996). Projecting the following turn is a default in human conversation. Auer (2002) underscores the importance of projecting for the economics of an interaction. Hence, we can assume that the speaker’s call to engage must be based on a belief of the speaker about the addressee’s belief. This belief, in turn, is based on an assumption about the evidence the addressee has available for the truth of the proposition.  For both Commitment and Engagement, the notion of evidence plays an important role in determining their degree. SAs are acts of negotiation either because the speaker lacks sufficient evidence for committing to the truth of a proposition. As a consequence, the speaker may engage the addressee. Speakers may also engage the addressee because they have evidence contradicting the previous SA. Alternatively, the addressee can be engaged because an issue has been tabled. This can be done for lack of evidence, or because the speaker is uncertain about the appropriateness of the SA. My conversational model therefore separates the performed presentation phase, i.e. the degree of Commitment, and the projected acceptance phase, i.e. the degree of Engagement. The degrees of Commitment correspond to the placement options on the upper part of the model (visualized by black arrows). The degrees of Engagement correspond to the placement options of the lower part of the model (visualized by grey arrows). SAs are configurations of both variables.  Figure 3.4: Building a conversational model (final) Each configuration of Speaker Commitment and Addressee Engagement is interactive because it expresses a message the speaker seeks to communicate and a request about what the addressee should do with this message. My model therefore distinguishes between speaker attitudes (which 89 are captured by Commitment) and speaker intentions (which are captured by Engagement). Traditional SA models lack this decidedly interactive character; they conflate attitude and intention (cf. Pierrehumbert & Hirschberg 1990). These models focus on the contribution of the speaker and fail to model the expectation toward the addressee. A distinguishing feature of my proposal is that it presupposes that interlocutors are aware of their role as contributors to an interactive process.  3.1.2 Epistemic development as a key to the use conventions of four speech acts I motivate the different degrees of Commitment and Engagement on the basis of the contexts of use of four different constructions: falling declaratives, high-rising declaratives, rising declaratives, and rising interrogatives. I describe the contexts of use in terms of i) the evidence for the truth of the proposition available to the interlocutors, ii) the possibility for the utterance to occur out of the blue, and iii) the response requirements. Gunlogson (2003; 2008) uses these criteria for describing the different contexts of use of falling and rising declaratives as well as of rising interrogatives. She fails to control, however, for a number of confounding factors in the application of these criteria. To get to these factors, I draw on a distinction made in Stalnaker (2002) between what is accepted and what is believed by both interlocutors. This is a crucial distinction to reflect Stalnaker’s (2002) observation that propositions can be accepted for the sake of a conversation to make them accessible for future reference. Correspondingly, I decompose the notion of cg into the set of salient propositions and the set of shared beliefs. The former arises from the one-sided act of stating a belief, the latter arises from the collaborative act of sharing a belief. This terminological distinction expands on Stalnaker (2002), whose distinction between accepting and believing does not permeate to the notion of cg. I define the two different sets as in (54).   Let conversations be built on two hierarchically-ordered sets of shared information: a. the Set of Salient Propositions (SSP) which contains those propositions that are linguistically and extralinguistically accessible to both interlocutors; b. the Set of Shared Beliefs (SSB) which contains those propositions that are considered to be true by both interlocutors. A first minimal pair of interest for the discussion of use conventions is that of falling declaratives (55) and rising interrogatives (56), whose effects mark the endpoints on the scale of negotiation. 90  {A is sitting in a windowless office. B enters from outside.}   T1: B: It’s raining (H* L-L%) T2: ✓A1: Oh, I didn’t know that.         ✓A2: says nothing.         #A3: Yes, that’s right.  {A is sitting in a windowless office wondering about the weather conditions when B enters from outdoors:}  T1: A: Is it raining (L* H-H%) T2: #B1: Oh, I didn’t know that.         #B2: says nothing.         ✓B3: Yes, that’s right. The contexts of use of the falling declarative in (55) and the rising interrogative in (56) differ significantly. The interactional matrix in Table 3.1 reflects the development of the SSP and SSB of the exchange in (55). Before the turn of Speaker B (T1), there is a clear asymmetry in the evidence of truth: only the incoming person (B) knows that it is raining outside. As a result of T1, the proposition p enters SSP – simply because it has been uttered. The decisive turn for whether it becomes a belief shared by both interlocutors is T2. With response A1, the addressee accepts p, makes p their own belief, and is therefore in the position to commit to it publicly. The fact that it is raining is now believed by both interlocutors (Bel(p)) and thus enters the SSB.  Asymmetry Negotiation Congruence A -  p  Bel(p)  T1 T2  B   Bel(p)  Bel(p)  Bel(p) Table 3.1: Epistemic development in the context of falling declaratives The initial asymmetry in knowledge is also reflected by the response possibilities in T2. The presence of oh in B1 marks a change of cognitive state (James 1972; 1974; Heritage 1984; 1998; Schiffrin 1987) and has been used by Gunlogson (2003) as a heuristic for belief states. Due to the principle of least collaborative effort, it is also acceptable for B not to respond (B2). Response B3, however, is infelicitous since B does not have any (direct) evidence for asserting that p is true. Now consider the use conventions of the rising interrogative in (56), which I represent in Table 3.2. The development of the SSP and SSB is similar to that of falling declaratives in that it starts 91 with a clear asymmetry in knowledge. The roles have changed, however: now it is B who knows that it is raining throughout the conversation, while A has no evidence to commit to p at the outset. This does not change until B resolves the issue of whether it is raining or not (p vs -p). With the rising interrogative in T1, A raises one of two possibilities. At this point, he cannot commit to either. Once B confirms one of them, as in B3, the proposition enters the set of beliefs.  Asymmetry Negotiation Congruence A -  {p, -p}  Bel(p)  T1 T2  B   Bel(p)  Bel(p)  Bel(p) Table 3.2: Epistemic development in the context of rising interrogatives Alternative responses B1 and B2 are both infelicitous. B1 is in violation of the asymmetry of knowledge – as the incoming speaker, B has evidence for the weather conditions outside. It is not a possibility to remain silent in a response to a rising interrogative – as in B2 - in accordance with Grice’s (1989) cooperative principle. B3 identifies one of the two alternatives of the rising interrogative as correct. It is worth noting in this context that the distinction between accepting and believing does not apply to rising interrogatives if the addressee is trustworthy. Once an interlocutor has raised an issue for resolution, the default is to believe the proposition offered by the addressee rather than to accept it for the sake of a conversation. Due to the clear asymmetry in knowledge, once a propositional alternative is identified as correct, it can be expected to enter the SSB. Of course, this is only a default assumption. If Speaker A turns out to have some evidence about a contrary belief, they can deviate from the expected course of conversation and initiate a negotiation in the following turn. The epistemic development in Table 3.2 represents the most economical path of the conversation to honor the Principle of least collaborative effort. Gunlogson (2003; 2008) identifies another property that distinguishes the conventions of use of falling declaratives and rising declaratives: only the latter are felicitous in out-of-the-blue contexts. This is because the speaker only commits in falling declaratives, but makes any commitment contingent on the addressee in rising interrogatives. I argue, however, that this heuristic for Commitment requires some qualification. Specifically, I take issue with the claim that Gunlogson’s examples represent real cases of out-of-the-blueness. In each of her examples, the QUD is accommodatable, which is at odds with a strict out-of-the-blue reading. Consider first the 92 difference between falling declaratives and rising interrogatives which serve in Gunlogson (2003) to establish the relevance of the heuristic. Gunlogson provides example (57) and claims that the rising interrogative is acceptable out of the blue, but the falling declarative is not. Yet, neither utterance in (57) occurs out of the blue since the fruit mentioned there is contextually salient. This contextual salience makes both utterances acceptable – contrary to what Gunlogson reports. The same holds for Gunlogson’s other example pertaining the felicity of declaratives that occur out of the blue. In example (58), both rising interrogative and falling declarative are acceptable (contrary to her claim about the latter) because the QUD is accommodatable. The inclusion of supposed to allows the addressee to accommodate that there is a pretext to Speaker A’s utterances. Speaker A assumes that Speaker B is familiar with the weather forecast. Hence, neither (57) nor (58) can be considered as strictly occurring out of the blue, i.e. without any relevant context.  {to co-worker eating a piece of fruit}  T1: #A1: That’s a persimmon   ✓A2: Is that a persimmon (examples and judgments from Gunlogson 2003: 2)  {A to their officemate B}  T1: #A1: The weather’s supposed to be good this weekend   ✓A2: Is the weather supposed to be good this weekend  (Gunlogson 2008: 103) Example (57) provides enough contextual information for the addressee to accommodate the topic of the falling declarative – which makes it acceptable. Stalnaker (2002) defines accommodation as “the process by which something becomes cg in virtue of one party recognizing that the other takes it to be common ground” (711). By recognizing that the persimmon is relevant for the conversation, the addressee can accommodate the question what kind of fruit is that? as the QUD in (57). Contrary to Gunlogson’s claim, the falling declarative is therefore acceptable. Chafe (1976) points to the role of extralinguistic context for accommodation: as long as speaker and addressee are conscious of the presence of an object, it is felicitous for the addressee to address it out of the blue, which explains the acceptability of example (57). In example (58), the context does not include any extralinguistic object that allows for accommodation. However, both utterances in (58) include the reportative modal supposed to that points to the relevance of an outside source. So even if it occurs strictly out of the blue, the falling declarative in (58) includes a linguistic device that invites for the accommodation of the QUD. What is more, the modal reduces the 93 responsibility of the speaker for the truth of the proposition through a lack of commitment (Matthewson & Truckenbrodt 2018; contra Gunlogson 2008). Hence, none of Gunlogson’s judgments are related to the issue of out-of-the-blueness. Her examples all include some information that disqualify out-of-the-blueness as a heuristic for Commitment. Falling declaratives can be used discourse-initially (just like rising declaratives) if the QUD is accommodatable. Out-of-the-blueness, however, cannot be a heuristic for Commitment, because accommodation is almost impossible to constrain. Discourse markers that signal initiation (59), such as well or you know (Shourup 1982), can mitigate the effects of out-of-the-blueness even if nothing invites for accommodation.   {A to their officemate B} T1: A1: Guess what, I won the lottery  A2: You know, this time last year I was in Rome Even thetic sentences (Marty 1918) where all information is new (60) are felicitous in the form of falling declaratives. Hence, Commitment must be a notion that is context-independent.  {A to their officemate B}   T1: A1: Your coat’s on fire    A2: Your mother called   (adapted from Rochemont 2016) I therefore disagree with Gunlogson’s claim that only rising interrogatives are acceptable out of the blue, and falling declaratives are not. The former may require less accommodation discourse-initially. But this property does not make them less felicitous in strictly out-of-the-blue sentences.  As a counter-proposal, I suggest that all of the above is a consequence of an underlying asymmetry in knowledge among the interlocutors. This asymmetry determines the degree of the speaker’s Commitment. For rising interrogatives, a speaker assumes that the addressee knows more about the truth of the proposition than themselves. If this is not the case, the penalty is low: the addressee can simply respond that they do not know. For falling declaratives, however, the speaker assumes that the addressee does not know the truth of the proposition. If they do, the penalty is high: the utterance violates a maxim of quantity because the speaker provides more information than required (Grice 1978). The different use conventions of falling declaratives and rising interrogatives reflect the development of SSP and SSB. In both SAs, there is an initial asymmetry in knowledge which is transformed to a knowledge congruence at the end of the exchange (for a 94 detailed treatment of congruence, see Osa forthcoming and also Guntly forthcoming). This is because one interlocutor serves as a credible source that allows the addressee to add the proposition to their SSB. To reflect the initial state, the addressee responds by signaling whether they can commit to the truth. For rising interrogatives, this is a necessary step; for falling declaratives it is optional if the addressee did not have any previous evidence of the truth of the proposition. If they did, the exchange is not maximally informative. The out-of-the-blue criterion is a consequence of the degree of Commitment. It is infelicitous to commit to a truth whose relevance is uncertain. If accommodation of the relevance is possible, the public Commitment is valid. At this point, a binary distinction between the Commitment states of falling declaratives and rising interrogatives appears to be sufficient to capture the conventions of use of both constructions. This changes once we consider constructions with prosodic and syntactic forms of heterogeneous form-function mappings. By including rising declaratives in our analysis, we see that Commitment can vary. We see the full necessity of relying on more than just Commitment when we add a fourth phenomenon, namely high-rising declaratives. I describe each of their conventions of use in turn by relying on the properties listed above (knowledge state, out-of-the-blue felicity, and response requirements). The juxtaposition of their conventions of use shows that the complement strategy pursued by Gunlogson (2003) and others will not suffice to solve the SA Problem. This is where we see the necessity of including Engagement as another complex variable. High-rising declaratives (Hirschberg & Ward 1995) belong to a family of declaratives that end in a rise. This family of declaratives come in many different shapes (see Warren 2016). Their use can be associated with two functions i) forward-looking or deferring, and ii) backward-looking or checking (MacNeil & Cran 2005; House 2006; Tomlinson 2009; Tomlinson & Fox-Tree 2011). Another common assumption is that they express uncertainty about the relevance of an utterance (Hirschberg & Ward 1995). High-rising declaratives are often discussed under the names of uptalk, upspeak, and high-rising terminals or high-rise questions. Gunlogson’s (2003; 2008) rising declaratives are sometimes included, sometimes they are not. I choose the term high-rising declarative to avoid a direct association of clause type and SA. The term high-rise question should be avoided since it is presently unclear for high-rising declaratives whether the speaker questions anything. Moreover, the term is derived from a so-called question rise, which is problematic in light of the prosodic variation of different types of questions.  95 To analyze the conversational properties of high-rising declaratives, consider the example in (61).  {A is sitting in a windowless office when B enters from outdoors, completely wet. Hoping for a towel, B says to A:} T1: B: It’s raining T2:  ✓A1: Oh, I didn’t know that. (Here’s a towel.)      #A: says nothing.      # A3: Yes, that’s right. (Here’s a towel.) For high-rising declaratives, there is an initial asymmetry in knowledge between speaker and addressee. As in the case of falling declaratives, this asymmetry changes to a shared belief over the course of the conversation, which leads to a knowledge congruence. In the epistemicity matrix in Figure 3.5, this corresponds to the change in Speaker A from no belief (-) over an acceptance that a proposition has been uttered (p) to a belief that the proposition is true (Bel(p)), which allows Speaker A to also commit to it in the future. Specifically, only Speaker B seems to know about the weather conditions before T1. After T1, Speaker A has evidence for the weather conditions through B’s statement, and they can mark this as their own belief with T2.   Asymmetry Negotiation Congruence A -  {p}  Bel(p)  T1 T2  B   Bel(p)  {p}  Bel(p) Figure 3.5: Epistemic development in the context of high-rising declaratives The development of Speaker A’s knowledge is therefore identical to the one we find in falling declaratives. This does not hold for Speaker B, however, albeit for reasons unrelated to evidence. In T1, the proposition is no longer presented as a belief. Instead, it is offered as a proposition whose relevance to the context of the ongoing conversation is to be negotiated. Hence, after T1, both interlocutors accept the proposition for the sake of the conversation because Speaker B asserted it, but only the course of the conversation will determine whether the proposition can enter the SSB. This brings us to the different response possibilities for Speaker A: A1 reflects the asymmetry in knowledge. The response would nevertheless be infelicitous if it was not accompanied by the handing over of a towel or a similar action. This is where we see that an addressee may be engaged for two different reasons: either to address the proposition or two respond to the SA itself. By 96 handing over the towel, Speaker A confirms the relevance of the weather situation. Independently of that relevance, it is likely that the proposition enters the set of shared beliefs because Speaker A does not have any evidence that permits them to challenge Speaker B. Response A2 is infelicitous because the relevance of the proposition of T1 is unclear until Speaker B decides about its relevance. B3 is also infelicitous, but crucially not because only B has no evidence for the truth of the proposition, but because the truth itself is not at issue. It’s the relevance of the SA, which needs to be recognized. Even if Speaker A had independent evidence before T1, A3 would be odd because Speaker A never doubted the truth of the proposition – only its relevance was at stake. Hence, the change from a proposition that is accepted to one that is believed is also contextually conditioned. An irrelevant proposition does not enter the set of shared beliefs because it does not serve the purpose of growing the cg. Looking at the response properties, then, we see how falling declaratives and high-rising declaratives differ. They begin and end with the same state of knowledge about the truth of the proposition, but the negotiation is different in that Speaker A may abandon their Commitment for the sake of the conversation if their belief is irrelevant for its outcome. We see further support for a difference in the conventions of use of high-rising and falling declaratives by comparing their licensing conditions for occurring out of the blue. Consider the contexts from above, first without and then with an extralinguistic element present.   {A to their co-worker B who shares the same office}:  T1: ✓A1: It’s raining   #A2: It’s raining   #A3: Is it raining For the strictly out-of-the-blue context, A1 is infelicitous because we expect both interlocutors to have the same evidential basis for the current weather conditions. A2 is infelicitous because it is impossible to question the relevance of a proposition if the QUD is unclear. A3 is also infelicitous because both speakers can be assumed to have the same evidence about the weather conditions, so it does not make sense to inquire about them from the interlocutor. These judgments change if we introduce an extralinguistic, contextually salient entity, such as a fruit in example (63). 97  {to co-worker eating a piece of fruit}  T1: ✓A1: That’s a persimmon   ✓A2: That’s a persimmon (why would you eat that?)   #A3: Is that a persimmon The falling declarative in A1 is felicitous if it is interpreted as an observation (see above). The high-rising declarative in A2 in (63) is felicitous in contrast to the one in (62) because the relevance of the statement is now clear, independent of whether or not it is true. A3 is felicitous because if it reflects the asymmetry in knowledge before T1: If the co-worker knows what fruit they are eating, and the addressee does not, A3 is completely natural. On the basis of the different conversational properties of the three constructions above, it is safe to assume that they differ both in use conventions and conversational effects. A major component that distinguishes their conventions of use is the individual states of knowledge of each interlocutor. It may be tempting at this point to ascribe the different effects of each construction to the contribution of (the different shapes of) SFI, whereby  corresponds to proposing a proposition for entering the set of shared beliefs,  to uncertainty about the relevance of the SA, and  to an uncertainty of the truth of the proposition. Once we include a fourth construction, namely rising declaratives, that direct link between SFI and interpretation crashes. Their conventions of use resemble not only those of falling declaratives and rising interrogatives; they also show similarities to those of high-rising declaratives. As before, we begin by tracing the epistemic development of SSP and SSB for example (64), which is represented in Figure 3.6.  {B is sitting in in a windowless office.}  T1: A: enters the office, taking off a wet jacket. T3: #A1: Oh, I didn’t know that.  T2: B: It’s raining      #A2: says nothing.           ✓A3: Yes, that’s right. The knowledge state of Speaker A is consistent: from the beginning to the end of the conversation they know the weather conditions. Speaker B has no belief about the current weather conditions since the windowless office does not allow him/her to tell. At the moment of Speaker A’s arrival, the knowledge state of Speaker B changes because they notice the wet jacket. Consequently, Speaker B suspects that it is raining as this seems the most likely explanation for Speaker A’s wet 98 jacket. The arrival of Speaker A therefore introduces new, salient material to the context. Speaker B then voices the possible explanation, which adds the proposition to the SSP. Only after a confirmation in T3 does Speaker B publicly adds the proposition to the SSB.  Asymmetry Negotiation Congruence A Bel(p)  Bel(p) Bel(p)  Bel(p)  T1 T2 T3  B   -  p > -p {p}  Bel(p) Figure 3.6: Epistemic development in the context of rising declaratives The variation in felicity judgments among the response possibilities reflects the requirement of a confirmation for an integration of a proposition in the SSB. Note here that the only thing that conditions the response properties is the knowledge of the truth of the proposition. In contrast to high-rising declaratives, relevance does not play a role for rising declaratives. A1 is infelicitous because Speaker A has evidence of the truth of the proposition – uttering A1 would simply be lying. A2 is also infelicitous since it simply ignores the previous turn that comes with a call on the addressee to respond (cf. Beyssade & Marandin 2007). Finally, A3 is the only felicitous response since it provides clarifying evidence that allows Speaker B to confirm their suspicion. Of course, any alternative explanation of the wet jacket would have a similar effect. The alternative explanation would simply override Speaker B’s suspicion and replace it with another proposition. Since Speaker A remains the only source of evidence in example (64), there is no reason for Speaker B to hold back and not make the proposition a belief when the truth is established.  The incompatibility with out-of-the-blue contexts is taken to be one of the defining features of rising declaratives in Gunlogson (2003; 2008). Consider the comparison of all four constructions.   {A to their co-worker B who shares the same office}:  T1: ✓A1: It’s raining   #A2: It’s raining   #A3: It’s raining   ✓A4: Is it raining In a strictly out-of-the-blue context, neither the rising nor the high-rising declarative in (65) is felicitous. A rising declarative requires some trigger for the bias; high-rising declarative requires 99 some evidence for relevance of the proposition. A4 may seem a little marked because we have no explanation for why Speaker A would ask about the weather conditions completely out of the blue. This markedness is much stronger for a rising declarative: its conventions of use are dependent on some contextually salient issue that leads the speaker to speculate whether it is true. The presence of such an issue changes the felicity judgments of both rising and high-rising declaratives in (66).   {to co-worker eating a piece of fruit}  T1: ✓A1: That’s a persimmon    ✓A2: That’s a persimmon   ✓A3: That’s a persimmon   ✓A4: Is that a persimmon Specifically for the rising declarative, the felicity arises in (66) because the contextually-salient fruit introduces a probability that the coworker is eating a persimmon (this explanation is overlooked in Gunlogson 2003). Hence, Speaker A can voice their suspicion with the expectation that the co-worker confirms their suspicion. The salience of the fruit makes all four SAs in (66) felicitous. In Table 3.3, I summarize the properties of falling declaratives, rising and high-rising declaratives as well as rising interrogatives according to the three heuristics discussed above.  Falling Declarative High-Rising Declarative Rising Declarative Rising interrogative Sufficient evidence ✓ ✓ ✗ ✗ Strictly out of the blue ✓ ✗ ✗ ✓ Response required ✗ ✓ ✓ ✓ Table 3.3: Summary of conventions of use These three diagnostics can distinguish the conventions of use of these four SAs. It is important to note that there are different reasons for how these SAs score in the above table. Out-of-the-blueness, which is related by Gunlogson (2003) to the notion of Commitment, turned out to be a marker of a salient QUD for high-rising declaratives. For rising declaratives, out-of-the-blueness marks the presence of something that can give rise to a bias. This also explains why rising declaratives score differently than high-rising declaratives for the issue of sufficient evidence. All in all, we have seen that the notion of Commitment does not suffice to capture the conventions of 100 use of the four SAs discussed here. The notion of Engagement, which became relevant in the discussion of the response properties, is equally important. 3.1.3 Modelling use conventions and conversational effects with the table analogy In this subsection, I show that the conversational effects of falling declaratives, rising interrogatives, high-rising declaratives and rising declaratives can be captured by two contributions to the negotiation. I illustrate this with the table analogy developed in Subsection 3.1.1 These contributions correspond to the speaker’s Commitment and their Engagement of the addressee. I also show that both Commitment and Engagement can be motivated independently.  A falling declarative is a proposal for the addressee to adopt a belief held by the speaker. It therefore constitutes a conversational move that does not rely on the addressee for establishing its truth, according to the speaker. Because interlocutors anticipate the future development of the cg (Auer 2002), the speaker decides not to engage the addressee in a negotiation about that proposal. The default is that the conversation can continue with the belief now either being accepted or believed by the addressee. Because the speaker has sufficient evidence for the truth of the proposition, they can commit fully without the risk of losing face (Krifka 2015). This outcome is modeled in Figure 3.7, which shows a belief that is presented to the addressee for adoption into their set of beliefs. There is no reason to present the proposition as an issue that requires negotiation. The speaker forgoes the table with their assertion (contra Farkas & Bruce 2010; see Geurts 2019 for support of the idea that some SAs have a default reaction, which in this case is to accept the truth of the proposition). Speaker Commitment and Addressee Engagement seem to be in an inverse relation: the speaker commits fully, and therefore, the addressee is not engaged. When we later look at derived SAs, however, we see that not all SAs show an inverse relation. 101  Figure 3.7: Conversational effects of falling declaratives As a specific example of how the conversational effects play out, consider the previous example of an assertion, repeated here as (67).  {B is sitting in a windowless office. A enters from outside.}   A: It’s raining  B: Oh, I didn’t know that. Speaker A knows that it is true that it is raining because she just arrived from outside. Speaker B has no evidential basis for evaluating the truth since the office does not have any windows. By default, that is according to the principle of least collaborative effort (Clark & Wilkes-Gibbs 1986), Speaker A can assume that they will not be challenged in their belief. Consequently, Speaker A fully commits to their belief and does not engage Speaker B for resolving the issue. Speaker A knows more than Speaker B, so A expects B to adopt their belief. With their reply, Speaker B signals by the use of oh that the belief is indeed new to them. Because nothing contradicts the truth of it, the most economical way for Speaker B to continue the conversation is to adopt Speaker A’s belief. Naturally, this is not the only direction the conversation can take from there. If the addressee has contradicting evidence or if the speaker does not express him/herself clearly, the negotiation can continue. To adhere to the principle of least collaborative effort, however, the speaker only projects the most economical direction of the conversation. The conversational effects of a rising interrogative correspond to the opposite scenario: a rising interrogative is a binary choice between two opposing propositions differing only in their truth value (Hamblin 1958; Karttunen 1977). It is therefore not a contribution to the cg, but a proposal to update it with one of those two propositions (Malamud & Ettinger 2013). For determining the 102 truth-value, the speaker engages the addressee, whom they assume to have sufficient evidence to do so. This is modeled in Figure 3.8 where the speaker presents the choice between two alternatives they cannot commit to. In analogy to the falling declarative, the figure does not include a black arrow that indicates the tabling of or committing to a proposition. The addressee is expected to resolve this uncertainty by providing evidence that one of the alternatives is true. This is illustrated by the grey arrow in Figure 3.8 to mark the effect of the projected response. Just as before, then, Speaker Commitment and Addressee Engagement are in an inverse relationship; the negotiation table is not involved.  Figure 3.8: Conversational effects of rising interrogatives The table analogy above represents the speaker’s attitude (Commitment) and their intention (Engagement): the speaker expresses a lack of Commitment and projects a resolution of the propositional choice through their call on the Addressee to engage with the content of the falling interrogative. In the familiar weather scenario, this plays out as follows. Speaker B asks whether it is raining because he wants to know “the way things are” (Roberts 1996: 2). Note that the question is more felicitous if it includes a conversation starter (tell me) to avoid any markedness associated with occurring strictly out-of-the-blue (i.e. nothing that links it to the context).   {A is sitting in a windowless office wondering about the weather conditions when B enters from outdoors:}  A: (Tell me), is it raining  B: Yes, it is. Because Speaker B enters from outside into a windowless office, they are a likely source of evidence for the truth of the opposing alternatives. Consequently, Speaker A requests for B to 103 resolve the issue of whether it is raining or not. Corresponding to the phrasing of the rising interrogative, Speaker B confirms the truth of the uttered alternative. Just as in (67), only the newcomer, Speaker B, has evidence about the truth of the proposition in (68), which means that the most economical way to continue the conversation for Speaker A is to add B’s belief to the set of shared beliefs.  I now turn to the conversational effects of rising and high-rising declaratives, which show that Commitment and Engagement do not have to be in an inverse relation. The two derived SAs also show that there are two different reasons for negotiating an issue: high-rising declaratives are negotiated because they do not mark the Engagement of the addressee. Rising declaratives are negotiated because they come with an unmarked Commitment. I discuss the conversational effects of the two derived SAs in turn. To begin with, high-rising declaratives are SAs where the truth value is not at stake. The speaker can present the proposition with full Commitment to the truth of the proposition because they have sufficient evidence to do so. The speaker is uncertain about the future development of the cg because they do not know how the addressee will respond to their proposal. Uncertain of the relevance of the proposition, the speaker offers a way out of the default scenario where an asserted proposition is expected to be included into the SSB. The speaker refrains from engaging the addressee for a resolution of a propositional issue. As a consequence, the rise signals to the addressee to attend to the relevance of the SA instead. Hence, the proposition may remain part of the SSP, and not become part of the SSB if the addressee does not consider it relevant for the future development of the conversation.  Figure 3.9: Conversational effects of a high-rising declarative 104 In example (69), repeated from above, this translates to the following: Speaker A has sufficient evidence to commit to the truth of the proposition since they just come from outside. The reason Speaker A ends their utterance in a rise can therefore not be grounded in the uncertainty about its truth. Rather, Speaker B is uncertain about the relevance of the utterance for the future course of the conversation. In this specific context, this choice may reflect strategic reasons: Speaker B may find it impolite to request for a towel and therefore hopes for Speaker A to resolve the implicature following from their assertion that it is raining. Hence, the truth of the proposition is not at stake. Rather, Speaker B engages Speaker A to work with that proposal and negotiate its relevance for the future course of the conversation. With their reply, Speaker A acknowledges the truth of the proposition – which he had no evidence for – and hence adds it to the SSB. Crucially, this is not an important step: Speaker A may just as well respond by handing Speaker B a towel. Believing the truth of Speaker B’s proposition is not decisive for how the conversation progresses. Accordingly, the table model in Figure 3.9 only includes a black arrow (representing Commitment) that extends to the Addressee and a grey arrow (representing Engagement) that only extends to the table, which is where metalinguistic issues, such as relevance are being negotiated.  {A is sitting in a windowless office when B enters from outdoors, completely wet. Hoping for a towel, B says to A:} B: It’s raining A: Oh, I didn’t know that. (Here’s a towel.) Another reason to involve the addressee in a negotiation can be found in rising declaratives. Rising declaratives lack the evidential basis that licenses full Commitment. The speaker may be biased toward the truth of the proposition, but they require additional evidence to add the proposition to their belief set. In a conversation, a reliable source of that evidence can be the addressee. Hence, an unmarked Commitment grounding in the insufficient evidence results in an Engagement of the addressee to resolve the issue. Just as in high-rising declaratives, the discourse moves inherent to rising declaratives lead to a negotiation. For rising declaratives, the negotiation is about truth values, not about relevance. In both SAs, Commitment and Engagement are not in complementary distribution. Because the speaker leaves the Commitment to the truth of the proposition unmarked (because of the insufficient evidence), they fully engage the addressee. 105  Figure 3.10: Conversational effects of a rising declarative In the exchange in (70), the arrival of Speaker A is a crucial moment for the development of the cg – it introduces new extralinguistic content to the context, which changes the evidence at hand. For Speaker A, the wet jacket is enough evidence to speculate that it must be raining outside, but he has no direct evidence to confirm this assumption. A therefore turns to B, who has direct evidence and can serve as a credible source of the weather conditions. Depending on their confidence in the assumption that the jacket is wet because of the weather, Speaker A can either ask for a confirmation of their belief with a rising declarative if he suspects that it is raining or a rising interrogative if he wants to signal that other explanations are also likely. For either choice, there is a clear asymmetry in knowledge between A and B: Speaker A is the only credible source for establishing the truth of the proposition.  {A is sitting in in a windowless office.}  B: enters the office, taking off a wet jacket.   A1: It’s raining  A2: Is it raining  B: Yes, that’s right.     The minimal pair between rising declarative and rising interrogative in (70) raises a question that has received considerable attention in the literature: How different are their conversational effects really? There is general agreement that only rising interrogatives can be neutral while rising declaratives are always biased (e.g. Gunlogson 2008). Farkas & Roelofsen (2017) take both to have the same semantic denotation; both constitute choices between two polar alternatives. The 106 difference is that rising declaratives come with a bias toward one alternative while polarity rising interrogatives do not. This translates to a lower credence with respect to one of the alternative propositions for rising declaratives compared to rising interrogatives where both have the same credence. Krifka (2015), however, departs from this divide proposing that standard rising interrogatives have a monopolar (rather than a bipolar) interpretation. Krifka takes the negative bias in negated questions, such as Isn’t it raining? to be evidence for his claim that even rising interrogatives express a bias. It is unclear to me how this extends to positive questions. The bipolar interpretation arises by way of disjunction. Bipolar rising interrogatives arise through a disjunction of monopolar SAs; hence disjunction does not occur at the propositional level, but at the SA level.  While Krifka’s monopolar approach may be an elegant way of incorporating biased questions, it assumes identical effects of rising declaratives and (standard) monopolar rising interrogatives. The only difference is in their syntax, but this does not affect their conversational effects. Treating rising declaratives as disjoint monopolar questions is also problematic because it puts them in the neighbourhood of alternative questions. The difference between bipolar questions and alternative questions lies in the responses they project: the former restrict the update to the set of shared beliefs to a proposition and its negation (in classic Hamblin style); the latter to the alternatives scoped out at the SA level, which results in two possible interpretations of the contribution to the conversation.  Westera (2017) in following Biezma & Rawlins (2012) points out that rising interrogatives may not be restricted to two orthogonal alternatives. Westera claims that rising interrogatives allow any thematically-relevant alternative as a response. As supporting evidence, he provides the examples in (71) where B1 is judged as marked and B2 as fine. This contrasts with (72) where B1 is judged as fine and B2 as marked. The explanation provided relies on the placement of the accent.  A: Was JOHN at the party B1: ?Mary was.  B2: He was at school.  A: Was John at the PARTY B1: Mary was.   B2: ?He was at school.   (data and judgments from Westera 2017: 286, citing Bäuerle 1979) 107 The problem with example (71) is that the contours and stress patterns of the responses (B1 vs. B2) are not provided by Westera. I propose that these contours would reflect that Speaker B is deviating from the QUD. Notice that Speaker A’s question in (71) only differs from Speaker A’s question in (72) by possible interpretation of the focus: In (71), there is a contrastive stress on John; in (72) the stress on party is ambiguous between a default stress on the last NP (party) and a contrastive stress on party. Correspondingly, B1 in (71) would need to have a contrastive stress on Mary and ideally a fall-rising contour, which indicates the presence of a second topic (see also Büring 2007). B2, however, is perfectly acceptable without a special contour because Speaker B’s response is relevant to Speaker A’s question, and therefore and indirect response to the question (p v -p). In (72), a contrastive stress on party in A would render B1 completely infelicitous and would definitely require a contrastive stress on school in B2. Only if the stress on party in (72) was a default stress, B1 would require fall-rising contour and B2 would be fine with a fall. All this is to say that a deviation from the restricted polar reading (i.e. {p, q} instead of {p, -p}) requires prosodic marking, which demonstrates that additional responses come with a penalty. The default response, which does not require special prosodic marking, is restricted to the polar alternatives. This concludes the discussion of the conversational effects of falling, rising and high-rising declaratives, as well as rising interrogatives. Table 3.4 summarizes the previous discussion by listing the degrees of Commitment and Engagement which capture their conversational effects.  Speech Acts Construction Commitment Engagement Effect primary falling declarative full none Accept p rising interrogative none full Resolve p v -p derived rising declarative unmarked full Negotiate p/SA high-rising declarative full unmarked Table 3.4: Overview of conversational effects Note that primary SAs can be defined by only one of the pragmatic variables – (full) Commitment for falling declaratives and (full) Engagement for rising interrogatives. Derived SAs demonstrate, however, that we require both variables and that these variables are no longer conceptualized as binary. By default, adhering to the Clause Type and the Fall/Rise Conventions lead to a straight-forward resolution of the issue – an adoption of a belief for falling declaratives, and a resolution 108 of a binary choice of beliefs for rising interrogatives. Mappings that do not conform to the the Clause Type and the Fall/Rise Conventions always lead to acts of negotiation – it is not possible to move on in the conversation by employing either of the default strategies of adopting or resolving. Rising declaratives require a confirmation of the truth of a proposition before it can be adopted by the speaker into their belief set. High-rising interrogatives require a (verbal or non-verbal) confirmation that the SA is relevant, and that the conversation can proceed. 3.2 Negotiating with Commitment and Engagement This section contextualizes (Section 3.2.1) and formalizes (Section 3.2.2) the notions of Commitment and Engagement established in the previous section. In Subsection 3.2.1.1, I begin by reviewing three diagnostics established in the previous literature for assessing the level of Commitment. I expand this selection of diagnostics by a fourth diagnostic to show that Commitment is a complex notion. In Subsection 3.2.1.2, I review previous instantiations of the concept of Engagement, most notably in Beyssade & Marandin (2007), and show with a discussion of three further diagnostics that Engagement, too, is complex and is subject to at least three different degrees of Engagement. In Subsection 3.2.2, I formalize the notions and degrees of Commitment and Engagement by drawing on two principles of conversational economy that arise from a discussion of the use conventions and conversational effects described and modeled above. 3.2.1 Motivating Commitment and Engagement In this subsection, I discuss previous instantiations of Commitment and Engagement in the literature on SAs and motivate my revisions of these (binary) notions. Sub-subsection 3.2.1.1 does this for Commitment; Sub-subsection 3.2.1.2 does it for Engagement. Both subsections review a number of existing conceptions of Commitment and Engagement, including diagnostics associated with them. On this basis, I develop my own notions of these variables and expand the existing diagnostics. 3.2.1.1 Commitment reflects the speaker’s propositional attitude This subsection motivates the complex nature of Commitment. This is an important step in my account of SAs to incorporate the different conventions of use of the combination of declarative form and rising intonation discussed in Subsection 3.1.2. In the following, Commitment will be treated as an independent conversational contribution. This can only be an intermediary step since 109 Commitment stands in a close relation with Engagement (for their combination, see Section 3.4). I begin with an overview of three diagnostics that have been used to establish the relevance of Commitment in the previous literature. I offer a fourth diagnostic that affirms the minor variation within rising declaratives and interrogatives in form of weak vs. strong evidentials. Response behavior is identified by Gunlogson (2003) as an important factor for distinguishing the conventions of use for rising and falling declaratives. This response behavior reflects the asymmetry of knowledge that defines the initial status of the conversational examples discussed here and in Gunlogson’s account. Based on the assumption that interlocutors are cooperative (Grice 1967), SAs are expected to elicit a response from the addressee that reflects the initial asymmetry. In (73), we see that falling declaratives pattern with high-rising declaratives and rising declaratives pattern with rising interrogatives in this regard.  a.  {A is sitting in in a windowless office. B enters from outside.}    B: It’s raining A1: #Yes, it is.  A2: ✓Oh, I didn’t know that.  b.  {A is sitting in a windowless office when B enters from outdoors, completely wet. Hoping for a towel, B says to A:}  B: It’s raining A1: #Yes, it is.  A2: ✓Oh, I didn’t know that.  c.  {A is sitting in in a windowless office. B enters the office, taking off a wet jacket.}   A: It’s raining B1: ✓Yes, it is.  B2: #Oh, I didn’t know that.  d. {A is sitting in a windowless office wondering about the weather conditions when B enters from outdoors:}   A: Is it raining B1: ✓Yes, it is.  B2: #Oh, I didn’t know that. The responses in (73) show that the asymmetry in Commitment is skewed to the addressee in rising declaratives and rising interrogatives. In falling declaratives and high-rising declaratives it is skewed to the speaker. Independent evidence for this pattern is provided by a second diagnostic introduced by Malamud & Stephenson (2014), which is based on the compatibility of the constructions under discussion with taste predicates and vague scalar predicates. Both predicates require the speaker to be the interlocutor who knows more about p due to the subjective nature of these predicates. We observe the same pattern for this set of diagnostics as for Gunlogson’s (2001) response behavior diagnostic: falling and high-rising declaratives behave alike; rising interrogatives and declaratives show the opposite behavior (74). 110  a.  {A is sitting in in a windowless office. B enters from outside.}    B: ✓It’s raining heavily/unpleasantly b.  {A is sitting in a windowless office when B enters from outdoors, completely wet. Hoping for a towel, B says to A:}  B: ✓It’s raining heavily /unpleasantly  c. {A is sitting in in a windowless office. B enters the office, taking off a wet jacket.}   A: #It’s raining heavily /unpleasantly  d. {A is sitting in a windowless office wondering about the weather conditions when B enters from outdoors:}   A: #Is it raining heavily/unpleasantly A third diagnostic for the knowledge state of the interlocutors is identified by Bonami & Goddard (2006). Evaluative adverbs like unfortunately are only licensed for speaker-only Commitment, which is independent of the attitude of the addressee. Hence, it is impossible for the speaker to commit to the propositional attitude (expressed by the adverb) and simultaneously to request for confirmation for the truth of the proposition. Again, the felicity judgements pattern as expected.  a.  {A is sitting in in a windowless office. B enters from outside.}   B: ✓Unfortunately, it’s raining  b.  {A is sitting in a windowless office when B enters from outdoors, completely wet. Hoping for a towel, B says to A:}  A: ✓Unfortunately, it’s raining  c. {A is sitting in in a windowless office. B enters the office, taking off a wet jacket.}   A: # Unfortunately, it’s raining  d. {A is sitting in a windowless office wondering about the weather conditions when B enters from outdoors:}   A: #Unfortunately, is it raining  The original context of this diagnostic is Beyssade and Marandin’s (2007) analysis of the scope of what they call the call-on-addressee. The speaker expects from the addressee to add the proposition to their belief set, but not the evaluative judgment. By extension, the data in (75) show that only a speaker who is in the position to commit to the truth of the proposition can also commit to an evaluative judgment. Just like the previous diagnostics, then, evaluative adverbs help to identify 111 the nature of a discourse-initial knowledge asymmetry between the interlocutors. If that asymmetry is skewed to the addressee, the speaker cannot commit to the truth of the proposition. At best, the speaker can make a contingent commitment (i.e. a commitment which requires to engage the addressee to resolve a propositional choice) which captures said asymmetry. We therefore arrive at the following intermediate summary: falling and high-rising declaratives allow for a commitment to the truth of the proposition, rising declaratives and interrogatives do not. Table 3.5 summarizes the findings of the previous discussion with the first diagnostic split into two rows (confirming response with yes, it is and acknowledging news with oh, I didn’t know that).   Falling declarative High-rising declarative Rising declarative Rising interrogative Confirming response   ✓ ✓ Acknowledging news ✓ ✓   Subjective predicates ✓ ✓   Evaluative adverbs ✓ ✓   Asymmetry Knowledge (S>A) Knowledge (S<A) Table 3.5: Diagnosing a knowledge asymmetry as a basis of Commitment Crucially, the results of the diagnostics summarized in Table 3.5 only reflect the grouping allowed by the binary options of the heuristics. We saw some evidence by analyzing the conventions of use of the above constructions in Section 3.1 that require us to expand our set of heuristics. After all, the constructions that pattern alike cannot be used interchangeably. For instance, the evidence available to a speaker uttering a rising declarative is greater than for uttering a rising interrogative, which was the primary reason for allowing an unmarked degree of Commitment. Unmarked Commitment reflects the speaker’s bias toward one alternative, but nevertheless is contingent on external ratification (Gunlogson 2008), which translates into an Engagement of the addressee.  I propose a fourth diagnostic that cuts across the groupings in Table 3.5. Evidentials can reflect the gradeability of the evidence available to the interlocutors. Consider the compatibility of the previous examples with the adverb apparently. This evidential marks an utterance as a proposition that has only an indirect source for its truth (Willet 1988; Glougie 2016). Since the evidence is considered unreliable, the speaker can neither commit nor propose for it to be included in the SSB.  112  a.  {A is sitting in in a windowless office. B enters from outside.}    B: #Apparently, it’s raining  b.  {A is sitting in a windowless office when B enters from outdoors, completely wet. Hoping for a towel, B says to A:}  B: #Apparently, it’s raining  c.  {A is sitting in in a windowless office. B enters the office, taking off a wet jacket.}   A: ✓Apparently, it’s raining d. {A is sitting in a windowless office wondering about the weather conditions when B enters from outdoors:}   A: #Apparently, is it raining In example (76), the rising declarative is felicitous with apparently and a review of the knowledge distribution reveals why this is the case. The falling declarative is uttered in a scenario with a strong asymmetry in knowledge: The speaker has sufficient evidence to commit to the truth of the proposition. A qualification of the commitment with apparently is therefore infelicitous. The same holds for the high-rising declarative: The speaker has direct evidence for the truth of the proposition, which is at odds with the function of apparently. For the rising interrogative, the asymmetry is reversed, but just as strong. The speaker does not have any evidence for committing to the truth of the proposition, which includes any indication that it may appear to be true that it is raining. Hence, there is no commitment that can be qualified. Finally, the rising declarative is perfectly suitable for occurring with apparently since the indirect evidence enables the speaker to be biased toward the truth of a proposition, but not to commit to it publicly. The use of apparently in (76) is felicitous because it qualifies the commitment as being contingent on indirect evidence. Consider next how this contrasts with the compatibility of the four constructions with the adverb obviously, which expresses direct evidentiality (Hübler 1983; Aijmer 2008). For obviously, we can define direct evidentiality as resting on reliable, contextually-available evidence. As before, the use of an evidential requires an asymmetry of knowledge at the outset of the conversation, but the compatibility with obviously will help us to better describe the nature of that asymmetry.  113  a.  {A is sitting in in a windowless office. B enters from outside, but can only be heard.}   B: #Obviously, it’s raining  b. {A is sitting in a windowless office when B enters from outdoors, completely wet. Hoping for a towel, B says to A:}  B: ✓Obviously, it’s raining   c.  {A is sitting in a windowless office wondering about the weather conditions when B enters from outdoors:}   A: #Obviously, is it raining  d. {A is sitting in in a windowless office. B enters the office, taking off a wet jacket.}   A: #Obviously, it’s raining With obviously, it is the high-rising declarative that stands out as felicitous in example (77). The speaker marks the strength of the evidence present through their wet attire: The speaker assumes that the truth of the propositional content (and the urgency of the implicature that they are hoping for a towel) is as obvious to the addressee as it is for the speaker. For all other examples in (77), the asymmetry in knowledge is too great to allow for a felicitous use of obviously. In the falling declarative in (a), the weather conditions are only obvious to the speaker; in the rising declarative in (c), they are more obvious to the addressee than to the speaker; and in the rising interrogative in (d), there is no evidential basis for the speaker to express a commitment to either alternative. Extended by the fourth diagnostic, we have to reevaluate the grouping of the four constructions in the asymmetry table above. In the revised version, the binary grouping is broken up by the compatibility of high-rising declaratives with strong evidentials and of rising declaratives with indirect evidentials. At the same time, the felicity of direct evidentials for high-rising declaratives and the felicity of indirect evidential for rising declaratives shows that their use conventions are different from falling declaratives and rising interrogatives, respectively. The fact that only the variation in the compatibility with evidentials separates high-rising and falling declaratives requires further investigation. For now, we to note that the binary split that emerges from the heuristics used in the literature is insufficient for characterizing their degrees of Commitment. 114  Falling declarative High-rising declarative Rising declarative Rising interrogative Confirming response   ✓ ✓ Acknowledging news ✓ ✓   Subjective predicates ✓ ✓   Evaluative adverbs ✓ ✓   Indirect evidentials   ✓  Direct evidentials  ✓   Knowledge S > A S ~ A S < A Table 3.6: Diagnosing a knowledge asymmetry as a basis of Commitment Together with the observations on the different contexts of use (see Subsection 3.3.2), Table 3.6 presents support for expanding the binary conception of Commitment found in previous accounts (Gunlogson 2003; 2008; Malamud & Stephenson 2014; Krifka 2015). Commitment relies on evidence for the truth of a proposition and constitutes an attempt to ameliorate the knowledge asymmetry common to all the discourse situations exemplified in the previous discussion. We know from Section 1.5 that the standard solution to the insufficiency of a binary notion of commitment is to make the commitment contingent on a confirmation from the addressee (Gunlogson 2003; 2008). As elegant as this move may seem, it conflates two variables that fulfill different functions. I propose that what Gunlogson calls contingent commitment is in fact unmarked Commitment. Gunlogson insists that any form of Commitment requires a source that provides the evidence for the truth of the proposition. In rising declaratives, this source is the addressee. Hence, the only person that can really commit is the addressee, not the speaker. To distinguish rising declaratives from rising interrogatives, Gunlogson admits that there is still some degree of commitment in the speaker, which requires ratification from said source. So even in Gunlogson’s account, Speaker Commitment is a complex (rather than binary) concept. The degree of Commitment hinges on the strength of the evidence available. Subjective predicates and evaluative adverbs point to a two-fold distinction. Yet, the compatibility test with evidentials suggests this binary distinction is artificial. At the propositional level, we must distinguish between at least three degrees: full Commitment, unmarked Commitment, and no Commitment. The data 115 including high-rising declaratives suggests that there may be another factor playing into the felicity judgments. I elaborate on this point in the next section. 3.2.1.2 Engagement reflects the speaker’s intended effects In this subsection, I show that Engagement is a complex concept just as Commitment is. I begin with a review of existing notions that are similar to Engagement, mostly notably that of a call-on-addressee (Ginzburg & Sag 2001; Beyssade & Marandin 2007), and point to the limitations of these concepts based on their close relation to clause-typing. I then return to the response requirements of the different constructions and demonstrate that they mirror a difference in compatibility between calls and addresses (Zwicky 1974). To explain these patterns, I propose a complex notion of Engagement that is independent of clause type. Specifically, I propose that a speaker can request from the addressee to resolve an issue at the propositional level or at the SA level. Alternatively, the speaker can decide not to engage the addressee at all since they take the future development of the SSB for granted. These three types of response requests can all be subsumed under one variable, i.e. Engagement, which reflects the speaker’s expectation toward the addressee about the current utterance. In the existing literature, Engagement is less established than Commitment for the interpretation of SAs. The term engagement is sometimes used to describe an emotional involvement by the speaker (Brazil 1975; Bolinger 1982; Beaken 2011). However, I will use this term in reference to the involvement of the addressee. The closest predecessor of my use of the term Engagement is Beyssade & Marandin’s (2007) notion of a “call-on-addressee” (henceforth: CoA). For each of the three primary clause types (declaratives, interrogatives, and imperatives), Beyssade & Marandin assume that there is a dedicated type of Commitment and a dedicated type of CoA. With a declarative, the speaker combines the Commitment to a proposition with a CoA that invites the addressee to add the proposition to the SSB. With an interrogative, the Commitment applies to the issue raised by the question, and the CoA invites the addressee to add a propositional abstract to the Question under Discussion (henceforth: QUD, Roberts 1996). Propositional abstracts are abstractions over variables (Ginzburg & Sag 2001). With an imperative, the speaker commits to the outcome of a future action and invites the addressee to add an outcome to a to-do list (see Portner 2004 for details). Exclamatives have neither a commitment (exclusive to the speaker) nor a CoA. For primary SAs, Beyssade & Marandin assume that Commitment and the CoA are in a 116 symmetrical relation (i.e. a homogenous combination of Commitment and CoA; see Table 3.7). Derived SAs can be modeled by a heterogeneous combination of the different types of Commitment and CoA. For the four constructions under scrutiny in this chapter, this translates as follows (high-rise questions are not discussed in Beyssade & Marandin 2007). Construction Commit to… CoA to add… Combination Falling declarative a proposition a commitment to the CG homogenous Rising interrogative an issue a propositional abstract in QUD homogenous Rising declarative a proposition a propositional abstract in QUD heterogeneous Table 3.7: Commitment and CoA according to Beyssade & Marandin (2007) Table 3.7 shows that the account can successfully model rising declaratives by combining a declarative Commitment type with an interrogative CoA type. Applying the logic of their system, we may speculate that High-rising declaratives would have the same combination of Commitment type and CoA as a rising declarative. Just as with any SA account relying on the complement strategy (i.e. the additive effects of combining the Clause Type Convention and the Fall/Rise Convention), the mechanisms at disposal are too limited to account for the interpretation of derived SAs that deviate from forms considered by these accounts. In short, Beyssade & Marandin (2007) lack the means of modeling high-rising declaratives. By dissociating the CoA from Commitment, Beyssade and Marandin (2007) seek to overcome a problem central to the traditional mapping between clause types and SAs. They nevertheless rely on clause types to motivate each type of Commitment and each type of CoA. For instance, a rising declarative is interpreted similarly to a rising interrogative because of their shared CoA. At the same time, a rising declarative shares the type of Commitment with a falling declarative. Beyssade & Marandin can explain the different interpretations of a falling and a rising declarative with their different CoAs. The contribution of the COA does not suspend the contribution of Commitment. Hence, Gunlogson’s (2003) observation that rising declaratives share contextual properties of both questions and assertions is therefore preserved. The innovation in Beyssade & Marandin (2007) lies in disentangling the relation between Commitment and CoA. Instead of shifting the Commitment to the addressee, which is what Gunlogson proposes, the Commitment remains that of the speaker. It is the CoA that engages the addressee to resolve a propositional abstract. This CoA can be encoded by tag questions, particles, and intonation. By introducing high-rising 117 declaratives we saw that Beyssade & Marandin’s (2007) proposal relies too heavily on the notion of clause type, which limits the different types of Commitment and CoA. To build on this observation, consider Table 3.8 for an overview of the CoAs we could associate with the Commitment of high-rising declaratives following from the declarative clause type.  Phenomenon Commit to… CoA to add…  High-rising declarative   a proposition  * a commitment to the speaker’s Ground * a propositional abstract in QUD * an outcome in to-do list Table 3.8: CoA and Commitment of High-rising declaratives None of the three CoAs discussed in Beyssade & Marandin (2007) captures the conversational effect of high-rising declaratives of engaging the addressee about the SA (see Subsection 3.1.2). Because their inventory of possible CoAs is exhausted by the three clause types they consider, it is not possible to add a CoA specific to the effects of high-rising declaratives. I hence depart from their clause-type-dependent notion of the CoA and associate my notion Engagement with SFI. For motivating a complex notion of Engagement independent of clause-type, I propose another set of diagnostics. For the endpoints on the (categorical) scale of Engagement, I revisit the response properties discussed before and point to the fact that a lack of a response directly impacts our understanding of the relation of Commitment and Engagement. For an intermediate degree of Engagement, I rely on Zwicky’s (1974) distinction between two different types of engaging the addressee: calls and addresses. My rationale behind a complex notion of Engagement is rooted in in the observation (evident in the different use conventions of rising and high-rising declaratives) that a speaker may engage the addressee for other reasons than to resolve a choice between propositional alternatives. The latter is assumed in SA accounts following the complement strategy (Beyssade & Marandin 2007; Malamud & Stephenson 2014; Farkas & Roelofsen 2017). We already saw in Subsection 3.2.1.1 that the response behavior is not adequately captured by an acknowledgment/confirmation distinction. The fact that high-rising and falling declaratives exhibit different behavior toward a lack of a response suggests that the acknowledgment/confirmation heuristic for Commitment lacks some depth. Compare the data in (78) which shows that high-rising declaratives require a response while falling ones do not: 118  a. {A is sitting in in a windowless office. B enters from outside.}   T1: B: It’s raining  T2: ✓A1: Oh, I didn’t know that.        ✓A2: says nothing.        #A3: Yes, that’s right.  b. {A is sitting in a windowless office when B enters from outdoors, completely wet. Hoping for a towel, B says to A:}  T1: B: It’s raining  T2:  ✓A1: Oh, I didn’t know that. (Here’s a towel.)        #A2: says nothing.        #A3: Yes, that’s right. (Here’s a towel.) The fact that the speaker engages the addressee with the high-rising declarative - although it comes with the same evidence as the falling declarative - shows that Engagement cannot be based exclusively on a knowledge asymmetry. By extension, we must assume that Engagement is also independent of the notion of Commitment. In example (78), it is acceptable not to respond because the speaker did not engage the addressee for resolving an issue – the speaker already has sufficient evidence for the truth of the proposition. In example (78), a response is required although the nature of the evidence is the same. The Engagement of the speaker can therefore not target the truth of the proposition. The meaning of the sentence-final contour therefore exceeds a fall-rise distinction. The three different contour shapes discussed here (, , and ) support a more fine-grained distinction, which suggests that the notion of Engagement needs to include at least three degrees. This observation is the basis of my argument: Engagement is complex and can occur for different reasons; a lack of Commitment is not the only motivation for a speaker to engage the addressee. Support for this assumption comes from different types of vocatives and their effects in a conversation. Zwicky (1974) distinguishes between calls and addresses. I assume that they differ in their degree of Engagement. While my distinction is similar in spirit, its terminology is different. For Zwicky, a call functions as a device to “catch the addressee’s attention” (787); an address functions “to maintain or emphasize the contact between speaker and addressee” (787). Hence, Zwicky considers both types to fall on an attention spectrum. Zwicky’s (1974) original examples show that the call in (79) engages the addressee for a different purpose than the address in (80). The call initiates a conversation and comes with the expectation that something is done about an 119 issue – in this particular instance, to pick up the piano. The address points to an ongoing issue, which the addressee can, but does not have to attend to – in (80), the action of the speaker’s coyote.  Hey lady, you dropped your piano.  I'm afraid, sir, that my coyote is nibbling on your leg.  (Zwicky 1974:787) Consider how these different purposes of engaging the addressee pattern with rising and high-rising declaratives, for which I assume a different degree of Engagement. I choose the call excuse me over hey since the latter can also function as an address (Zwicky 1974). While the rising declarative can only occur with a call, the high-rising declarative can occur with both a call and an address (both can combine with a vocative, as in the original examples in (79) and (80)).  a.  {A is sitting in a windowless office when B enters from outdoors, completely wet. Hoping for a towel, B says to A:}  B1: ✓Excuse me, Brenda, it’s raining B2: ✓I’m afraid, Brenda, it’s raining    b. {B is sitting in in a windowless office. A enters the office, taking off a wet jacket.}   B1: ✓ Excuse me, Brenda, it’s raining B2: # I’m afraid, Brenda, it’s raining  A call is compatible with both rising and high-rising declaratives. With a call, the speaker overtly flags that their utterance is relevant to the addressee, so it does not come as a surprise that it felicitous with both constructions in (81). An address is only compatible with a high-rising declarative because it comes with a different expectation toward the addressee from the rising declarative. The address is not compatible with a construction where the speaker requires information from the addressee. We see the full picture when we include falling declaratives and rising interrogatives. Both allow a call, but only the falling declarative allows an address.   a.  {A is sitting in in a windowless office. B enters from outside.}  B1: ✓ Excuse me, Brenda, it’s raining↓  B2: ✓I’m afraid, Brenda, it’s raining↓   b. {A is sitting in a windowless office wondering about the weather conditions when B enters from outdoors:}   A: ✓ Excuse me, Brenda, is it raining↑  A2: # I’m afraid, Brenda, is it raining↑ We find some independent evidence for a ternary distinction between Engagement in the functional variation of the Canadian confirmational eh. Heim & Wiltschko (in press) report that this discourse particle comes with three types of sentence-final intonation. The combination of eh 120 and rising intonation in (83) serves to request a confirmation of truth or belief from the addressee. Rising intonation signals that a response is mandatory for the future development of the SSB and therefore encodes full ENGAGEMENT. The combination of eh and level intonation2 in (84) occurs in narrative settings and serves to check agreement with the addressee. The default scenario is for the speaker to continue their turn. Level intonation therefore cannot function as a call to respond. It nevertheless allows the addressee to backchannel or to nod. I propose that it needs to be conceived of as an unmarked form of Engagement. Just as we saw with high-rising declaratives, the target of Engagement is the SA, rather than the proposition. The combination of eh and falling intonation in (85) – a less frequent combination – serves to mark that the speaker takes a belief for granted.3 Hence, the speaker does not expect the addressee to respond, which means that falling intonation encodes no Engagement.  {A runs into his friend B who is walking her new dog around the block.}  A: You have a new dog, eh?  B: Yes, I just got him last week.  {A and B catch up over a drink after the summer break.}  A: So, I have a new dog, eh, and he just doesn’t listen!  {A starts daydreaming about a trip to Hawaii, but she keeps coming back to the fact that this will be difficult with her latest addition to the household. B puts an end to A’s dreaming, and says:}  B: You have a new dog, eh. Table 3.9 summarizes the preceding discussion. A comparison of falling declarative, high-rising declarative, rising declarative, and rising interrogative shows that a speaker can engage the addressee for different reasons or abstain from engaging the addressee altogether.   2 This intonation is also represented by  because it has a low pitch excursion and is usually not completely flat. In the autosegmental-metrical framework (Pierrehumbert 1980), level-intonation corresponds to H* H-L%. By the same rationale, I also represent H* H-H% by . 3 The intonational properties reported here are specific to Canadian eh. In New Zealand English, the falling intonation on eh still requires a response (p.c. Lisa Matthewson). 121  Falling Declarative High-rising declarative Rising Declarative Polar Interrogative Call ✓ ✓ ✓ ✓ Address ✓ ✓   Response required  (✓) ✓ ✓ Type of rise     Engagement None Unmarked Full Full Table 3.9: Summary of Engagement properties Response requirements are an important diagnostic for the nature of Engagement. A binary distinction between full and no Engagement is insufficient. The felicity of an address, such as I’m afraid, X, with high-rising declaratives demonstrates that high-rise declaratives do not target the resolution of a propositional choice. It is important to note that responses can be required both to resolve such an issue and to resolve a metalinguistic issue, such as the relevance of a SA. In correspondence with the threefold distinction between the use conventions of the particle eh, I argue for a threefold distinction between Engagement. In contrast to Beyssade & Marandin’s (2007) three types of CoAs, the different degrees of Engagement are not linked to any clause type properties. Engagement is an independent pragmatic variable that expresses the speaker’s expectation toward the response behavior of the addressee.  3.2.2 Formalizing Commitment and Engagement In this subsection, I formalize the notions of Engagement and Commitment and demonstrate how they relate to the different conversational moves I introduced through the table analogy (Subsection 3.1.1). I begin with the formalization of the conditions that govern the use conventions of the four constructions under discussion in this chapter. With these ingredients in place, I provide a preliminary formalization of Engagement and Commitment, which I revisit in Chapter 4 where I discuss their encoding. The interpretations of both variables scope over propositions. I begin with a brief summary of the factors conditioning the conventions of use of rising interrogatives as well as those of falling, rising, and high-rising declaratives: 122  Factors that determine the conventions of use for different SAs: (i) Knowledge asymmetry: Based on what the interlocutors know prior, during, and after a conversation, the SSP and the SSB change from reflecting an asymmetry to reflecting a symmetry of knowledge and belief of the interlocutors, respectively. (ii) Context linking: Depending on what elements are (extralinguistically) salient, the speaker can initiate a conversation by stating something strictly out of the blue. (iii) Response requirement: Due to a knowledge asymmetry an utterance can require a response from the addressee that reduces the asymmetry in knowledge explicitly or implicitly addressed by the speaker. This response requirement can also arise for other reasons independent of the knowledge asymmetry, such as the relevance of a SA. These factors define the role of an individual utterance in the conversation that it is part of. They define the rhythm of conversation. At the beginning is the asymmetry of knowledge that inspires the question about the way things are (Roberts 1996). We can formalize the relation between the asymmetry of knowledge in a conversation and any discourse move as follows:  Asymmetry condition: A discourse move µ serves to reduce an asymmetry in knowledge κ between two interlocutors A and B, expressed as {κA > κB}, iff a. {κA > κB}is evident to at least one interlocutor A or B, b. {κA > κB} can be reduced based on evidence available to interlocutor A,  c. {κA > κB} cannot be reduced by the presence of an (extra)linguistic element e. The condition in (87) puts the asymmetry of knowledge at the center of a conversation, relates a discourse move to the linguistic and extralinguistic context, and explains the response requirement for those discourse moves that do not contribute to reducing the initial asymmetry in knowledge. A prerequisite of the asymmetry reduction is an awareness of the asymmetry (87), which allows the speaker to project an economic flow of conversation. In Gricean terms, the quantity of a conversational move is only preserved if there is an initial asymmetry; otherwise the move is redundant. Subcondition (b) draws on an insight by Gunlogson (2008) that commitment requires a source that provides evidence. Sufficient evidence is the prerequisite of being a reliable source of truth. Subcondition (c) reflects the relevance of extralinguistic material to the question of whether or not a SA is felicitous out of the blue. It also guarantees an economic exchange between 123 the interlocutors; alternative ways of reducing the asymmetry would be the introduction of evidence through an extralinguistic source or self-deduction (see Chafe 1976). The asymmetry condition, however, only captures the preconditions of a conversational exchange. For defining the conversational effects of the four different constructions under discussion, as mentioned before, I distinguish between the SSP and the SSB, building on Stalnaker (2002). I further assume a default for continuation of a conversation for falling declaratives and rising interrogatives building on the principle of least collaborative effort by Clark & Wilkes-Gibbs (1986). Beyond the factors determining their contexts of use, then, there are three factors that determine their effects.  Factors determining the conversational effects of different SAs: i. Update of SSP: After any utterance, the SSP is updated by the propositional content of that utterance. Both interlocutors can refer to the proposition at any future point in the conversation independently of whether they believe it is true. ii. Update of SSB: From the content in the SSP, an addressee can include the propositional content of the speaker’s utterance in their set of beliefs. iii. Efficient updating: If the speaker is a credible source of the truth of a proposition, the most efficient conversational move is to accept it as true without a change in turn. If the addressee does not respond, this can be interpreted by the initial speaker that the proposition can be treated as part of the SSB until the addressee communicates otherwise.  These factors can be formalized with a notation parallel to (87) as follows:  Belief condition: A proposition p asserted to be true by Speaker A moves from SSP, where p is treated as true by Speaker B for the sake of the conversation ({BelA (p), κB (p)}) to SSB, where p is believed by both interlocutors A and B ({BelA (p), BelB (p)}) without a change in turns, iff: a. at least one interlocutor is a credible source for the truth of p, b. BelA (p) is not contradicting BelB (p), and c. (p) is not already included in {BelA (p) ∪ BelB (p)}. For any case that does not fulfill condition (89), the interlocutors need to negotiate: (i) whether the interlocutors are a credible source for the truth of the proposition; (ii) if the addressee has a 124 different belief than the speaker assumed they had (e.g. when they have contradicting evidence), and (iii) if the addressee already knows what the speaker reveals (to avoid redundancy). While the asymmetry condition in (87) and the belief condition in (89) capture the prerequisites and consequences of a conversational move, they leave the roles of the interlocutors for reducing the asymmetry of knowledge unspecified. We therefore need to characterize how interlocutors i and j relate to that asymmetry. This is where Commitment and Engagement become relevant: they capture the speaker’s attitude and their intentions for the conversational effects that lead to a reduction of the asymmetry of propositional knowledge.   Let English have the conversational variables COM and ENG. Let these variables scope over the content of an intonational phrase α, which has propositional content. Then,   a.  [[COM (α)]]A,B ≈ A proposes [[α]] to be added to {BelA (p) ∪ BelB (p)}.   b.  [[ENG (α)]]A,B ≈ A engages B to move [[α]] from {κA > κB} to {BelA (p) ∪ BelB (p)}. Let me briefly point out how the asymmetry condition and the belief condition constrain Commitment and Engagement. For this, I return to the table analogy for purposes of illustration.  Figure 3.11: Commitment and Engagement at the table of negotiation. The asymmetry condition sets the scene before any discourse move and the belief condition facilitates the flow of conversation. Unless there is a reason to negotiate, or one to anticipate a negotiation move by the addressee, the belief put forth by the speaker can enter the SSB. Hence, falling declaratives only communicate the speaker’s (full) Commitment, but no Engagement of the 125 addressee to resolve the truth of proposition or its alternative. Rising interrogatives have the opposite configuration since the speaker is not in the position (for lack of sufficient evidence) to commit to the truth of a proposition. For falling declaratives and rising interrogatives, the table of negotiation is irrelevant thanks to the belief condition that restricts its uses to guarantee an economic flow of conversation. Rising and high-rising declaratives, however, include both a form of Commitment and Engagement because they require some negotiation. For a rising declarative, this is a direct consequence of the asymmetry condition because the speaker lacks sufficient evidence for committing fully to the truth of a proposition in a rising declarative. To reduce the asymmetry in knowledge, the speaker engages the addressee, whom they suspect to be a credible source. The belief condition limits the number of turns until a belief can enter the SSB: once the addressee turns out to be a credible source, there is no need to further negotiate their response. For a high-rising declarative, however, the truth of the proposition is not at stake. The speaker does not engage the addressee to reduce the asymmetry of knowledge; full Engagement is therefore not expected. The Engagement must relate to something else, which, I suggest, is the relevance of the SA. In Subsection 3.1.3 we saw that this results in a temporary downgrading of the belief status: the proposition is treated as knowledge in the SSP rather than a proposal for the proposition to enter the SSB. As a consequence of the belief condition, the interlocutors need to negotiate. We can now formalize the different degrees of Commitment and Engagement by resorting to the belief and asymmetry condition for those conversational moves that cannot be defined by either full or no Commitment or Engagement. Recall that +/- COM and +/-ENG define the coordinates of the negotiation space. For primary SAs, it is possible to reduce the variables to one: if the speaker lacks sufficient evidence to commit to a proposition, they engage the addressee; if the speaker has sufficient evidence, they do not. In the previous sections we saw that high-rising and rising declaratives make use of the middle ground between the endpoints of full and no Engagement and of full and no Commitment, respectively. The use conventions of high-rising and rising declaratives, however, suggest that things are more complicated: The former come with unmarked Engagement despite sufficient evidence for the truth of the proposition, and the latter (fully) engages the addressee despite having some evidence for the truth of the proposition. If Commitment and Engagement are independent variables, their relation must be mediated by another factor. This can be done by including the asymmetry condition in the definition of the three different degrees of Commitment. 126  Let English have the conversational variable COM. Let this variable scope over the content of an intonational phrase α, which has propositional content. Then, [[COM (α)]]A,B is defined as: a. [[-COM (α)]]A,B ≈ A cannot propose [[α]] to be added to {BelA (p) ∪ BelB (p)}because the initial state{κA > κB} cannot be reduced for lack of evidence available to A. b. [[+COM (α)]]A,B ≈ A proposes [[α]] to be added to {BelA (p) ∪ BelB (p)}because the initial state{κA > κB} can be reduced based on evidence available to A. c. [[uCOM (α)]]A,B occurs elsewhere. Unmarked Commitment (uCOM) simply arises if Commitment is neither specified as +COM or -COM. The speaker may choose to do so because they only have partial knowledge of the truth of a proposition, they are biased toward it, or because they would like to add important information. Note that the three degrees of Commitment formalized here correspond to a positive and a negative feature valuation for full and no Commitment, respectively, and for an unvalued feature for unmarked Commitment. Commitment captures the relation of the speaker to the propositional truth. Hence unmarked Commitment is characterized by neither an established relation to the truth nor an absence thereof; it is characterized by the insufficient nature of the evidence that does not allow the speaker to commit to it nor to its polar opposite.  A similar ternary distinction is evident in the three degrees of Engagement, which can be modeled by integrating the belief condition in their definition:  Let English have the conversational variable ENG. Let this variable scope over the content of an intonational phrase α, which has propositional content. Then, [[ENG (α)]]A,B is defined as: a. [[-ENG (α)]]A,B ≈ A does not engage B to move [[α]] from {κA > κB} to {BelA (p) ∪ BelB (p)} because the speaker can reduce an initial state {κA > κB} as a credible source. b. [[+ENG (α)]]A,B ≈ A engages B to move [[α]] from {κA > κB} to {BelA (p) ∪ BelB (p)} because they appear to be a credible source. c. [[UENG (α)]]A,B occurs elsewhere. Again, unmarked Engagement (UENG) simply arises if Engagement is neither specified as +ENG or -ENG. The speaker may do so if the issue is not of propositional nature. The belief condition, which ensures the economy of a conversation, is one of two factors that break up the dependency 127 of Commitment and Engagement. It is possible to engage the addressee to whatever degree independently from the initial knowledge state because there can be a difference between what is between what is accepted for the purposes of the conversation and what is believed, which corresponds to my distinction between SSP and SSB. Unmarked Engagement does not draw on a knowledge state, but on the metalinguistic notion of how to proceed with the conversation, which includes the question of whether or not a SA is relevant. The other factor, the Asymmetry condition, points to the possibility that a knowledge asymmetry may not be defined by a clear-cut distinction between knowing and not-knowing the truth of the proposition. Leaving the Commitment unmarked reflects the insufficiency of the present evidence for publicly committing to the truth of the proposition. 3.3 Commitment and Engagement compose the Dialogical Speech Act Model Following from the observations in Section 3.1 that Commitment and Engagement reflect the use conventions and intended effects of a range of different constructions, and from the observations in Section 3.2 that these variables determine the development of the conversation, it is obvious that there is a very close relation between Commitment and Engagement. In this section, I argue that Commitment and Engagement are – contrary to what is often assumed – independent variables. While these variables appear to be in an inverse relation in primary SAs, the use conventions of rising and high-rising declaratives show that there are systematic deviations from this pattern. In other words, the degree of Engagement does not automatically follow from a particular degree of Commitment, or vice versa. Instead, I argue that a speaker can chose to engage the addressee to any degree at whatever degree of Commitment. However, both unmarked Commitment and Engagement will prevent a direct proceeding in the discourse and will require some negotiation between the interlocutors of how to proceed. I showed earlier that the degrees of Commitment and Engagement fall into three discrete categories: full, none, or unmarked. Among the constructions discussed, falling declarative and rising interrogative exemplify configurations where the two pragmatic variables are at the opposite ends of each scale. The use conventions of rising interrogatives are defined by no Commitment and full Engagement of the addressee. Falling declaratives are characterized by full Commitment and no Engagement. High-rising declaratives also come with full Commitment, but only with an unmarked Engagement. For rising declaratives, the degree of Engagement corresponds to that of 128 a rising interrogative; but for the rising declaratives, the speaker does not mark their Commitment. This leaves us with the configurations listed in Table 3.1 for our four conversational phenomena. SA Commitment Engagement Falling declarative full none High-rising declarative full unmarked Rising declarative unmarked full Rising interrogative none full Table 3.10: Conversational properties of falling, high-rising, rising declaratives and rising interrogatives The use conventions and intended effects of rising and high-rising declaratives – captured by the unmarked degrees of Commitment and Engagement, respectively – demonstrate that the two variables can be independent from each other. For rising declaratives, the full degree of Engagement is based on the fact that the speaker cannot commit fully to the proposition; for high-rising declaratives the unmarked degree of Engagement is not based on the fact that the speaker cannot commit fully to the proposition. Crucially, the configurations in Table 3.10 do not exhaust the logical possibilities of combining different degrees of Commitment and Engagement. This is an important observation since we could otherwise reduce the different configurations to an erroneous rule that treats their relation as dependent, such as a speaker must engage the addressee if they lack evidence to commit to a proposition. While such a rule may capture the conversational effects of the four constructions in Table 3.10, there are other constructions whose conversational properties are captured by Commitment and Engagement that break with this pattern. To demonstrate how Commitment and Engagement relate to the interpretation of the different constructions, it is helpful to arrange their possible configurations on a two-dimensional plane (see Table 3.11). Each axis corresponds to one pragmatic variable. The origin is the point where no Commitment maps onto no Engagement. On each axis, we find the three degrees of Commitment and Engagement, respectively. As of now, we can only fill in the configurations of rising interrogatives (Int), rising declaratives (Dec), high-rising declaratives (Dec) and falling declaratives (Dec). 129  Table 3.11: Configurations of Commitment and Engagement (preliminary) In Chapter 5, however, I demonstrate that English indeed constructions that map onto each logically possible configuration of Commitment and Engagement. Disjunctive interrogatives (INT), wh-interrogatives (Wh-INT), echoes (XP), rise-fall-rise declaratives (DEC) and utterances with a modified rise (XP) complete the picture (see Table 3.12).  Table 3.12: Configurations of Commitment and Engagement (still preliminary) Any assumption that stipulates a dependence on Commitment and Engagement cannot hold because a full degree of Commitment can still result in a full degree of Engagement in the case of echoes. It is also possible to both communicate no Commitment to a proposition and no Engagement to resolve a propositional issue in the case of alternative questions. For echoes, the propositional truth is not at stake; instead the speaker engages the addressee about the truth claim. For alternative questions, the speaker cannot commit to any of the stated alternatives and engages 130 the addressee not to resolve an issue, but to accept the propositional choice. The question interpretation only emerges from the presence of the alternatives being mentioned. Hence, the basis of the different degrees of Commitment and Engagement remains the same independent of what degree of Engagement the degree of Commitment is mapped to. This is made possible by disentangling the notions of Commitment and Engagement from any association with questionhood or the lack thereof. Consider again the effects of the different degrees of Commitment and Engagement as summarized in Table 3.13.  Commitment Engagement No The speaker cannot commit to the truth of the proposition because they do not have sufficient evidence to do so. The speaker does not engage the addressee to resolve an issue because they expect the addressee to accept the propositional content. Unmarked The speaker leaves the Commitment unmarked because they may lose face if they do. Consequently, the propositional content needs to be negotiated. The speaker leaves the Engagement of the addressee because the proposition is not at stake. Consequently, the SA needs to be negotiated. Full The speaker commits to the truth of the proposition at issue because they have sufficient evidence to do so. The speaker engages the addressee to resolve a propositional issue. Table 3.13: Conversational effects of Commitment and Engagement Unmarked Engagement, then, stands out by the fact that it is not about the truth of the proposition. This is different from Malamud & Stephenson’s (2014) claim that rising declaratives are associated with a metalinguistic issue. I claim that a metalinguistic issue has its foundation in a lack of Engagement, and not in a lack of Commitment. Unmarked Engagement points to an issue with the SA itself. We can therefore complete the overview of the individual configurations of Commitment and Engagement by relating their different degrees to conversational effects following from the contextual analysis in Subsection 3.1.2. On the one hand, no Commitment presents the addressee with a set of propositional alternatives; unmarked Commitment marks a proposal for negotiation (I adopt the term proffer from Ettinger & Malamud 2013); and full Commitment corresponds to the classic notion of asserting the truth of a proposition. No Engagement, on the other hand, communicates an expectation that the addressee should accept a proposition or a propositional choice; unmarked Engagement comes with the expectation that the addressee attends to the SA; and full Engagement is a call to respond to a propositional choice. 131  Table 3.14: Degrees and effects of Commitment and Engagement The details of how the different configurations adequately capture the conversational properties of the constructions not discussed in this chapter can be found in Chapter 5. The insight to be taken away from this overview of the different configurations is that we can systematically account for when Engagement inversely maps onto Commitment, and when it does not. While the endpoints of Commitment and Engagement always relate to a set of propositions (singular or multiple, that is), unmarked variables relate the SA to metalinguistic issues. These issues can be characterized by a sense of incompleteness for Commitment, and a sense of attention-seeking for Engagement. The latter is possible if the addressee is asked to engage with the SA rather than the proposition. 3.4 Comparison with earlier speech act models In this subsection, I compare the decomposition of SAs into configurations of Commitment and Engagement to previous models of primary and derived SAs. I begin with a comparison of the implications of my model for the negotiation table. Between the conversational models discussed in Chapter 2 and my current proposal, there are only minor differences in what elements and mechanisms they contain. They all contain a version of the cg, a table of negotiation, a notion of projection and different ingredients that are being negotiated. The key difference is that my conversational model relies on two simultaneous conversational moves rather than one, which makes my model a dialogical one. I also compare my use of projection and metalinguistic issue for Engagement with those in Malamud & Stephenson (2014) and Krifka (2015), respectively. Finally, I show how the Dialogical SA Model does not depend on the notion of salient proposition for resolving ambiguous form-function mappings (cf. Bartels 1997; Truckenbrodt 2012). 132 In my account, the negotiation table is the space where conversational issues are negotiated that cannot be assumed to be resolved without an act of negotiation. To guarantee an economical exchange, propositional choices are resolved, and propositional truths are accepted without making use of the negotiation table if possible. The economy of conversation is captured by the asymmetry condition and the belief condition, which draw on the principle of least collaborative effort by Clark & Wilkes-Gibbs (1986) and Stalnaker’s (2002) distinction between accepting and believing. The SSP refers to any knowledge shared by the interlocutors – based on linguistic or extralinguistic knowledge. The SSB refers to any beliefs share by the interlocutors – based on evidence or successful negotiation. Neither SSP nor SSB need to be specified in the model. The SSB is the intersection of speaker- and addressee-beliefs; the SSP is the union of what is on the table and what can be accommodated by reference to any extralinguistically-salient entity. Hence, there are only three elements in my model: the speaker’s ground, the addressee’s ground, and the table.  These three elements of the model have corresponding elements in previous conversation models. Farkas & Bruce (2010) distinguish between the belief of the interlocutors and Stalnaker’s cg as an additional element. Issues that are being discussed are placed on the table. Malamud & Stephenson (2014) go one step further in assuming a projected version of each of the current elements, the speaker’s ground, the addressee’s ground, the cg, and the table. Ettinger & Malamud (2013) take all of these elements and additionally split the table into one for choices and one for proposals. And Krifka (2015) has all of these elements albeit under different terminology: his commitment state roughly corresponds to my notion of cg; his commitment space includes the cg as well as its projected continuation; and the beliefs of the interlocutors are represented by the propositions or propositional choices they publicly commit to and the alternatives to which they do not.  In the model proposed in this thesis, the notion of projection, which in previous models receives an additional space – or several – is captured by the notion of Engagement. The innovation in my model is to conceive of projection as being part of the same conversational move as Commitment is. Engagement communicates the speaker’s intention of how the addressee should continue; Commitment communicates the propositional attitude of the speaker based on the evidence they have available for making a public commitment. This leaves us with the content that is being negotiated. In Malamud & Stephenson (2014), this content corresponds to propositions or a metalinguistic issue. In Ettinger & Malamud (2013) the 133 different issues correspond to different tables: one table for choices, such as those present in questions, and one for proposals – both corresponding to propositional sets of different sizes. In a way, here, a choice corresponds to the metalinguistic issue in other models. In Krifka (2015), the metalinguistic issue is a meta-SA, which serves to deal with biases and denegations (i.e. negations at the SA level; Cohen & Krifka 2014). In my model, what is being negotiated is either a propositional set or a SA. So-called metalinguistic issues only arise if the SA itself is negotiated, which comes with unmarked Engagement. In my model, the SA itself is placed on the table to determine how the conversation continues. Table 3.15 provides a summary of the ingredients of the various models.   Grounds Tables Projection Content Farkas & Bruce (2010) Speaker Addressee Common  Set of cgs Propositional sets Malamud& Stephenson (2014) Speaker Addressee Common  One at every level of the model Propositional sets, metalinguistic issues Ettinger & Malamud (2013) Speaker Addressee Common Tablechoices Tableproffer Toward a target cg Propositional sets Krifka (2015) Commitment states Return to previous state Commitment space Propositional sets, SA Dialogical SA Model (this thesis) Speaker Addressee  via Engagement Propositional sets, SA Table 3.15: Ingredients of different conversational models While all of the models in Table 3.15 contain very similar elements, they vary in their complexity – mostly based on how they model the future development of the cg. The current model reduces some of that complexity by relying on two assumptions: Firstly, the speaker’s projection of the future development is independent of their attitude, but every SA is defined by both its degree of Commitment and its degree of Engagement. Both conversational variables can involve the interlocutors or the table. Secondly, a conversation is regulated by principles that streamline the conversation. For primary SAs, the speaker can assume that the addressee will not engage in a negotiation but resolve or accept an issue in the most economical way (Clark & Wilkes-Gibbs 1986). If the addressee has reason to negotiate, they can simply move the issue back onto the table, but they will mark that explicitly (e.g. with a rise-fall-rise contour). Specifically, these assumptions 134 are detailed in the belief condition, which governs discourse moves, and the asymmetry condition, which governs the preliminary assumptions of a conversation. The higher degree of parsimony of the current model therefore arises by assuming defaults and projecting responses. The table analogy is more constrained than previous versions by assuming a direct correspondence of discourse move and the issue being negotiated. Whenever Commitment and Engagement is marked, the negotiation is about the propositional content. If Commitment is unmarked, the negotiation is about a faulty proposition, which for rising declaratives translates to a proposition missing its truth value. If Engagement is unmarked, the relevance of the SA is negotiated. This is summarized in Table 3.16, which lists the contents associated with each variable configuration. Degree Commitment to Move Engagement about Move Full proposition  to Groundaddresee propositional choice from Groundaddressee   to Groundspeaker Unmarked faulty proposition onto Table SA onto Table None propositional choice remains in Groundspeaker proposition added to Groundaddressee Table 3.16: Issues and moves in the present account of primary and derived SAs Finally, the current model of Commitment and Engagement does not need to turn to the notion of salient propositions for the interpretation of metalinguistic issues. In Sub-subsection 2.5.2.3, I discussed the use of the notion of salient proposition in Bartels (1997) and Truckenbrodt (2012). Bartels’ (1997) use of the notion of salient propositions differ from propositional content only for interrogatives; for Truckenbrodt (2011) they always apply to explain how intonation contributes to meaning. Salient propositions are those propositions that are not being asserted (Bartels 1997) or that are being put up for question (Truckenbrodt 2012). Yet, narrowing down which propositions are salient is underspecified. My notions of Commitment and Engagement, however, make it possible to dispense with the notion of salient propositions for the interpretation of the contour: only if an issue is placed on the table – for lack of evidence or in anticipation of a non-propositional response – does the addressee need to draw on contextual information to resolve an issue. This contextual information can be characterized in clear terms: the former scenario is based on an extralinguistic entity or event, the latter on the SA mentioned in the previous turn. To see the technical implementation of my account, consider the example in (93), which is provided by Truckenbrodt (2012) to argue for the relevance of salient propositions. 135  My name is Mark Liberman (H*H-H%) For the high-rising declarative in (93), Truckenbrodt assumes that what is questioned with H- cannot be the propositional content, but a salient proposition, such as you are expecting me or I am in the right place. In my account, both of these salient propositions fall under the implicatures that arise from fully committing to the propositional content of My name is Mark Liberman and engaging the addressee about the relevance of the SA. Unmarked Engagement tasks the addressee to negotiate the SA itself, which in the case of (93) means negotiating how to proceed from there in the conversation after the speaker has committed to the truth of the proposition. If the SA is accepted, the addressee can resolve the issue by giving a reply based on their knowledge. If the addressee knew about an appointment of the speaker, they can respond by mentioning this appointment (e.g. Right, you have an appointment at 3pm). Of course, this reply can be analyzed as a response to a contextually salient proposition, but there is no need to go through a process of deciding which proposition may be salient or not. The addressee only needs to consider the proposition and decide about its relevance for the conversation. What is being relevant is the information the addressee had before the speaker’s utterance. Rejecting the SA would therefore lead to a reply, such as I see, how is that my concern? which makes it impossible to draw on any of the salient propositions that one might propose based on (93) alone. To conclude then, I summarize the comparison of the different SA models with the following three propositions: (i) Table analogies can be reduced to the spaces of speaker ground, addressee ground and the Table; the notion of cg follows from the intersection of beliefs in speaker ground and addressee grounds for the SSB and what is on the table for the SSP; (ii) Projected sets and projected spaces follow from the conversational move of projecting Engagement; Engagement takes care of the additionally-projected spaces in other models. (iii) Negotiated contents or table splits do not need to be explicitly mentioned in the model; they follow from the different degrees of Commitment and Engagement. This makes the present table analogy notably more parsimonious than its alternatives in the literature without compromising on explanatory adequacy. Beyond the table analogy, my account can also dispense with the notion of saliency. Finally, the metalinguistic contents can be reduced to faulty propositions (for unmarked Commitment) and SAs (for unmarked Engagement). 136 3.5 Conclusion In this Chapter, I argued for a SA model that is based on the two pragmatic variables of Speaker Commitment and Addressee Engagement, which reflect the use conventions and intended effects of both primary and derived SAs and cut across different clause types. I introduced a revised model of the negotiation table (Farkas & Bruce 2010) that only relies on three elements (speaker ground, addressee ground, and table) and two conversational moves (Commitment and Engagement) to model these conventions and effects. I demonstrated that Commitment and Engagement adequately capture the relevant conversational properties of falling, rising, and high-rising declaratives, as well as of rising interrogatives, independently of any form-function mappings discussed in the previous literature. A crucial role in constraining my model is ascribed to the asymmetry and the belief condition which secure the economy and capture the preliminaries of a conversational exchange. As a consequence of these conditions, Commitment and Engagement can be in an inverse relation for primary SAs. To capture the conventions and effects of derived SAs, I drew on the full paradigm of possible configurations of Commitment and Engagement. A crucial observation in this context is that unmarked Commitment corresponds to a commitment to a faulty proposition and that unmarked Engagement corresponds to an engagement of the addressee about the SA. The resulting relation of Commitment and Engagement has the potential of accounting for a wide range of SAs independent of a clause-type base notion of SAs. There are three aspects of the SA Problem that require further attention. Firstly, so far I have suspended the issue of how Commitment and Engagement are encoded. This allowed forfor a reconceptualization of the use conventions and intended conversational effects independent of any constraints introduced by word order or question words. I return to this issue in Chapter 4, where I empirically show that intonation provides the means of distinguishing different degrees of Commitment and Engagement independent from clause-types. In Chapter 5, I will then complete the picture by discussing the remaining configurations. In the same chapter I will also address the other two issues: the intonational variation found among different constructions and the role of intonation independent of propositional meaning. 137 Chapter 4: Encoding Commitment and Engagement In this chapter, I discuss possible means to encode Commitment and Engagement. Primary attention is given to the role of SFI (i.e. the so-called nuclear tune), but I also explore the relation between SFI and other factors that contribute to their encoding, such as word-order and wh-pronouns. I argue that variation in Commitment and Engagement impact the shape of the SFI thereby indicating to what degree the speaker can commit to an issue and to what degree they engage the addressee to resolve this issue. The prosodic variables that correlate with Commitment and Engagement are pitch height, excursion and the duration of SFI. Pitch height uniformly encodes the degree of Commitment in declarative clauses. For rising SFI, Commitment negatively correlates with SFI duration. Engagement positively correlates with SFI excursion. In other words, the shorter and smaller the shape of the rise, the more confident the speaker; the greater the pitch excursion, the higher the response expectation. For falling SFI, we see the opposite pattern: Commitment negatively correlates with SFI excursion and Engagement positively correlates with SFI duration. Put differently, the greater the fall, the more confident the speaker; and the greater the pitch duration, the higher the response expectation. These correlations of Commitment and Engagement with the shape of SFI play a central role for different types of declaratives. For interrogatives, morphosyntax supplies a range of cues that make prosodic encoding less relevant: subject-auxiliary inversion indicates the presence of alternatives; wh-pronouns indicate missing information. This is captured in Figure 4.1.  Figure 4.1: Encoding Commitment and Engagement 138 In Section 4.1, I introduce my proposal for the encoding of Commitment and Engagement through the shape of the nuclear tune. In Section 4.2, I survey previous investigations of meaningful prosodic variation with a particular focus on role of SFI. In Section 4.3, I discuss the methods of a two-part perception study that I conducted to I provide quantitative evidence for the significance of SFI for encoding Commitment and Engagement. In Section 4.4, I report the results of that study that are relevant for the encoding of Commitment. In Section 4.5, I report the results of that study that are relevant for the encoding of Engagement. In Section 4.6, I discuss the possibility of expanding the findings onto interrogative sentences. In Section 4.7, I conclude and point to the relevance of these findings to the remainder of the development of the SA model. 4.1 Prosodic correlates of Commitment of Engagement in declaratives In this section, I discuss the encoding of Commitment and Engagement by intonation in the absence of morphosyntactic variation, i.e. in plain declaratives, such as those exemplified below:  a. It’s raining b. It’s raining  c. It’s raining I argue that Commitment and Engagement can be encoded by intonation alone. Specifically, it is encoded by the manipulation of the shape of the SFI through variation in pitch excursion and duration. These manipulations have different effects for falling and rising SFI. For rising intonation, I show the following: i) The more prominent the rise, the smaller the degree of Commitment. ii) The greater the pitch excursion, the greater the degree of Engagement. Prominence above is defined as the combination of both excursion and duration. For falling intonation, we (almost) see the opposite pattern of form-function mapping: i) The greater the pitch excursion, the greater the degree of Commitment. ii) The greater the duration, the greater the degree of Engagement. Put differently, Commitment receives a penalty for any rise; so, the smaller the rise, the higher the Commitment. A stronger Engagement can be communicated by increasing the pitch excursion of a rise or the duration of a fall. Expressed in terms of the AM framework, it is the vertical and horizontal distance of the boundary tone (T%) from the anchoring tone (T*) that matters for the 139 encoding, not the individual tones. Depending on the type of contour (rise or fall) those distances are interpreted differently. Since both anchoring and boundary tones are determined in relation to the preceding tone, meaning is impossible to express via a direct tone to function mapping. Figure 4.2 represents the encoding of Commitment and Engagement for rises (starting with a high or low pitch accent and ending in a high boundary one) and falls (also starting with a high or low pitch accent but ending in a low boundary one). The deciding factor for the degree of Engagement and Commitment is the vertical and horizontal distance between pitch accent and boundary tone.  Figure 4.2: Encoding of Commitment and Engagement for rises (left) and falls (right) To link this mapping back to the preceding discussion of the conception of intonation as a combination of tonal targets or as set configurations and to the question of the relevance of the nuclear tune, the implications are clear: Commitment and Engagement are encoded by the shape of the nuclear tune. None of the existing frameworks makes predictions that are clear enough to allow me to specify the exact encoding of their degrees, but the transition between tonal targets is the essential component. Hence, if modelled within the AM framework, vertical and horizontal scaling are phonological, rather than phonetic features in the sense that they are conventionalized. I motivate the proposed encoding by providing a survey of the (experimental) literature on intonational meaning (Subsection 4.2.1), spelling out my hypothesis and predictions for the perception of the above manipulations and how they are associated with a difference in confidence and response expectation (Subsection 4.2.2), outlining the methods for two parts of a perception 140 study testing these predictions (Subsection 4.2.3) and reporting results and discuss the findings for Commitment (Subsection 4.2.4) and Engagement (Subsection 4.2.5). I conclude by relating those findings to the issue of encoding Commitment and Engagement as variables composing SAs (Subsection 4.2.6). 4.2 Previous findings on meaningful prosodic variation In this subsection, I provide an overview of experimentally-supported interpretations of SFI. These align with the core meanings postulated in the theoretical literature (as discussed in Section 2.5): i) Rising intonation is typically associated with uncertainty, incompleteness, or insecurity ii) Falling intonation is typically associated with finiteness, completeness, or confidence. In the following, I focus on the phonetic aspects of intonation alone, which can be separated into aspects of duration (Sub-subsection 4.2.1.1) and pitch height or excursion (Subsubsection 4.2.1.2). Meanings similar to Engagement and Commitment have been postulated for both measures. I conclude with a report on studies that looked into the inter-dependency of these dimensions. 4.2.1 Variation within the temporal dimension: pitch height and excursion The empirical investigations of the interpretation of variation in pitch height and pitch excursion closely follow the predictions arising from the theoretical literature: the more prominent the rise, the higher the question or continuation interpretation. Of particular interest in this respect is the experimental literature on uptalk. Here, researchers pay close attention to prosodic variation because the combination of rising intonation and declarative morphosyntax does not lead to a question interpretation. I discuss several studies on rising intonation that consider pitch height or excursion as factors of variation.  In an overview of the experimental literature on uptalk, Warren (2016) concludes that there is not a single acoustic measure that can distinguish an uptalk rise from other types of rises across different dialects. Uptalk rises come in different shapes, with different pitch spans in Canadian and American English, different onsets (H* in in Canadian and New Zealand English; L* in Australian, American and British English), which suggests that the encoding of uptalk is best described with L* H-H%, L* L-H% and H* L-H% for Canadian English, and L* H-H% H* H-H%, and H+L* H-H% for Australian English. Wilhelm (2016) adds further support to the claim that uptalk is realized with different contours in a cross-dialectal corpus analysis. This suggests that any link between 141 contour and function may be dialect-specific (and language-specific), even beyond the scope of uptalk. For individual dialects, it may be possible to identify individual contours dedicated to specific functions. Based on their findings from a map task study, Ritchart & Arvaniti (2014) associate low-rising declaratives with a small excursion (L* L-H%), low-rising declaratives with a question interpretation, level/plateau contours (H* H-L%) with a floor-holding function, and high-rising declaratives (H* H-H%) with a confirmation request. Nevertheless, there was great variation in their findings with the whole range of rises occurring for floor holding and confirmation requests. The authors also speculate that pitch scaling may be important beyond a L/H distinction. Prechtel & Clopper (2016) speculate that the low frequency of high-rising contours in Midwest American English may be an indication of a phonological distinction between question and uptalk rises. Fletcher & Loakes (2010) explore the degree to which rising intonation can turn the interpretation of an assertion into a question in Australian English. Contours were manipulated to arrive at a threefold variation in fundamental frequency (in Hz) of onset and a sevenfold variation of the offset of the nuclear tune. Participants increasingly favored a question interpretation with higher sentence-final pitch values. Questionhood ratings increased incrementally with higher pitch values (240 – 480 Hz for a female speaker). Interestingly, rises with a high onset (240Hz), which result in a reduced pitch excursion, were only interpreted as questions when the pitch height reached a high frequency. Translating this into AM notation, Fletcher & Loakes claim that H* H-H was the tone most likely to be interpreted as a question, followed by L+H* H-H%. L* H-H% had the least amount of question ratings. Hence, when ending in a high-tone, contours with smaller excursion were more likely associated with questions than those with greater pitch excursion. This trend was mirrored in a second measure where participants were asked to rate the confidence in their judgment. Participants felt most confident with a high anchoring tone (H*) and a high boundary tone (H%). In the discussion of their findings, Fletcher & Loakes associate the three contours with different interpretations: L*H-H%, i.e. the contour with the lowest pitch excursion, is associated with some doubt or insecurity over a response; L+H* H-H% is associated with uncertainty about a scale the speaker evokes; and H*H-H% is associated with a request for confirmation. The continuation function is addressed in Nilsenova (2006) where fifty-one native speakers of American English were asked to predict the continuation of a number of utterances. Seventeen of these participants were only presented with transcripts of utterances including their immediate 142 contexts; another seventeen only heard the audio stimuli, and the remaining seventeen participants had both transcripts and audio stimuli. The response choices among which all participants had to choose were defined by the presence vs. the absence of turn-transitions: participants could choose between no response, a brief acknowledgement (i.e. backchanneling), and an evaluative response. This choice is motivated by Nilsenova’s definition of rising intonation in the context of declaratives as “evaluative response-seeking”. Independent of whether participants made their judgments on the basis of scripts, audible stimuli, or a combination of the two, less than half of the declaratives with a rising intonation were identified as response-seeking. Although types of rises were not discriminated in the analysis, Nilsenova remarks that some of the stimuli that were not associated with a response-seeking function came with a rise, including a high rise (H*H-H%). In conclusion, existing attempts to isolate specific contours as being dedicated to uptalk or to associate a particular contour with a specific function face the challenge of great cross-dialectal differences. Also within the same dialect, there was considerable variation in those studies that looked for contour-to-function mappings. Some of the findings are outright contradicting. A small pitch excursion, for instance, is taken as an indicator of a question interpretation in Fletcher & Loakes (2010) and Prechtel & Clopper (2016), but as an indicator of a request in Ritchart & Arvaniti (2014) and of a non-response-seeking function in Safarova (2006). Hence, pitch excursion may be a useful measure for distinguishing between different types of rises, but they may come with great variation. However, the varied findings across and within dialects may be an artifact due to a lack of replications across different studies and the low participant numbers within the individual studies. As for a means of encoding Commitment or Engagement, the existing findings loosely suggest that pitch height or excursion may be worth examining in accordance with the state-of-the-art in the theoretical literature (i.e. a distinction between low and high rises for questioning and other functions). 4.2.2 Duration There is only a handful of experimental studies that have explored the temporal dimension of pitch variation. These studies consider temporal aspects of the intonational contours for their interpretations (e.g. Kohler 2004; 2006; Ramus & Mehler 2006). In a survey of several acoustic measures, Pon-Barry (2008) and Pon-Barry & Schiever (2011) report that temporal aspects are the most reliable indicators of speaker confidence. They specifically list total silence, percent silence, 143 total duration, and speaking duration as significant cues. F0 slope and range, i.e. the frequency aspects of pitch variation, had smaller correlations with speaker confidence. An interesting detail is that the timing features were significant both at the global and the local level: duration, for instance, significantly impacted the ratings of perceived confidence of the overall sentence and when speakers reached a target word (presumably the most prominent word).  The most elaborate investigation of duration to date is Tomlinson & Fox Tree (2010). I discuss this investigation in greater detail since it inspired, in part, the design of my own experiments. In a series of experiments, Tomlinson & Fox Tree investigate the effect of changes in duration (i.e. the time over which the contour unfolds) and come to the tentative conclusion that duration negatively correlates with perceived expertise (rated in three different ways). Stimuli consisted of twenty-four rising declaratives (H* L-H% and H* H-H%) and twenty-four falling declaratives (H*H-L% and H* L-L%), which were elicited semi-naturally. Half of each type of declaratives had prolonged syllables (M = 501ms, SD =156 ms), half of them did not have prolonged syllables (M = 291ms, SD = 133 ms). This results in a difference of 210ms on average or an increase of 72% in duration for the prolonged syllables. Qualitatively, the difference in duration corresponded to an early vs. late peak of the contour. These stimuli were tested in three experiments.  In their first experiment, twenty native speakers of English had to judge on a 7-point Likert scale whether a speaker they listened to accurately recalled some facts about celebrities they had learned at an earlier point in time. Duration was a significant factor since stimuli with prolonged syllables were judged as less accurately recalled than those without prolongation. Contour type was not a significant factor. In other words, independent of whether the contour was a rise or a fall, stimuli with prolonged syllables were associated with less-knowledgeable speakers. The second experiment was an online replication of the first experiment with twenty speakers who listened to the same stimuli. Participants had to press a button after hearing a word they saw 500ms before the onset of each audible stimulus. This target word occurred after a first sentence ending in a rise. Prolonged stimuli had faster reaction times than non-prolonged stimuli. Pitch (rising vs. falling intonation) was not an independent effect, but there was a significant interaction between rise and prolongation. Put differently, the fastest reaction times occurred with prolonged rises, which came with the advantage of longer processing times of the following sentence. The authors interpret this finding as a confirmation of a link between long rises and a forward-looking function, i.e. an indicator of continued elaboration. Finally, in a third experiment, forty-two native speakers of 144 English were split into two subgroups and responded to the same stimuli as before with the same procedure as in the second experiment. One subgroup was told they were rating experts, the other that they were rating non-experts. For non-experts, there was a significant interaction of rising intonation and prolongation. Again, prolonged rises had the shortest reaction times. This interaction was absent in the expert condition. The absence of main effects is worth noting here because they show that neither duration nor contour type was a significant predictor of reaction times when participants were told that they would listen to experts.  Summarizing the findings of all three experiments, then, prolongations correlate with a decrease in sounding knowledgeable independent of the contour type (as per the first experiment) and interact with rising interpretation in facilitating target word recognition (from non-experts) if that target word occurs after the rise. Tomlinson & Fox Tree therefore refer to prolonged rises as forward looking. This interpretation of their findings competes with a proposal by Brinton & Brinton (2010) which associates prolongations with the opposite effect: rather that encoding a forward-looking function, long falls encode are here proposed to encode finality and long rises are proposed to encode questioning (which would be backward-looking in Tomlinson & Fox Tree (2010)). Likewise, it is the brevity of falls and rises that is associated with attenuation and reservation, respectively. Again, these functions would be considered forward-looking in Tomlinson & Fox Tree (2010) and are associated there with prolongations. Brinton & Brinton (2010) base their claims about the effects of longer falls and rises on their observations about prosodic variation in response markers, such as yes and no. The problem with associating any specific effects with a change in duration is that phrase-final lengthening has been observed to naturally occur at the boundaries of any intonational phrase (Turk & Shattuck-Hufnagel 2007; Wightman et al. 1992; Scott 1982); hence we expect duration to vary independently of any pragmatic function. If Commitment is encoded via intonation, the matter of phrase-final lengthening could be a confounding factor. Furthermore, Boegels & Torreira (2015) report that phrase-final lengthening serves as a predictor of short vs. long questions. It has also been found in a regional dialect of Scottish English (Orkney English) that short duration (global) is an interrogativity marker (Van Heuven & Van Zanten 2005). Cross-linguistically, marking questions through final lengthening is well attested for many African languages (Rialland 2007 shows that all 18 Gur languages possess this feature). Even in North-American English, this phenomenon is well documented as an intermediate step in first-language acquisition. Patel & 145 Grigos (2006) and Snow (1994) show that lengthening is acquired later than F0 manipulation but is employed for question marking before pitch manipulation is mastered.  Just as with pitch height and excursion, the duration is associated with several different functions and displays a great amount of variation. It nevertheless seems worth exploring temporal aspects for correlations with Commitment, especially in light of the competing proposals of Tomlinson & Fox Tree (2010) and Brinton & Brinton (2010). Two questions that arise from these accounts are (i) whether duration has the same effect for rises and falls, and (ii) whether duration and pitch height/excursion need to be conceptualized as interactions. These questions also reflect that uncertainty has been associated both with a change in pitch excursion and duration. 4.2.3 (In)dependence of duration and pitch The fundamental frequency, which is the physical signal that corresponds to our perception of intonation, is typically given in hertz (Hz). One hertz is one cycle of the sinusoidal pattern per second (Johnson 2011). This is how we describe the rate of vibration of our vocal cords producing sound. By definition, then, pitch also includes a temporal dimension. However, the temporal dimension is relevant in two ways: since intonation captures the rise or fall of the fundamental frequency over time, the rate of vibration will change over time. Hence, it is possible that the same change in pitch over a shorter period of time might sound like a rise with a higher pitch excursion than that change over a longer period of time. Not surprisingly, then, syllable duration and the shape of the contour seem related (Lyberg 1981). The key question therefore is whether we can conceive of duration as an independent measure. Several studies have looked at this issue and agree it is worth looking at both dimensions independently. In their investigation of cross-dialectal differences of rising intonation in American English, Armstrong et al. (2015) tested the independence of change of duration and the slope of the contour. The two measures showed a negative correlation, i.e. longer duration was perceived to have a shallower slope. Based on this interaction, they argue that both measures matter. Riedfieldt & Gussenhoven (1987) show that the dependency may go both ways, at times. For instance, high monotonous sections in the contour was interpreted as spoken at a faster rate. Snow & Balog (2002) argue for a dependence of pitch height and duration. They even argue that pitch height is not a sufficient cue to pragmatic and attitudinal meaning. Support beyond the physical domain comes from the observed path of acquisition: variation in duration is acquired in falls before it is 146 acquired in rising intonation. Gussenhoven (2004) speculates that addressees have a tacit knowledge of the relation between peak delay and peak height given that higher peaks will occur later than lower peaks at a fixed rate of change in fundamental frequency. If this tacit knowledge exists, a speaker may employ peak delay to emphasize or substitute for an increase in pitch height. Ladd & Mourton (1997) show that peak delay is associated with unusual prosodic marking, which Gussenhoven (2004) takes as support for the idea that delayed peaks are perceived as higher than non-delayed peaks. So, although the exact nature of the relation of pitch height/excursion to duration is unclear, this brief survey of investigation of their dependency suggests that an interaction of those factors is to be expected. Whether this is the case for both rises and falls is unclear since both developmental (Snow & Balog 2002) and experimental (Tomlinson & Fox Tree 2010) findings point to differences between these contours. Evidence for the interaction of excursion and duration is exclusive to rising intonation in the existing literature. 4.2.4 Hypothesis and Predictions Based on the findings reported in the experimental literature, I hypothesize the following relationship between Commitment and Engagement and the shape of the nuclear tune:  Commitment and Engagement affect the shape of the nuclear tune. Commitment negatively correlates with the duration of a sentence-final rise; Engagement positively correlates with the pitch excursion of a sentence final rise. These measures are in a dependent relation. Notice that this hypothesis is exclusive to rising intonation. Previous findings do not allow for predictions about falling intonation. The offline results of Tomlinson & Fox Tree (2010) show no difference between falls and rises for duration; pitch excursion is exclusively discussed for rising intonation. As for the hypothesized interaction between pitch excursion and duration, it is plausible to expect that short, shallow rises have a higher rating of Commitment than long, high rises. For Engagement, we expect the opposite if addressees really have an implicit knowledge of their dependence (Gussenhoven 2004). I consciously decided to frame my hypothesis in reference to pitch excursion rather than pitch height to allow for a distinction between high-rising and rising declaratives.  To test the hypothesis in (95), I designed two perception experiments that separately investigated the correlations of duration and pitch excursion with Commitment and Engagement. Given the attested interaction of the two prosodic measures, it may well be that both are relevant for either 147 variable. The hypothesis in (95) leads to the following predictions for the outcome of my experiment:  Predictions for rising declaratives: a. Commitment: An increase in duration negatively correlates with the perceived degree of Commitment. An increase in excursion has the same effect. Both factors can interact. b. Engagement: An increase in excursion positively correlates with the perceived degree of Engagement. No effects of duration nor an interaction of both factors is expected.   Predictions for falling declaratives: a. Commitment: An increase in duration negatively correlates with the perceived degree of Commitment. No effect for pitch excursion or any interaction is expected. b. Engagement: Changes in excursion or duration are not expected to correlate with the perceived degree of Engagement – neither individually nor in combination.  Predictions for the comparison of contour types and associated effects: a. Contour type: Effects of duration are identical for rises and falls, but not of excursion. b. Effect: Duration correlates more with Commitment than with Engagement; Excursion correlates more with Engagement than with Commitment (based on effect sizes). These predictions reflect the similar effects of duration in Tomlinson & Fox Tree’s (2010) first experiment for both falls and rises, but their online experiments as well as the focus in the experimental literature on variation in rising intonation (see Subsection 4.2.1) lend more substance to the predictions for rises than for falls. Hence, I focus on rising intonation in my experimental investigations (and include falling declaratives as controls). While later accounts of the British tradition allow for a distinction between different types of falls (see Subsection 4.1.1), I expect that pitch excursion only matters for rising intonation in accordance with the earlier accounts in this tradition and in following the threefold distinction between falling, rising, and high-rising declaratives. 4.3 Methods for investigating question 1 and question 2 In this subsection, I provide the details of my methods for testing for relations between pitch height and duration with Commitment and Engagement. They were identical for both parts of the experiments (one part addressing Commitment (Q1) and one addressing Engagement (Q2)).  148 4.3.1 Participants  For the analysis of both parts of the experiment, forty native speakers of Westcoast Canadian English were selected from a pool of 120 undergraduate students who participated as part of the Linguistics Outside the Classroom project at the University of British Columbia. The project does not allow for language-specific exclusion criteria; hence the experiments were run until the target number of forty native speakers was reached. Participants were compensated with course credit. Selection criteria for the assessment as native speaker were place of birth, age of exposure to English, language spoken with parents and peers, and self-reported proficiency, in that order. All target participants needed to have spent the majority of their lives in British Columbia to guarantee a homogenous dialectal sample. Of the forty native speakers (mean age: 20.85; 1 queer, 7 male, 32 female), twenty were monolingual speakers of Canadian English, twenty were bilingual speakers. 4.3.2 Materials Materials comprised eighteen rising declarative sentences spoken by three linguists with a Westcoast Canadian accent. Speakers were instructed to sound surprised. The speakers also provided whispered and falling versions of each item for the filler material. From each speaker, six items were selected based on acoustic quality. The stimuli comprised three topics: six declaratives described a state of the weather, six described a cooking activity, and six described an accident. They all consisted of the pronoun it, a cliticized copula ‘s and a present participle to form the present progressive. A sample item of each content category is provided in (99).  a. It’s raining b. It’s baking c. It’s sinking These rising declaratives were manipulated using Praat (Boersma & Weenink 2017) to create 108 critical items, 108 controls and 108 fillers. The manipulation of the critical items consisted of multiplying the pitch excursion of the final word with a factor of .5 and by multiplying the duration of the final word by .75 and 1.25 each. The onset of the manipulations was chosen to coincide with the beginning of the participle (rather than the onset of the rise) to avoid unnatural transitions after resynthesis. Manipulations and resynthesis were done with a single Praat script. The manipulations 149 resulted in 108 rising declaratives of six different shapes (i.e. eighteen per shape). Figure 4.3 is a graphic representation of the manipulations where item constitutes the original contour.   Figure 4.3: Manipulations of the critical stimuli Mean values and standard deviations for excursion and duration of the nuclear tunes for each rise shape are listed in Table 4.1. Pitch excursions differed by an average of 84.289 Hz (SD = 32.469 Hz); durations differed by 117ms (SD = 27ms) for the shorter rises and by 112ms (SD = 36ms) for the longer rises (with medium length fixed). This corresponds to an average decrease of 24.5% and an increase of 23.5% in duration, respectively. The difference between the multiplication factor and the resulting proportional increase is due to Praat’s pitch measure algorithms. The differences combined (i.e. the difference between shorter and longer rises) adds up to 229ms, which is similar to the difference of the two types of declaratives in Tomlinson & Fox Tree (2010).  Table 4.1: Mean duration and excursion of rises 150 A sample item with low and high excursions, ordered from short to long, is provided in Figure 4.4.   Figure 4.4: Sample item with low and high excursion for short (left), medium (mid), and long (right) duration The control items were the identical set of declaratives with inverted contours to produce falling declaratives with the same changes in duration and excursion. Half of the filler items were created by flattening the contours of the elicited falling declaratives to their median and using the Praat Vocal Toolkit (Corretge 2012) for extending and reducing their duration by a factor of .25. The other half of those contours were created by applying the whisper function of the Vocal Toolkit to the elicited whispered declaratives and manipulating the durations as before. Using the whisper function on the elicited whisper function helped to normalize the variation found within the three speakers and between their individual whispered items. Table 4.2 provides an overview of the 324 stimuli which were evenly divided by the variables of excursion, duration and type of stimuli.  Duration Excursion short mid long  critical (increasing f0) high 18 18 18 low 18 18 18  control (decreasing f0) high 18 18 18 low 18 18 18  filler (no change in f0) whispered 18 18 18 monotone 18 18 18 Table 4.2: Distribution of stimuli (by type, excursion and duration) It’s boiling100500200300400Pitch (Hz)Time (s)0 0.457emma_RD18_short0.213872269It’s boiling100500200300400Pitch (Hz)Time (s)0 0.53780.213872269It’s boiling100500200300400Pitch (Hz)Time (s)0 0.619emma_RD18_long0.213872269em a_RD18_low_long151 4.3.3 Procedure All participants participated in both parts of the experiment. One part (Q1) addressed the encoding of Commitment. The other part (Q2) addressed the encoding of Engagement. The order of the two parts was randomized. From the participants selected for analysis, sixteen began with Q1 and twenty-four began with Q2. Each participant gave written consent to their participation. The two parts of the experiment were separated by a distractor block where participants had to decide whether they had seen red triangles on different slides presenting rectangular and triangular forms in black and red color for 500ms. The cognitive load required by the task has shown to be a reliable distractor in past experiments (Mattys & Wiget 2011). The entire distractor block lasted seven minutes. Participants were given the possibility to take a short break before and after the distractor block. The entire session lasted 45-60 minutes including an elaborate language questionnaire followed by the first study, the distractor, and then the second study. Both parts included 324 ratings with optional short breaks every twenty items. Each experiment started with a short training phase of ten items that did not occur in the actual experiments. In this training phase, the participants received an instruction identical to the following experimental condition. All instructions were presented on a computer screen using EPrime (Schneider, Eschman & Zuccolotto, 2007). Stimuli were only presented audibly through headphones. Independent of the phase (training or experimental) and experiment (Q1 or Q2), participants were asked to respond on a five-point Likert scale via a button-press box placed in front of them with buttons labeled [1] to [5] (from left to right).  4.4 Q1: Correlations of duration and excursion with Commitment In this Subsection, I provide the instructions (Subsection 4.2.4.1) and report the results (Subsection 4.2.4.2) of Q1 which investigated prosodic correlations of Speaker Commitment. I then contextualize my findings with reference to previous studies (Subsection 4.2.4.3). 4.4.1 Instructions For Q1, participants were instructed to rate audio stimuli based on the instruction in (100).   Do you think the speaker was CONFIDENT about her statement? Instructions were given in full during the training block and abbreviated to “CONFIDENT?” during the trial block. Instructions asked for evaluations of speaker confidence rather than for 152 evaluations of Commitment based on participants’ comments during the pilot phase. This is in line with previous investigations of effects of duration (Tomlinson & Fox Tree 2010; Pon-Barry 2008; Pon-Barry & Schiever 2011). Participants were advised that some of the contours may sound distorted (as in the monotonized and whispered fillers) and were invited to make use of the full scale. They were also told that they had five seconds for their response. Rating a stimulus or exceeding the time limit automatically triggered the next signal. Response options were detailed for [1] “Not at all.” and [5] “Absolutely.” with numbers [2], [3] and [4] (without any label) with even-sized spaces in between. The numbers [1] and [5] were also printed on the button press box. 4.4.2 Results My analysis is based on linear mixed effect regressions using the lm4e package (Bates et al. 2014) in R (Ro Core Team 2018). The model structures included all variables covered by the experiment design. Critical items and controls were analyzed separately. Initial model fitting followed a ‘keeping it maximal’ strategy for the random effects structure with subsequent pruning guided by experiment design (Barr et al. 2013). P-values and 95% Confidence Intervals (CI) were obtained via Wald-statistics approximation (using the sjstats package (Lüdecke 2017)). Planned posthoc tests only included pairwise comparisons of contour shapes and were conducted using the emmeans package (Lenth 2018). Visual inspection of residual plots did not reveal any obvious deviations from homoscedasticity or normality. Effect sizes were calculated by obtaining Ω02 scores (Xu 2003) and R2 (using the sjstats package). Graphs were created using the GGPlot2 package (Wickham 2016) following suggestions from Weatherholtz (2015). Figure 4.5 shows the mean response ratings (including standard errors) for the critical items (i.e. rising declaratives). Across the six different shapes of rises, lower excursion was always rated higher than higher excursion. For low excursion, rises with short duration (M = 2.47, SE = 0.09) were rated higher than those with mid duration (M = 2.37; SE = 0.09), which again were rated higher than those with long excursion (M = 2.33, SE = 0.08). We see the same pattern for high excursion: rises with short duration (M = 2.02, SE = 0.10) were rated higher than those with mid duration (M = 1.91, SE = 0.08), which again were rated higher than rises with long excursion (M = 1.77, SE = 0.09). None of the rises had a positive rating for speaker confidence (i.e. >2.5). 153  Figure 4.5: Mean response ratings for speaker confidence for critical items (error bars represent ±SE) For the analysis, I fitted a linear mixed effects regression model to predict the relationship between confidence ratings and duration in combination with excursion (see Section 4.2.3.4 for details). A full model with duration and excursion as fixed effects and a random effect structure of intercepts and by-item and by-subject slopes for subjects, items, speaker, topic, and experiment order did not converge. I therefore excluded by-item slopes as the least meaningful variable, effectively treating items as invariant for effects of speaker, topic, and experiment order (cf. Barr et at. 2013). For the critical stimuli (rising declaratives), duration affected mean response ratings on speaker confidence (χ2 (2) = 11.256, p =0.003596), increasing it by 0.08 ± 0.05 (standard errors) for mid duration and by 0.19 ± 0.05 (standard errors) for short duration compared to the longest duration. Hence, shorter duration positively correlated with higher ratings for speaker confidence. A pairwise comparison revealed that these correlations were only significant between the shortest and longest duration (F (86.96) = -3.437, p = 0.0113). Pitch excursion also affected mean response ratings on speaker confidence (χ2 (1) = 74.269, p < 0.001), increasing it by 0.48 ± 0.04 (standard errors) compared to contours with high excursion. Hence, a decrease in excursion positively 154 correlated with higher ratings of speaker confidence. There was no significant interaction for duration and excursion. Statistics of the final model are summarized in Table 4.3. Model structure: Response ~ Excursion + Duration + (1|Subject) + (1|Sound1) + (1|Semantics) + (1|speaker) + (1|ExperimentName) + (0+Semantics|Subject) + (0+speaker|Subject) + (0+ExperimentName|Subject)  B CI p Fixed Effects Intercept  1.80641 1.5352636; 2.0775564 <0.001 Low excursion 0.48029 0.3935796; 0.5670004 <0.001 Medium duration 0.08089 -0.0253224; 0.1871024 0.134 Short duration 0.18777 0.0815968; 0.2939432 0.001 Random effects τ00, Subject 0.276 σ2 0.782 ICCSubjects 0.2237 Observations 4518 NSubject 40 R2conditional 0.398  τ00, Items 0.035 Ω02 0.321 ICCItmes 0.0287   NItmes 121   τ00, speaker 0.036   ICCspeaker 0.0288   Nspeaker 3   τ00, Semantics 0.009   ICCSemantics 0.0073   NSemantics 3   τ00, Order 0.097   ICCOrder 0.0787   NOrder 2   Table 4.3: Q1 model of speaker confidence by pitch and excursion for critical items  For control items (i.e. falling declaratives, see Figure 4.6), duration was not a significant predictor of the mean ratings of speaker confidence. Higher excursion, however, received higher response ratings for speaker confidence for each length in duration. For short duration, higher excursion (M = 3.98, SE = 0.09) was rated higher than low excursion (M = 3.91, SE = 0.09); For mid duration, 155 higher excursion (M = 3.98, SE = 0.09) was rated higher than low excursion (M = 3.85, SE = 0.10); for long duration, higher excursion (M = 3.93, SE = 0.10) was rated higher than low excursion (M = 3.91, SE = 0.09). All response ratings received a positive rating for speaker confidence.  Figure 4.6: Mean response ratings for speaker confidence for controls (error bars represent ±SE) As with the critical items, a full model for a linear mixed regression analysis of controls with duration and excursion as fixed effects and a random effect structure of intercepts and by-item and by-subject slopes did not converge. I excluded by-item slopes for subjects, items, speaker, topic, and experiment order, but kept them for by-subject slopes and as intercepts. For the controls (falling declaratives), duration did not significantly affect ratings of speaker confidence. Pitch excursion affected mean response ratings on speaker confidence (χ2 (1) = 12.53, p = 0.0004005), decreasing it by 0.12 ± 0.03 (standard errors) compared to contours with high excursion. The significance of excursion was confirmed in the pairwise comparison (F (99.17) = 3.674, p = 0.0051). Hence, an increase in excursion positively correlated with higher ratings of speaker confidence. Statistics of the final model are summarized in Table 4.4. 156 Model structure: Response ~ Excursion + Duration + (1|Subject) + (1|Sound1) + (1|Semantics) + (1|speaker) + (1|ExperimentName) + (0+Semantics|Subject) + (0+speaker|Subject) + (0+ExperimentName|Subject)  B CI p Fixed Effects Intercept  4.03 3.82; 4.24 <0.001 Low excursion -0.12 -0.18; -0.06 <0.001 Medium duration 0.03 -0.05; 0.11 0.414 Short duration 0.06 -0.02; 0.14 0.139 Random effects τ00, Subject 0.273 σ2 0.604 ICCSubjects 0.2467 Observations 4530 NSubject 40 R2conditional  - τ00, Items 0.014 Ω02 0.405 ICCItmes 0.0125   NItmes 120   τ00, speaker 0.005   ICCspeaker 0.0043   Nspeaker 3   τ00, Semantics 0.006   ICCSemantics 0.0051   NSemantics 3   τ00, Order 0.001   ICCOrder 0.0013   NOrder 2   Table 4.4: Q1 model of speaker confidence by pitch and excursion for controls 4.4.3 Discussion The results largely confirm the predictions for the effects of duration and excursion on Commitment applied to the response metric of speaker confidence for rising declaratives. The smaller the pitch duration and the smaller its excursion, the more confident the speaker is rated. It is important to note, however, that confidence ratings remained in the lower half of the scale even for the highest rat