UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

How do current approaches to communicating ambiguity in risk estimates influence decisions? Hicklin, James Gregory 2018

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2018_september_hicklin_james.pdf [ 20.01MB ]
Metadata
JSON: 24-1.0366137.json
JSON-LD: 24-1.0366137-ld.json
RDF/XML (Pretty): 24-1.0366137-rdf.xml
RDF/JSON: 24-1.0366137-rdf.json
Turtle: 24-1.0366137-turtle.txt
N-Triples: 24-1.0366137-rdf-ntriples.txt
Original Record: 24-1.0366137-source.json
Full Text
24-1.0366137-fulltext.txt
Citation
24-1.0366137.ris

Full Text

 HOW DO CURRENT APPROACHES TO COMMUNICATING AMBIGUITY IN RISK ESTIMATES INFLUENCE DECISIONS?byJames Gregory HicklinB.A., The University of British Columbia, 2015A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCEinTHE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES(Pharmaceutical Sciences)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)April 2018© James Gregory Hicklin, 2018iiThe following individuals certify that they have read, and recommend to the Faculty of Graduate and Postdoctoral Studies for acceptance, a thesis/dissertation entitled:How do current approaches to communicating ambiguity in risk estimates influence decisions?submitted by James Hicklinin partial fulfillment of the requirements forthe degree of Master of Sciencein Pharmaceutical SciencesExamining Committee:Nick Bansback, School of Population and Public HealthCo-supervisorLarry Lynd, Pharmaceutical SciencesCo-supervisor Mark Harrison, Pharmaceutical SciencesSupervisory Committee MemberWei Zhang, School of Population and Public HealthAdditional ExaminerAdditional Supervisory Committee Members:Peter Loewen, Pharmaceutical SciencesSupervisory Committee MemberMohsen Sadatsafavi, Pharmaceutical SciencesSupervisory Committee MemberiiiAbstractBackground: Uncertain outcomes are an unavoidable fact of medicine. First-order uncertainty (e.g. “10 in 100 people can expect an outcome in the next year”) has well-established guidelines as to how it should be best presented, but it is not clear if and how to present second-order uncertainty, referred to as ambiguity (e.g. “10 [95% CI 5,15] in 100”).Objectives: To explore the ways in which ambiguity in risk is currently being described to patients by (1) identifying existing presentation techniques and evidence for their potential impact on decision-making, (2) investigating how presentation techniques influence decision-related outcomes, including intention, trust, worry, decisional uncertainty, risk perception, knowledge and preference, and (3) determining which techniques should be investigated further.Methods: The literature on current techniques to present ambiguity was systematically reviewed through an electronic search of the Medline/PubMed database, and an existing database of patient decision support interventions. The influence of each identified communication technique was evaluated by the design and implementation of a web-survey in a hypothetical atrial fibrillation vignette.Results: Nine distinct presentation techniques were identified as having been used in the past, and were shown to influence decision-making outcomes. Of ivthese techniques, the visual and textual range techniques were found to result in change in intention (in both directions) which was statistically significant, while other techniques decreased trust, increased decisional uncertainty, and resulted in greater knowledge.Conclusions: Techniques that resulted in the worst knowledge of the range in risk scores tended to be the ones that were preferred by participants. Yet, without good knowledge of risks involved with different medical options, informed consent, and value-based decisions are challenging. Findings from this work indicate that some techniques for presenting uncertainty, such as the visual and textual range techniques, impact various psychometric outcomes related to decision-making, including intention to take oral anticoagulation, trust in risk estimates, decisional uncertainty and knowledge of ambiguity. Further research should focus on testing the influence of these techniques on decision-making related outcomes. vLay summaryWith most medical treatments there are risks. It is important to communicate these risks (referred to as uncertainty) to patients so they can make informed decisions, but many patients struggle to interpret these risks. Furthermore, often we do not know how precise our estimates of these risks are; this uncertainty is known as ambiguity. There are good guidelines on how to present risks to patients, but less clear recommendations on how to present the ambiguity around risk estimates. This thesis examines the available literature to identify the most common techniques used to present ambiguity in risk estimates, and how they can potentially impact decisions. This thesis tests the impact of the identified techniques for presenting ambiguity using a web survey with a hypothetical scenario involving a heart condition. Several techniques appeared to influence decision-making. Based on the results, some techniques may be good candidates for future research.viPrefaceThe work involved in this thesis was performed primarily at the University of British Columbia, The Center for Health Evaluation and Outcome Sciences at St. Paul’s hospital, and my home address. I was involved in the conceptualization, study design, data collection, analysis and interpretations at all stages in this program of research. I performed the systematic review and synthesis of data, which makes up chapter 2 of this thesis, with the support of Dr. Nick Bansback. The hypothetical survey (chapter 3) was designed and developed by me the support from the committee (Drs. Nick Bansback, Larry Lynd, Mark Harrison, Peter Loewen, and Mohsen Sadatsafavi) and external field experts (Drs. Paul Han and Liana Fraenkel). I conducted the analysis of the survey results, with help from Drs. Daphne Guh and Huiying Sun. Survey deployment required approval from the UBC Behavioural Ethics Board (H17-02094)viiTable of contentsAbstract ........................................................................................................................................iiiLay summary ................................................................................................................................vPreface ..........................................................................................................................................viTable of contents.......................................................................................................................viiList of tables..................................................................................................................................xList of figures ..............................................................................................................................xiList of abbreviations.................................................................................................................xiiAcknowledgements.................................................................................................................xiii1 Introduction ............................................................................................................................11.1 Uncertainty in health care.......................................................................................................31.2 Types of uncertainty in health.................................................................................................61.3 Risk communication................................................................................................................81.4 The evolving role for ambiguity in risk communication..........................................................111.5 Case studies ..........................................................................................................................131.5.1 Atrial fibrillation ....................................................................................................................131.5.2 Rheumatoid arthritis .............................................................................................................141.6 Ambiguity aversion ...............................................................................................................151.7 Thesis goals...........................................................................................................................171.7.1 Research questions ...............................................................................................................171.7.2 Hypotheses............................................................................................................................171.7.3 Objectives..............................................................................................................................181.8 Summary...............................................................................................................................192 Systematic review ................................................................................................................212.1 Objectives .............................................................................................................................252.2 Methods ...............................................................................................................................262.3 Results ..................................................................................................................................292.3.1 Qualitative prefix...................................................................................................................312.3.2 Quality of evidence................................................................................................................352.3.3 Range.....................................................................................................................................392.3.4 Other techniques...................................................................................................................462.4 Discussion .............................................................................................................................473 Survey design .......................................................................................................................583.1 Objectives .............................................................................................................................593.2 Methods ...............................................................................................................................613.2.1 Overview ...............................................................................................................................613.2.2 Survey development .............................................................................................................623.2.3 Survey outline........................................................................................................................633.2.3.1 Consent .........................................................................................................................................633.2.3.2 Vignette.........................................................................................................................................633.2.3.3 Main Survey ..................................................................................................................................653.2.3.4 Demographic questions ................................................................................................................673.3 Outcomes and sociodemographics ........................................................................................673.3.1 Primary outcome...................................................................................................................67viii3.3.2 Secondary outcomes .............................................................................................................693.3.2.1 Knowledge.....................................................................................................................................693.3.2.2 Perceived magnitude of risk..........................................................................................................703.3.2.3 Decisional uncertainty...................................................................................................................713.3.2.4 Worry ............................................................................................................................................713.3.2.5 Trust ..............................................................................................................................................723.3.3 Sociodemographics ...............................................................................................................733.3.3.1 Ambiguity aversion .......................................................................................................................733.3.3.2 Subjective numeracy .....................................................................................................................733.4 Population ............................................................................................................................743.5 Piloting..................................................................................................................................753.5.1 Expert review.........................................................................................................................753.5.2 Online pilot ............................................................................................................................763.5.3 Open-ended feedback...........................................................................................................793.5.4 Changes before final survey ..................................................................................................813.6 Sample size ...........................................................................................................................843.7 Discussion .............................................................................................................................864 Survey results .......................................................................................................................954.1 Statistical analysis methods ..................................................................................................954.2 Survey completion results and randomization.....................................................................1004.3 Sociodemographic characteristics........................................................................................1004.4 Primary outcome: Intention ................................................................................................1024.4.1 Descriptive results ...............................................................................................................1024.4.2 Logistic regression ...............................................................................................................1064.4.3 Linear regression .................................................................................................................1084.5 Secondary outcomes ...........................................................................................................1094.5.1 Risk perception....................................................................................................................1094.5.2 Worry ..................................................................................................................................1104.5.3 Trust ....................................................................................................................................1114.5.4 Decisional uncertainty.........................................................................................................1134.5.5 Knowledge...........................................................................................................................1134.5.6 Preference for technique ....................................................................................................1154.6 Conclusion...........................................................................................................................1155 Discussion ...........................................................................................................................1335.1 Overview.............................................................................................................................1335.2 Key Findings ........................................................................................................................1335.3 Implication of results...........................................................................................................1415.4 Strengths and limitations ....................................................................................................1445.5 Future research and recommendations ...............................................................................148References .................................................................................................................................151Appendix A: CHERRIES Checklist ......................................................................................158Appendix B: Ambiguous presentation techniques from survey....................................163Appendix C: Pilot survey intention question ....................................................................165Appendix D: Correlation coefficients between actual intention change and baseline intention across all techniques .............................................................................166ixAppendix E: Logistic regression univariate test results (no technique in model) ......167Appendix F: Omnibus chi2 test results for intention........................................................168Appendix G: Logistic analyses for directional decrease in intention ...........................169Appendix H: Linear regression univariate test results (no technique in model) ........170Appendix I: Omnibus F-test results for linear regressions .............................................171Appendix J: 1, 2 and 3 point proportion changes across secondary outcomes.............172Appendix K: Table of responses for "low" knowledge question. Shaded row is the "correct" answer. ................................................................................................................173Appendix L: Table of responses for “high” knowledge question. Shaded row is the “correct” answer. ...............................................................................................................174xList of tablesTable 1: Sources of ambiguity as defined by Han’s taxonomy ............................................20Table 2: Terms used to search Medline ...................................................................................50Table 3: Balshem, 201159- outline of old and new GRADE scoring systems ......................50Table 4: Summary of ambiguity presentation techniques as identified by the literature review, along with potential influences on outcomes associated with decision-making.........................................................................................................51Table 5: AA-Med scale80 to measure ambiguity aversion.....................................................88Table 6: Pilot demographic results ...........................................................................................89Table 7: Pilot descriptive results ...............................................................................................90Table 8: Powers for different plausible sample sizes using normal approximation methods. Highlighted cells are those taken into consideration to determine sample size...........................................................................................................................91Table 9: Sample size powers for between technique comparisons......................................92Table 10: Demographic results................................................................................................117Table 11: Descriptive results for intention change...............................................................118Table 12: Cohen D's effect size for all outcomes...................................................................119Table 13: Logistic regression model for 2-point change in intention ................................120Table 14: Linear regression results for intention change ....................................................121Table 15: Descriptive results for secondary outcomes ........................................................122Table 16: Linear regression results for risk perception and worry....................................123Table 17: Linear regression results for trust..........................................................................124Table 18: Linear regression results for decisional uncertainty...........................................125Table 19: Proportion of participants who answered knowledge questions correctly..............................................................................................................................126Table 20: Summary of effects on outcomes...........................................................................150xiList of figuresFigure 1: Theory of planned behaviour ...................................................................................20Figure 2: PRISMA for systematic review findings.................................................................52Figure 3: Grant et al. "about" textual prefix55..........................................................................53Figure 4: MAGIC app.................................................................................................................53Figure 5: BMJ clinical evidence categories ..............................................................................54Figure 6: Schapira's range41 .......................................................................................................55Figure 7: Johnson's range19 ........................................................................................................55Figure 8: Han's range49 ...............................................................................................................56Figure 9: Han's gradient range49 ...............................................................................................56Figure 10: Bansback's gradient range60 ....................................................................................56Figure 11: Correll's violin plot71 ................................................................................................57Figure 12: First-order visualization with randomly placed icons, illustrating the potential first-order of uncertainty74 ................................................................................93Figure 13: Simple animated GIF example. ..............................................................................93Figure 14: Hypothetical AF scenario with ambiguity ...........................................................94Figure 15: Knowledge questions from survey........................................................................94Figure 16: Behavioural intention question from survey .......................................................94Figure 17: Histograms of change in intention by technique...............................................127Figure 18: Order of techniques by proportion change in the 1-point, 2-point, and 3-point case. .......................................................................................................................128Figure 19: Predictions for adjusted 2-point proportion change in intention. ..................128Figure 20: Linear predictions for adjusted change in trust by technique .........................129Figure 21: Linear predictions for adjusted change in decisional uncertainty by technique............................................................................................................................130Figure 22: Knowledge by technique.......................................................................................131Figure 23: Preferred technique by presentation technique.................................................132xiiList of abbreviationsAF: Atrial fibrillationAIDS: Auto-immune deficiency syndromeBIC: Bayesian information criterionCHERRIES: Checklist for reporting results of Internet E-surveysCPM: Clinical prediction modelCVS: Chorionic villus samplingDCS: Decisional conflict scaleDMARD: Disease modifying anti-rheumatic drugGIF: Graphics interchange formatGRADE: Grading of recommendations assessment, development, and evaluationHINTS: Health information national trends surveyIPDAS: International patient decision aid standardsMAGIC: Making GRADE the irresistible choiceMRI: Magnetic resonance imagingMTurk: Mechanical turkNOAC: Non vitamin-k oral anticoagulantOAC: Oral anticoagulationPtDA: Patient decision aidRA: Rheumatoid arthritisSDM: Shared decision makingTPB: Theory of planned behaviourUBC: University of British ColumbiaxiiiAcknowledgementsThis thesis would not have been possible without the unconditional support from friends, family, and colleagues. I owe special thanks to Dr. Nick Bansback, for the array of support and opportunities that he has provided before and during my graduate degree, for answering my constant email questions, even on weekends, and for always being available to discuss issues and concerns. I look forward to the future of DCIDA.To Daphne and Huiying, for your advice and feedback as I progressed through my data analysis, and for helping me to address any and all concerns that I had.To my Mom and Dad, who supported me throughout my degree. Thanks for always being on call to celebrate and/or complain, and for the advice and knowledge you have instilled upon me over the years.Finally, to someone who went from being my girlfriend, to my fiancée, to my wife, all within the space of a few months. Gabi, this would have never been possible without your unwavering support, and I look forward to our future adventures together.11 IntroductionUncertainty exists across nearly all aspects of daily life. Checking the weather report, driving to work in the morning, and dealing with your springtime allergies all contain some degree of uncertain information. Will it rain, even though the weather forecast says there is a 90% chance of sunshine? Will the car break down on the way to work? Should you take that antihistamine that never seems to help your symptoms? Benjamin Franklin once stated: “The only two certainties in life are death and taxes”. With uncertainty in some shape or form playing such a big role in our lives, it is imperative that the scientific community investigates the effect it can have on our decisions and the decision-making process as a whole.To understand uncertainty, it is vital to discuss risk, and how the two are interconnected. Definitions for risk vary, but the Merriam Webster dictionary describes it as “the possibility of loss or injury”.1 In other words, it can be considered the uncertainty of encountering a poor outcome. Of note, however, is that risk can also result in a good outcome, for example, capital gains in stock trading.2 To accommodate for the inclusion of a chance of benefit, the term “risk” is used hereafter interchangeably between benefit and harm.Some fields, in particular psychology, have conducted extensive research to assess how people understand and interpret risk. In a classic study published 2in Science, Tversky and Kahneman discuss how the framing of risk can influence the actions that people take, and use the example of survival rates versus mortality rates.3 They find that choices involving gains (surviving) are often risk averse, while choices involving losses (mortality) are often risk taking. This is a classic example of a framing effect, and illustrates how decision-making is inherently influenced by cognitive biases. On top of framing effects influencing the interpretation of risk estimates, simple probabilistic statements have proven difficult for laypeople to understand. In one study examining how breast cancer risk is perceived, 77% of participants were found to overestimate their risk of death by a factor of 10 or more.4 Another study found that 20% of highly educated individuals couldn’t correctly identify the largest risk given values of 1%, 5%, and 10%.5 Gigerenzer’s weather based study tested how laypeople understood the probability of rain in a weather forecast.6 Results indicated that the majority of people in European cities misinterpreted the probability as being related to the area covered by rain, or the amount of time in tomorrow’s date that it would rain. The correct interpretation relates to understanding past data, and the proportion of days where conditions were similar. For example, a 50% chance of rain tomorrow, correctly interpreted, means that in the past, where conditions were similar or identical to today, it rained on 50% of those days. Therefore, without understanding the reference class (number of days in the past), 3participants struggled to interpret simple probabilistic statements. This weather-based study highlights an important issue in that laypeople’s understanding of risk is not guided solely by their own psychometric properties (knowledge, biases, etc.), but also by disclosure failures on the presenter’s side (in this case, the weather people failed to disclose the reference class).Even when an individual risk is properly understood, there may be attitudes and perceptions that people apply to it. For example, a 10% risk of rain may be construed as unlikely, but a 10% risk of death may be construed as extremely likely: the context in which the risk estimate is given and the consequence of said risk can play a vital role, especially when risk representations are purely textual.7 Additionally, individual preferences may vary with respect to how a risk is perceived. In finance, for example, it is common for a financial advisor to construct an “investor profile”, in which the investor’s attitudes towards risk are assessed. Some may see a volatile investment as an opportunity for significant capital gain, while others may see it as an obstacle to be avoided due to the potential loss of wealth.1.1 Uncertainty in health careNaturally, uncertainty also prevails in health care, whether it is through assessing symptoms to diagnose a patient, weighing the risks and benefits of a 4given therapy, or constructing a patient preference profile. A primary role that clinicians play in their relationship with patients is to communicate these uncertainties, which involves an array of complex challenges.8–11 Patient preferences, numeracy, and the desire for information are all factors that must be considered when deciding how best to communicate these uncertainties.  Clinicians are increasingly being challenged to involve patients in decision-making about their care.12 The shared decision-making (SDM) paradigm contrasts to traditional practice where the physician plays the paternal role and prescribes without patient participation.13 The literature identifies several key characteristics of SDM, including the presence of at least two participants (the patient and clinician), the sharing of information between parties, and a shared consensus on preferred treatment.8 The communication of risk/uncertainties is complicated further since evidence surrounding most treatments benefits and harms has a certain degree of dispersion around the risk estimates, and that competition in the pharmaceutical market has led to multiple treatments becoming available to treat the same condition. The resulting comparison, where numerous benefits and harms with varying risk estimates and degrees of imprecision are being compared across multiple treatment options, becomes an indisputably difficult task. In the case of pharmaceuticals, rare, or even unknown harms such as 5adverse effects are all potential outcomes that must be considered and perhaps communicated to patients.The evidence on most treatments is uncertain, given that a review by Politi et al. discovered that “nearly half (47%) of all treatments for clinical prevention or treatment were of unknown effectiveness and an additional 7% involved an uncertain trade-off between benefits and harms”.14 Yet, a study published in 2015 found that 59% of Americans were taking prescription drugs in 201215, so society must have some degree of belief that their medication is effective. This false sense of certainty might result in problematic treatment decisions being made amongst patients and their clinicians, and an over or under prescription of drugs.Similar to investor profiles, patients can legitimately have different attitudes or aversions to risk. A recent study found that 73% of Canadians were averse to health risks, based on a representative sample’s responses to the Health-Risk Attitude scale.16,17 This means that given two patients who both have good understanding of an identical risk, they may make different decisions if one is more risk averse, and the other more risk seeking.Scientific health research is, in many cases, dependent on uncertainty, as the widths of confidence intervals and magnitudes of p-values are scrutinized to establish statistical significance and minimize the chance of making incorrect 6interpretations.18 Yet, the phenomenon of uncertainty continues to confuse and complicate even the most basic decisions, especially in the general population.14,19,20Conceptually, uncertainty can be difficult to understand. There are multiple sources from which uncertainty can arise, and identifying and conveying the various attributes associated with uncertainty has proven to be problematic.21 In the health context, there are many stakeholders, including clinicians, patients, family members, and policymakers, all of who must understand uncertainty to varying extents. As a result of the complexities involved, our understanding of the optimal methods to communicate uncertainty are limited.1.2 Types of uncertainty in healthHan’s taxonomy of uncertainty provided a vital first step in conceptualizing a framework within which uncertainty in health can be understood.21 By grouping uncertainty into a set of orders of uncertainty, techniques to convey these abstract components are more easily designed. This also enables the construction of scales to measure uncertainty-related attributes and thus conduct better-quality research.7The first conceptual order of uncertainty, also known as aleatory uncertainty, arises from the inherent indeterminacy in a probabilistic event. Herein, this type of uncertainty is referred to as first-order uncertainty. Han’s framework simply terms this as “probability”. For example, a hypothetical drug used to treat atrial fibrillation (AF) may claim that “10% of patients may experience stomach ache while taking this drug”. Given a random sample of 100 patients, and assuming that the risk estimate is incredibly precise, it is not possible to determine which 10 patients will be struck by stomach ache.Second-order uncertainty, often termed epistemic uncertainty, arises from a variety of sources, and is conceptually more intricate than its first-order counterpart. In Han’s taxonomy, he refers to this type of uncertainty as “ambiguity”. Hereafter, I will refer to second-order uncertainty as “ambiguity” to stay consistent with Han’s taxonomy. Ambiguity is often thought of as a probability in which discrete estimates are difficult or impossible to determine. In most academic research, ambiguity is expressed in the form of confidence intervals. Using the same example as previously but represented in its second-order format: “5% to 15% of patients may experience stomach ache while taking this drug”. So given a sample of 100 patients, somewhere between 5 and 15 people (though possibly more or possibly less) will be struck by stomach ache. Table 1 illustrates the ways in which ambiguity can arise.8A critical difference between the orders of uncertainty is the fact that first-order uncertainty is essentially a population parameter; it is a value that investigators attempt to estimate. Estimates may change as evidence is refined and perfected, or the population redefined, but the true population value remains constant. Second-order uncertainty, on the other hand can change, decreasing with further evidence. 1.3 Risk communicationWhile significant research has investigated layperson understanding of probability, investigators have also investigated this in health contexts, with some enlightening results. In Schwartz’s study examining the role of numeracy in understanding the benefit of screening mammography, the investigators found that only 16% of participants could answer three simple probability questions correctly, including: a) number of times a fair coin would come up heads with 1000 flips; b) convert 1 in 100 to a percentage; and, c) convert 1% to an absolute value out of 1000.22Clinicians often call on data from scientific articles and other established sources when consulting with patients.23 In many cases, this data is presented in the form of an effect estimate. For example, clinicians may state that patients should see a 20% chance of improvement, but fail to acknowledge that the base 9case (absolute risk) is small, so a 20% change is actually very small. In relative terms, a 20% chance of improvement may sound large, but if the absolute risk is only 1%, a 20% relative chance improvement equates to a 1.2% absolute value for the percentage of patients who improve. This is only a 0.2% absolute change in absolute risk, which emphasizes the importance of the denominator. This occurrence is a common source of misunderstanding in medicine, and many studies highlight clinician and patient sensitivity to the form of the point estimate.24,25In the past, patients expected their clinicians to make therapy-related decisions for them, and were not typically involved in the decision making process. Evidence has shown that older generations tend to prefer that their physician make all of the decisions on their behalf.26 Recently, however, the medical world has seen an upsurge in SDM, a process by which patients and clinicians work together to agree on health-related decisions. SDM can be implemented using various techniques: verbal communication between the clinician and patient, websites, and information pamphlets could all be interpreted as forms of SDM, as they are educational tools developed by clinicians designed to improve patient understanding. A newer tool that encourages SDM is the patient decision aid, a tool that enables patients to ascertain their values, understand the options associated with their given 10health-related concern, and help them choose which option they think is best for themselves. The International Patient Decision Aid Standards (IPDAS) group has defined a minimal decision aid as a tool that explains potential options and their associated outcomes.27 At their core, however, decision aids are merely informational tools. Evidence exists to support the theory that when patients are more informed regarding their treatment options, they are more likely to adhere to their prescribed therapy.12,28A large portion of a decision aid is commonly devoted to explaining the chance that an outcome will occur (first-order uncertainty). Outcomes, however, are never certain. As mentioned previously, outcomes are often communicated as effect estimates, or in other words, the probability that a specific outcome may occur. These estimates are Bayesian in the sense that they are derived from the analysis of large numbers of people, as opposed to exact probability estimates. In essence, there is always some degree of ambiguity around an effect estimate, which changes as the Bayesian prior is updated, and gets closer to, or further away from, the true population value.As the primary goal of a decision aid is to help rationalize decisions, it makes sense to discuss a more general model of decision-making, through the theory of planned behaviour (TPB). This framework is commonly used to link various psychometric properties to intention to perform an action, which in 11theory, leads to the actual performance of that action. It is a useful model to consider, as it includes several measurable attributes that theoretically indicate a change in behaviour. For example, in the TPB, intention leads to behavioural change (Figure 1). Therefore, if intention is a measurable psychometric property (through some scale), changes in intention, and therefore behaviour, should be detectable in the context of scientific research. As a result, intention is frequently used in research around decision aids to help validate their effectiveness.29,301.4 The evolving role for ambiguity in risk communicationThe focus of this thesis is on whether patients can understand ambiguity in risk estimates, and how it influences their decision-making. When deciding to take treatment or receive therapy, patients must weigh up the benefits and harms, which is typically done using the individual point estimates (aleatory uncertainty). But what if the true effects could be quite different due to ambiguity in the point estimate? Would decision-making be influenced? Treatments may have a similar risk of side-effect estimates, but one may have greater ambiguity. Should ambiguity be communicated in this case? Alternatively, if treatments have clearly different risk estimates with low ambiguity, does ambiguity even play a role in the decision-making? The ethical imperative for presenting uncertainty is obvious in the context of fully informed 12consent, but if said uncertainty is not fully understood, or even worse, misunderstood, there are concerns if this consent is valid.31The need to answer these questions is becoming more urgent with the increasing interest and funding around precision medicine, also known as personalized medicine.32 The National Cancer Institute in the United States defines personalized medicine as “a form of medicine that uses information about a person’s genes, proteins, and environment to prevent, diagnose, and treat disease”. Yet, a recent commentary highlights a critical issue with the risk profile that precision medicine allows us to construct.33 A contradiction exists in the use of the term “precision”, given that tailoring treatments to increasingly small subgroups will “demand a greater tolerance of and ability to calculate and interpret probabilities of uncertainty by clinicians and patients”.33 Statistically, lower sample sizes often result in greater variance. There exists a serious tradeoff between improving the accuracy of a point estimate (i.e.. decreasing first-order uncertainty) and accepting lower precision with respect to these estimates (i.e. increasing ambiguity), which is manifest by wider confidence intervals.To supplement the advent of precision medicine, clinical prediction models (CPMs) are in rapid development to provide individualized patient risk estimates based on a set of input parameters related to the patient. Worryingly, most CPMs output precise risk estimates (to many decimal points), which 13sometimes convey a false degree of confidence despite potentially large confidence intervals given the contradiction described by Hunter, and as emphasized in the case studies below.331.5 Case studiesTo further highlight the potential importance of describing ambiguity in risk estimates, several case studies have been identified in which an understanding of ambiguity is important from the perspective of the patient. 1.5.1 Atrial fibrillationAtrial fibrilation has a new generation of treatment options. The gold-standard treatment for AF is warfarin, a drug that has been on the market since 1954. Warfarin has an enormous body of research behind it, including multiple meta-analyses and systematic reviews, which provide evidence for its effectiveness.34–36 In recent years, new generations of non-vitamin-k oral anticoagulants (NOACs) have entered the market to compete with warfarin. While studies have shown these NOACs to be similar in effectiveness compared to warfarin, the confidence intervals around the reductions in stroke risk and potential for major bleed are more ambiguous. These drugs are potentially favorable, as far as theoretical expected utility is concerned, and presenting these 14degrees of ambiguity could potentially influence which treatment the patient ultimately chooses to have. On top of this, many of the estimates used when communicating risks and benefits in AF rely on CPMs. The CHADS₂ model, in particular, outputs a patient’s risk of stroke given that they have been diagnosed with AF. If a patient has a CHADS₂ score of 1, their risk of stroke is 2.8% (2.0-3.8), but when their CHADS₂ score is 6, their risk of stroke is 18.2% (10.5-27.4). This difference in confidence intervals is extremely large, and if the patient were to understand this degree of uncertainty, their decision-making might be influenced. While CHADS₂ is one of many CPMs used in clinical practice, it is one of the few that actually includes the confidence intervals in its outputs, which raises another issue of actually acquiring the ambiguity data to present to patients.1.5.2 Rheumatoid arthritisRheumatoid arthritis (RA), an autoimmune disease in which the body mistakenly attacks the joints of the sufferer, is traditionally treated with non-biologic disease modifying anti-rheumatic drugs (DMARDs), for example, methotrexate. In recent years, a new generation of biologic drugs have been developed and introduced to treat RA, and have been shown through comparative research to be equally or more effective to the first-generation non-15biologics. Five to ten years after their introduction, it was found that these drugs have a small, but significant increase in risk of cancer for patients who take them for RA.37New types of biologics and small molecules are being introduced to the market each year, and there is a possibility that these new drugs have similar potential adverse effects over the long-term, such as an increased cancer risks or different risks or potentially different events. But due to the nature of new drugs having a smaller body of long-term research, these risks are unknown. In essence, the new drugs have more ambiguous outcomes compared to the older drugs. Depending on patient preferences and their risk profile, understanding this ambiguity could potentially alter their decision-making when choosing between RA treatment options. However, most rheumatologists will only describe the first order uncertainty, and may only hint at the ambiguity by stating the drug is new.1.6 Ambiguity aversionJust as people can have different attitudes to first-order risks, people can have different attitudes to ambiguity. Previous studies have shown that individuals tend to be ambiguity averse, a phenomenon in which people tend to prefer more certain outcomes over ambiguous, but potentially favorable 16outcomes.38 Early evidence for ambiguity aversion in the decision theory world exists in the Ellsberg paradox, by which people’s choices violate what traditional expected utility theory dictates. In Ellsberg’s thought experiment, 90 red, blue and yellow marbles are mixed in a bag.38 Participants were told that 30 marbles were red, and that the remaining 60 were a mix of blue and yellow. They were then given the choice between two gambles: (A) $100 if they picked a red marble; or, (B) $100 if they picked a blue marble. Overwhelmingly, participants chose option A, as it was a known probability, despite option B being up to twice as likely.Despite evidence indicating that people are averse to ambiguity, little research has investigated this occurrence in health. While first-order uncertainty has been widely studied, and guidelines have been developed with respect to best practices, such as keeping denominators consistent, using absolute risks, and visually presenting risk information with icon arrays, few researchers have shifted their attention to the presentation of ambiguity.27 This would appear to be an area for urgent research, considering ambiguity plays such a crucial role in the way that scientific research is conducted. 171.7 Thesis goals1.7.1 Research questionsThe two primary research questions for this thesis are:1. Is there heterogeneity in the way that ambiguity is currently presented to patients?2. Do the current techniques used to present ambiguity to patients impact on key aspects of decision-making?3. Which are the techniques for communicating ambiguity to patients that are most commonly used and should be compared for their impact on decision-making?1.7.2 Hypotheses1. There will be heterogeneity in the approaches currently used to communicate ambiguity to patients in the absence of guidelines on the presentation of this information.2. Different approaches to communicating ambiguity will influence decision-making in different ways because of the heterogeneity of individual preferences and their interpretation of information. For example, since some individuals will be ambiguity averse, and others ambiguity seeking, I expect that the direction of changes in 18decision-making outcomes will differ between individuals. Consequently evaluation of change is expected to need to consider the absolute difference in outcomes when decisions are made using non-ambiguous and the ambiguous estimates of risks/benefits.3. There will be some techniques that will be more influential in affecting decision-making than others, and have more positive characteristics than others (for example in influencing knowledge of the range of uncertainty). These techniques can be identified as candidates for further research.1.7.3 ObjectivesAs a result of these research questions and hypotheses, my primary objectives are consequently:1. Through a review of the literature, categorize the ways in which ambiguity in risk estimates have been communicated in the health context in the past.2. Design and implement a survey that will evaluate approaches on key aspects of decision-making, notably, behavioural intention, knowledge, trust, risk perception, worry, decisional uncertainty, and preference for technique.193. Synthesize results comparing techniques to prioritize the best candidate techniques for future research in the communication of ambiguity.1.8 SummaryIn the next Chapter of this thesis I will address my first objective, where I conduct a systematic review of the literature. Chapter three of this thesis outlines the design process that was followed to create the survey, and chapter four describes the results from the survey’s implementation. Chapter 5 discusses the implication of the results along with the strengths, limitations, and future directions that this work could lead to.20Table 1: Sources of ambiguity as defined by Han’s taxonomySource of ambiguity ExampleLack of reliable information Badly designed studies- Low sample size- Flawed methodology- Confounders in designLack of credible information Research from questionable sources- Funding bias- Conflicts of interestLack of adequate information Not enough research- New drugs- Extremely rare side effectsFigure 1: Theory of planned behaviour212 Systematic reviewWhile research concerning ambiguity in risk estimates is sparse, various studies have tested how the presentation of probability in risks and benefits influences patient understanding of their risk. This research has been pooled to form a set of guidelines around optimal methods of describing first order uncertainty.27 A significant factor regarding presentation choice is based on considerations regarding patient numeracy and visual literacy, and there are some overall recommendations that exist as defined by the IPDAS.27When considering a risk estimate in the patient context, whether it be a risk of side effect or probability of benefit, it is vital that statistical probabilities be conveyed in a way that elicits a transfer of knowledge. Past studies have repeatedly shown that patients struggle to understand health statistics. When a physician tells a new patient that their 1-year risk of stroke is reduced by 14% if they take warfarin, it is unknown whether that patient will truly understand what that means, considering 20% of adults are unable to tell which risk is higher – 1%, 5% or 10%39. Consequently, the patient (and physician) may be under the belief that they are fully informed, when in fact they have misunderstood some key aspect of the given information. This has led researchers to test whether presenting probabilities as frequencies (e.g. 14 out of 100 people will be saved from having a stroke every year), instead of percentages, leads to better understanding. 22The general trend in evidence has shown that patients exhibit a better understanding of medical statistics when they are presented as frequencies, which is consequently the recommended format.40–42When presenting risk as a natural frequency, it is necessary to consider the effect of changing the denominator on patient understanding. One study found that only 353 of 633 participants (~56%) were able to correctly answer the question of which is greater, a risk of 1 in 112 or 1 in 384.43 In light of this phenomenon, a primary recommendation from IPDAS is that the reference class (the denominator of the risk estimate) stays consistent throughout multiple presentations. This allows for direct comparisons between estimates without the added mental strain of normalizing scales. Additionally, there is consensus that the denominator should be fixed at either 100 or 1000, depending on the magnitude of the risk, as these have shown to be easily compatible with visual representations.44Visual aids that accompany risk estimates play a vital role in the consent process. Taking into account widespread innumeracy in the population (the US Department of Education’s National Center for Education Statistics estimates that 22% of Americans have “below basic” quantitative skills), non-numeric representations of risk may be necessary to ensure a thorough understanding across the population.45 Pictographs, also known as icon arrays, are the gold 23standard in health risk communication, as they are easily mapped to from the natural frequencies that IPDAS recommends. They have been shown to reduce a variety of biases and aid in patient understanding of more complex topics, such as incremental risks.46Though the guidelines are clear on how to present statistical probability to patients, these risk estimates invariably have confidence intervals around them, numerically denoting the degree of ambiguity. When these confidence intervals are small, there may be no reason to include this ambiguity when consulting with the patient. Yet, in many cases, the degree of ambiguity may be large or unknown and warrant some mention. Consider a new drug, rivaroxaban, used to control AF. Rivaroxaban entered the drug market in 2011, and competes with an alternate drug, warfarin. Warfarin has been approved for medical use since 1954, and has an enormous body of evidence supporting its benefits.47 While rivaroxaban also has supporting evidence, the degree to which is has been studied is much smaller, simply as a result of the time it has been available on the market.48 While the point estimates describing the risks and benefits may be similar between drugs, the confidence intervals may vary greatly. If the patient understood this discrepancy, their decision-making may be altered depending on their preference profile.24The literature is conflicted on the benefits of describing ambiguity to patients. Numerous studies provide evidence for a phenomenon known as “ambiguity aversion”, in which patients avoid ambiguous but potentially favourable options in lieu of less ambiguous alternatives.38 While there is an extensive body of research devoted to this occurrence in the psychology and finance domains, little has been done to understand its implication in health.In one study, investigators tested the effect of presenting ambiguity in colorectal cancer lifetime risk estimates.49 Participants were randomized and presented with either a textual or visual representation of their lifetime risk, in which ambiguity was either present or not present. Results were consistent with the theory of ambiguity aversion, in that patients who received ambiguous representations reported greater cancer-related worry and were less optimistic about their outcomes.Another study, conducted in 1991, examined the willingness of participants to move to a new location based on environmental risk factors that could cause nerve disease or cancer.50 Participants were iteratively asked which of two locations they preferred, in which the risks were ambiguous in one location and precise in the other. The precise risk estimates were adjusted on each iteration of the questionnaire, until the participant noted that risks at each location were equivalent. The ambiguous estimates were kept consistent 25throughout iterations for a single participant, but participants were randomized to different degrees of ambiguity. The investigators found that in cases of more ambiguity, more adjustments to the precise estimates were required for participants to indicate equivalent risk, which supports the theory of ambiguity aversion.In light of this apparent aversion to ambiguous estimates, and in the context of a traditional paternalistic approach to medicine, it is unlikely that ambiguity would ever be presented to patients. But with the advent of shared decision-making in health, there is a moral and ethical imperative for patients to be fully informed. The overarching goal of this thesis is to determine how ambiguity is currently being presented to patients, and to assess the effect that this specific presentation format has on decision-making. The first step of assessing the quality of representations of ambiguity involves understanding methods that have been used to describe ambiguity in past research.2.1 ObjectivesA recent review found nearly half of patient decision aids (PtDAs) described ambiguity in risks, mostly using qualitative statements such as “about 15%” or “around 15%”.51 Additional techniques encountered were numeric techniques, such as “5-10 in 100” and “5-10% of all people”. However, the 26objective of this review by Bansback et al. was to understand broadly how risks were being described in PtDAs, and grouped different approaches into wide-ranging clusters based on their ‘qualitative’ or ‘quantitative’ nature.51 It did not tease out all the different ways ambiguity was being presented, nor did it provide any evidence for how well they worked.The current review therefore had 2 primary objectives:1. Understand the ways in which ambiguity in risk is being communicated in a health-based context.2. Understand how well each identified communication technique works, and the impact that it has on decision-making.2.2 MethodsStudies were sought that examined presentation of ambiguity in risk estimates from the health domain. The search terms that were used had previously been developed and tested for the exploration of risk communication in health.52 However, reviewing the results of this search presented problems. The number of papers identified was large (>20,000) reflecting the rise in interest in risk communication over the past 15 years. A review of a random sample of these papers suggested that very few included ambiguity. Further, many of the papers focused on risk communication tools such as PtDAs, but the actual 27decision aid was not included in the paper or appendix. As a result, in these publications, it was not possible to determine whether ambiguity was being described in the PtDA within such publications.It was therefore decided that the addition of references to ambiguity and more general uncertainty to the search terms (Table 2) might decrease the number of matched results. These search terms were tested to see whether they captured known papers reviewed in 2007, and found to perform well.14 Following this test of the modified search strategy, the Medline database was searched using these terms. The searching of non-health specific databases was explored, but it was found that the terms did not return many useful results as they were geared towards a health context, and it was thus recognized that there are nuances to health risk in the search terms that were not as relevant in the non-health domain. It was therefore decided to supplement the search through reference lists of included papers. Here, studies were not restricted to health, as in this context a health paper citing a non-health paper was considered more likely to be relevant.Eligibility was assessed in two distinct phases. The first phase was screening of titles and abstracts, and was based on a probable inclusion of representation of risk in a patient context. This was double-screened by two reviewers (James Hicklin and Nick Bansback), and any differences in inclusion 28decisions were resolved by discussion. Any study indicating that medical risks were presented to participants was included in the full-text screen. The full-text screen included articles based on whether risk representations included some depiction of ambiguity, whether it be textual or visual. Again, this was double-screened. To assess inter-rater agreement at the full-text review phase, Cohen’s kappa was used.53 In recognition of the fact that searching the peer-reviewed literature would miss many methods that have been used in risk communication tools developed for patients, as described previously as a publication omission, a secondary analysis of the database of patient decision aids51 was performed. These decision aids were extracted from various repositories, including: (a) the Ottawa Hospital Research Institute patient decision aid registry (the most comprehensive collection of PtDAS which includes tools developed by over 30 different organizations); (b) Choosing Wisely (USA and Canada); (c) Option-Grid Collaborative; and, (d) the National Health Service (UK). For each PtDA, all statements concerning benefits and harms of specific options were extracted and classified by their presentation of aleatory and epistemic uncertainty. This previous review by Bansback et al. only categorized broad groupings of decision aids based on whether statements were qualitative or quantitative.51 29The PtDAs that were categorized as having a description of ambiguity were thus considered further and the exact approach that was used was identified.2.3 ResultsThe PubMed search identified 2272 papers. These were supplemented by 7 additional studies from the grey literature. Of those, 110 (5%) passed the title and abstract screen while 2169 (95%) were rejected. In the full-text screen, articles were excluded for a variety of reasons, including no presentation of risk estimates (52%), no presentation of uncertainty (16%), no presentation of second-order uncertainty (33%), language not in English (<1%), and inaccessible articles (<1%). 19 articles passed the full-text screen and were included in the final results (Cohen’s kappa=0.87). A total of 460 PtDAs from eight main developers were in the database contributing 8956 uncertainty statements, averaging 10 statements per PtDA (range 0-45). Healthwise contributed 37% of the PtDAs, while all other contributed 13% or fewer. The majority of PtDAs were developed in the USA (76%), with the remainder from Australia (3%), Canada (12%) the UK (8%), and Malaysia (1%). A few PtDAs were developed prior to 2011 (5%), but thereafter PtDA development proliferated, with 40% developed between 2011 and 2014, and 55% developed in the final two years of the review (2015-2016).5130The PtDAs reviewed were evenly split between supporting decision-making around treatments (including surgeries, but not medications) (28%), medications (22%), non-imaging laboratory tests (17%), procedures (18%), and imaging tests (2%). A number of PtDAs (13%) aimed to support decisions concerning combinations of these options, while others supported decisions that did not fall into these major categories – e.g., long term care for relatives.51Two hundred and forty three (53%) of the PtDAs presented at least one statement that attempted to communicate ambiguity, while the remaining 47% did not communicate ambiguity and were therefore excluded. The majority of PtDAs that did communicate ambiguity used qualitative approaches. Thirty two percent of all PtDAs employed qualifying textual prefixes expressing that the estimate was imprecise (e.g. “about” or “approximately”). Twenty seven percent of all PtDAs included statements conveying low evidentiary quality (e.g. “These are low quality studies”), lack of consistency of evidence (e.g. “Evidence surrounding this is controversial”) or conflicting expert opinion (e.g. “Experts disagree on these chances”). A few PtDAs attempted to express uncertainty when using ranges around numerical estimates, either qualitatively using implied ranges (8%) (e.g. “up to 50 in 100” or “as many as 20%”) or quantitatively as a numerical range (21%) (e.g. “between 20 and 40 in 100 people have a side effect).31In combination, the search resulted in 262 results, matching both decision aids and published studies (Figure 2). A variety of ways in which ambiguity was presented, and some of the outcomes associated with those presentation techniques, are outlined below.2.3.1 Qualitative prefixThe most common techniques encountered were simple qualitative representations of ambiguity. In the case of a risk estimate or benefit, expressions such as ‘about’, ‘around’, and ‘approximately’ were frequently used as a prefix to indicate a degree of ambiguity.In Gurmankin’s study investigating the effect of numerical statements on trust and comfort with hypothetical physician risk communication, the authors compared how participant worry and risk perception were affected by including ambiguous numeric representations of risk alongside a physician’s statement, compared to no numeric representation.54 An example scenario included the following presentation of risk: “There is about a 1 in 4 chance that this is prostate cancer”. While the effect of ambiguity was not discussed in the paper, results indicated that subjects were more confident in the physician’s risk communications when the numeric estimate was provided, though this was qualified by participant numeracy. There was also no discussion of presenting 32numeric estimates without using the qualitative prefix, therefore its actual effect worry and risk perception is unclear.Grant et al. evaluated the use of a paper-based decision aid and audiotape in patients considering blood donation before open-heart surgery.55 A major portion of the decision aid was dedicated to presenting the probability of receiving a blood transfusion after surgery. In doing so, investigators chose to imply ambiguity in the risk estimate using the ‘about’ prefix (Figure 3). Results indicated that patients exposed to the decision aid had greater knowledge, but found no statistical significance in decisional conflict before and after administering the decision aid. An increase in participant risk perception with respect to their chance of contracting AIDS, in which participant perceptions were more in line with the evidence, was also discovered after completion of the decision aid, though it is unclear how these risks were presented in the actual tool. Notably, authors discovered a significant decrease in participant uncertainty, which is considered a sub-construct of decisional conflict, after having gone through the decision aid. This is occasionally encountered in decision-aids, and may be attributed to a significant amount of newly acquired knowledge. However, investigators were not testing any effects with respect to the ambiguous presentation technique, therefore no findings can be interpreted in that context.33In Politi’s study looking at how to support shared decision making when clinical evidence is low, the authors lay out some conversational strategies to support low-evidence decisions.56 While there is little discussion regarding this prefixed format, an example scenario states that “about 2 out of 100 patients who take this pill will develop low blood pressure”. It is unclear whether the “about” prefix is recommended or not, but it tends to be used widely whether its use is intentional or not. In Cohn’s study looking at how adolescents interpret probability statements, they investigated how youth understood vague textual statements made by clinicians.57 They noted that many medical factsheets include statements like “out of 1000 black females who are now 15 years old, about 39 persons will probably die in the next 25 years”. They discovered that words such as “about” and “probably” were misinterpreted by as many as 33% of adolescents, and that they attached a much lower likelihood (magnitude of risk perception) to those statements than evidence indicates. The investigators argue that numerical estimates, alongside a clear visual aid, provide an optimal form of risk/benefit communication, and warn that verbal representations of uncertainty should be used with caution.Buetow’s discussion of risk communication techniques in the physician-patient relationship includes recommendations on how to present numeric 34estimates.58 Firstly, they suggest that numeric estimates be acquired from meta-analyses and systematic reviews of the subject. They suggest qualifying numeric estimates with prefixed statements, as per their example: “treatment increases the risk of unwanted symptoms from about one in five to almost one in two”. This technique uses absolute risks, as per the first-order recommendations, but fails to normalize the denominator, which may impede patient understanding. This study was published before such recommendations had emerged, though, and is thus outside the scope of criticism in that regard. Of the decision aids, there were a significant portion that included prefixed qualitative risk statements in those that presented ambiguity. In the UK National Health Service’s decision aid aimed at stroke prevention for atrial fibrillation and atrial flutter patients, they include the following risk statement: “someone with a CHADS2 score of 2 has about a 40 in 1000 chance of having a stroke over the course of a year”. Research is lacking on how patients perceive such qualitative statements when making treatment-related decisions.In the review of decision aids, the ‘around’ prefix was found to be the next-most common after ‘about’. A prostate cancer screening decision aid from option grid group of PtDAs discussed that “around 90 in every 100 men (90%) will not have any sign of cancer”. The inclusion of ‘around’ indicates that this point estimate is ambiguous, though the magnitude is unclear.352.3.2 Quality of evidenceThere are a number of factors that can influence the quality of evidence that informs an effect estimate of a risk or benefit. The quality of the study (e.g. sample size, randomized control trial or qualitative, existing body of work) is the primary factor that should be considered. While there exist various rating systems to rate the quality of evidence given a point estimate, it was chosen to group these into a single technique. The most widely used rating system is that of the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) (Table 3), though others exist.59 Important to note with respect to GRADE is that ratings are applied to entire bodies of evidence, as opposed to individual studies, and that the quality of evidence reflects confidence that the effect estimates are correct.59 A significant portion of this confidence is based on the size of the confidence interval, or in other words, the ambiguity around the point estimates. Purely due to the nature of ongoing research, these rating systems have changed over time, and variations of these systems exist across the literature. As a result, these are combined into a single technique hereafter.In Bansback’s study comparing hypothetical treatment options, statements based off of GRADE guidelines were compared against numerical statements and visual representations.60 Results indicated that 88% of participants understood this qualitative description, which supports existing evidence that 36qualitative methods are superior to quantitative methods.61 Despite this improved understanding, decisional conflict was significantly increased when imprecision was presented, but this is not necessarily a negative given a complex decision regarding health. Developers of the Making GRADE the Irresistible Choice (MAGIC) project developed an online platform to produce electronic decision aids intended for clinical use.62 The tool allows developers to input the clinical evidence that their estimates are based on, and appraise them using the GRADE criteria. The decision aids themselves, which are intended for use by patients, present the data as seen Figure 4. In these decision aids, the quality of evidence is presented at the categorical level, with no expandable definition. While testing as of yet has been mostly qualitative, results indicate that the approach is appreciated by both patients and clinicians, and is consistent with the findings in Bansback’s study.60In Politi’s review of uncertainty communication, it is noted that many of the literature’s recommendations for communicating uncertainty are unsupported by evidence, especially in those that consider ambiguity.14 The investigators discuss several rating systems that are used in the medical context, and specifically note the British Medical Journal Publishing Groups’ Clinical Evidence reports. These reports categorize evidence as indicated in Figure 5. Unfortunately, and as the authors note, the outcomes of using this approach are 37lacking data, and few decision support tools actually include a description of the quality of the evidence.28 It is also worth noting that these rating systems are only conveying ambiguity in the presence of a specific risk estimate. If these statements are the sole representation of risk in a given context, they are only indicating uncertainty of the first-order. In Harrison’s discrete choice experiment investigating patient preferences in rheumatoid arthritis treatment, findings were relevant in that they found participant confidence (i.e. trust in the evidence about treatments) to be an important factor in preference ranking.63 The investigators tested different levels of GRADE imprecision statements, and discovered that participants placed great value on precise information, and avoided information that was ambiguous, providing evidence for ambiguity aversion.Harrison et al. investigated how preferences for “new” treatments are influenced using a discrete choice experiment, and modified the level of “newness” based on how long a drug had been available.64 Examples of the extremes were: “Old treatment: approved 10 years ago” and “New treatment: approved within the last 12 months”. When these statements were combined with GRADE statements indicating confidence in the effect estimates, the investigators found that individual preferences for new treatments were diminished, compared to no ambiguous information being present. The authors 38go on to suggest that clinicians be mindful of describing treatments as “new” when consulting with patients, or to qualify the implications around “new” treatments, as they have shown that there is a cognitive bias influencing how individuals make decisions with respect to new treatments.In Kirkegaard’s qualitative study investigating cholesterol-reducing decision-making in general practice, general practitioners were in disagreement as to whether recommending cholesterol-reducing medication to patients with high cholesterol, but without manifest morbidity, was recommended.65 Some relied on the “newest” evidence, while others relied on older evidence and felt too little is known regarding the risks and benefits in otherwise healthy patients. The authors conclude that interventions using shared decision making should acknowledge the presence of “epistemological uncertainty” (ambiguity) in any presented risk estimates, by expressing its inevitability. In the analysis of the included decision aids, a small portion included a qualitative statement indicating that experts disagree and that conflicting results may exist, though this technique was rare. An oophorectomy decision aid from the Ottawa Health Research Institute repository included a statement that “the evidence of HRT’s effects on cardiovascular disease (heart disease and stroke) is more controversial”, indicating a high degree of ambiguity around those risk estimates. A large portion of decision aids, however, included more general 39statements related to the quality of evidence. These were often based off of GRADE, though other quality grading systems were sometimes used.Occasionally, communication techniques included estimates taken directly from studies, in which the sample size of the study is directly communicated to the patient. In a pregnancy decision aid from the Ottawa Health Research Institute repository intended to help women choose whether to have chorionic villus sampling (CVS), risk estimates were presented in the following manner: “in one study of women who had CVS performed by a highly trained doctor, about 1 out of 400 women had a miscarriage after the test, while 399 did not”. The 400-person denominator implies a certain degree of imprecision, where higher numbers might indicate more precision and higher powered (i.e. higher quality) studies compared to lower numbers. Problematically, this technique breaks guidelines indicating that denominators should stay consistent, and cannot be differentiated from first-order uncertainty in the case of denominators being equivalent.2.3.3 RangeIn those that included explicit numeric representations of ambiguity, a simple range covering the confidence interval of the estimate was by far most common. Some studies presented explicit numeric ranges, with no visual 40accompaniment, while others included a visual representation alongside the textual range. Some studies tried to enhance the range to better represent the true probability of a risk or estimate, using varying degrees of opacity.Roberts et al. conducted a qualitative evaluation of patient-centered communication strategies in oncotype diagnosis testing to determine gaps in the knowledge exchange, by conducting telephone interviews with North Carolinian oncologists.66 They found that only 1/3 of providers identified and discussed uncertainty around the potential benefits of chemotherapy. Within that subgroup, some providers explicitly indicated that they show their patients visual displays of test results and risk of recurrence. These visuals include bar graphs that include confidence intervals, which the providers then talk through with the patient by verbally explaining that the extremes of the confidence interval are real possibilities. One physician is quoted as saying “Look, your cure rate might go up 5%. It might not go up at all. It could even be harmful”. This is a key point when discussing ambiguity, as there can sometimes be a real possibility of harmful effects that aren’t explicitly covered when only the point estimate is presented.Schapira et al. reported mixed findings in their qualitative study on risk communication formats used in health care.41 A portion of their study involved investigating how patients react to graphics that convey uncertainty. They 41presented patients with a graphic that presented a risk estimate presented as a discrete point estimate, alongside a graphic that presented the same risk as a range (Figure 6). Participants with lower levels of education reported the range as being “vague” and “wishy-washy”, along with decreased trust in the estimate. More educated participants were more accepting of the uncertainty and its presence in any scientific data. The higher-educated group additionally indicated that they felt the confidence interval should be presented to patients so that they are fully informed when making treatment-related decisions. Johnson et al. conducted a series of studies in which degrees of uncertainty around an environmental risk were varied.19 When presented with a range, participants were found to focus on the upper end of the range and indicated that the risk was much more uncertain, compared to point estimate presentations. They also found a significant increase in participant worry when presented with a range. Despite this, when participants were presented with a confidence interval, they believed that the information provider was being more honest. Additionally, the investigators tested the effect of a graphical representation, which indicated the upper and lower bounds of the risk estimate (Figure 7). They found that the use of a graphic significantly increased participant perception of risk magnitude, and decreased perceived trustworthiness of the estimate, compared to its textual counterpart. Despite 42these findings, participants liked the graphics and felt that they made interpretation easier.Longman and colleagues corroborated Johnson et al.’s findings by conducting a randomized control trial in which uncertainty around the benefit of a hypothetical acne medication was varied to assess risk perception, understanding, and perceived credibility of the risk information source.67 They randomized participants to receive an exact point estimate, a small range, and a large range depicting the benefit of the hypothetical drug. Results indicated that presenting uncertainty can lead to poorer understanding of the risk (assessed using three validated items from previous studies68,69), an increased perception of risk when the confidence interval grows, and a lower perceived credibility in the information source.Politi notes that the lay publication, Consumer Reports Best Buy Drugs, reports ranges that imply a confidence interval.14 In one example, they quote that in patients taking a given drug, “between 17% and 25% are pain-free”. It is vital to recognize that in some cases, point estimates are entirely omitted, and only the range is given. Evidence is lacking as to how perceptions of ranges differ between those that include point estimates of the mean/median values, and those that don’t. 43In O’Doherty’s discussion regarding the pitfalls of communicating risk, the authors briefly mention that ranges are occasionally presented to patients during genetic consultations.70 While this paper is primarily a discussion as opposed to an actual study, the authors provide some suggestions on how risk can be more generally presented to patients. Primarily, they suggest that health care practitioners discuss how population frequencies are important in estimating personal risk, but are not the whole picture, and recognize that population frequencies may be logically familiar to professionals, but puzzling when presented to patients. In more general terms, they encourage clinicians to present the uncertainties associated with risk estimates, but do so in a way that is understandable and applicable to a general patient audience.In Han’s study, the investigators tested how the presentation of ambiguity, using textual and visual ranges (Figure 8), was perceived and understood by laypeople.49 Their results indicated that communicating ambiguity (compared to no ambiguity) leads to ambiguity aversion, though they noted that these effects appear to be affective as opposed to cognitive. Equally important, the authors discovered that optimism and the use of visual communication methods could reduce ambiguity aversion. The authors additionally note that visual representation allows people to focus on specific aspects of risk information, and in this case, the lower end of the risk range.44Quantitative ranges accounted for a significant portion of the included decision aids. In a decision aid regarding obesity retrieved from the UK National Health Service repository, a risk statement noted that “bariatric surgery cures type-2 diabetes in two-thirds to three-quarters of very overweight people with type-2 diabetes”. It is of note that this example uses fractions as opposed to numeric percentages, which conflicts with existing recommendations to use whole numbers27.It is imperative to note that not all range statements are explicit. Commonly, prefixed textual statements such as “as many as”, “up to”, and/or “less than” were used to denote implicit ranges. A decision aid for abnormal pap smears from Ottawa Health Research Institute indicates that “regular pap smears every 2 years can prevent the most common type of cervical cancer in up to 90% of cases”. Taken literally, the implication here is that 0-90% of the most common cervical cancer types can be prevented with regular pap smears, which is not a useful interpretation. It is unclear how these implicit ranges might influence risk perception, worry, and other decision-related outcomes in patients. When visualized, simple ranges are limited in that they provide no representation of the underlying probability distribution that makes up the risk estimate. As a result, attempts have been made to use “gradient ranges”, which adjust the opacity of the range based on some underlying numeric data. There 45are several options that can be considered when choosing how to determine range opacity.In Han’s second experiment in the afore-mentioned study, they investigated how different representations of ambiguity might influence risk perception and cancer-related worry by comparing the previously stated simple textual ranges and solid ranges (Figure 8) to enhanced gradient ranges (Figure 9).49 These gradient ranges were constructed based on the kernel density function of the underlying data, outside the 95% confidence interval, while leaving everything inside the confidence interval fully opaque. Conceptually, this is a viable option, as the true point estimate could realistically be anywhere inside the 95% confidence interval. By adjusting the opacity of the edges outside the 95% confidence interval, there is a visual interpretation that the risk estimate might be outside the interval, even though the probability is lower than within the interval. The investigators found no significant change in risk perception or cancer-related worry. This led them to hypothesize that textual representations of ambiguity that are enhanced to include clear representations of the lower and upper ends of the range may be equally effective as visual representations, as far as communicating specific range intervals goes.Bansback et al. examined how imprecision around probabilities influences the way people value treatments, and tested how people understood imprecision 46when it was described quantitatively compared to qualitatively.60 Their results showed that 68% of people understood ambiguity when it was illustrated as a quantitative range, using a shaded confidence interval (Figure 10). In Bansback’s study, the shading of the entire CI is based off of the underlying probability distribution of the data. The entire CI is shaded with varying opacities, as opposed to only the edges outside the CI, as per Han’s study in 201149. Additionally, the investigators found that presenting imprecision in any form increased decisional conflict by a small but significant amount.2.3.4 Other techniquesExperts don’t typically recommend the use of complex visualizations to emphasize various aspects of second-order uncertainty. Some studies, however, have looked at how alternate encodings for mean and variance might improve individual inference in the presence of uncertainty.  In Correll’s crowd-sourced experiments looking at how alternate encodings of uncertain data influence individual decision making, they tested layperson understanding of the uncommon violin plot (Figure 11), in which the shape of the “violin” is based off the underlying probability density function.71 Interestingly, they found that the use of violin plots mitigated “within-the-bar” bias, in which points contained within a standard error bar are seen as likelier 47than those outside the error bar. More generally, they found that participants were able to make complex inferential judgments using violin plots, while minimizing the issues of binary interpretation (i.e. the value is either within the margin of error, or not) and “within-the-bar” bias, which traditional error bars tend to suffer from. The authors suggest using visual encodings that are symmetric and visually continuous when attempting to convey specific ambiguous data, such as violin plots or gradient bars.2.4 DiscussionIn this chapter, the process followed to conduct a systematic review of the literature was discussed, and the results from the studies and decision aids that were ultimately included were outlined. The initial search returned 2279 results, along with 460 decision support tools. After abstract and full-text screening, 19 studies were included alongside 243 decision support tools. A variety of different techniques for presenting ambiguity were identified from the included items, which were finally grouped into 9 distinct techniques (Table 4).The primary findings from this review were that:1. There is little evidence testing the impact (in terms of trust, worry, intention, knowledge, and other decision-making related outcomes) of ambiguity description, which is in stark contrast to the first-order 48literature. Additionally, of the evidence that does exist, there are some conflicting results.41,49,57,19,72,732. Probably as a result of (1), there are many different ways that ambiguity is being presented, and no consensus or recommendations on best practices. The identified methods were grouped into 9 distinct techniques.In retrospect, it may have been beneficial to widen the scope of this review, given it’s reliance on the decision aid repository.51 Decision support tools exist in the form of discrete choice experiments, best/worst experiments, and other preference elicitation instruments that may not have been captured by this review. However, given the committee’s vast experience in the field of shared decision-making, it was expected that any missed technique groups would be brought up over the course of this research. Additionally, the use of reference lists in the included studies helped to discover potential techniques that were not found in the main Medline search (e.g. Correll’s violin plot).71 Given more time, there is a possibility that additional studies could have been included in this review. As mentioned earlier in the chapter, many of the discarded studies were based on decision aids, but failed to include those decision aids as appendices or within the papers themselves. If the original 49authors were contacted, it is likely that I could have acquired the relevant decision support tools and therefore included them as part of the review. Though, given that 262 decision aids and papers were condensed into nine distinct techniques, it is unlikely that additional tools and papers would yield new techniques.This review of studies brings up important considerations in describing ambiguity beyond people’s knowledge. Intention, trust, worry, decisional uncertainty, and risk perception were all encountered as outcomes in studies testing the influence of ambiguity, and the direction of effect for these outcomes was not always clear. A more thorough introduction to these outcomes is discussed in Chapter 3, where a survey to test their influence based on ambiguity presentation technique is designed.50Table 2: Terms used to search MedlineSearch strategy for MedlineRisk (section 1) Communication (section 2)Health behaviour CommunicationsRisk CounselingRisk taking Genetic counselingRisk factors Health educationHealth promotionPatient educationUncertainty (section 3) Context (section 4)Uncertainty PatientAmbiguityImprecisionCredibilityReliabilityTable 3: Balshem, 201159- outline of old and new GRADE scoring systemsQuality levelCurrent GRADE definition Previous GRADE definitionHigh We are very confident that the true effect lies close to that of the estimate of the effectFurther research is very unlikely to change our confidence in the estimate of effectModerate We are moderately confident in the effect estimate: The true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially differentFurther research is likely to have an important impact on our confidence in the estimate of effect and may change the estimateLow Our confidence in the effect estimate is limited: The true effect may be substantially different from the estimate of the effectFurther research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimateVery low We have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effectAny estimate of effect is very uncertain51Table 4: Summary of ambiguity presentation techniques as identified by the literature review, along with potential influences on outcomes associated with decision-making.Technique Positives Negatives Example from searchQualitative prefix- Easy to implement- Magnitude of ambiguity unclear (can’t have different levels)55- Can be misinterpreted57“There is [about] a 25 in 100 chance that this is prostate cancer.”Quality of evidence statement (GRADE & other rating systems)- Granular – can distinguish multiple levels of ambiguity59- Evidence supports understanding60- Increases decisional conflict60- Can be challenging to implement as guidelines require substantial evidence59- Decreases trust63“1 in 100 chance of the event, but our confidence in the effect estimate is [limited/good]: The true effect may be substantially different from the estimate of the effect.”Time treatment has been available- Simple, and has potential to implicitly convey lots of information regarding ambiguity- Promotes ambiguity aversion when combined with risk estimates64- Inconsistent use by clinicians65“Old treatment: approved [10/2] years ago.”Number of people- Can increase trust with regard to where information came from- Violates recommendations to use constant denominator46- Requires mental math to make probabilistic sense“In one study of women who had CVS performed by a highly trained doctor, about 1 out of [400/40] women had a miscarriage after the test, while [399/39] did not.”Experts disagree- Truly articulates ambiguity in much of the scientific literature- Confusing for patient (they rely on expert opinion)- May impact trust“The evidence of HRT’s effects on cardiovascular disease (heart disease and stroke) is more controversial.”Range (visual)- Portrays the ambiguity numerically – high information transfer- Technically simple- Supported by numerate participants41- Can decrease ambiguity aversion49- Conceptually difficult – is the extreme as likely as the middle? Requires careful explanation- Can increase risk perception19- Can decrease credibility in estimate41,19 52Table 4: Summary of ambiguity presentation techniques as identified by the literature review, along with potential influences on outcomes associated with decision-making.Gradient Range- Accurate portrayal of numerical ambiguity- Complex for layperson to understand71- Not shown to be more effective compared to basic range49  Violin plot- Accurate portrayal of numerical ambiguity- Decreases “within-the-bar” bias71- May be overly complex- Cannot be integrated easily with icon arrays (recommended presentation format for risk in health) Range (textual)*same as visual range *same as visual range“5-15% of people may have adverse side effects”Figure 2: PRISMA for systematic review findings53Figure 3: Grant et al. "about" textual prefix55Figure 4: MAGIC appCertainty ratings are based on the GRADE system54Figure 5: BMJ clinical evidence categories55Figure 6: Schapira's range41Figure 7: Johnson's range1956Figure 8: Han's range49Figure 9: Han's gradient range49Shaded sides represent area outside the 95% confidence interval. Entire area within confidence interval is shaded in the same color to indicate that true value could be anywhere within the bar.Figure 10: Bansback's gradient range6057Figure 11: Correll's violin plot71Shape of violin is based on underlying probability distribution, mirrored across the y-axis.583 Survey designGiven the number of distinct techniques and unclear impacts on decision-making that were identified in the previous chapter, I chose to develop a survey that compares the presentation of risk using ambiguous and non-ambiguous techniques, and hereafter describe the process that was followed and rationale. Through the review of the literature described in Chapter two, nine distinct techniques were identified that have been used previously to describe ambiguity in risk estimates to patients. However, the potential consequences associated with using each of these techniques in describing risks to patients are unclear. The few studies investigating these techniques reported varying outcomes, and contradictory results were frequently encountered between papers.41,49,57,19,72,73It was chosen to not include the violin plot from the set of potential presentation techniques, as the expert committee felt it might be too difficult for patients to understand. Additionally, and given first-order recommendations, it was unclear on how a violin plot could be adapted for use with a standard icon array, and therefore considered ineligible for further testing.Furthermore, on review with the committee and field experts, it was felt that the variety of techniques encountered in the health domain were non-exhaustive. Throughout the review, many articles were discarded as they focused on visualizations of first-order uncertainty, as opposed to ambiguity. 59Yet, some of these visualizations were easily adaptable to display ambiguity instead. One such visualization was an animated tool developed at the University of Cambridge, showing risk in the typical icon-array frequently used in health.74 The novelty that they explored was the ability to alter the first-order uncertainty dimension by randomly distributing the affected icons throughout the visualization (Figure 13).Several studies have tested the influence of this presentation technique in the first-order context, but results have been mixed, and some have shown that it does not differ from the traditional “grouped” icon array with respect to risk perception or worry, subjective uncertainty, or dispositional optimism.75 This technique could easily be adapted to describe ambiguity by altering the number of people affected as opposed to keeping it constant, yet no such examples were found in the literature review. As a result of this, a new animated visualization was developed that alters the number of affected icons, which gives a visual representation of the range (95% confidence interval) that exists in the underlying data. A simple example is described in Figure 13.3.1 ObjectivesA survey was created that tested different ambiguity presentation techniques, which were either identified in the literature review or newly 60created, to determine how outcomes shown to improve decision-making may be influenced. The specific objectives of this survey were to:1. Determine whether presentation technique of ambiguity in risks will change intention to take oral anticoagulation in a hypothetical atrial fibrillation treatment scenario (primary). The expected direction of this change was unclear, therefore change irrespective of direction, termed “distance”, and actual change were both considered.2. In comparison to no ambiguity, and to different ambiguity techniques, investigate whether any technique(s):a. Increases perceived magnitude of risk b. Increases worryc. Decreases trustd. Increases decisional uncertaintye. Increases knowledgeOf note is that the primary outcome’s direction is unclear as a result of insufficient evidence in the literature review. The secondary outcomes were identified directly from the literature review, and each had a generally identifiable direction, therefore these objectives were directional.613.2 Methods3.2.1 OverviewA survey was developed that used a hypothetical scenario in a sample of the general population. Since there is little to no evidence on the influence of these techniques on decision-making in the literature, it was determined that it was unethical to test this on patients making real decisions, hence the decision to use a hypothetical scenario and a general population sample. A web survey was used primarily due to the exploratory nature of the study as they often provide data of equal quality to in-person surveys.76,77 The scenario was built around atrial fibrillation, in part due to familiarity within the committee, and also as there are multiple new treatment options that have recently entered the pharmaceutical market, which may have ambiguous or unknown outcomes, and so it served as a useful case study (see Chapter 1). A pilot survey was developed using the best and most validated measures associated with the primary and secondary outcomes that could be identified in the literature, and feedback was received from consultations with an expert group (see section 3.2.2). This pilot survey was tested in 100 individuals and revised on the basis of the feedback and exploratory analysis, before being sent to ~650 respondents. Approval from the UBC Behavioural Research Ethics board (reference H17-02094) was given prior to releasing the survey.623.2.2 Survey developmentOn completion of this review of the literature, advice was sought on which presentation techniques should be investigated further. It was important to understand whether the focus should be constrained to the more commonly utilized techniques or whether it would be of interest to look at all identified techniques. On consultations with field experts in risk communication (Paul Han and Liana Fraenkel) and the members of the supervisory committee, the consensus was to consider all broad types of techniques, as this work could help select the few that would be of most interest in future research. Advice was given to collapse some of the textual techniques (‘about’ was chosen, while ‘around’, ‘roughly’, and other textual techniques were omitted as it was felt they would all produce similar results).The survey development closely followed the Checklist for Reporting Results of Internet E-Surveys (CHERRIES) checklist.78 The CHERRIES list is predominantly a tool to report on the quality of web surveys and so requires authors to describe how the survey was designed, undertaken and implemented. Since it ensures adherence to good reporting practice and will be useful for publishing the study in the peer-reviewed literature, it was used as a structure for reporting methods and results. A full CHERRIES checklist is available in Appendix A.633.2.3 Survey outline3.2.3.1 ConsentThe first page of the survey was used to obtain consent. Participants were clearly informed that clicking next implied they had given consent to participate. They were informed that no identifying information would be collected, that the estimated time to complete the survey was between 5-10 minutes, that all response data would be stored on a secure server, and that results would only be reported in aggregate.3.2.3.2 VignetteThe vignette scenario used was a hypothetical situation describing a newly diagnosed AF patient. This scenario has been successfully used previously in the general population, so it was believed to be feasible and appropriate for use in this study.79 After introducing participants to the scenario, they were given a brief introduction to AF which described that the disease is caused by an irregular heartbeat and can significantly increase a patient’s risk of having a stroke. The difference between a major and minor stroke was explained, and participants were asked to focus only on their risk of having a major stroke for the remainder of the survey.  Participants were then given a simple introduction 64to the concepts and interpretation of probability and risk, along with an explanation of how icon arrays can be used to present risk estimates.The scenario described to the participant included specific point estimates regarding their individual risk of stroke, in the format which is frequently presented to patients based on the outputs from clinical prediction models (e.g. CHA2DS2-VASc). While the point estimates are important, there is also a potential for ambiguity surrounding these estimates that could factor in to the decision. In the case of new drugs where evidence is limited, this ambiguity is potentially large. Hypothetical point estimates and confidence intervals were used, as shown in Figure 14.If confidence intervals were not presented, the obvious choice would be to prefer oral anticoagulation (OAC) treatment because the risk of stroke is lower. However, when the confidence intervals are included, the choice becomes less obvious; the wide confidence interval around the chance of having a stroke when taking OAC therapy indicates that the true probability could actually be higher than the risk if no therapy is used, or even lower than the point estimate suggests. Depending on how risk and ambiguity averse the patient is (i.e. whether they look at either extreme of the confidence interval), they may now choose to have no therapy, or may still choose OAC. 653.2.3.3 Main SurveyThe main survey consisted of five pages which included questions used to collect the selected outcomes. Two separate scenarios were defined, and participants were asked to complete each scenario twice. The ambiguous versions of presentation techniques are available in Appendix B. Participants were randomized to one of the nine techniques, where they answered questions regarding trust, worry, risk perception, knowledge, preference, decisional uncertainty, and intention in both the ambiguous and non-ambiguous versions of each scenario.The first scenario included a single point estimate and consisted of three components on three separate pages, outlining the participant’s potential risk of stroke if they were to take an oral anticoagulant. The order of these three pages was not randomized. First, participants saw the ambiguous presentation technique for the technique that they were randomized to, and answered questions regarding their degree of worry, trust, and perceived magnitude of risk (see 3.3.3). They also answered the knowledge question on this initial page. On the second page, participants answered the same questions (minus the knowledge question), but about a non-ambiguous point estimate representing the lower end of the confidence interval. Finally, on the third page, participants were asked the risk perception and worry questions a final time using a non-ambiguous 66representation of the upper end of the confidence interval. As users progressed, it was made obvious that they were to disregard information seen on previous pages, and only consider information shown on the current page of the survey. The order of these pages was not randomized, as the knowledge question (Figure 15) depended on not having seen the lower and upper ends of the confidence interval. Participants would have otherwise been able to navigate back and assess the “correct” answer to the knowledge question.In the second scenario, participants were asked to consider two risk estimates regarding their risk of stroke: one if they were to take an anticoagulant and one if they were to forego treatment. They were then asked questions regarding their intention to take medication and their decisional uncertainty. Again, participants were asked to complete questions (see 2.3.4) twice: once with no ambiguity around the point estimates and once with ambiguity. Following these scenarios, a final question was asked requesting participants to choose whether they preferred the ambiguous version or the non-ambiguous version showing the same underlying data, just with different degrees of ambiguity. This question was used to explore potential relationships between personal preference and outcome changes, as personal preference clearly factors in to all decision-making.  673.2.3.4 Demographic questionsParticipants were then asked to complete three further survey pages, in which demographic information was collected, including age, sex, and level of education.  Data regarding ambiguity aversion using the AA-Med scale and subjective numeracy using the subjective numeracy scale were also collected.80–83 Finally, participants were optionally invited to leave open-ended feedback in an open text box.3.3 Outcomes and sociodemographics3.3.1 Primary outcomeResearch around behavioral change has discovered that behavioral intention is a precursor to behavioral change, explaining up to 30% of the variance in actual health behavior.84 Consequently, simple behavioral intention questions are frequently used in similar studies to this. For example, in a study by Sheridan et al published in The Journal of the American Medical Association, the primary outcome was participant intention to accept prostate cancer screening, and was asked using the following question: “I plan to get screened for (name of screening test) in the next year”.85With Sheridan’s study in mind, additional evidence was sought on how to construct intention questions based on the theory of planned behavior. 68Johnston’s guide, a well-cited manual published by City University London exists to help health services researchers construct questionnaires based off the theory of planned behavior.86 This manual was used, alongside Sheridan’s work, to construct the question used for the primary outcome, shown in Figure 16.Johnston’s guide outlines three techniques that can be used to measure intention: intention performance, generalized intention, and intention simulation. Intention performance was better suited for studies where you could compare intention results to actual results (e.g. asking a physician how many patients he expects to refer for magnetic resonance imaging (MRI), and then comparing that response to how many people he/she actually referred). Intention simulation is typically geared towards measuring intention in health professionals, and was therefore not suitable for use. Recommendations suggest that generalized intention is best suited when measuring patient-reported intention, and as such, this technique was chosen as a template for the survey question.Of important note here is that, following the pilot survey, an adapted intention question was used in the final survey (Figure 16) given feedback related to the clarity and understanding of the original question from the pilot participants. The pilot results, presented in section 3.5.2, are based on the initial 69question (Appendix C). A description of differences between the old intention question and new question is given in section 3.5.4.3.3.2 Secondary outcomesThe following outcomes were considered as previous studies have found that ambiguity in risk communication can alter these in positive, negative, and unknown ways.3.3.2.1 KnowledgeKnowledge, in this context, is defined as an understanding that point estimates related to risks and benefits are imprecise, and in certain contexts and situations should influence decision-making. While the TPB doesn’t explicitly include knowledge as a precursor to behavioural change, evidence suggests that knowledge is highly correlated with attitude, subjective norm, and perceived behavioural control, which are the three primary predictors in the theory of planned behaviour model.87 It was chosen to measure knowledge using two questions, asking participants to state what they think is the lowest and highest possible risk of having a stroke given the ambiguous risk presented. These questions can be seen in Figure 15. It was predicted that participants who are randomized to the less precise techniques (e.g. a textual prefix) would have significantly more variation in the upper and lower bounds of the responses to 70these questions compared to those that are more explicit with respect to the amount of ambiguity present (e.g. range approaches). 3.3.2.2 Perceived magnitude of riskPerceived magnitude of a risk was assessed by asking participants where they consider a given risk (either with or without ambiguity) to be, on a scale of very low to very high. This question was adapted from the Health Information National Trends Survey (HINTS) administered by the National Cancer institute, and is a validated questionnaire used to assess cancer risk perception in the United States.88 Higher scores for this question represent higher perceived risk magnitudes.There is rich literature on the connection between risk perception and it’s influence on decision-making. Smith, for example, found that participants struggled to apply a binary 50% chance of Huntington’s disease and assimilate it into their decision-making processes, and that it was too simplistic for an individual case.72 Some participants explicitly sought additional data that could transform the statistic into a more skewed 70/30 risk, as they felt it was then more easily applied to themselves. It is possible that presenting ambiguity might provide some of this additional data and lead to improved risk perception. Conflicting evidence exists, however. Hivert’s study on diabetes risk perception 71found that in primary care patients a higher perceived risk of diabetes correlated with an actual elevated risk of diabetes, but did not have any impact on intentions to adopt healthier lifestyle choices.733.3.2.3 Decisional uncertaintyThe uncertainty subscale of the decisional conflict scale (DCS) was used to understand participant uncertainty in the given risk information. Two questions were used to measure this construct, including “Are you clear about the best choice for you?” and “Do you feel sure about what to choose?” The Decisional Conflict Scale is a validated scale frequently used to assess the influence of patient decision aids in the literature. The DCS scale has a reported test-retest reliability coefficient of 0.81 and internal consistency ranges from 0.78-0.92, making the psychometric properties acceptable for use.89 While only the uncertainty subscale of the DCS (internal consistency of 0.92) was used in this survey, it is often analyzed on its own and used independently, hence my decision to use it.90,91 Lower scores in this scale represent more decisional uncertainty.3.3.2.4 WorryEvidence is unclear on whether patient worry increases or decreases when outcomes are uncertain.75 Additionally, worry has been shown to have a 72cognitive influence on decision-making. Metzger showed that high levels of worry resulted in slower deliberation and longer response times when ambiguous decisions were to be made.92 It would therefore be ideal to decrease patient worry while simultaneously conveying uncertainty to fully inform the patient. Worry was measured using a single item: “If you received these results, to what extent would you feel worried about having a major stroke?”. This item was also adapted from the HINTS.88 Higher scores on this scale represent higher worry.3.3.2.5 TrustPatients have been shown to trust point estimates provided without any representation of uncertainty more than those presented with some degree of uncertainty. This is troubling, as presenting point estimates without any notion of ambiguity is perceived as being extremely trustworthy, yet when the additional information regarding the ambiguity is included, that trust diminishes. A technique that retains patient trust and still conveys ambiguity would be ideal for this outcome. Trust was gauged using a single question: “How well do the following adjectives describe the content you just read?”. The included adjectives comprise of “accurate”, “authentic”, and “believable”.  Answers were recorded in a Likert format, from one (describes very poorly) to 73seven (describes very well). This question was constructed based on a previously validated message credibility scale, with high reliability (Cronbach’s α = 0.87) and high content, criterion and construct validity.93 Higher scores on this scale represent greater trust.3.3.3 Sociodemographics3.3.3.1 Ambiguity aversionThe AA-Med scale, used to measure ambiguity aversion, asks 6 questions about participant reactions to uncertain medical tests and treatment outcomes. The questions are answered using a 5-point Likert scale, which ranges from “Strongly disagree” to “Strongly agree”, and higher scores correspond to higher ambiguity aversion. The scale also displays relatively strong reliability with a Cronbach α of 0.73.80 The items of the AA-Med scale are outlined in Table 5.3.3.3.2 Subjective numeracySubjective numeracy data were collected using the subjective numeracy scale.81 The short version of the scale asks 3 questions, 2 of which ask participants to subjectively assess their ability to work with numerical data, and 1 which probes participants on whether they find numerical information useful. The version of the scale that was used reports a Cronbach α of 0.78 and is therefore 74considered reliable.83 Questions are answered on 6-point Likert scales, and higher scores represent higher levels of subjective numeracy.3.4 PopulationThere were various options available for data collection including advertisements and market research panels, but Amazon’s Mechanical Turk (MTurk) platform was chosen as an efficient approach to recruit participants as it charges per question completed allowing a guaranteed sample size to be collected within the limited budget for the study. The last decade has seen an upsurge in studies using MTurk, and evidence has shown that MTurk samples produce good Cronbach alphas (0.73-0.93), and are therefore adequately reliable and useful for testing psychometric constructs.76 In the clinical domain, evidence shows that MTurk studies are fast and effective at producing high quality data for psychiatric research, often producing sufficient data in as little as three days.77 In addition, patients may feel more comfortable disclosing information in an anonymous, online survey as opposed to in-person during focus groups or personal interviews.77 The use of Mturk was tested in the pilot (see 3.5) and produced results that gave sufficiently large change scores (see 3.5.2). It even exceeded expectations as 6% of participants provided free text responses on how they felt the survey could be improved (see 3.5.3) – something they were not 75being compensated to do. This feedback proved to be invaluable when adapting and improving in the period between completion of the pilot survey and launching the full survey. Additionally, this improved confidence in the data, as it signaled that Mturk participants were not just doing the minimum required work to be remunerated. At the time of the survey, Mturk was only available to respondents in the US (it has since become available to Canada). While there was a preference for a Canadian sample, it was decided that this was not of primary importance, and in fact, most of the studies identified in Chapter 2 were from the US. This acts as a potential benefit, as results may be less impacted by contextual differences between Canada and the US.3.5 Piloting3.5.1 Expert reviewThe initial design of the survey was sent to two field experts, Paul Han and Liana Fraenkel, as part of an exercise to rationalize the number of techniques to investigate further. While the general consensus was that testing in actual patients required reducing the number of techniques, both experts felt that all of the candidate techniques should be tested using MTurk and then “throw out the 76ones that perform really poorly”. Future research using real patients could then focus on those techniques that showed some promise.Additionally, expert feedback indicated that the proposed vignette was too complex. The primary worry was that the cognitive effort required in understanding the vignette would prime participants to be ambiguity averse, and would therefore skew the data. The original vignette design featured two treatment-related attributes: stroke risk and major bleed risk. After carefully considering the design of the survey, it was deemed that the bleed risk attribute was unnecessary, and that the same outcomes could be collected without including this attribute. As such, the up-front introduction to AF could be simplified, and participants were explicitly informed that they were to consider only one attribute: their potential risk of stroke.3.5.2 Online pilotAfter finalizing the survey design, it was realized that the required sample size was not going to be clear, as the evidence for the mean (and SD) intention change was not known for the AF treatment context. Therefore a pilot test was conducted in which only two techniques for presenting ambiguity were tested: a visual range and a textual representation of ambiguity, which were expected a 77priori to represent the likely best and worst techniques at communicating ambiguity, respectively. One hundred participants were enrolled and randomized to the two presentation techniques: visual range (n = 48) and textual prefix (n = 52). Baseline characteristics were well balanced. Descriptively, those randomized to the visual range had higher levels of educational qualifications than those who saw a textual representation of ambiguity (Table 6). Additionally, the textual prefix group had a slightly higher proportion of females (56%) compared to those in the visual range group (50%).There were 2 ways of analyzing the results for intention, either by looking at the mean change, or by looking at the proportion of respondents that change by a predetermined number of steps on the scale (e.g. 1-point change, 2-point change, etc.). In the ambiguous cases, intention to take medication decreased for both the visual range technique (from 3.73 to 3.43) and the textual prefix technique (from 3.85 to 3.67), shown in Table 7. Cohen’s D effect size was also computed, though the visual range only showed a small effect (-0.28) while the textual range effect was negligible (-0.17). While these changes appear small, the direction was not hypothesized a priori, and therefore the 1-point, 2-point and 3-point proportion changes were calculated where direction was not considered. When comparing the proportion that changed by at least 1-point on the scale, it 78appears that the visual range technique had a greater effect than the textual prefix technique (64% for the visual range compared to 25% for the textual prefix), though no additional analysis was conducted in this preliminary pilot study.Secondary outcome results are also outlined in Table 7. Of particular note are that both trust and decisional uncertainty appear to be impacted more when describing ambiguity using an explicit visual range as opposed to through the use of a simple textual prefix. Specifically, participants tended to trust the range less (mean score change of -0.29), compared to only a mean score change of -0.17 for the textual prefix. Participants were more uncertain using the range (mean score change of 0.36) compared to a mean score change of 0.14 for the textual prefix. Additionally, participants perceived the magnitude of risk using a range as greater than when using the textual prefix (mean score of 1.96 for range compared to mean score 1.81 for textual prefix). The change in worry between the techniques was minimal.My conclusion with respect to intention was that a 1-point change, while greater than the 0.5-point change used in Sheridan’s study, might be open to natural variation.85 This conclusion was formed based on the high proportion of people who changed by 1 point within the textual prefix group (proportion of people who change by 1 point=0.25, SD=0.44), which was predicted not to have 79an effect on intention to take oral anticoagulation. On top of this, the 0.65 (SD=0.48) proportion change in the visual range group is extremely high, which led to a belief that there may be some degree of noise. The concern with using a 1-point change was that it might not indicate a real change, but just an uncertainty about which response option best reflects the individual beliefs, and so may lead to over interpretation of results. With this in mind, alongside the changes that were made to the intention question (see 3.5.4), it was deemed that measuring a 2-point change would better reflect a real change, and chose to adjust the future analyses to this.3.5.3 Open-ended feedbackSix percent of participants provided open-ended feedback regarding how the quality of the survey could be improved, which proved to be valuable as some common themes evolved. Several participants had difficulty with the prevalent nature of the AF treatment decision.“It is impossible to disassociate other factors, especially drug side effects, when asked to choose between drug treatment and no treatment. I know if I take blood thinners to prevent stroke, I am 100% more susceptible to uncontrolled bleeding, which is an increased risk factor directly caused by a side effect of taking blood thinners.”“In the case of taking anticoagulants lowering the probability of having a stroke, if there were no costs or side-effects associated with treatment, I'm sure most people would opt for treatment. However, once you factor in cost and potential side-effects, it doesn't seem 80like a good decision for someone to take anticoagulants to minimize risk of stroke by a few percentage points in comparison to someone that isn't taking any treatments whatsoever.”Many existing studies looking at ambiguity use simple probabilistic questions. The famous Ellsberg paradox, for example, is based on simple decisions around pulling balls from bags and winning money38. In the clinical world, decisions are inherently more complex. Various factors related to the decision must be considered, and each of those factors may be precise or ambiguous. While the hypothetical AF scenario was made simpler by removing those extra attributes from the decision, some participants found that they couldn’t disassociate them.An additional piece of feedback noted the usefulness of seeing risk reductions in the treatment case side by side with the risk in the no treatment case.“When risk of stroke with medication is presented along with risk without it, it starts to make one think more about whether the medication is even useful and about whether there are other factors for me individually that are relevant besides medication. Presentation of risk with medication by itself doesn't seem to raise those other important questions as readily.”While this wasn’t an explicit goal of the study, it speaks to the amount of information that participants are given when making treatment decisions. 81Without knowing their baseline risk, it is difficult to assess whether medication is even going to be useful for them. Comments like this, if the sample is representative, should encourage clinical professionals to commit to shared decision-making, and ensure that patients have all the necessary information they need to make an informed decision, including relevant ambiguity associated with that information. Thus, it was deemed that a revision to the scenarios as presented should be considered and discussed with the expert group prior to launching the full survey, as described in the next section.3.5.4 Changes before final surveyBased on a second expert review and discussions with the thesis committee following the pilot survey, several changes were made to the survey before launching it with all presentation techniques included. An important addition was a new question asking participants to state which technique for presenting ambiguity they prefer. This question was asked after the primary and secondary outcome pages, but before completing demographics, as personal preference may have confounded earlier questions. Personal preferences and biases tend to be heavily emphasized when making decisions, and can often result in bad decisions (incongruent values and/or poor knowledge) being made. It was chosen to collect personal preference for presentation technique to 82understand whether patients preferred more explicit, less explicit, or no presentation of ambiguity at all. Participants were shown the non-ambiguous and ambiguous version for the technique that they were randomized to, and asked them to assume that there were two patient information leaflets – one with each technique. They were then asked which leaflet they would prefer to read, with ambiguous format, unambiguous format, equal, neither, and unsure being options. Some problems were experienced in analyzing several of the outcomes. In this pilot, worry, magnitude of risk perception, and trust, were measured over three scenarios, once using the ambiguous presentation format and twice using precise representations of the upper and lower confidence intervals. This proved to be problematic, as the response values were derived from different point estimates, therefore calculating a “change” score was deemed invalid. It was initially structured in this way because original interests were in how the different non-ambiguous point estimates influenced risk perception in the low-risk and high-risk cases. After further discussion with the committee, it was decided that this was outside the scope of this study, as it was not related to ambiguity. One of these scenarios was therefore dropped, and the point estimate in the non-ambiguous scenario was converted to be the same as the ambiguous point estimate (as opposed to an extreme of the confidence interval), which 83allowed the computation of valid change scores for the full survey. Randomizing the order of these two scenarios was also now possible, as the correct answers to the knowledge question were no longer obvious if you saw the non-ambiguous page first.Finally, as mentioned previously, issues with the question related to the respondents’ intention to take oral anticoagulation were noticed after collecting the data. According to the theory of planned behavior, attitudes towards a behavior (e.g. preferring treatment over no treatment), is not equivalent to an actual intention change94, and as such, it was decided to go back to the literature to try to construct a better question, which uncovered Johnston’s guide.86 My initial intention question, which was based on Sheridan’s study85, collected responses on a 5-point Likert scale, whereas changes derived from Johnston’s guide led to the use of  a 7-point scale. By increasing the scale’s domain, and taking into account the high variability that was seen in the pilot data, it was decided that a 2-point change might help to filter out the inherent variability and show a real change, especially considering the fact that changing 2 points on a 5-point scale reflects a drastic change, compared to 2 points on a 7-point scale.843.6 Sample sizeThe sample size calculation for the study was primarily based on the proportion of respondents who would report at least a 2-point change in intention to take treatment when shown the ambiguous risk versus the non-ambiguous risk. Since the intention question differed in the pilot, the calculations used in this section are based on the 1-point change. It was expected that the 1-point change in the pilot would be roughly equivalent to the 2-point change in the full survey, though more precise. It was also determined that 20% of respondents changing behavioural intention would be sufficiently large that the results would inform communicators of health risks. The null hypothesis is that, for each technique, less than 20% of participants will change their intention by at least 2 points when described the ambiguous risk compared to the non-ambiguous risk. Determining the sample size requires consideration of not just the overall proportion, but the lower confidence interval. For example, if the mean proportion with a 2-point change were 20%, this would not reject the null if the confidence interval overlapped zero (eg. 20% proportion change, 95%CI [0-40]). This requires an assumption about what proportion is ‘different’ for the lower bound of the confidence interval. Assuming a two-sided test of alpha=0.05, powers were calculated for different plausible sample sizes (Table 8). Based on the pilot results where a 85difference in the proportion of 1-point change of 40% was observed between the visual range group and the textual prefix group (65% for the visual range compared to 25% for the textual prefix), and acknowledging that there is likely a degree of variation (as the textual prefix change scores showed), it was determined that a sample size of ~60 participants per technique would be sufficient to test a 20% difference in the 2-point change hypothesis. This gives >90% power to detect a proportion change of at least 20% of people changing their intention by 2 points, allowing for some variation in the null proportion (P0). This sample size does not account for multiple statistical tests, which is discussed in the next chapter.A secondary sample size calculation explored comparisons of outcomes between techniques (comparing proportions of independent samples), as opposed to comparing a single technique to a null proportion. The decision to include all techniques rather than focus on a few meant this cross-technique comparison was not expected to be powered. Instead, it was expected that results would indicate which techniques would be appropriate for further comparative studies. Nevertheless, the calculations (Table 9) show, that a difference in ~50% of respondents changing by 2 points would have been detectable.863.7 DiscussionThis chapter describes the process that was followed to develop the final survey to test how presentation of ambiguity influences various outcomes in a hypothetical AF scenario. Based on the descriptive results from this pilot, it appears that the visual range technique influences intention, trust and decisional uncertainty more than the textual prefix technique, though this is unverified from a statistical standpoint. The results from this pilot study were used to perform a sample size calculation and determine that ~60 participants per technique was sufficient for the full survey. The final survey results are described in the next chapter.On reflection, the development of the survey was challenging. There were several outcomes that were useful to collect, but finding validated questions and scales proved to be difficult for some of them. There were additional constraints to keep the survey relatively short and minimize the cost, which meant items from validated questionnaires (e.g. the intention questionnaire) were sometimes removed, or slightly modified, to ensure that the questions fit in the AF context. It is recognized that there is a risk in doing so to maintain reliability and validity, but felt to be a good compromise between using invalidated questions and constructing an entirely new scale using factor analysis, which would have greatly increased the scope of this work. 87Given more time, the questions could have been validated in the AF context, additional data could have been collected using cognitive interviews and debriefs, and a more thorough analysis could have been conducted on the pilot data. However, within the constraints of the time and resources for this project, it was determined that the survey was of sufficient quality to proceed.Additionally, some issues were recognized with using Mturk. While evidence supports its use and it shows promise in clinical areas, it is still a relatively new way to collect data. Due to the nature of MTurk (people are already signed up to participate in surveys), extrapolating results to a more specific clinical population may be problematic. Additionally, due to its use as an online tool, it may result in skewed age distributions, as younger people tend to be more familiar with technology. This, however, could be controlled for in future studies by stratifying the sampling.88Table 5: AA-Med scale80 to measure ambiguity aversion1. Conflicting expert opinions about a medical test or treatment would lower my trust in the experts. COG3:2. I would not have confidence in a medical test or treatment if experts had conflicting opinions about it. AFF1:3. Conflicting expert opinions about a medical test or treatment would make me upset. AFF2:4. I would not be afraid of trying a medical test or treatment even if experts had conflicting opinions about it.*5. If experts had conflicting opinions about a medical test or treatment, I would still be willing to try it.*6. I would avoid making a decision about a medical test or treatment if experts had conflicting opinions about it.All questions answered on a 5-point numeric response scale, with end-points labelled as “strongly disagree” and “strongly agree”* = reverse coded 89Table 6: Pilot demographic resultsTechnique Characteristic All Participants (n = 100)Range (n = 48)Textual Prefix (n = 52)Sex N (%)    Male 47 (47%) 24 (50%) 23 (44%)Female 53 (53%) 24 (50%) 29 (56%)Age N (%)    18-25 3 (3%) 2 (4%) 1 (2%)26-35 44 (44%) 20 (42%) 24 (46%)36-45 30 (30%) 12 (25%) 18 (35%)46-55 12 (12%) 8 (17%) 4 (8%)56-65 9 (9%) 5 (10%) 4 (8%)66-75 2 (2%) 1 (2%) 1 (2%)75+ 0 (0%) 0 (0%) 0 (0%)Education N (%)    None 0 (0%) 0 (0%) 0 (0%)High School Diploma 34 (34%) 13 (27%) 21 (40%)Associate Degree 16 (16%) 7 (15%) 9 (17%)Undergraduate Degree 41 (41%) 22 (46%) 19 (37%)Post-graduate Degree 9 (9%) 6 (13%) 3 (6%)Ambiguity Aversion mean score (SD)5-point scale3.15 (0.83) 3.01 (0.80) 3.28 (0.85)Perceived Numeracy mean score (SD)6-point scale4.73 (1.00) 4.82 (0.89) 4.65 (1.09)90Table 7: Pilot descriptive resultsOutcome Visual range Textual prefixIntentionNo ambiguity (SD) 3.73 (1.05) 3.85 (1.00)Ambiguity (SD) 3.43 (1.13) 3.67 (1.12)Change (SD) -0.29 (1.07) -0.17 (0.58)Cohen’s D effect size -0.28 -0.17Distance (SD) 0.79 (0.77) 0.29 (0.54)1-point proportion change (SD) 0.65 (0.48) 0.25 (0.44)2-point proportion change (SD) 0.1 (0.31) 0.04 (0.19)3-point proportion change (SD) 0.02 (0.14) 0.0 (0.0)Decisional uncertaintyNo ambiguity (SD) 3.50 (1.10) 3.72 (1.13)Ambiguity (SD) 3.86 (1.08) 3.87 (1.05)Change (SD) 0.36 (1.28) 0.14 (0.39)Cohen’s D effect size -0.33 -0.14Distance (SD) 0.93 (0.95) 0.14 (0.39)Magnitude of risk perceptionNo ambiguity - low point (SD) 1.46 (0.80) 1.27 (0.53)No ambiguity - high point (SD) 2.83 (1.00) 2.87 (0.95)Ambiguity (SD) 1.96 (0.58) 1.81 (0.69)WorryNo ambiguity - low point (SD) 2.48 (1.03) 2.33 (1.00)No ambiguity - high point (SD) 3.27 (1.01) 3.35 (0.93)Ambiguity (SD) 2.85 (0.95) 2.79 (0.82)TrustNo ambiguity (low point) (SD) 4.36 (0.52) 4.31 (0.62)Ambiguity (SD) 4.09 (0.74) 4.31 (0.64)91Table 8: Powers for different plausible sample sizes using normal approximation methods. Highlighted cells are those taken into consideration to determine sample size.Within comparison: Normal ApproximationP1NTotal P0 0.4 0.45 0.5 0.55 0.6 0.65 0.730 0.20 0.74 0.88 0.96 0.99 1.00 1.00 1.0030 0.25 0.48 0.69 0.85 0.94 0.99 1.00 1.0030 0.30 0.24 0.44 0.65 0.83 0.94 0.98 1.0040 0.20 0.84 0.95 0.99 1.00 1.00 1.00 1.0040 0.25 0.58 0.80 0.93 0.98 1.00 1.00 1.0040 0.30 0.29 0.54 0.77 0.92 0.98 1.00 1.0050 0.20 0.90 0.98 1.00 1.00 1.00 1.00 1.0050 0.25 0.67 0.87 0.97 0.99 1.00 1.00 1.0050 0.30 0.35 0.63 0.85 0.96 0.99 1.00 1.0060 0.20 0.94 0.99 1.00 1.00 1.00 1.00 1.0060 0.25 0.74 0.92 0.99 1.00 1.00 1.00 1.0060 0.30 0.40 0.70 0.90 0.98 1.00 1.00 1.0070 0.20 0.97 1.00 1.00 1.00 1.00 1.00 1.0070 0.25 0.80 0.95 0.99 1.00 1.00 1.00 1.0070 0.30 0.45 0.76 0.94 0.99 1.00 1.00 1.0080 0.20 0.98 1.00 1.00 1.00 1.00 1.00 1.0080 0.25 0.84 0.97 1.00 1.00 1.00 1.00 1.0080 0.30 0.50 0.81 0.96 1.00 1.00 1.00 1.0090 0.20 0.99 1.00 1.00 1.00 1.00 1.00 1.0090 0.25 0.88 0.98 1.00 1.00 1.00 1.00 1.0090 0.30 0.54 0.85 0.98 1.00 1.00 1.00 1.00100 0.20 0.99 1.00 1.00 1.00 1.00 1.00 1.00100 0.25 0.91 0.99 1.00 1.00 1.00 1.00 1.00100 0.30 0.58 0.89 0.99 1.00 1.00 1.00 1.0092Table 9: Sample size powers for between technique comparisonsBetween Comparison  P2NPerGroup P1 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.650.25 0.09 0.19 0.36 0.56 0.74 0.88 0.95 0.990.30  0.08 0.18 0.34 0.53 0.72 0.87 0.950.35   0.08 0.17 0.33 0.52 0.71 0.860.40    0.08 0.17 0.32 0.52 0.710.45     0.08 0.17 0.32 0.520.50      0.08 0.17 0.330.55       0.08 0.17500.60        0.080.25 0.11 0.27 0.50 0.73 0.89 0.97 0.99 1.000.30  0.10 0.25 0.48 0.71 0.88 0.97 0.990.35   0.10 0.24 0.46 0.70 0.87 0.960.40    0.10 0.23 0.45 0.69 0.870.45     0.09 0.23 0.45 0.700.50      0.09 0.23 0.460.55       0.10 0.24750.60        0.100.25 0.12 0.34 0.62 0.85 0.96 0.99 1.00 1.000.30  0.12 0.32 0.59 0.83 0.95 0.99 1.000.35   0.11 0.30 0.58 0.82 0.95 0.990.40    0.11 0.29 0.57 0.81 0.950.45     0.11 0.29 0.57 0.820.50      0.11 0.29 0.580.55       0.11 0.301000.60        0.1193Figure 12: First-order visualization with randomly placed icons, illustrating the potential first-order of uncertainty74Figure 13: Simple animated GIF example.Imagine an animated image that could take 1 of 4 forms, represented by the 4 icon arrays shown here. Every second, the GIF could transform into the next image. Assume the confidence interval is 14 (row 2) to 30 (row 4). The same concept could be applied to a standard icon array.94Figure 14: Hypothetical AF scenario with ambiguityFigure 15: Knowledge questions from surveyFigure 16: Behavioural intention question from survey954 Survey resultsHaving refined the survey based on feedback and results from the pilot study, I released the modified survey to a larger sample. This chapter describes the results from a population sample completing the survey described in Chapter 3. It begins by describing the statistical methodology. Next, it describes the randomization, dropout rates, and sample characteristics using demographic data. Results of primary and secondary outcomes are analyzed using statistical regression models. Finally, a conclusion is provided based on the results. 4.1 Statistical analysis methodsThe raw data was coded and analyzed using Stata/SE 13. The analysis was structured into three distinct phases. First, the survey completion results and randomization was examined to ensure randomization worked appropriately, and whether dropouts indicated any problems with the survey. Second, descriptive analyses were run to visualize the data prior to running any statistical models. This included investigations of demographic variable distributions alongside initial assessments of how outcomes were influenced by presentation technique and demographics. The third step made up the bulk of the analyses, and included the specification of logistic and linear models for the collected outcomes to determine associations between the techniques and outcomes, and identify any potential confounding variables.96The first two steps were conducted by examining tables of results and plotting histograms and boxplots for demographic variables against my outcomes of interest. These plots were used to look for potential correlations that would need to be addressed to avoid issues with collinearity in the regressions., as this could lead to increased variance in the model coefficients and make estimates very sensitive to minor changes in the model, which might influence interpretability of the model. The descriptive plots also showed whether respondent demographics were evenly distributed across techniques, so to judge whether this might require adjustment in later analyses. Finally, due to the potential for noise in the data as discussed in chapter 3, it was chosen to investigate how the order of techniques (from highest proportion change to lowest proportion change) changed between the 1-point, 2-point, and 3-point proportion change values. It was expected that the order would begin to normalize (stay the same) after the 2-point change, as the noise in the data was filtered out.Since it was hypothesized that interactions might exist between certain demographics and each technique, the distributions of demographic results were examined, and where appropriate recoded into binary variables. For example, age groups were collapsed, as analyzing interactions between techniques and each age group would have been prohibitively complex and the regression tables 97would have been very large. Additionally, the size of the sample may have prohibited the number of interactions that could be explored as some combinations were not common and had no matches in the sample, resulting in small or empty cells. After recoding, age was separated at above and below 26 (the median value), education was separated at the university degree level, (the median value) and outlier sex values (Prefer not to say: <1%, Other: <1%) were grouped into the male category (it had a lower N than the female group). Previous studies have evidence that suggests that an ambiguity aversion score of greater than 4 can be considered as “ambiguity averse” and that roughly 21% of the general population has high (≥4) ambiguity aversion80. Results from the current study indicated that only a small proportion reached this threshold (~3%). Because of this, it was chosen to reduce this threshold from ≥4 to a score of ≥3.2, to achieve a similar proportion that can be considered high. Similarly, subjective numeracy was dichotomized at a score of 3, as scores greater than 3 are considered “high”.95In all outcomes, changes were considered both with and without a direction. Directional change scores were calculated by subtracting the non-ambiguous result from the ambiguous result. For example, a lower trust score in the ambiguous scenario would result in a negative result, indicating a decrease in trust. This is important to consider as results are analyzed and interpreted. Non-98directional change scores (distance) were calculated as the absolute change between scores.After recoding, the third step of the analyses began, which consisted of univariate regressions. It is worth nothing here that it is possible to treat each individual arm as an independent study, due to the fact that each group has its own control and that the primary objective was not for cross-technique comparisons. However, doing so limits interpretability as different demographics between groups might be influential, hence it was decided to use all techniques in a single model. Initial models examined outcomes with potential confounding variables. For example, a model with intention change as the dependent variable and sex as the independent variable was created. Baseline levels for outcomes were treated identically to other potential confounders. This allowed for the investigation of potential relationships between independent variables and the outcome of interest. If the resulting model found p<0.2 for a potential confounder, it was further tested in a new model, which included the presentation technique as an independent variable. If the inclusion of the potential confounder resulted in a > 10% change compared to the unadjusted model (the same model without including the potential confounder as an independent variable), it was kept in the final multivariate model. The corresponding interaction between the confounder and technique was then 99explored to assess whether the outcome effect changes over different levels of the confounding variable.Since most variables were categorical, it was important to consider how the reference class was chosen, especially for the presentation format. The choice of reference class was based on the descriptive results, and was chosen based on the technique that appeared to be least influential across all the outcomes of interest. This reference class was held constant throughout the analysis to make comparisons more interpretable.Finally, the overarching multivariable model was to be constructed based on the results from the univariate analyses. Variables that showed potential confounding were included in the final adjusted model, to adjust for their effect. When confounding was deemed a possibility, interactions were explored between those confounding variables and the presentation technique.Of importance throughout the analysis were considerations of the problem of multiple comparisons, given that each technique’s p-value for the outcome of interest was considered independently. To account for this potential issue, both unadjusted and Bonferroni-adjusted p-values are considered when interpreting results. However, due to this study’s exploratory nature, it is not necessarily useful to have strict rules regarding the adjusted values, and given that the literature is conflicted on how and when Bonferroni correction should be 100applied, I am careful not to strictly rely on the adjusted value and am therefore less conservative than a true Bonferonni-adjusted analysis.96 4.2 Survey completion results and randomizationOut of 610 respondents, 34 (6%) did not complete the survey and were not included in the analysis. Demographics were collected at the end of the survey, and as a result, it was not possible to determine whether there were demographic factors that were associated with the dropout rate. Dropouts happened early in the survey and so did not suggest any serious issues.The randomization process resulted in relatively well-balanced sample sizes between groups, with the largest group (visual range) having 78 participants and the smallest group (expert agreement) having 55 participants. Since randomization was not blocked, this variation is expected when allocating 610 respondents within 9 groups. 4.3 Sociodemographic characteristicsThe descriptive demographic results were assessed to determine whether demographic variables were evenly distributed across techniques. The table of demographic characteristics can be seen in Table 10.The sample differed from the general US population in some of the measured characteristics.97 The sample was highly educated, with 52% of 101respondents having a bachelor’s degree or higher. This contrasts with the US census records, which indicate that only 27% of Americans have that high a level of education.97 Additionally, there was only 1 participant (<1%) who indicated that they have completed no schooling, compared to 18% of the American general population. Age was slightly skewed as well, with many participants being in the younger age categories (41% aged 26-35 compared to only 15% in the American population).  Sex, on the other hand, was represented well overall and matched closely with US census data. When comparing sociodemographic characteristics between the samples randomized to different techniques, sex was relatively well distributed when considering the entire sample (47% male, 52% female, <1% other, <1% prefer not to say) (Table 10). However, the percentage of females randomized to the visual range (65%), the textual prefix (60%), and the textual range (60%) techniques was much higher than the percentage of males.Age was also relatively equally dispersed across techniques with the majority of participants falling in the 26-35 age range (41%) or the 36-45 age range (25%), though again, there were some exceptions to these proportions. The sample randomized to the expert disagreement technique, for example, roughly reversed these percentages, where the 26-35 range (29%) represented less than 102the 36-45 range (35%). All other age ranges represented less than 15% of the sample.Level of education was well spread across all techniques, with the majority of participants having completed a high school diploma (27%), an associate’s degree (21%), or an undergraduate degree (40%). Other relevant demographics include subjective numeracy and ambiguity aversion. Subjective numeracy was well balanced across all techniques with a mean value of 4.84 (SD=0.930, range 1-6). Ambiguity aversion was also well balanced, with a mean value of 3.18 (SD=0.396, range 1-5).4.4 Primary outcome: Intention4.4.1 Descriptive resultsChange in intention to take oral anticoagulation was the primary outcome (defined as the difference between intention in the non-ambiguous scenario and intention in the ambiguous scenario), but as described in the methods, this was considered in terms of both ‘distance’ (where the direction of the change is irrelevant) and average ‘change’ (where the direction of change was pre specified). Proportions were coded as a binomial 1-point, 2-point, and 3-point change in intention, based on the discussion in Chapter 3. The 1-point and 3-103point changes were also calculated to get a better indication of the magnitude of change within each technique (Table 11).Descriptive results appear to show a slight change in intention (Table 11) for most techniques, but not for those who saw ambiguity represented as a textual prefix (“about N out of 100”), ratings of point estimates based on existing evidence (GRADE - “Our confidence in this effect estimate is limited”), how long the treatment had been available (“This treatment has been available for N years”), or expert agreement on the evidence (“Experts disagree on the evidence”), as those change in intention scores appear to cluster around zero (Figure 17). Additionally, standard deviations were high (Table 10) which indicated substantial variability and the possibility of intention change in either direction. Compared to the textual prefix (change=-0.07, SD=0.94) and how long the treatment had been available (change=-0.09, SD=1.00), which appeared to have the least effect, all other techniques had change scores between -0.25 (expert agreement) and -0.62 (textual range), though some may have been influenced by outlier responses. For example, in the technique based on how much experts agree or disagree with the evidence, the mean drops to -0.15 (SD=0.94) if the outlier response (6-point change in intention) is dropped, though there is no reason to believe that this participant’s response was not valid therefore it is included in analyses.104By considering the directional change and distance (binomial) scores separately, two questions based on my hypothesis could theoretically be answered: is presentation technique influencing intention (distance), and if so, in which direction is intention influenced (directional change).A proportion of change in intention greater than the lower bound of 20% specified in the power calculation was identified when ambiguity was presented using three techniques: (1) visual range (icon array with explicit range) (proportion=0.37, SD=0.49), (2) the animated GIF (dynamic icon array with changing point estimate) (proportion=0.26, SD=0.44), and (3) the textual range (specific textual representation of a range – e.g. “between 3 and 13%”) (proportion=0.37, SD=0.49) (Table 11). The ordering of techniques was also compared using the 1-point, 2-point, and 3-point change in intention. Based on the pilot results, it was anticipated that the amount of signal relative to noise for the 1-point change would be low (see Chapter 3), but would be increasingly large and more informative for the 2-point and 3-point changes. The results of this descriptive verification are shown in (Figure 18). The technique order based on magnitude of intention change appears to become largely stable after the 2-point check, given that the variability between the 1 and 2-point steps is much greater than between the 2 and 3-point steps. Specifying a proportion of change greater than 3-points would result in extremely low proportions for all techniques, as the 105scale only covers 7 points. This provided further confidence that the 2-point change was the best to analyze and interpret.As part of the descriptive analysis, the correlation matrices between demographics were examined to identify potential collinearity issues in later analyses. Given Mukaka’s rules of thumb when interpreting correlation coefficients, the majority of coefficients were in the ‘negligible’ range (0-0.3), and all coefficients were below 0.5, which represents the upper end of the ‘low’ range, which gave me confidence that collinearity would not be problematic in later regressions.98Due to the primary outcome of interest being a change in intention from the non-ambiguous technique to the ambiguous technique, the baseline intention’s (the non-ambiguous case) influence on the change was considered, and whether this needed to be adjusted for in later analyses. Correlation matrices between intention change and baseline (Appendix D) were analyzed, and results showed primarily low to moderate correlations amongst variables. This observation, alongside similar baseline means between techniques, suggested that baseline intention might be having an effect and it was therefore included in future regressions.Since results were difficult to interpret given that meaningful changes in outcomes were unclear, it was useful to examine effect sizes using Cohen’s D 106(Table 12). D scores between 0.2 and 0.5 are typically considered “small”, but not insignificant effect sizes. Most techniques, apart from the textual prefix technique (D=-0.04) and the technique based on how long the treatment had been available (D=-0.05), show D scores < -0.2, which, when considered as absolute values, suggest that some techniques may be having a negative effect on intention.4.4.2 Logistic regressionA model for a 2-point change in intention was built using univariate logistic analyses to examine potential predictor variables. This model should answer the question as to whether presentation technique influences intention, but will not provide any answer regarding the direction of that change. Individual regressions were conducted to test for confounding between each demographic variable collected and the technique for presenting ambiguity. The use of regression was primarily to understand the relationship between the intention change, the technique used, and other independent variables collected through the survey. Analyses found that sex, education, and subjective numeracy were potentially confounding intention (Appendix E), though type 3 tests for the interactions with presentation technique were insignificant (Appendix F). Nevertheless, these variables were included in the final adjusted model (Table 13).107The resulting model (Table 13) found intention change to be significant (p < 0.001). The overall model fit was also significant (p < 0.001). The type-3 test for presentation technique resulted in a significant result (chi2=33.73, p<0.001), and it was therefore concluded that at least one of the techniques was having an effect on intention. Based on further analyses, three techniques showed a significant 2-point change in intention: the visual and textual representations of ambiguity using explicit ranges, and the dynamic GIF technique in which the specific point estimate of the icon array changes based on sampling the underlying distribution. Interestingly, the two range techniques produce nearly identical results, the difference in techniques being that one presents the information visually on an icon array while the other simply states the range. The adjusted proportion changes are illustrated in Figure 19. Only the range techniques have intention change scores where the lower confidence intervals sits above the 20% proportion change threshold, and as a result, it is possible to conclude that they are influencing intention more than the reference class (technique based on how long the treatment has been available). Additionally, the mean and 95% confidence intervals of these techniques are very similar, giving us confidence that the range techniques, which are conceptually very similar, are actually having more of an effect than the other techniques. 108Finally, given that a directional change in intention was possible, logistic analyses were run for a 2-point decrease in intention (Appendix G). While results were similar, all techniques were insignificant at the 20% proportion change level, indicating that some participants were likely changing intention in the other direction.4.4.3 Linear regressionIn addition to the proportion change using logistic regression, which tested whether 20% of participants changed intention by at least 2 points, a linear association was also used to investigate whether there is a specific directional change associated with intention. Again, univariate analyses (Appendix H and I) were run to understand potential confounders and interactions. While sex, education, and subjective numeracy were found to be potential predictors in the logistic tests, this did not hold true for the linear tests. No potential confounders were found in the univariate models, so the final model includes only the intention change and presentation technique (Table 1). Though the textual range showed significance, the overall model and presentation technique F-test were not significant (p > 0.05), therefore the null hypothesis couldn’t be rejected and it was not possible to conclude that there was a directional effect on intention in any of the techniques.1094.5 Secondary outcomesSimilar analyses were conducted for secondary outcomes, and found that trust and decisional uncertainty were significantly influenced by the presentation technique for ambiguity. Descriptive results are shown in Table 15. Baseline values were analyzed for each outcome to check for potential ceiling and floor effects, but found no obvious outliers. Univariate linear models were constructed for each secondary outcome to test for potential confounders and interactions between sociodemographic/participant characteristics and outcome variables, using an identical approach to that used for the primary outcome, intention. Investigations of non-directional proportional changes were also conducted in all secondary outcomes (Appendix J), but found results were similar to the raw change results and so it was concluded that the non-directional change analysis (logistic regression) was not necessary.4.5.1 Risk perceptionDescriptive analysis of directional change identified two techniques that appeared to influence the perception of risk magnitude more than others; the GRADE technique (change=0.23, SD=0.65), in which ambiguity is described as a rating for the quality of the evidence that makes up that point estimate, and the gradient range technique, in which the opacity of the range changes based on the underlying probability distribution of the point estimate’s evidence 110(change=0.18, SD=0.67). All other techniques resulted in a change in the perception of risk magnitude between -0.1 and 0.11 on the 5-point scale. When considering distance, as opposed to directional change, all techniques produced a result between 0.15 and 0.37. There was not a large amount of variation, though the technique in which the denominator of the risk estimate is taken directly from existing studies (number of people technique) (distance=0.35, SD=0.60), the gradient range technique (distance=0.35, SD=0.59) and the textual range technique (distance=0.37, SD=0.67) appeared to influence risk magnitude to a greater degree than all of the other techniques. Given that there was not much consistency between the results using directional change and the distance in identifying the most influential techniques, apart from the gradient range, it was unclear as to whether there was a true effect in this outcome. Despite this scepticism, outcomes were analyzed in regression analyses but found no linear associations (Table 16).4.5.2 WorryAnalysis of the directional change for worry generated change estimates that were small (ranged -0.02 (textual prefix technique) to 0.19 (gradient range technique)). When analyzing distance, the change scores for most techniques were larger, in the 0.30 (animated GIF, where the icon array dynamically 111changes) to 0.43 (gradient range, where opacity of range changes based on underlying distribution) range, apart from the technique in which ambiguity is embedded in how long the treatment had been available (time available) (distance=0,18, SD=0.46) and the technique in which ambiguity is embedded in how much experts agree/disagree (distance=0.13,SD=0.34), which had less of an effect. Further regression analysis (Table 16) showed no association in the unadjusted case, therefore the null hypothesis (where all techniques are equal in terms of an effect on worry) could not be rejected and it was not possible to conclude that worry was influenced by one or more techniques.4.5.3 TrustDescriptive results indicated that the technique used to present ambiguity decreased trust in most of the techniques when looking at raw change. The textual prefix technique (change=-0.06, SD=0.40) and time available technique, in which the ambiguity is embedded in how long the treatment had been available (change=-0.03, SD=0.32) showed minimal changes in trust, whereas the majority of other techniques showed a change in trust ranging from -0.25 (gradient range technique) to -0.5 (animated GIF technique). Analysis of distance showed consistent findings; the textual prefix (distance=0.21, SD=0.35) and time available 112(distance=0.12, SD=0.30) techniques showed smaller mean change in trust than the remainder, which ranged from 0.41 (expert agreement technique) to 0.69 (animated GIF technique). Since the directional and distance results were similar, there was confidence that technique type was having a significant effect on trust. Furthermore, effect-sizes were in the -0.3 (gradient range technique) to -0.56 (animated GIF technique) range for most techniques (not including the textual prefix and time treatment available techniques), indicating small to moderate changes, which gave us further confidence that statistical modelling might show an effect in some techniques.99 This was verified using linear regression, which found the visual range, GRADE, expert agreement, animated GIF, number of people, textual range, and gradient range techniques were significantly decreasing trust (p < 0.05), with coefficients ranging from -0.25 to -0.50 (Table 17) (Figure 20). The model as a whole was also significant (R2=0.17, p<0.001). The resulting model included age as a potential confounder, as some technique coefficients changed by greater than 10% when it was included in the model, and it was significantly associated with change in trust (p<0.2). This is consistent with my hypothesis that the more explicit presentations of ambiguity might decrease trust, and is also consistent with existing evidence. 1134.5.4 Decisional uncertaintyOut of the set of secondary outcomes, decisional uncertainty had the greatest variation in change scores, ranging from -0.05 (number of people technique) to -0.75 (textual range technique). The outlier techniques, based on visual inspections, were the visual range (change=-0.44, SD=1.18) and the textual range (change=-0.75, SD=1.28) techniques, as they had larger mean change scores compared to the other techniques. The distance results showed a similar pattern, with the technique based on how long the treatment had been available (distance=0.24, SD=0.46) having the lowest overall effect on decisional uncertainty, while the visual range, in which the range is explicit in the icon array (distance=0.77, SD=0.99), the textual range, in which the range is explicit numerically in text (distance=1.04, SD=1.05), changed decisional uncertainty the most. This was further investigated using linear regression, and both the visual range and textual range techniques were found to significantly increase decisional uncertainty (p < 0.05) (Table 18). Sex and subjective numeracy were adjusted for, as there was a potential for confounding (Figure 21).4.5.5 KnowledgeParticipant knowledge was influenced by presentation technique, though not to the degree that had been expected a priori. It was hypothesized that 114presentation techniques that specifically named or showed the likely extremes of an estimate (e.g. ranges) would result in participants being more knowledgeable with respect to the degree of ambiguity around the point estimates. This appears to be confirmed, though the less explicit techniques (e.g. experts disagree, number of people) resulted in means that were relatively close to the correct values, albeit with larger confidence intervals (Figure 22). This was confirmed by examining distributions of the responses, as it was expected that the more explicit techniques would cluster around the “correct” values, while the less explicit techniques would be much more varied (Appendix K and L).Given that this clustering was expected, and knowing that mean knowledge scores could be skewed by outliers, I additionally grouped participants into those who answered correctly and those who didn’t (Table 19). “Correctly”, in this case, was defined as participants having answered both the knowledge questions to within 1 point of the true value (i.e. true upper and lower bound ±1). Here, there is clearly a pattern, in that those who were randomized to the more numeric-based techniques were highly likely  (>70%) to answer both these questions correctly, compared to those randomized to the text-based techniques (<11%). I had planned to run subgroup analyses on those who answered correctly, but this proved to be infeasible as sample sizes were low for techniques in which knowledge was poor (some techniques had as low as 4 115people who were deemed to have answered both the knowledge questions correctly).4.5.6 Preference for techniqueParticipants were asked which technique they liked best to assess preference for whether ambiguity should be presented. The techniques providing ambiguity in a less explicit manner (e.g. time available technique, in which ambiguity is embedded in how long the treatment had been available) tended to be preferred over the more explicit techniques Figure 23. This appears to have a negative correlation with change in intention to take treatment: techniques that result in less change in intention tend to be preferred. Additionally, techniques that appear to negatively impact knowledge tend to be preferred. 4.6 ConclusionThis chapter described the results of a survey conducted in 610 respondents in the USA. It was found that a minimum of 20% of participants changed their intention score by at least 2 points in both the visual range technique (adjusted proportion=0.39, SE=0.06 and the textual range technique (adjusted proportion=0.37, SE=0.06). Five of the 9 techniques were found to decrease trust, including the textual range (adjusted change=-0.20, SE=0.13), 116GRADE (adjusted change=-0.35, SE=0.13), expert agreement (adjusted change=-0.26, SE=0.13), animated GIF (adjusted change=-0.46, SE=0.13), and number of people (adjusted change=-0.27, SE=0.13) techniques. Interestingly, the same techniques that changed intention were also significant in increasing decisional uncertainty, with the visual range increasing uncertainty by 0.31 (SE=0.15) and the textual range increasing uncertainty by 0.63 (SE=0.16). The statistical significance of these analyses need to be interpreted carefully, due to the multiple tests being conducted.It was also discovered that more explicit techniques (e.g. visual range) were better at improving knowledge than the less explicit techniques (e.g. textual prefix). While knowledge is not on its own predictive of intention change, it plays a key role in the theory of planned behaviour so can be considered a relevant finding. Finally, participants tended to prefer techniques that presented ambiguity in a less explicit way (e.g. the expert agreement technique). The implications of these results will be discussed in Chapter 5.117Table 10: Demographic resultsDemographic  Technique Overall (n=576)Visual Range (n=78)Textual Prefix (n=60)GRADE (n=60)Time Treatment Available (n=67)Experts Disagree (n=55)Animated GIF (n=61)Number Of People (n=62)Gradient Range (n=68)Textual Range (n=65)Sex, n (%)           Male 271 (47) 27 (34.62) 24 (40.00) 32 (53.33) 35 (52.24) 28 (50.91) 35 (57.38) 35 (56.45) 29 (42.65) 26 (40.00)Female 302 (52) 51 (65.38) 36 (60.00) 28 (46.67) 32 (47.76) 26 (47.27) 26 (42.62) 26 (41.94) 38 (55.88) 39 (60.00)Other 1 (0) 0 (0.00) 0 (0.00) 0 (0.00) 0 (0.00) 0 (0.00) 0 (0.00) 0 (0.00) 1 (1.47) 0 (0.00)Prefer not to say 2 (0) 0 (0.00) 0 (0.00) 0 (0.00) 0 (0.00) 1 (1.82) 0 (0.00) 1 (1.61) 0 (0.00) 0 (0.00)Age, n (%)           18-25 51 (9) 6 (7.69) 8 (13.33) 5 (8.33) 5 (7.46) 5 (9.09) 7 (11.48) 5 (8.06) 6 (8.82) 4 (6.15)26-35 238 (41) 33 (42.31) 22 (36.67) 26 (43.33) 33 (49.25) 16 (29.09) 25 (40.98) 28 (45.16) 35 (51.47) 20 (30.77)36-45 146 (25) 21 (26.92) 10 (16.67) 13 (21.67) 12 (17.91) 19 (34.55) 16 (26.23) 17 (27.42) 16 (23.53) 22 (33.85)46-55 76 (13) 10 (12.82) 11 (18.33) 9 (15.00) 9 (13.43) 9 (16.36) 7 (11.48) 6 (9.68) 3 (4.41) 12 (18.46)56-65 50 (9) 7 (8.97) 8 (13.33) 5 (8.33) 6 (8.96) 4 (7.27) 3 (4.92) 4 (6.45) 7 (10.29) 6 (9.23)66-75 15 (3) 1 (1.28) 1 (1.67) 2 (3.33) 2 (2.99) 2 (3.64) 3 (4.92) 2 (3.23) 1 (1.47) 1 (1.54)76+ 0 (0) 0 (0.00) 0 (0.00) 0 (0.00) 0 (0.00) 0 (0.00) 0 (0.00) 0 (0.00) 0 (0.00) 0 (0.00)Education, n (%)           No schooling 1 (0) 0 (0.00) 0 (0.00) 1 (1.67) 0 (0.00) 0 (0.00) 0 (0.00) 0 (0.00) 0 (0.00) 0 (0.00)High school 155 (27) 18 (23.08) 14 (23.33) 23 (38.33) 20 (29.85) 12 (21.82) 19 (31.15) 13 (20.97) 22 (32.35) 14 (21.54)Associate degree 122 (21) 20 (25.64) 16 (26.67) 5 (8.33) 18 (26.87) 7 (12.73) 13 (21.31) 14 (22.58) 14 (20.59) 15 (23.08)Undergraduate degree 235 (41) 29 (37.18) 23 (38.33) 22 (36.67) 26 (38.81) 26 (47.27) 22 (36.07) 29 (46.77) 25 (36.76) 33 (50.77)Post-graduate degree 63 (11) 11 (14.10) 7 (11.67) 9 (15.00) 3 (4.48) 10 (18.18) 7 (11.48) 6 (9.68) 7 (10.29) 3 (4.62)Subjective Numeracy, 1-6 (SD) 4.84 (0.930) 4.66 (1.01) 4.93 (0.93) 4.87 (1.03) 4.93 (0.88) 4.99 (0.81) 4.98 (0.96) 4.87 (0.85) 4.73 (0.93) 4.69 (0.91)Ambiguity Aversion, 1-5 (SD) 3.18 (0.396) 3.20 (0.40) 3.19 (0.43) 3.33 (0.40) 3.16 (0.35) 3.15 (0.38) 3.15 (0.39) 3.18 (0.45) 3.13 (0.39) 3.17 (0.36)118Table 11: Descriptive results for intention changeOutcomes  Technique Overall (n=576)Visual Range (n=78)Textual Prefix (n=60)GRADE (n=60)Time Treatment Available (n=67)Experts Disagree (n=55)Animated GIF (n=61)Number Of People (n=62)Gradient Range (n=68)Textual Range (n=65)Intention (1-7)           No ambiguity (SD) 4.71 (1.75) 4.58 (1.94) 5.07 (1.69) 4.62 (1.81) 4.85 (1.84) 4.58 (1.79) 4.44 (1.88) 4.90 (1.54) 4.65 (1.44) 4.74 (1.73)Ambiguity (SD) 4.34 (1.73) 4.08 (1.81) 5.00 (1.66) 4.13 (1.75) 4.76 (1.78) 4.33 (1.71) 3.95 (1.78) 4.50 (1.72) 4.21 (1.56) 4.12 (1.59)Change (SD)-0.38 (1.29)-0.50 (1.66)-0.07 (0.94)-0.48 (1.08)-0.09 (1.00)-0.25 (1.22)-0.49 (1.57)-0.40 (1.14)-0.44 (1.11)-0.62 (1.57)Distance (SD) 0.85 (1.04) 1.19 (1.25) 0.53 (0.77) 0.72 (0.94) 0.51 (0.86) 0.62 (1.08) 1.05 (1.26) 0.79 (0.91) 0.85 (0.83) 1.29 (1.07)Proportion change by 1 point (SD) 0.52 (0.50) 0.60 (0.49) 0.40 (0.49) 0.45 (0.50) 0.36 (0.48) 0.36 (0.49) 0.59 (0.50) 0.53 (0.50) 0.62 (0.49) 0.75 (0.43)Proportion change by 2 points (SD) 0.22 (0.41) 0.37 (0.49) 0.10 (0.30) 0.20 (0.40) 0.09 (0.29) 0.16 (0.37) 0.26 (0.44) 0.19 (0.40) 0.19 (0.40) 0.37 (0.49)Proportion change by 3 points (SD) 0.07 (0.26) 0.15 (0.36) 0.03 (0.18) 0.07 (0.25) 0.03 (0.17) 0.04 (0.19) 0.10 (0.30) 0.06 (0.25) 0.04 (0.21) 0.12 (0.33)119Table 12: Cohen D's effect size for all outcomesPresentation TechniqueOutcomeOverall (n=576)Visual Range (n=78)Textual Prefix (n=60)GRADE (n=60)Time Treatment Available (n=67)Experts Disagree (n=55)Animated GIF (n=61)Number Of People (n=62)Gradient Range (n=68)Textual Range (n=65)Intention (1-7) -0.21 -0.27 -0.04 -0.28 -0.05 -0.14 -0.27 -0.25 -0.29 -0.37Magnitude of risk perception (1-5) 0.10 0.05 0.01 0.28 -0.14 0.12 0.13 0.07 0.26 0.07Worry (1-5) 0.11 0.12 -0.02 0.15 0.09 0.02 0.12 0.09 0.19 0.11Trust (1-5) -0.35 -0.48 -0.07 -0.51 -0.06 -0.52 -0.56 -0.34 -0.30 -0.32Decisional uncertainty (1-5) 0.23 0.39 0.06 0.21 0.08 0.23 0.14 0.04 0.23 0.69120Table 13: Logistic regression model for 2-point change in intentionCoef (S.E) OR pConstant -3.28 (0.61) 0.04 0.000Presentation techniqueVisual Range 1.97 (0.50) 7.14 0.000Textual Prefix 0.18 (0.61) 1.19 0.773GRADE 0.97 (0.54) 2.65 0.072Time Available --- --- ---Expert agreement 0.66 (0.57) 1.94 0.243Animated GIF 1.33 (0.52) 3.77 0.011Number of People 0.85 (0.54) 2.35 0.114Gradient Range 0.97 (0.53) 2.65 0.067Textual Range 1.88 (0.51) 6.58 0.000Baseline intention 0.11 (0.06) 1.11 0.088SexFemale -0.37 (0.22) 0.69 0.087Male --- --- ---EducationUndergraduate degree or higher 0.41 (0.22) 1.50 0.058Lower than undergraduate degree --- --- ---Subjective numeracy>4 0.45 (0.27) 1.57 0.102<4 --- --- ---Likelihood ratio chi2 47.01p > chi2 0.000121Table 14: Linear regression results for intention changeCoef (S.E) pConstant 0.09 (0.16) 0.570Presentation techniqueVisual Range -0.41 (0.21) 0.057Textual Prefix 0.02 (0.23) 0.921GRADE -0.39 (0.23) 0.087Time Available -- --Expert agreement -0.16 (0.23) 0.483Animated GIF -0.40 (0.23) 0.079Number of People -0.31 (0.23) 0.168Gradient Range -0.35 (0.22) 0.114Textual Range -0.53 (0.22) 0.020R2 0.02Adjusted R2 0.01F-test 1.41p 0.188122Table 15: Descriptive results for secondary outcomesOutcomes TechniqueOverall (n=576)Visual Range (n=78)Textual Prefix (n=60)GRADE (n=60)Time Treatment Available (n=67)Experts Disagree (n=55)Animated GIF (n=61)Number Of People (n=62)Gradient Range (n=68)Textual Range (n=65)Magnitude of risk perception (1-5)           No ambiguity (SD) 1.98 (0.81) 2.01 (0.92) 1.97 (0.88) 1.88 (0.76) 1.82 (0.69) 2.02 (0.89) 1.95 (0.74) 2.21 (0.83) 1.94 (0.64) 2.02 (0.87)Ambiguity (SD) 2.06 (0.82) 2.05 (0.8) 1.98 (0.75) 2.12 (0.96) 1.73 (0.64) 2.13 (0.9) 2.05 (0.76) 2.27 (0.83) 2.12 (0.74) 2.08 (0.89)Change (SD) 0.08 (0.61) 0.04 (0.55) 0.02 (0.57) 0.23 (0.65) -0.09 (0.48) 0.11 (0.37) 0.10 (0.57) 0.06 (0.70) 0.18 (0.67) 0.06 (0.77)Distance (SD) 0.27 (0.55) 0.24 (0.49) 0.25 (0.51) 0.27 (0.63) 0.18 (0.46) 0.15 (0.36) 0.23 (0.53) 0.35 (0.60) 0.35 (0.59) 0.37 (0.67)Worry (1-5)           No ambiguity (SD) 2.93 (0.94) 2.95 (0.94) 2.85 (0.94) 2.83 (0.92) 2.75 (0.96) 3.18 (0.82) 2.8 (0.85) 3.1 (0.92) 2.78 (0.99) 3.18 (1)Ambiguity (SD) 3.03 (0.96) 3.06 (0.89) 2.83 (1.01) 2.98 (1.05) 2.84 (0.98) 3.2 (0.89) 2.9 (0.79) 3.19 (1.01) 2.97 (0.96) 3.29 (0.96)Change (SD) 0.10 (0.61) 0.12 (0.62) -0.02 (0.62) 0.15 (0.63) 0.09 (0.42) 0.02 (0.36) 0.10 (0.57) 0.10 (0.67) 0.19 (0.78) 0.11 (0.71)Distance (SD) 0.31 (0.54) 0.35 (0.53) 0.32 (0.54) 0.32 (0.57) 0.18 (0.39) 0.13 (0.34) 0.30 (0.49) 0.39 (0.55) 0.43 (0.68) 0.38 (0.60)Trust (1-5)           No ambiguity (SD) 4.00 (0.74) 4.05 (0.69) 4.01 (0.74) 4.01 (0.7) 4.11 (0.67) 4.01 (0.52) 4.01 (0.79) 3.83 (0.91) 4.01 (0.8) 3.93 (0.81)Ambiguity (SD) 3.72 (0.85) 3.67 (0.88) 3.96 (0.67) 3.61 (0.86) 4.07 (0.68) 3.68 (0.73) 3.51 (0.98) 3.52 (0.92) 3.76 (0.88) 3.67 (0.84)Change (SD) -0.28 (0.74) -0.38 (0.76) -0.06 (0.40) -0.39 (0.84) -0.03 (0.32) -0.33 (0.64) -0.50 (0.97) -0.31 (0.85) -0.25 (0.76) -0.26 (0.78)Distance (SD) 0.45 (0.65) 0.52 (0.67) 0.21 (0.35) 0.51 (0.78) 0.12 (0.30) 0.41 (0.59) 0.69 (0.84) 0.56 (0.71) 0.55 (0.58) 0.50 (0.65)Decisional uncertainty (1-5)           No ambiguity (SD) 3.25 (1.16) 3.35 (1.25) 3.45 (1.13) 3.3 (1.13) 3.4 (1.21) 3.18 (1.18) 3.3 (1.09) 3.35 (1.14) 3.03 (1.2) 2.88 (1.06)Ambiguity (SD) 3.51 (1.07) 3.79 (0.99) 3.52 (1.04) 3.53 (1.02) 3.49 (1.14) 3.43 (0.99) 3.46 (1.17) 3.4 (1.15) 3.29 (1.03) 3.62 (1.08)Change (SD) 0.26 (0.91) 0.44 (1.18) 0.07 (0.56) 0.23 (0.78) 0.09 (0.51) 0.25 (0.77) 0.16 (0.98) 0.05 (0.76) 0.26 (0.82) 0.75 (1.28)Distance (SD) 0.53 (0.79) 0.77 (0.99) 0.25 (0.51) 0.47 (0.67) 0.24 (0.46) 0.37 (0.72) 0.58 (0.80) 0.40 (0.65) 0.57 (0.64) 1.04 (1.05)123Table 16: Linear regression results for risk perception and worryOutcome Risk perception WorryStatistic Coef (S.E) p Coef (S.E) pConstant -0.09 (0.07) 0.225 0.09 (0.08) 0.234Presentation techniqueVisual Range 0.13 (0.10) 0.203 0.03 (0.10) 0.801Textual Prefix 0.11 (0.11) 0.322 -0.11 (0.11) 0.332GRADE 0.32 (0.11) 0.003 0.06 (0.11) 0.581Time Available --- --- --- ---Expert agreement 0.20 (0.11) 0.071 -0.07 (0.11) 0.524Animated GIF 0.19 (0.11) 0.079 0.01 (0.11) 0.936Number of People 0.15 (0.11) 0.148 0.01 (0.11) 0.947Gradient Range 0.27 (0.10) 0.011 0.10 (0.11) 0.338Textual Range 0.15 (0.10) 0.151 0.02 (0.11) 0.866R2 0.0211 0.0089Adjusted R2 0.0073 -0.0051F-test 1.53 0.64p 0.1448 0.7477124Table 17: Linear regression results for trustCoef (S.E) pConstant 1.42 (0.18) 0.000Presentation techniqueVisual Range -0.36 (0.11) 0.002Textual Prefix -0.05 (0.12) 0.694GRADE -0.39 (0.12) 0.001Time Available --- ---Expert agreement -0.31 (0.12) 0.013Animated GIF -0.5 (0.12) 0.000Number of People -0.37 (0.12) 0.002Gradient Range -0.25 (0.12) 0.035Textual Range -0.27 (0.12) 0.025Baseline trust -0.34 (0.04) 0.000Age>=26 -0.1 (0.06) 0.078<26 --- ---R2 0.17Adjusted R2 0.15F-test 11.19p <0.001125Table 18: Linear regression results for decisional uncertaintyCoef (S.E) pConstant 1.4 (0.15) 0.000Presentation techniqueVisual Range 0.3 (0.13) 0.022Textual Prefix -0.03 (0.14) 0.849GRADE 0.08 (0.14) 0.545Time Available --- ---Expert agreement 0.04 (0.14) 0.758Animated GIF 0.02 (0.14) 0.886Number of People -0.08 (0.14) 0.569Gradient Range 0.01 (0.14) 0.914Textual Range 0.44 (0.14) 0.002Baseline decisional uncertainty -0.37 (0.03) 0.000EducationUndergraduate degree or higher 0.1 (0.07) 0.123Undergraduate degree or lower --- ---Subjective numeracy>=4 -0.11 (0.08) 0.178<4R2 0.28Adjusted R2 0.26F-test 19.82p <0.001126Table 19: Proportion of participants who answered knowledge questions correctlyPresentation technique Incorrect CorrectVisual Range (N=78) 21 (27%) 57 (73%)Textual Prefix (N=60) 54 (90%) 6 (10%)GRADE (N=60) 56 (93%) 4 (7%)Time Available (N=67) 62 (93%) 5 (7%)Expert agreement (N=55) 50 (91%) 5 (9%)Animated GIF (N=61) 28 (46%) 33 (54%)Number of People (N=62) 55 (89%) 7 (11%)Gradient Range (N=68) 9 (13%) 59 (87%)Textual Range (N=65) 18 (28%) 47 (72%)“Correct” based on participant being within 1 point of the true upper and lower bounds (i.e. true value ±1)127Figure 17: Histograms of change in intention by technique.051015202530354045-6 -4 -3 -2 -1 0 1 2 3 4 5 6Visual range051015202530354045-6 -4 -3 -2 -1 0 1 2 3 4 5 6Textual prefix051015202530354045-6 -4 -3 -2 -1 0 1 2 3 4 5 6GRADE051015202530354045-6 -4 -3 -2 -1 0 1 2 3 4 5 6Time treatment available051015202530354045-6 -4 -3 -2 -1 0 1 2 3 4 5 6Expert agreement051015202530354045-6 -4 -3 -2 -1 0 1 2 3 4 5 6Animated GIF051015202530354045-6 -4 -3 -2 -1 0 1 2 3 4 5 6Number of people051015202530354045-6 -4 -3 -2 -1 0 1 2 3 4 5 6Gradient range051015202530354045-6 -4 -3 -2 -1 0 1 2 3 4 5 6Textual rangeInten on changeNote how some techniques (e.g. textual prefix, expert agreement) cluster around 0, while others (e.g. textual range, visual range) show more variation.128Figure 18: Order of techniques by proportion change in the 1-point, 2-point, and 3-point case. Ordered from lowest proportion change to highest proportion change in each column.Figure 19: Predictions for adjusted 2-point proportion change in intention. 00.10.20.30.40.50.60.70.80.91Visual Range TextualRangeAnimatedGIFGRADE GradientRangeNumber ofPeopleExpertagreementTextualPrefixTimeAvailableAdjustedproporonchangeby2pointsPresenta on techniqueAdjusted 2-point propor on change in inten on bytechniqueOrange line represents minimum of 20% of respondents who changed intention by 2 points that was defined as meaningful. Adjusted for baseline intention, sex, education, and subjective numeracy.129Figure 20: Linear predictions for adjusted change in trust by technique-0.8-0.7-0.6-0.5-0.4-0.3-0.2-0.100.10.20.3Visual RangeTextual RangeAnimated GIFGRADEGradient RangeNumberofPeopleExpertagreementTextual PrefixTime AvailableAdjustedchangeintrustPresenta on TechniqueAdjusted change in trust by technique130Figure 21: Linear predictions for adjusted change in decisional uncertainty by technique-0.2-0.100.10.20.30.40.50.60.70.80.9Visual RangeTextual RangeAnimated GIFNumberofPeopleGradient RangeGRADEExpertagreementTextual PrefixTime AvailableAdjustedchangeindecisionaluncertaintyPresenta on TechniqueAdjusted change in decisional uncertainty by technique131Figure 22: Knowledge by techniqueLower and upper horizontal lines represent the true values for the hypothetical scenario132Figure 23: Preferred technique by presentation technique1335 Discussion5.1 OverviewThis thesis sought to investigate how ambiguity is currently presented to patients, and the influence that presentation techniques have on various decision-making related outcomes. In this chapter, I summarize and synthesize the key findings, discuss the strengths and limitations, and make recommendations for further research.5.2 Key FindingsThere is considerable variability in the way that ambiguity in risk estimates is currently being presented to patients.The first question sought to understand how ambiguity in risk estimates is currently being described to patients. It was hypothesized that there would be considerable variation, since in contrast to the first order context of risk communication where recommendations are clear, there are no clear guidelines for how to describe second order ambiguity.27 Findings from the systematic review of the literature somewhat confirmed the initial hypothesis, in that many different techniques were found and then distilled down into 9 distinct techniques. But it was surprising how many qualitative approaches were being used, and it was often not clear whether their use was intentional or not. For 134example, terms such as “about”, “around”, “approximately”, and “roughly”, and “up to” were found as textual prefixes to risk estimates, which were then combined into a single technique. Of concern is that the developers of these communication techniques do not appear to understand how they influence the readers of this information, given that little evidence was found testing their effects. Some existing evidence shows that presenting ambiguity can result in changes to trust, worry, decisional uncertainty, knowledge and risk perception, though the degrees to which presentation technique affects each of these outcomes remains relatively unknown.Some of the techniques used in the literature change people’s intention to take medication.The second question sought to understand how these approaches influenced decisions – providing preliminary feedback to developers on the impact of their choices regarding presentation format. It was found that the techniques that described a range, both visually in an icon array and textually (e.g. 10 to 15%), had the greatest influence on behavioral intention to take an oral anticoagulant to treat atrial fibrillation, based on the proportion of respondents changing at least 2-points change on a 7-point scale. It was further found that the less obvious range techniques, such as the gradient range (shaded confidence 135interval based on the underlying probability distribution), the animated GIF (dynamic image showing different “samples” of 100 people and how the value might change), and the number of people (risk estimates described directly from study samples – e.g. “in one study, 12 out of 453 participants experience a side effect”) techniques influenced behavioral intention by 1-point on the 7-point scale. However, my confidence in the degree to which this will result in an actual behavioral change is limited, given the variability in the 1-point change discovered in the pilot study. Interestingly, the 2 techniques that did change intention by 2-points were not found to result in statistically significant mean change in intention in the linear regressions, meaning it was not possible to reject the null hypothesis (that ambiguity decreases intention in some techniques). It was initially hypothesized that the direction of intention would vary depending on how ambiguity averse individuals were, where showing ambiguous risks to ambiguity averse individuals would tend to reduce intention (e.g. with a range they might look at the higher value of risk of side-effect), but would increase intention in ambiguity seeking individuals (e.g. with a range they might look at the lower value of a side-effect). Statistical modeling found no relationship between ambiguity aversion and intention to take OAC, and therefore this could not be verified. It was speculated that the direction of change might be moderated by some other sociodemographic factor, as evidence 136suggests that higher age and lower education can influence ambiguity aversion, and might also influence intention.100 This finding could also not be verified, possibly due to the lack of sample size.  No evidence was found that presenting ambiguity has an effect on perceived risk magnitude.The results from this study found no evidence that presenting ambiguity has an effect on perceived magnitude of the risk, regardless of the presentation technique, though it is recognized that the study wasn’t necessarily powered to detect this. In Han’s study testing the influence of ambiguous presentation techniques on colorectal cancer estimates, investigators found corroborating results, where communicating ambiguity in visual and textual forms had small but insignificant effects on risk perception.49 Yet, there exist numerous studies that refute this finding, showing that ambiguous estimates lead to increased perception of risk.50,101 Most of the studies that found such a relationship exist outside the health domain, and therefore may indicate differences in health-related behaviors with respect to risk perception. Additionally, these studies tend to test ranges of ambiguity, from no/low ambiguity to high ambiguity, whereas the current study only tested no ambiguity compared to ambiguity. This finding could be seen as a positive, as if it is true that presentation technique has no 137effect on magnitude of risk perception, then perhaps there is no harm in including this information when communicating risk estimates to patients. Though effects on other possible decision-making outcomes would still need to be considered. No evidence was found that presenting ambiguity has an effect on risk-related worry.Analyses also found no evidence indicating that ambiguity presentation technique has an effect on perceived worry, which was contrary to the initial hypothesis, though again, it is recognized that study power may have limited the ability to detect a change here. This also refutes existing evidence, which suggests that presenting ambiguity has been shown to increase worry.49,19 Existing studies have examined this effect in a cancer-related context, which may explain the result discussed here, as this is the first study that has examined ambiguity presentation in an atrial fibrillation context. Perceptual differences between cancer worry and stroke worry may explain this finding, as there are major negative societal perceptions with respect to cancer risk. This finding is important, as if it is true, it lends support to presenting ambiguity in many contexts if there is no negative effect on worry. 138Some presentation techniques decreased trust in the risk estimates.Presenting ambiguity decreased trust in the estimate in most presentation techniques, which corroborates existing evidence.41,54,60,19 This was expected, as it was assumed that presenting point estimates as ambiguous would result in less trust in that information. There was speculation that higher levels of education might moderate this effect, as understanding that the inclusion of ambiguity is actually providing more information than the point estimate, and therefore should theoretically increase trust in the information. There is potential for confusion here, as trusting the “information” (e.g. the point estimate and range) is substantially different to trusting the “provider of information” (e.g. the physician, the risk pamphlet, the decision aid, etc.), and the questions collecting this outcome may have been insufficient to discriminate these constructs.Some presentation techniques increased decisional uncertainty.Decisional uncertainty was increased by two techniques: the visual range technique and the textual range technique. This finding is significant, as these are the same two techniques that had an effect on intention. This is consistent with existing evidence, showing that people may manifest aversion to ambiguity through increased decisional conflict, which consists of a superset of questions including the decisional uncertainty questions used in this study.89139Some presentation techniques resulted in better knowledge than other techniques.Across techniques, participant knowledge differed, where some techniques showed good knowledge results while others were poorer. Those techniques that performed well, namely the visual range (a range indicated on an icon array), textual range (a range described textually), gradient range (shaded range based on the underlying probability distribution), and to an extent the animated GIF (dynamic image showing different “samples” of 100 people and how the value might change) techniques, resulted in substantially better knowledge than the textual-based techniques. This finding is important, as informed consent is one of the fundamental basics to implement shared decision-making. If knowledge is sub-optimal, and patients make decisions (and give consent) based on that information, determining whether consent is truly informed becomes problematic. There are considerations to be made regarding the possibility of dichomitizing techniques into those that are numeric, and those that are textual. As the numeric techniques provide explicit values regarding the true upper and lower bounds of the range, while the more implicit text-based techniques provide this information textually as opposed to exact probability estimates, these results for knowledge are expected. People can’t be expected to have such specific knowledge regarding the scope of the range without having 140been given this information in the past. Yet, it is still relatively unknown as to whether such precise knowledge is required, and whether more generic approaches, such as grading point estimates by the quality of their evidence, convey knowledge to an adequate degree such that good decisions are made.Some presentation techniques were preferred in their ambiguous format.In general, respondents preferred non-ambiguous presentations. Interestingly, but perhaps not surprisingly, the two techniques that showed the worst results in the knowledge questions, specifically the expert agreement, in which ambiguity is embedded in how well experts agree with the existing evidence, and the time available technique, in which ambiguity is embedded in how long the treatment had been available, were the only two techniques that were preferred in the ambiguous format compared to the non-ambiguous format. A relevant parallel to draw is to the observed phenomenon that people generally misunderstand pie charts, yet prefer them over other graphical formats of presenting data.102 This result could be attributed to people’s perception of ranges as being “wishy-washy”, which would explain people’s general aversion to the more explicit range-based techniques.41 This finding needs to be considered carefully, as there is an important distinction between what people 141want to know, and what they should know if they are to be sufficiently informed to provide consent.5.3 Implication of resultsTable 20 synthesizes the results of all the outcomes for each technique. Only the visual range (a range indicated on an icon array), textual range (a range described textually), gradient range (shaded range based on the underlying probability distribution), and animated GIF (dynamic image showing different “samples” of 100 people and how the value might change) techniques accurately helped patients understand what the true underlying risks were. Consequently, developers using the other techniques need to understand that they are potentially misleading patients and doctors and not disseminating accurate knowledge. Of the techniques that do disseminate more accurate knowledge, the textual range and visual range techniques do potentially change intention to take medication, while the others do not. Whether changing intention is a good outcome cannot be addressed by this thesis – understanding whether these intentions come from rational judgment and decision-making or conflated by cognitive biases and heuristics requires a fuller understanding of each individual’s preference and knowledge profile. However, the finding that some 142techniques do influence intention and others do not is important, as this is essentially a choice of the developer, and therefore their choice is potentially influencing patient decisions. Therefore, if developers use different techniques without considering their potential impact, they might unintentionally alter patient decision-making.The finding of trends in decreasing trust and increasing decisional uncertainty need to be weighed up with improving knowledge and changing intention. While these tend to be outcomes that we do not want to worsen in patients, there is an argument that some patients oversimplify decisions, and that increased decisional conflict and worry could be an indicator of a better-quality decision. However, it seems sensible if the ambiguity in risks is similar between two arms and should not influence decision-making, there is an argument for not including it.In the context of precision medicine, the findings are troubling, and reflect Hunters claims that “we are largely ill equipped” and  “already struggle with” assessing and presenting ambiguous probabilities to patients, given the significant heterogeneity of techniques found and varying effects on decision-making outcomes.33 Evidence is provided to support his assertion that “we need to develop methods to help our patients absorb large amounts of complex information”, as ambiguity is inherently complex and difficult to understand, 143and clearly has significant effects on psychometric properties associated with decision-making.As shared decision-making has become more commonplace, it is important that the implications of presenting ambiguity in a given technique are understood. By identifying which techniques are influential, it is possible to begin constructing recommendations on how best to present ambiguity to patients, whether it be through prognostic risk tools, decision aids, or precision medicine. Yet, as this is preliminary work, there is insufficient evidence to make recommendations, but this is a priority. In light of these results, the expert agreement and time available techniques should likely be avoided, as they can have negative effects on patient knowledge with respect to the true ambiguity around the given point estimate, and knowledge forms a base for informed decision-making. Yet, despite the range techniques improving this knowledge, there were also associated negative effects on intention and trust, as well as increase decisional uncertainty. While these cognitive impacts may not be a negative in all scenarios, it is important to consider the direction in which they are influenced when making decisions on how to present ambiguity.1445.4 Strengths and limitationsA primary strength is that a randomized web survey was used to collect the data. Similar study designs have been used in past research and have resulted in the construction of meaningful recommendations, therefore it was felt that a randomized web survey might allow for appropriate sample sizes and therefore power the study to detect meaningful changes.85 Had a more traditional face-to-face survey scenario been used, it would have been substantially more difficult to collect as many outcomes and therefore identify the trends that were identified.An additional strength appeared in the data analysis, as the visual range and textual range techniques resulted in remarkably similar results across all outcomes. Given that, conceptually, these techniques are very similar, it was expected that they would have similar effects. Confirmation of this expectation provided confidence that there was a degree of internal consistency in the survey.Given that research in the health domain was limited with respect to how presenting ambiguity influences patient decisions, a further strength exists in the fact that this topic has not been previously explored. It is expected that, as precision medicine and clinical prediction models become more common, the issue of when, and how to present ambiguity will become more urgent, and this 145work provides an effective baseline with which to start understanding the potential influence of presentation technique. Due to a large number of arms and high-required sample sizes, it was chosen to perform convenience sampling as opposed to a representative sample of AF patients. This resulted in an under-representation of older individuals, possibly because older populations are less familiar with technology and the fact that a web survey was utilized. AF is typically diagnosed in older individuals, and therefore results from this convenience sample may not be generalizable to the AF population. Additionally, the target demographic was not actual AF patients, as this was an exploratory study and it was felt that, in the context of pilot work, this was unnecessary. Actual AF patients may have had different interpretations of the information, as well as experience with the setting and potential treatments which could affect their responses. Another potential limitation is the use of a prevalent (complex) decision in the hypothetical vignette. The majority of existing evidence that considers ambiguity exist within experimental designs that use simple risk/reward scenarios.38 The choice of taking a treatment for atrial fibrillation obviously requires consideration of multiple factors, including cost, risk of side effects, stroke risk reduction, mode of administration, and more. Many of these factors have ambiguity, and constructing a scenario in which every factor and its 146ambiguity are considered would be extremely complex. Yet, many of the survey respondents noted that it was difficult, if not impossible to not consider these other attributes when answering questions regarding whether treatment should be taken. While the simple risk/reward scenarios may prove to be easier to understand and collect better data, most decisions are significantly more complex, and thus a more complex scenario was chosen as it was felt to be more generalizable, though clearly not perfect. Since rational and optimal decision-making rely on objective thinking, cognitive biases may exist that influence the results and make interpretation problematic. For example, despite explaining the hypothetical nature of the decision to be made, many participants noted that they had experience with heart disease and warfarin, and therefore couldn’t disregard this experience completely. Though a review of the literature was conducted to identify existing techniques used to present ambiguity, the search was restricted to examples within health research. There are likely other presentation techniques that could be found on other contexts, especially in the finance and weather literature, but it was felt that for a preliminary study it was feasible to stay in the health context. Additionally, the techniques found were grouped into similar techniques so as to keep the number of arms to a minimum. It is possible that ungrouped techniques 147may perform significantly differently with respect to the collected outcomes, which could be verified in future research.Another potential limitation was the questions used in the survey. Best efforts were made to only use validated questionnaires, but most of those identified tended to be context-specific or too long. It was therefore decided in some cases to omit some items, or adjust wording so that the items made sense in the AF context. This may have had an impact on their validity and reliability.Finally, initial proposals for this work were to investigate cross-technique effects, in order to establish whether some techniques were influencing the collected outcomes more than others. Due to the number of outcomes and techniques, powering the study was problematic, as prohibitively large sample sizes would be needed to make the cross-technique comparison feasible and the sample was collected on a pay-per-question basis. As a result of limited funds, it was only possible to detect which techniques had an effect, as opposed to comparing techniques for magnitude of effects. This limited the ability to interpret the results and make specific recommendations about optimal techniques, but identifies priorities and the potential candidate techniques that future research could investigate further.1485.5 Future research and recommendationsThe previously mentioned limitations offer potential opportunities for future research directions.The first potential direction is to run a similar study using a real scenario. Initially, it had been planned to use real confidence intervals and point estimates, but it was found that the clinical prediction models used in atrial fibrillation failed to include confidence intervals around the output risk estimates. As a result, it was necessary to derive these point estimates and ranges from published studies and meta-analyses, which proved to be challenging. What was found was that, in high CHADsVASc scores, the confidence intervals were extremely wide, likely due to a lower sample size in that subgroup. However, including CHADsVASc references in the vignette would have resulted in the need to explain what a high CHADsVASc means, and added additional detail to the informational section before the survey actually started, which might increase upfront burden and drop out rates. As iterations of feedback were incorporated into the survey development, it was consistently found to be too complex, and that it needed to be simplified. As a result, it was chosen to make the scenario entirely hypothetical so as to avoid having to explain various other factors, CPMs, and other information that would have been necessary to complete the survey. Yet, clinical decisions are rarely this simple, and as a result, 149it is vital that researchers begin testing these effects in more complex decision-making processes, as simple risk/reward scenarios discussed in Ellsberg’s work are vastly different to a decision to take oral anticoagulation.38Additional work needs to be conducted with respect to the techniques identified as being the most influential. Findings from this work indicate that some techniques, such as the visual and textual range techniques, impact various psychometric properties, including intention, trust, decisional uncertainty and knowledge more than others. Other techniques, such as the time treatment available and textual prefix technique, have no impact on those psychometric properties, but sometimes result in poorer knowledge. Future studies could investigate the impact of these effects further by comparing technique effects and establishing best practices with respect to presentation technique for ambiguity. This would involve testing these techniques in real decisions, and assessing whether participants make good decisions – decisions that are congruent with their personal values and based off good knowledge.150Table 20: Summary of effects on outcomesVisual rangeTextual rangeAnimated GIFNumber of peopleGradient rangeGRADEExpert agreementTextual prefixTime availableBehavioural intention to take OAC ↔ ↔ - - - - - - -Perception of risk magnitude - - - - - - - - -Worry - - - - - - - - -Trust ↓ ↓ ↓ ↓ ↓ ↓ ↓ - -Decisional uncertainty ↑ ↑ - - - - - - -Knowledge ✓✓ ✓✓ ✓✓ ✓ ✓ ✓ X ✓ XPreference X X X X X X ✓ X ✓↔ = 2-point change in 20%, unclear direction   ↓= significant decrease in outcome   ↑= significant increase in outcomeKnowledge: ✓✓= low variation around both mean responses   ✓ = low variation around one mean responseX = high variation around both mean responses  - no significant relationship 151References1. Merriam-Webster, I. Merriam-webster online dictionary. Springfield, MA: Author. Retrieved July 7, 2005 (2008).2. Aronson, J. Good prescribing: Benefits, hazards, harms, and risks. BMJ 352, 2016 (2016).3. Tversky, A. & Kahneman, D. The Framing of Decisions and the Psychology of Choice Author ( s ): Amos Tversky and Daniel Kahneman Published by : American Association for the Advancement of Science Stable URL : http://www.jstor.org/stable/1685855 REFERENCES Linked references are avail. Science (80-. ). 211, 453–458 (1981).4. Black, W. C., Nease, R. F. & Tosteson, A. N. A. Perceptions of Breast-Cancer Risk and Screening Effectiveness in Women Younger than 50 Years of Age. J. Natl. Cancer Inst. 87, 720–731 (1995).5. Lipkus, I. M. et al. General Performance on a Numeracy Scale. 21, 37–44 (2001).6. Gigerenzer, G., Hertwig, R., Van Den Broek, E., Fasolo, B. & Katsikopoulos, K. V. ‘A 30% chance of rain tomorrow’: How does the public understand probabilistic weather forecasts? Risk Anal. 25, 623–629 (2005).7. Beyth-Marom, R. How probable is probable? A numerical translation of verbal probability expressions. J. Forecast. 1, 257–269 (1982).8. Charles, C., Gafni, A. & Whelan, T. Shared decision-making in the medical encounter: What does it mean? (Or it takes, at least two to tango). Soc. Sci. Med. 44, 681–692 (1997).9. Elwyn, G. J., Edwards, A., Kinnersley, P. & Grol, R. Shared decision making and the concept of equipoise: The competences of involving patients in healthcare choices. Br. J. Gen. Pract. 50, 892–897 (2000).10. Epstein, R. M., Alper, B. S. & Quill, T. E. Participatory Decision Making. Jama 291, 2359–2366 (2004).11. Sheridan, S. L., Harris, R. P. & Woolf, S. H. Shared decision making about screening and chemoprevention: A suggested approach from the U.S. Preventive Services Task Force. Am. J. Prev. Med. 26, 56–66 (2004).12. Lee, E. O. & Emanuel, E. J. Shared Decision Making to Improve Care and Reduce Costs. N. Engl. J. Med. 6–8 (2012). doi:10.1056/NEJMp121460513. Emanuel, E. J. & Emanuel, L. L. Four models of the patient-physician relationship. J. Am. Med. Assoc. 267, 2221–2226 (1992).14. Politi, M. C., Han, P. K. J. & Col, N. F. Communicating the Uncertainty of Harms and Benefits of Medical Interventions. Med. Decis. Mak. 27, 681–695 (2007).15. Kantor, E. D., Rehm, C. D., Haas, J. S., Chan, A. T. & Giovannucci, E. L. 152Trends in Prescription Drug Use Among Adults in the United States From 1999-2012. Jama 314, 1818 (2015).16. Bansback, N., Harrison, M., Sadatsfavi, M., Stiggelbout, A. & Whitehurst, D. G. T. Attitude to health risk in the Canadian population: a cross-sectional survey. C. Open 4, E284–E291 (2016).17. van Osch S.M.C., S. A. . The development of the Health-Risk Attitude Scale. Constr. Heal. State Util. (2014).18. du Prel, J.-B., Hommel, G., Röhrig, B. & Blettner, M. Confidence interval or p-value?: part 4 of a series on evaluation of scientific publications. Dtsch. Arztebl. Int. 106, 335–9 (2009).19. Johnson, B. B. & Slovic, P. Presenting uncertainty in health risk assessment: initial studies of its effects on risk perception and trust. Risk Anal 15, 485–94 (1995).20. Han, P. K. J. et al. Laypersons’ responses to the communication of uncertainty regarding cancer risk estimates. Med. Decis. Mak. 29, 391–403 (2009).21. Han, P. K. J., Klein, W. M. P. & Arora, N. K. Varieties of Uncertainty in Health Care: A Conceptual Taxonomy. Med. Decis. Mak. 31, 828–838 (2011).22. Schwartz, L. M., Wolshin, S., Black, W. C. & Welch, H. G. The role of numeracy in understanding the venefit of screening mamography. Ann Intern Med 127, 966–72 (1997).23. Mckibbon, B. K. A. & Sc, B. Evidence-based Practice. 86, (1998).24. Spiegelhalter, D. J. Understanding uncertainty. 14, 196–197 (2013).25. Lloyd, A. J. The extent of patients’ understanding of the risk of treatments. Qual. Saf. Heal. Care 10, i14–i18 (2001).26. Benbassat, J., Pilpel, D. & Tidhar, M. Patients’ preferences for participation in clinical decision making: a review of published surveys. Behav. Med. 24, 81–88 (1998).27. Elwyn, G. et al. Developing a quality criteria framework for patient decision aids: online international Delphi consensus process. Bmj 333, 417 (2006).28. Stacey, D. et al. Decision aids for people facing health treatment or screening decisions ( Review ) Decision aids for people facing health treatment or screening decisions. Cochrane Database Syst. Rev. (2012). doi:10.1002/14651858.CD001431.pub5.www.cochranelibrary.com29. Ickenroth, M. et al. A single-blind randomised controlled trial of the effects of a web-based decision aid on self-testing for cholesterol and diabetes. study protocol. BMC Public Health 12, 6 (2012).30. Tran, V. T., Kisseleva-Romanova, E., Rigal, L. & Falcoff, H. Impact of a printed decision aid on patients’ intention to undergo prostate cancer 153screening: A multicentre, pragmatic randomised controlled trial in primary care. Br. J. Gen. Pract. 65, e295–e304 (2015).31. Han, P. Conceptual, Methodological, and Ethical Problems in Communicating Uncertainty in Clinical Evidence. Med Care Res Rev. 70, 14S–36S (2013).32. Chouchane, L., Mamtani, R., Dallol, A. & Sheikh, J. Personalized medicine: a patient - centered paradigm. J. Transl. Med. 9, 206 (2011).33. Hunter, D. J. Uncertainty in the Era of Precision Medicine. N. Engl. J. Med. 375, 711–713 (2016).34. Aguilar, M. I. & Hart, R. Oral anticoagulants for preventing stroke in patients with non-valvular atrial fibrillation and no previous history of stroke or transient ischemic attacks. (2005). doi:10.1002/14651858.CD001927.pub235. Hart, R. G., Pearce, L. A. & Aguilar, M. I. Annals of Internal Medicine Review Meta-analysis : Antithrombotic Therapy to Prevent Stroke in Patients Who Have Nonvalvular Atrial Fibrillation. (2014).36. Hicks, T., Stewart, F. & Eisinga, A. NOACs versus warfarin for stroke prevention in patients with AF: a systematic review and meta-analysis. Open Hear. 3, e000279 (2016).37. Solomon, D. H. et al. Comparative cancer risk associated with methotrexate, other non-biologic and biologic disease-modifying anti-rheumatic drugs. Semin. Arthritis Rheum. 43, 489–497 (2014).38. Ellsberg, D. Risk, Ambiguity, and the Savage Axioms. The Quarterly Journal of Economics 75, 643 (1961).39. Kirsch, I. S., Jungeblut, A., Jenkins, L. & Kolstad, A. Adult Literacy in America: A First Look at the Results of the National Adult Literacy Survey. Natl. Cent. Educ. Stat. 178 (2002). doi:NCES 1993-27540. Hoffrage, U. & Gigerenzer, G. Using natural frequencies to improve diagnostic inferences. Acad Med 73, 538–540 (1998).41. Schapira, M. M., Nattinger, A. B. & McHorney, C. A. Frequency or probability?  A qualitative study of risk communications formats used in health care. Med. Decis. Mak. 21, 459 (2001).42. Fagerlin, A., Zikmund-Fisher, B. J. & Ubel, P. A. Helping patients decide: Ten steps to better risk communication. J. Natl. Cancer Inst. 103, 1436–1443 (2011).43. Grimes, D. A. & Snively, G. R. Patients’ understanding of medical risks: Implications for genetic counseling. Obstet. Gynecol. 93, 910–914 (1999).44. Galesic, M., Garcia-Retamero, R. & Gigerenzer, G. Using Icon Arrays to Communicate Medical Risks: Overcoming Low Numeracy. Heal. Psychol. 28, 210–216 (2009).15445. Education, U. D. of. US. Department of Education, National Center for Education Statistics. Digest of Education Statistics 1 (2016). Available at: http://nces.ed.gov/fastfacts/display.asp?id=80. 46. Trevena, L. J. et al. Presenting quantitative information about decision outcomes: a risk communication primer for patient decision aid developers. BMC Med. Inform. Decis. Mak. 13 Suppl 2, S7 (2013).47. Lopez-Lopez, J. A. et al. Network meta-analysis of oral anticoagulants for primary prevention, treatment and secondary prevention of venous thromboembolic disease, and for prevention of stroke in atrial fibrillation. Value in health 1, A374 (2015).48. Aryal, M. R. et al. Meta-analysis of efficacy and safety of rivaroxaban compared with warfarin or dabigatran in patients undergoing catheter ablation for atrial fibrillation. Am. J. Cardiol. 114, 577–582 (2014).49. Han, P. K. J. et al. Communication of Uncertainty Regarding Individualized Cancer Risk Estimates: Effects and Influential Factors. Med. Decis. Mak. 31, 354–366 (2011).50. Viscusi, W. K., Magat, W. A. & Huber, J. Communication of ambiguous risk information. Theory Decis. 31, 159–173 (1991).51. Bansback, N. et al. Communicating Uncertainty in Benefits and Harms: A Review of Patient Decision Support Interventions. Patient - Patient-Centered Outcomes Res. 10, 311–319 (2016).52. Matthews, E. J. et al. Efficient literature searching in diffuse topics: lessons from a systematic review of research on communicating risk to patients in primary care. Health Libr. Rev. 16, 112–120 (1999).53. Cohen, J. A Coefficient of Agreement for Nominal Scales. Educ. Psychol. Meas. 20, 37–46 (1960).54. Gurmankin, A. D., Baron, J. & Armstrong, K. The Effect of Numerical Statements of Risk on Trust and Comfort with Hypothetical. (2004). doi:10.1177/0272989X0426548255. Grant, F. C., Laupacis, A., O’Connor, A. M., Rubens, F. & Robblee, J. Evaluation of a decision aid for patients considering autologous blood donation before open-heart surgery. CMAJ 164, 1139–44 (2001).56. Politi, M. C., Lewis, C. L. & Frosch, D. L. Supporting Shared Decisions When Clinical Evidence Is Low. Med. Care Res. Rev. 70, 113S–128S (2013).57. Cohn, L. D., Schydlower, M., Foley, J. & Copeland, R. L. Adolescents’ misinterpretation of health risk probability expressions. Pediatrics 95, 713–6 (1995).58. Buetow, S., Cantrill, J. & Sibbald, B. Risk communication in the patient-health professional relationship. Heal. Care Anal. 6, 261–270 (1998).59. Balshem, H. et al. GRADE guidelines: 3. Rating the quality of evidence. J. 155Clin. Epidemiol. 64, 401–406 (2011).60. Bansback, N., Harrison, M. & Marra, C. Does Introducing Imprecision around Probabilities for Benefit and Harm Influence the Way People Value Treatments? Med. Decis. Mak. 36, 490–502 (2016).61. Muscatello, D. J., Searles, A., Macdonald, R. & Jorm, L. Communicating population health statistics through graphs: A randomised controlled trial of graph design interventions. BMC Med. 4, (2006).62. Agoritsas, T. et al. Decision aids that really promote shared decision making: the pace quickens. Bmj 350, g7624–g7624 (2015).63. Harrison, M., Marra, C., Shojania, K. & Bansback, N. Societal preferences for rheumatoid arthritis treatments: evidence from a discrete choice experiment. Rheumatology (Oxford). 54, 1816–1825 (2015).64. Harrison, M., Marra, C. A. & Bansback, N. Preferences for ‘New’ Treatments Diminish in the Face of Ambiguity. Heal. Econ. (United Kingdom) 752, 743–752 (2016).65. Kirkegaard, P., Risor, M. B., Edwards, A., Junge, A. G. & Thomsen, J. L. Speaking of risk, managing uncertainty: Decision-making about cholesterol-reducing treatment in general practice. Qual. Prim. Care 20, 245–252 (2012).66. Roberts, M. C. et al. Patient-Centered Communication for Discussing Oncotype DX Testing. Cancer Invest. 34, 205–212 (2016).67. Longman, T., Turner, R. M., King, M. & McCaffery, K. J. The effects of communicating uncertainty in quantitative health risk estimates. Patient Educ. Couns. 89, 252–259 (2012).68. Price, M., Cameron, R. & Butow, P. Communicating risk information: The influence of graphical display format on quantitative information perception-Accuracy, comprehension and preferences. Patient Educ. Couns. 69, 121–128 (2007).69. Hawley, S. T. et al. The impact of the format of graphical presentation on health-related knowledge and treatment choices. Patient Educ. Couns. 73, 448–455 (2008).70. O’Doherty, K. & Suthers, G. K. Risky communication: Pitfalls in counseling about risk, and how to avoid them. J. Genet. Couns. 16, 409–417 (2007).71. Correll, M. & Gleicher, M. Error bars considered harmful: Exploring alternate encodings for mean and error. IEEE Trans. Vis. Comput. Graph. 20, 2142–2151 (2014).72. Smith, J. A., Michie, S., Stephenson, M. & Quarrell, O. Risk Perception and Decision-making Processes in Candidates for Genetic Testing for Huntington’s Disease: An Interpretative Phenomenological Analysis. J. Heal. Psychol. London 7, 1359–1053 (2002).15673. Hivert, M.-F., Warner, A. S., Shrader, P., Grant, R. W. & Meigs, J. B. Diabetes Risk Perception and Intention to Adopt Healthy Lifest yles Among Primary Care Patients. Diabetes Care 32, 1820–1822 (2009).74. Spiegelhalter, D. & Pearson, M. 2845 ways to spin the Risk. Understanding Uncertainty (2009). Available at: https://understandinguncertainty.org/node/233. 75. Han, P. K. J. et al. Representing randomness in the communication of individualized cancer risk estimates: Effects on cancer risk perceptions, worry, and subjective uncertainty about risk. Patient Educ. Couns. 86, 106–113 (2012).76. Buhrmester, M. et al. Amazon’s Mechanical Turk: A New Source of Inexpensive, Yet High-Quality,. Perspect. Psychol. Sci. 6, 3–5 (2011).77. Shapiro, D. N., Chandler, J. & Mueller, P. A. Using Mechanical Turk to Study Clinical Populations. Clin. Psychol. Sci. 1, 213–220 (2013).78. Eysenbach, G. Improving the quality of web surveys: The Checklist for Reporting Results of Internet E-Surveys (CHERRIES). J. Med. Internet Res. 6, e34 (2004).79. Ghijben, P., Lancsar, E. & Zavarsek, S. Preferences for Oral Anticoagulants in Atrial Fibrillation: a Best–Best Discrete Choice Experiment. Pharmacoeconomics 32, 1115–1127 (2014).80. Han, P. K. J., Reeve, B. B., Moser, R. P. & Klein, W. M. P. Aversion to ambiguity regarding medical tests and treatments: measurement, prevalence, and relationship to sociodemographic factors. J. Health Commun. 14, 556–72 (2009).81. Zikmund-Fisher, B. J., Smith, D. M., Ubel, P. A. & Fagerlin, A. Validation of the subjective numeracy scale: Effects of low numeracy on comprehension of risk communications and utility elicitations. Med. Decis. Mak. 27, 663–671 (2007).82. Fagerlin, A. et al. Measuring numeracy without a math test: Development of the subjective numeracy scale. Med. Decis. Mak. 27, 672–680 (2007).83. McNaughton, C. D., Cavanaugh, K. L., Kripalani, S., Rothman, R. L. & Wallston, K. A. Validation of a Short, 3-Item Version of the Subjective Numeracy Scale. Med. Decis. Mak. 35, 932–936 (2015).84. Sheeran, P. Intention — Behavior Relations : A Conceptual and Empirical Review. Eur. Rev. Soc. Psychol. 12, 1–36 (2002).85. Sheridan, S. L. et al. A Comparative Effectiveness Trial of Alternate Formats for Presenting Benefits and Harms Information for Low-Value Screening Services. JAMA Intern. Med. 176, 31 (2016).86. Johnston, M. et al. Constructing questionnaires based on the theory of planned behaviour. … Services Researchers (2004).15787. Tan, C. L. H., Gan, V. B. Y., Saleem, F. & Hassali, M. A. A. Building intentions with the theory of planned behaviour: The mediating role of knowledge and expectations in implementing new pharmaceutical services in Malaysia. Pharm. Pract. (Granada). 14, 1–8 (2016).88. Nelson, D. et al. The Health Information National Trends Survey (HINTS): Development, Design, and Dissemination. J. Health Commun. 9, 443–460 (2004).89. O’Connor, A. M. Validation of a Decisional Conflict Scale. Med. Decis. Mak. 15, 25–30 (1995).90. De Achaval, S., Fraenkel, L., Volk, R. J., Cox, V. & Suarez-Lmazor, M. E. Impact of educational and patient decision aids on decisional conflict associated with total knee arthroplasty. Arthritis Care Res. 64, 229–237 (2012).91. Kaplan, A. L. et al. Decisional conflict in economically disadvantaged men with newly diagnosed prostate cancer: Baseline results from a shared decision-making trial. Cancer 120, 2721–2727 (2014).92. Metzger, R. L. & Miller, M. L. Worry Changes Decision Making:The Effect Of Negative Thoughts On Cognitive Processing. Journal of Clinical Psychology 46, 5416–5421 (1990).93. Appelman, A. & Sundar, S. S. Measuring Message Credibility. Journal. Mass Commun. Q. 93, 59–79 (2016).94. Ajzen, I. The theory of planned behavior. Orgnizational Behav. Hum. Decis. Process. 50, 179–211 (1991).95. Galesic, M. & Garcia-Retamero, R. Statistical numeracy for health. Transparent Commun. Heal. Risks Overcoming Cult. Differ. 170, 15–28 (2013).96. Althouse, A. D. Adjust for Multiple Comparisons? It’s Not That Simple. Ann. Thorac. Surg. 101, 1644–1645 (2016).97. United States (Census Bureau). Demographic Trends. GPO (2010).98. Mukaka, M. M. Statistics corner: A guide to appropriate use of correlation coefficient in medical research. Malawi Med. J. 24, 69–71 (2012).99. Cohen, J. Statistical power analysis for the behavioral sciences. Statistical Power Analysis for the Behavioral Sciences 2nd, 567 (1988).100. Han, P. K. J. et al. Predictors of perceived ambiguity about cancer prevention recommendations: sociodemographic factors and mass media exposures. Health Commun. 24, 764–772 (2009).101. Kuhn, K. M. Communicating uncertainty: Framing effects on responses to vague probabilities. Organ. Behav. Hum. Decis. Process. 71, 55–83 (1997).102. Tufte, E. R. The Visual Display of Quantitative Information. Technometrics 2nd, 197 (2001).158Appendix A: CHERRIES ChecklistItem Category Checklist Item ExplanationDesignDescribe survey designDescribe target population, sample frame. Is the sample a convenience sample? (In “open” surveys this is most likely.)IRB (Institutional Review Board) approval and informed consent processIRB approval Mention whether the study has been approved by an IRB.Informed consent Describe the informed consent process. Where were the participants told the length of time of the survey, which data were stored and where and for how long, who the investigator was, and the purpose of the study?Data protection If any personal information was collected or stored, describe what mechanisms were used to protect unauthorized access.Development and pre-testingDevelopment and testingState how the survey was developed, including whether the usability and technical functionality of the electronic questionnaire had been tested before fielding the questionnaire.Recruitment process and description of the sample having access to the questionnaireOpen survey versus closed surveyAn “open survey” is a survey open for each visitor of a site, while a closed survey is only open to a sample which the investigator knows (password-protected survey).Contact mode Indicate whether or not the initial contact with the potential participants was made on the Internet. (Investigators may also send out questionnaires by mail and allow for Web-based data entry.)159Item Category Checklist Item ExplanationAdvertising the surveyHow/where was the survey announced or advertised? Some examples are offline media (newspapers), or online (mailing lists – If yes, which ones?) or banner ads (Where were these banner ads posted and what did they look like?). It is important to know the wording of the announcement as it will heavily influence who chooses to participate. Ideally the survey announcement should be published as an appendix.Survey administrationWeb/E-mail State the type of e-survey (eg, one posted on a Web site, or one sent out through e-mail). If it is an e-mail survey, were the responses entered manually into a database, or was there an automatic method for capturing responses?Context Describe the Web site (for mailing list/newsgroup) in which the survey was posted. What is the Web site about, who is visiting it, what are visitors normally looking for? Discuss to what degree the content of the Web site could pre-select the sample or influence the results. For example, a survey about vaccination on a anti-immunization Web site will have different results from a Web survey conducted on a government Web siteMandatory/voluntary Was it a mandatory survey to be filled in by every visitor who wanted to enter the Web site, or was it a voluntary survey?Incentives Were any incentives offered (eg, monetary, prizes, or non-monetary incentives such as an offer to provide the survey results)?Time/Date In what timeframe were the data collected?Randomization of items or questionnairesTo prevent biases items can be randomized or alternated.Adaptive questioning Use adaptive questioning (certain items, or only conditionally displayed based on responses to other items) to reduce number and complexity of the questions.Number of Items What was the number of questionnaire items per page? The number of items is an important factor for the completion rate.Number of screens Over how many pages was the questionnaire 160Item Category Checklist Item Explanation(pages) distributed? The number of items is an important factor for the completion rate.Completeness check It is technically possible to do consistency or completeness checks before the questionnaire is submitted. Was this done, and if “yes”, how (usually JAVAScript)? An alternative is to check for completeness after the questionnaire has been submitted (and highlight mandatory items). If this has been done, it should be reported. All items should provide a non-response option such as “not applicable” or “rather not say”, and selection of one response option should be enforced.Review step State whether respondents were able to review and change their answers (eg, through a Back button or a Review step which displays a summary of the responses and asks the respondents if they are correct).Response ratesUnique site visitor If you provide view rates or participation rates, you need to define how you determined a unique visitor. There are different techniques available, based on IP addresses or cookies or both.View rate (Ratio of unique survey visitors/unique site visitors)Requires counting unique visitors to the first page of the survey, divided by the number of unique site visitors (not page views!). It is not unusual to have view rates of less than 0.1 % if the survey is voluntary.Participation rate (Ratio of unique visitors who agreed to participate/unique first survey page visitors)Count the unique number of people who filled in the first survey page (or agreed to participate, for example by checking a checkbox), divided by visitors who visit the first page of the survey (or the informed consents page, if present). This can also be called “recruitment” rate.Completion rate (Ratio of users who finished the survey/users who agreed to participate)The number of people submitting the last questionnaire page, divided by the number of people who agreed to participate (or submitted the first survey page). This is only relevant if there is a separate “informed consent” page or if the survey goes over several pages. This is a measure for attrition. Note that “completion” can involve leaving questionnaire items blank. This is not a measure for how completely questionnaires were filled in. (If you need a measure for this, use the word “completeness rate”.)Preventing multiple 161Item Category Checklist Item Explanationentries from the same individualCookies used Indicate whether cookies were used to assign a unique user identifier to each client computer. If so, mention the page on which the cookie was set and read, and how long the cookie was valid. Were duplicate entries avoided by preventing users access to the survey twice; or were duplicate database entries having the same user ID eliminated before analysis? In the latter case, which entries were kept for analysis (eg, the first entry or the most recent)?IP check     Indicate whether the IP address of the client computer was used to identify potential duplicate entries from the same user. If so, mention the period of time for which no two entries from the same IP address were allowed (eg, 24 hours). Were duplicate entries avoided by preventing users with the same IP address access to the survey twice; or were duplicate database entries having the same IP address within a given period of time eliminated before analysis? If the latter, which entries were kept for analysis (eg, the first entry or the most recent)?Log file analysis Indicate whether other techniques to analyze the log file for identification of multiple entries were used. If so, please describe.Registration In “closed” (non-open) surveys, users need to login first and it is easier to prevent duplicate entries from the same user. Describe how this was done. For example, was the survey never displayed a second time once the user had filled it in, or was the username stored together with the survey results and later eliminated? If the latter, which entries were kept for analysis (eg, the first entry or the most recent)?AnalysisHandling of incomplete questionnairesWere only completed questionnaires analyzed? Were questionnaires which terminated early (where, for example, users did not go through all questionnaire pages) also analyzed?Questionnaires submitted with an atypical timestampSome investigators may measure the time people needed to fill in a questionnaire and exclude questionnaires that were submitted too soon. Specify the timeframe that was used as a cut-off point, and 162Item Category Checklist Item Explanationdescribe how this point was determined.Statistical correction Indicate whether any methods such as weighting of items or propensity scores have been used to adjust for the non-representative sample; if so, please describe the methods.163Appendix B: Ambiguous presentation techniques from survey164165Appendix C: Pilot survey intention question166Appendix D: Correlation coefficients between actual intention change and baseline intention across all techniquesPresentation TechniqueCoefficient between intention change and baselineVisual Range 0.5018Textual Prefix 0.3085GRADE 0.3553Time Available 0.3373Expert agreement 0.4052Animated GIF 0.4783Number of People 0.1999Gradient Range 0.2753Textual Range 0.5397167Appendix E: Logistic regression univariate test results (no technique in model)Sociodemographic StatisticsOR 0.77Sex (female)p 0.186OR 1.59Education (undergraduate degree or higher) p 0.024OR 0.77Age (>=26)p 0.207OR 1.49Subjective numeracy (>4)p 0.122OR 0.81Ambiguity aversion (>3.2)p 0.296168Appendix F: Omnibus chi2 test results for intentionInteraction Chi2 statisticschi2 6.10sex*techniquep > chi2 0.636chi3 2.66education*techniquep > chi3 0.915chi4 7.65age*techniquep > chi4 0.469chi5 7.29subjective numeracy*technique p > chi5 0.505chi6 3.50ambiguity aversion*technique p > chi6 0.899169Appendix G: Logistic analyses for directional decrease in intention00.10.20.30.40.50.60.70.80.91Visual Range TextualRangeAnimatedGIFGRADE GradientRangeNumber ofPeopleExpertagreementTextualPrefixTimeAvailable(Ref)Predictedproporondecreaseby2pointsPresenta on techniqueAdjusted predicted probabili es for 2-point decrease ininten on170Appendix H: Linear regression univariate test results (no technique in model)OutcomeInteraction TestIntentionRisk perceptionWorry TrustDecisional uncertaintyCoef 0.06 0.09 0.07 -0.07 0.05Sex (female)p 0.572 0.075 0.190 0.264 0.473Coef -0.07 0.03 -0.03 -0.07 0.12Education (undergraduate degree or higher) p 0.11 0.560 0.590 0.266 0.102Coef 0.300 0.00 -0.07 0.15 0.00Age (>=26)p 0.954 0.992 0.179 0.012 0.981Coef -0.13 0.07 0.05 0.09 -0.14Subjective numeracy (>4)p 0.331 0.258 0.460 0.225 0.115Coef 0.16 0.11 -0.01 0.14 0.00Ambiguity aversion (>3.2)p 0.141 0.026 0.834 0.025 0.970171Appendix I: Omnibus F-test results for linear regressionsOutcomeInteraction TestIntentionRisk perceptionWorry TrustDecisional uncertaintyF 0.46 1.22 0.53 0.25 0.47sex*techniquep > F 0.881 0.284 0.834 0.980 0.875F 1.01 1.63 1.58 1.30 0.50education*techniquep > F 0.427 0.114 0.129 0.241 0.856F 0.33 0.67 1.74 0.85 0.95age*techniquep > F 0.954 0.715 0.086 0.557 0.475F 0.72 0.78 0.98 1.06 0.89subjective numeracy*technique p > F 0.671 0.623 0.454 0.393 0.525F 0.34 0.70 0.89 0.64 1.13ambiguity aversion*technique p > F 0.950 0.691 0.523 0.748 0.342 172Appendix J: 1, 2 and 3 point proportion changes across secondary outcomesOutcomes  Type Overall (n=576)Visual Range (n=78)Textual Prefix (n=60)GRADE (n=60)Time Treatment Available (n=67)Experts Disagree (n=55)Animated GIF (n=61)Number Of People (n=62)Gradient Range (n=68)Textual Range (n=65)Magnitude of risk perception (1-5)  1-point change (SD) 0.22 (0.41) 0.22 (0.42) 0.22 (0.42) 0.20 (0.40) 0.15 (0.36) 0.15 (0.36) 0.18 (0.39) 0.29 (0.46) 0.29 (0.46) 0.28 (0.45)2-point change (SD) 0.04 (0.20) 0.03 (0.16) 0.03 (0.18) 0.03 (0.18) 0.03 (0.17) 0.00 (0.00) 0.05 (0.22) 0.06 (0.25) 0.06 (0.24) 0.08 (0.27)3-point change (SD) 0.01 (0.07) 0.00 (0.00) 0.00 (0.00) 0.03 (0.18) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.02 (0.12)Worry (1-5)  1-point change (SD) 0.28 (0.45) 0.32 (0.47) 0.28 (0.45) 0.27 (0.45) 0.18 (0.39) 0.13 (0.34) 0.28 (0.45) 0.35 (0.48) 0.34 (0.48) 0.32 (0.47)2-point change (SD) 0.03 (0.18) 0.03 (0.16) 0.03 (0.18) 0.05 (0.22) 0.00 (0.00) 0.00 (0.00) 0.02 (0.13) 0.03 (0.18) 0.07 (0.26) 0.06 (0.24)3-point change (SD) 0.00 (0.04) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.01 (0.12) 0.00 (0.00)Trust (1-5)  1-point change (SD) 0.22 (0.41) 0.21 (0.41) 0.07 (0.25) 0.22 (0.42) 0.06 (0.24) 0.16 (0.37) 0.33 (0.47) 0.29 (0.46) 0.32 (0.47) 0.28 (0.45)2-point change (SD) 0.05 (0.22) 0.08 (0.27) 0.00 (0.00) 0.05 (0.22) 0.00 (0.00) 0.04 (0.19) 0.15 (0.36) 0.08 (0.27) 0.04 (0.21) 0.03 (0.17)3-point change (SD) 0.01 (0.11) 0.01 (0.11) 0.00 (0.00) 0.03 (0.18) 0.00 (0.00) 0.02 (0.13) 0.03 (0.18) 0.00 (0.00) 0.00 (0.00) 0.02 (0.12)Decisional uncertainty (1-5)  1-point change (SD) 0.33 (0.47) 0.41 (0.50) 0.15 (0.36) 0.32 (0.47) 0.15 (0.36) 0.22 (0.42) 0.39 (0.49) 0.27 (0.45) 0.41 (0.50) 0.57 (0.50)2-point change (SD) 0.12 (0.32) 0.21 (0.41) 0.02 (0.13) 0.10 (0.30) 0.03 (0.17) 0.09 (0.29) 0.11 (0.32) 0.08 (0.27) 0.09 (0.29) 0.29 (0.46)3-point change (SD) 0.02 (0.15) 0.06 (0.25) 0.02 (0.13) 0.00 (0.00) 0.00 (0.00) 0.02 (0.13) 0.02 (0.13) 0.00 (0.00) 0.00 (0.00) 0.09 (0.29) 173Appendix K: Table of responses for "low" knowledge question. Shaded row is the "correct" answer.Low responseVisual RangeTextual Prefix GRADETime AvailableExpert agreementAnimated GIFNumber of PeopleGradient RangeTextual Range0 6 9 12 15 10 5 13 3 31 5 7 6 7 4 0 8 1 52 2 3 4 3 6 3 7 2 23 62 3 4 3 1 26 1 58 504 1 10 6 10 10 11 12 1 15 0 7 10 6 12 8 4 1 16 0 5 9 9 5 2 6 0 07 0 2 1 0 1 1 0 0 18 1 12 6 9 5 4 11 1 010 0 1 0 1 1 0 0 0 013 0 0 0 0 0 0 0 0 120 0 0 1 0 0 0 0 0 087 0 0 0 0 0 0 0 1 088 1 0 0 0 0 0 0 0 090 0 1 0 0 0 0 0 0 092 0 0 1 4 0 1 0 0 1 174Appendix L: Table of responses for “high” knowledge question. Shaded row is the “correct” answer.High responseVisual RangeTextual Prefix GRADETime AvailableExpert agreementAnimated GIFNumber of PeopleGradient RangeTextual Range1 1 0 0 0 0 0 2 0 02 0 0 0 1 0 0 1 0 03 1 0 0 0 0 0 0 0 14 0 0 0 0 0 0 2 1 05 1 0 2 0 0 0 2 1 06 0 0 0 2 0 1 1 0 17 1 1 0 0 0 1 0 0 08 1 26 20 31 10 6 27 1 59 0 2 2 0 0 0 1 0 110 6 12 7 11 15 3 10 1 311 0 0 1 1 1 1 0 1 012 1 6 7 6 8 5 10 0 013 61 0 1 0 0 33 0 63 4914 0 1 0 2 0 1 0 0 115 3 5 9 4 8 7 2 0 216 0 0 2 3 1 0 2 0 118 0 0 1 0 0 0 0 0 120 1 4 4 0 7 1 1 0 025 0 1 1 0 2 1 0 0 030 0 1 0 2 0 0 0 0 034 0 0 0 0 0 1 0 0 050 0 0 1 0 0 0 0 0 087 1 0 0 0 0 0 0 0 090 0 0 0 0 1 0 0 0 094 0 1 0 0 0 0 0 0 0100 0 0 2 4 2 0 1 0 0

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0366137/manifest

Comment

Related Items