Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Visual techniques for exploring alternatives and preferences in Group Preferential Choice Hindalong, Emily 2018

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2018_february_hindalong_emily.pdf [ 12.44MB ]
Metadata
JSON: 24-1.0363067.json
JSON-LD: 24-1.0363067-ld.json
RDF/XML (Pretty): 24-1.0363067-rdf.xml
RDF/JSON: 24-1.0363067-rdf.json
Turtle: 24-1.0363067-turtle.txt
N-Triples: 24-1.0363067-rdf-ntriples.txt
Original Record: 24-1.0363067-source.json
Full Text
24-1.0363067-fulltext.txt
Citation
24-1.0363067.ris

Full Text

Visual Techniques for Exploring Alternatives andPreferences in Group Preferential ChoicebyEmily HindalongM.Sc. Bioinformatics, The University of British Columbia, 2015B.Sc. Cognitive Systems, The University of British Columbia, 2011A THESIS SUBMITTED IN PARTIAL FULFILLMENTOF THE REQUIREMENTS FOR THE DEGREE OFMaster of ScienceinTHE FACULTY OF GRADUATE AND POSTDOCTORALSTUDIES(Computer Science)The University of British Columbia(Vancouver)January 2018c© Emily Hindalong, 2018AbstractGroup Preferential Choice is when two or more individuals must collectively chooseamong a competing set of alternatives based on their individual preferences. Inthese situations, it it can be helpful for decision makers to visually model and com-pare their preferences in order to better understand each others’ points of view.Although a number of tools for preference modelling and inspection exist, noneare based on a comprehensive understanding of the demands of Group PreferentialChoice in particular.The goal of our work is to understand these demands and explore the spaceof possible visualizations to support them. We make progress toward this goal inthree steps. First, we characterize the scope of Group Preferential Choice by ex-amining a diverse set of real-world scenarios. In particular, we identify sources ofvariation in preference models, goals, and contexts. Second, we produce a detailedmodel of abstract tasks to support the goals identified in the first step. Finally,we analytically evaluate various designs with respect to these tasks and concludewith recommendations for different classes of users. We believe that these contri-butions will help designers produce more effective visual support tools for GroupPreferential Choice.iiLay SummarySometimes a group of people must make a choice together. For instance, a boardof directors may need to agree on a new office location. This can be challenging ifthere are multiple factors involved, or if the decision makers disagree about whatis important. In these situations, effective communication is key.One way to improve communication is to have each decision maker show hisor her preferences graphically. For instance, they might use a bar chart to commu-nicate their ratings of potential office locations. In this work, we try to understandwhat questions a graphic needs to be able to answer to support effective group de-cision making. Then, we present a variety of graphical options and discuss howwell each one answers these questions.iiiPrefaceCredit for the overarching vision of the project goes to my supervisor, GiuseppeCarenini - it was his idea to develop a design space of visualizations to supportGroup Preferential Choice. He also came up with the idea of a Preference ModelTaxonomy (Section 3.1) and contributed substantially to its development.Hooman Shariati observed and conducted the interviews for the XpertsCatchcase (Section 3.2.6) and the department meeting portion of the Faculty Hiring case(Section 3.2.2).Soheil Kianzad observed and conducted the interviews for the Gift case (Sec-tion 3.2.7).All other original work is my own. There are no publications based on thiswork at this time.ivTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiLay Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ixList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Background and Related Work . . . . . . . . . . . . . . . . . . . . . 72.1 Decision Theoretic Foundations . . . . . . . . . . . . . . . . . . 72.1.1 Utility Theory . . . . . . . . . . . . . . . . . . . . . . . 72.1.2 Multiple Criteria Decision Making . . . . . . . . . . . . . 82.1.3 Group Multi-Attribute Decision Making . . . . . . . . . . 112.2 Visual and Interactive Techniques For MADM and Related Data . 122.2.1 Group ValueCharts . . . . . . . . . . . . . . . . . . . . . 132.2.2 ConsensUs . . . . . . . . . . . . . . . . . . . . . . . . . 172.2.3 Web-HIPRE . . . . . . . . . . . . . . . . . . . . . . . . 182.2.4 LineUp . . . . . . . . . . . . . . . . . . . . . . . . . . . 20v2.2.5 WeightLifter . . . . . . . . . . . . . . . . . . . . . . . . 222.2.6 SurveyVisualizer . . . . . . . . . . . . . . . . . . . . . . 242.2.7 DCPAIRS . . . . . . . . . . . . . . . . . . . . . . . . . . 252.2.8 QStack . . . . . . . . . . . . . . . . . . . . . . . . . . . 272.2.9 Lessons from Evaluations . . . . . . . . . . . . . . . . . 282.3 Design Space Analyses . . . . . . . . . . . . . . . . . . . . . . . 283 Characterizing Group Preferential Choice . . . . . . . . . . . . . . 313.1 Preliminary Data Model for Group Preferential Choice . . . . . . 323.1.1 Preference Model Taxonomy . . . . . . . . . . . . . . . . 333.2 Seven Real-World Scenarios . . . . . . . . . . . . . . . . . . . . 393.2.1 Best Paper at a Conference . . . . . . . . . . . . . . . . . 403.2.2 Faculty Hiring . . . . . . . . . . . . . . . . . . . . . . . 423.2.3 Campbell River Watershed . . . . . . . . . . . . . . . . . 473.2.4 MJS77 Project . . . . . . . . . . . . . . . . . . . . . . . 523.2.5 Nuclear Crisis Management . . . . . . . . . . . . . . . . 563.2.6 Technology Selection at XpertsCatch . . . . . . . . . . . 603.2.7 Buying a Gift for a Colleague . . . . . . . . . . . . . . . 623.3 Data Model Revisions . . . . . . . . . . . . . . . . . . . . . . . . 643.3.1 Participant Roles . . . . . . . . . . . . . . . . . . . . . . 643.3.2 Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . 653.3.3 Evaluator Groups and Weights . . . . . . . . . . . . . . . 663.3.4 Preference Model Taxonomy . . . . . . . . . . . . . . . . 673.4 Revised Data Model for Group Preferential Choice . . . . . . . . 693.4.1 Preference Model Taxonomy . . . . . . . . . . . . . . . . 703.5 Summary of Preference Synthesis Goals . . . . . . . . . . . . . . 723.6 Summary of Contextual Features and Scale . . . . . . . . . . . . 743.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 764 Data and Task Abstraction for Preference Synthesis in Group Pref-erential Choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 784.1 Data Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . 784.2 Task Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . 81vi4.2.1 High-Level Task Abstraction . . . . . . . . . . . . . . . . 814.2.2 Low-Level Task Abstraction . . . . . . . . . . . . . . . . 865 A Design Space of Visualizations to Support Preference Synthesis inGroup Preferential Choice . . . . . . . . . . . . . . . . . . . . . . . 965.1 Static Design Aspect . . . . . . . . . . . . . . . . . . . . . . . . 975.1.1 Major Idioms . . . . . . . . . . . . . . . . . . . . . . . . 985.1.2 Task-based Evaluation of Encodings . . . . . . . . . . . . 1115.2 Dynamic Design Aspect . . . . . . . . . . . . . . . . . . . . . . 1235.2.1 View Transformations . . . . . . . . . . . . . . . . . . . 1235.2.2 Data Transformations . . . . . . . . . . . . . . . . . . . . 1255.3 Composite Design Aspect . . . . . . . . . . . . . . . . . . . . . . 1265.3.1 General Recommendations . . . . . . . . . . . . . . . . . 1265.3.2 Class C: Casual Users . . . . . . . . . . . . . . . . . . . 1275.3.3 Class B: Professional Users . . . . . . . . . . . . . . . . 1305.3.4 Class A: Specialized Users . . . . . . . . . . . . . . . . . 1366 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1376.1 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . 1376.1.1 Characterization of Group Preferential Choice . . . . . . . 1386.1.2 Data and Task Abstraction for Preference Synthesis . . . . 1386.1.3 Design Space for Preference Synthesis . . . . . . . . . . . 1396.2 Critical Reflections . . . . . . . . . . . . . . . . . . . . . . . . . 1406.2.1 Goals Elicitation . . . . . . . . . . . . . . . . . . . . . . 1406.2.2 A More Agile Approach? . . . . . . . . . . . . . . . . . . 1406.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1416.3.1 Validating the Data and Task Model . . . . . . . . . . . . 1416.3.2 Validating the Task-based Assessment . . . . . . . . . . . 1426.3.3 Extending the Design Space to Other Levels of the Taxonomy1426.3.4 Relating Existing Encodings to the Design Space . . . . . 1436.3.5 Extending the Design Space to Hierarchical and Large Di-mensions . . . . . . . . . . . . . . . . . . . . . . . . . . 144Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145viiA Analysis of Existing Encodings . . . . . . . . . . . . . . . . . . . . . 151viiiList of TablesTable 2.1 Eight Techniques for Visualizing Multi-Attribute Data . . . . . 12Table 2.2 Six Design Space Analyses . . . . . . . . . . . . . . . . . . . 30Table 3.1 Preference Synthesis Goals and Tasks for Best Paper Scenario . 42Table 3.2 Preference Synthesis Goals and Tasks for Faculty Hiring Scenario 46Table 3.3 Preference Synthesis Goals and Tasks for Campbell River Sce-nario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Table 3.4 Preference Synthesis Goals and Tasks for MJS77 Scenario . . . 54Table 3.5 Preference Synthesis Goals and Tasks for Nuclear Crisis Scenario 60Table 3.6 Preference Synthesis Goals for XpertsCatch Scenario . . . . . 62Table 3.7 Coverage of Preference Model Taxonomy . . . . . . . . . . . 67Table 3.8 Summary of Data Model Issues from Scenarios . . . . . . . . 68Table 3.9 Goals for Preference Synthesis in Group Preferential Choice . . 74Table 3.10 Contextual Features of Seven Scenarios . . . . . . . . . . . . . 75Table 4.1 Basic Measures in Group Preferential Choice . . . . . . . . . . 80Table 4.2 Tasks to Support G1: Discover Viable Alternatives . . . . . . . 83Table 4.3 Tasks to Support G2: Discover Sources of Disagreement . . . . 84Table 4.4 Tasks to Support G3: Explain Individual Scores . . . . . . . . 85Table 4.5 Tasks to Support G4: Validate Model . . . . . . . . . . . . . . 86Table 4.6 Target Values for Task Analysis . . . . . . . . . . . . . . . . . 87Table 4.7 Target Distributions for Task Analysis . . . . . . . . . . . . . 87Table 4.8 Auxiliary Tasks . . . . . . . . . . . . . . . . . . . . . . . . . 88Table 5.1 Possible Inputs for Each Auxiliary Task . . . . . . . . . . . . . 112ixTable 5.2 Applicable Rearrangements for Each Encoding . . . . . . . . . 124Table 5.3 Recommended Interactive Features for Class C . . . . . . . . . 130Table 5.4 Recommended Interactive Features for Class B . . . . . . . . . 136xList of FiguresFigure 1.1 Web-HIPRE . . . . . . . . . . . . . . . . . . . . . . . . . . . 3Figure 1.2 Group ValueCharts . . . . . . . . . . . . . . . . . . . . . . . 4Figure 1.3 ConsensUs . . . . . . . . . . . . . . . . . . . . . . . . . . . 5Figure 2.1 A multi-attribute value function for choosing a hotel . . . . . 10Figure 2.2 Group ValueCharts . . . . . . . . . . . . . . . . . . . . . . . 13Figure 2.3 Web ValueCharts - individual view . . . . . . . . . . . . . . . 15Figure 2.4 Web ValueCharts - group view . . . . . . . . . . . . . . . . . 16Figure 2.5 ConsensUs . . . . . . . . . . . . . . . . . . . . . . . . . . . 17Figure 2.6 Web-HIPRE - main view . . . . . . . . . . . . . . . . . . . . 19Figure 2.7 Web-HIPRE - analysis window . . . . . . . . . . . . . . . . . 20Figure 2.8 LineUp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21Figure 2.9 WeightLifter . . . . . . . . . . . . . . . . . . . . . . . . . . 22Figure 2.10 WeightLifter plus two additional views . . . . . . . . . . . . 23Figure 2.11 SurveyVisualizer . . . . . . . . . . . . . . . . . . . . . . . . 24Figure 2.12 DCPAIRS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26Figure 2.13 QStack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27Figure 3.1 Preference Model Taxonomy . . . . . . . . . . . . . . . . . . 38Figure 3.2 Best Paper scenario - spreadsheet of ranks . . . . . . . . . . . 41Figure 3.3 Faculty Hiring scenario - matrix of histograms . . . . . . . . 44Figure 3.4 Faculty Hiring scenario - interleaved histograms . . . . . . . 45Figure 3.5 Campbell River scenario - consequence table . . . . . . . . . 48Figure 3.6 Campbell River scenario - model validation . . . . . . . . . . 49xiFigure 3.7 Campbell River scenario - weight range plots . . . . . . . . . 50Figure 3.8 Campbell River scenario - matrix of alternative ranks . . . . . 51Figure 3.9 MJS77 scenario - table of alternative scores . . . . . . . . . . 55Figure 3.10 MJS77 scenario - table of alternative rankings by team . . . . 56Figure 3.11 Nuclear Crisis scenario - alternative scores by group . . . . . 59Figure 4.1 Brehmer and Munzner’s typology of visualization tasks [7]. . 81Figure 5.1 Stacked Bar Chart . . . . . . . . . . . . . . . . . . . . . . . 99Figure 5.2 Multi-bar Chart Design 1 . . . . . . . . . . . . . . . . . . . . 100Figure 5.3 Multi-bar Chart Design 2 . . . . . . . . . . . . . . . . . . . . 100Figure 5.4 Tabular Bar Chart Design 1 . . . . . . . . . . . . . . . . . . . 101Figure 5.5 Tabular Bar Chart Design 2 . . . . . . . . . . . . . . . . . . . 102Figure 5.6 Strip Plot Design 1 . . . . . . . . . . . . . . . . . . . . . . . 103Figure 5.7 Strip Plot Design 2 . . . . . . . . . . . . . . . . . . . . . . . 103Figure 5.8 Box Plot Design 1 . . . . . . . . . . . . . . . . . . . . . . . 105Figure 5.9 Box Plot Design 2 . . . . . . . . . . . . . . . . . . . . . . . 105Figure 5.10 Parallel Coordinates Design 1 . . . . . . . . . . . . . . . . . 106Figure 5.11 Parallel Coordinates Design 2 . . . . . . . . . . . . . . . . . 106Figure 5.12 Troublesome radar chart. . . . . . . . . . . . . . . . . . . . . 107Figure 5.13 Radar Chart Design 1 . . . . . . . . . . . . . . . . . . . . . . 108Figure 5.14 Radar Chart Design 2 . . . . . . . . . . . . . . . . . . . . . . 109Figure 5.15 Tabular Bar Chart Design 1 with variable column widths . . . 110Figure 5.16 Stacked Bar Chart corresponding to Figure 5.15 . . . . . . . . 111Figure 5.17 Comparing the efficacy of different encodings for task AT2 . . 113Figure 5.18 Comparing the efficacy of different encodings for task AT3 . . 115Figure 5.19 Illustrating the value of sorting for task AT10 . . . . . . . . . 121Figure 5.20 Efficacy of each encoding for each auxiliary task (when Eval-uatorWeights are defined) . . . . . . . . . . . . . . . . . . . 122Figure 5.21 Efficacy of each encoding for each auxiliary task (when Eval-uatorWeights are not defined) . . . . . . . . . . . . . . . . . 122Figure 5.22 Class C Option 2: Dual View . . . . . . . . . . . . . . . . . . 129Figure 5.23 Class B Option 1: Dual View with Custom Strip Plot (1) . . . 131xiiFigure 5.24 Class B Option 1: Dual View with Custom Strip Plot (2) . . . 132Figure 5.25 Class B Option 1: Dual View with Custom Strip Plot + flexiblecolour mapping (1) . . . . . . . . . . . . . . . . . . . . . . . 133Figure 5.26 Class B Option 1: Dual View with Custom Strip Plot + flexiblecolour mapping (2) . . . . . . . . . . . . . . . . . . . . . . . 133Figure 5.27 Class B Option 2: Dual View with Intelligent Plot Selection (1) 134Figure 5.28 Class B Option 2: Dual View with Intelligent Plot Selection (2) 135Figure 6.1 Dimensions and Measures by Preference Model Taxonomy level143xiiiAcknowledgmentsI would like to thank my supervisor, Dr. Giuseppe Carenini, for taking me on asa student and helping me find a project that was a good match for my skills andinterests. I am grateful for his patient tutelage and enthusiasm for novel research.Special thanks go to my second reader, Dr. Tamara Munzner, for providingexpert guidance at multiple stages of the project. I owe much of my success to herencouragement and advice.Additional thanks go to Dr. David Poole for advising me during the earlystages of the project, Aaron Mishkin for his incisive comments on the first draft ofChapter 3, and Joyce Poon for answering my (many) questions in a timely mannerand keeping the Computer Science Graduate Program running smoothly.Finally, I would like to thank my husband, Michael Gottlieb, for taking care ofdomestic matters and making sure I took meal breaks - I know it was about as easyas giving a cat a bath.xivChapter 1IntroductionGroup Preferential Choice is when two or more individuals, each with his or herown preferences, must collectively choose among a competing set of alternatives.These situations are common in organizational and public planning. Examplesinclude selecting an office location, hiring a candidate for a position, or choosing awastewater management system for a city.Sometimes it is possible for the group to arrive at a satisfactory decision throughdiscussion alone. However, this can be challenging if group members have differ-ences in preferences and opinions. In fact, the group members may not even havea complete understanding of their own preferences. To complicate matters further,the group may wish to explicitly model trade-offs among competing criteria. Forinstance, a company might want to independently assess a job candidate’s educa-tion, experience, and company fit.As the number of group members, alternatives, and criteria grows, it becomesincreasingly difficult to grapple with the complexities effectively. In fact, organiza-tions often resort to pre-existing solutions because they do not have the resourcesto tackle this challenge [11] [26]. For instance, a municipality may elect to keepan outdated and costly wastewater management system simply because analyzingthe pros and cons of alternatives is too daunting. Clearly, there is an incentive todevelop processes and tools to facilitate such analyses.One viable approach is to have each decision maker explicitly model his orher preferences over the alternatives and criteria. Then, the group members can1compare how alternatives perform under different preference models in order tocome to a better understanding of other points of view. There is evidence thatexplicit preference modelling can encourage reflection, promote transparency andinclusiveness, and ultimately lead to greater satisfaction with the outcome [4].The benefits of this process depend on how quickly and effectively decisionmakers can glean insights from their own and others’ preferences. It can be difficultto spot interesting differences and trends when the data is represented in text-basedformats, such as traditional spreadsheets. Information Visualization solutions aremore promising because they leverage the pattern recognition and pre-attentive ca-pabilities of the human visual system [38]. However, not all graphical methods areequally effective, and a poorly-chosen graphic can actually diminish the efficacy ofthe decision making process [2].Many existing tools for preference modeling and inspection come from thefield of Multi-Criteria Decision Making (MCDM), a sub-discipline of OperationsResearch concerned with practical aspects of multi-criteria decision making [30].There is considerable evidence that MCDM processes can improve group deci-sion making by enhancing communication among group members [6] and lendingtransparency and legitimacy to the decision making process [51].Although MCDA support tools are plentiful, few are able to integrate and dis-play multiple preference models simultaneously [40] [51]. For this reason, mostrecorded applications of MCDA to group decision making involve joint construc-tion of a single preference model by all members of the group [51]. Tools thatdo allow multiple users to input their preferences, such as M-MACBETH [18],D-Sight [1], and 1000Minds [28], typically show the aggregate performance ofalternatives over all decision makers using non-interactive charts and tables.One notable exception is Web-HIPRE, which allows groups to model each de-cision maker as a separate criterion in the overarching decision problem [39]. Web-HIPRE shows the performance of alternatives using stacked bar charts, where thescore for each decision maker is a segment contributing to the total (Figure 1.1).Like other MCDA support tools, the main focus of Web-HIPRE is on the decisionanalysis process, not the graphical representation.2Figure 1.1: Web-HIPRE [40]. The total score for each alternative is repre-sented by bar height, and the contribution of each decision maker tothe total is represented by segment height (here, each decision maker isactually a group of people acting as a unit).A few other tools to support joint preference inspection in Group PreferentialChoice put a stronger emphasis on Information Visualization. One is Group Val-ueCharts (Figure 1.2), which is an interactive visual aid that uses a combination ofstacked and multi-bar charts to show how different alternatives perform for differ-ent users [4].3Figure 1.2: Group ValueCharts [4]. The top right section shows the scoreof each alternative for each user. The bars are grouped by alternativeand colour-coded by user. The bottom left section shows the criteriahierarchy, and the bottom right section shows the breakdown of scoresby criterion. The red outlines show the weights assigned by each userto each criterion.Group ValueCharts is an extension of ValueCharts, a system that supports elic-itation and inspection of linear preference models for individual decision makers[12]. ValueCharts was analytically evaluated based on a task model of individualpreferential choice [5]. However, this task model may not generalize to GroupPreferential Choice.Another tool, ConsensUs, aims to support the consensus building process byhighlighting sources of disagreement [36]. It uses strip plots to encode per-criterion4scores for each alternative and user (Figure 1.3). This tool only allows individualusers to compare their preferences against the group average or one other user at atime.Figure 1.3: ConsensUs [36]. The Individual View (left) allows each user toscore alternatives relative to each other on each criterion. The alterna-tives are colour-coded dots. The Group View (right) shows the individ-ual scores (small dots) in the context of group averages (large dots). Redlines highlight points of disagreement.A major shortcoming of all available tools and designs is that none (to ourknowledge) are grounded in a comprehensive data and task model for Group Pref-erential Choice. This is not ideal, as the suitability of a design will likely dependon the characteristics of the decision making scenario. However, the diversity ofGroup Preferential Choice scenarios has not been studied. Future designers of suchtools would benefit from a clearer understanding of this diversity and the implica-tions for design. Our work addresses this problem in three steps:First, we perform an in-depth analysis of seven real-world group preferentialchoice scenarios in order to characterize the variation in the data, goals, and deci-sion making contexts (Chapter 3). For the analysis of goals, we focus solely on thestage where individual preference models are combined and discussed by the deci-sion makers, a process we call preference synthesis. These scenarios were studiedusing a combination of structured interviews and analysis of secondary sources.The outputs of this analysis are:1. A precise definition of Group Preferential Choice2. A taxonomy of commonly-used preference models3. A summary of the preference synthesis goals across scenarios4. A summary of the decision making contexts across scenarios5Second, we translate the data and goals identified in Chapter 3 into domain-independent language in order to produce descriptions at various levels of abstrac-tion (Chapter 4). These abstractions are intended to be suitable visualization designand analysis. The data and task abstraction is performed in accordance with the ty-pology of Brehmer and Munzner [7].Finally, we present a prescriptive design space of visualizations to supportpreference synthesis in the context of Group Preferential Choice (Chapter 5). Aprescriptive design space is a set of viable designs for a particular kind of datawith recommendations based on goals and contexts. For now, our design space islimited to a subset of all Group Preferential Choice scenarios where there are noexplicit criteria and no more than a dozen alternatives and decision makers. Thedesign space is described in terms of the following design aspects:1. Static design aspect - the basic idioms that are available and various optionsfor mapping the dimensions and measures to marks and channels.2. Dynamic design aspect - the mechanisms for transforming the view.3. Composite design aspect - the options for arranging and coordinating differ-ent views relative to each other.For inspiration, we look to other prescriptive design space papers, such asBrehmer et al. [9], which presents a design space for timelines in the context ofstorytelling. This work analyzes over 100 existing timelines in order to identify themajor design dimensions, and then offers recommendations based on the narrativepoints that the storyteller would like to make. Similarly, we provide design recom-mendations based on the decision making context and the relative importance ofvarious tasks.We believe this work will provide a sound starting point for designers of GroupPreferential Choice support tools. Depending on the situation, designers may wishto create standalone visual aids or integrate visualizations into complete decisionsupport systems. We expect our recommendations to serve in a wide variety ofindividuals, ranging from project managers who need to produce graphical sum-maries quickly to designers of MCDA support tools who want to incorporate moreeffective graphics into their decision analysis software.6Chapter 2Background and Related Work2.1 Decision Theoretic FoundationsDecision Theory is a field of study that is concerned with developing abstractionsand techniques to support rational decision making. It defines a decision problemas one where a decision maker must select between two or more acts, each of whichhas an associated outcome [44]. The decision maker is presumed to prefer someoutcomes over others. In some cases, the decision maker may not know for certainwhat the outcome of an act will be. Such scenarios are called decisions under risk.2.1.1 Utility TheoryIn economics, utility is a quantitative measure of satisfaction with a good, service,or situation. In order to apply certain decision theoretic methods to a decisionproblem, a decision maker must describe her preferences as a utility function [60].There are two classes of utility functions: ordinal and cardinal.An ordinal utility function ranks all possible outcomes from most to least pre-ferred without specifying the strength of preference. More formally, it defines apreference relation between each pair of outcomes indicating a preference for oneor the other or indifference between the two. The function must satisfy certainconditions, such as completeness, anti-symmetry, and transitivity [44].A cardinal utility function maps outcomes to values along an interval scalesuch that preferences are preserved up to positive linear transformations. Cardinal7utility functions are essential for decision making under risk [32]. However, manyeconomists believe they are unnecessary in risk-free decision analysis, since theyare more difficult to elicit and do not add much power to the decision analysis [48][45].Aside from risk, another major consideration is whether the decision problemhas one or multiple attributes [32]. In the single attribute case, the outcome isan atomic value; in the multiple attribute case, it is a composition of values fordifferent attributes. For instance, if a decision maker is looking to buy a houseand only cares about cost, the outcome can be expressed in terms of cost only.However, if the decision maker also cares about location, then the outcome needsto be expressed in terms of cost and location.The optimal decision making strategy for the single attribute case without riskis straightforward - simply choose the option that yields the most preferred outcomeon the sole attribute. The multiple attribute case is more complicated, and there isan entire field of study devoted to it.2.1.2 Multiple Criteria Decision MakingMulti-Criteria Decision Making (MCDM) is a sub-discipline of Operations Re-search that is concerned with formalizing and developing methods for scenarioswhere a decision maker must choose among multiple competing alternatives in thepresence of multiple competing criteria. Although MCDM draws from DecisionTheory, its emphasis is more pragmatic than theoretical. Here, we focus on a subsetof MCDM called Multi-Attribute Decision Making (MADM), which is concernedwith scenarios where the options are finite and predefined [29].A number of MADM methods have been developed, but they all require thefollowing key ingredients:1. A finite set of two or more alternatives2. A finite set of two or more attributes (also called objectives or criteria)3. A quantitative model of the individual’s preferences over the alternativesand/or attributesThe main way in which these methods differ is in how preferences are elicited,expressed, and combined to produce a final score or ranking over the alternatives.8Multi-Attribute Utility Theory (MAUT)Multi-Attribute Utility Theory is the most popular class of MADM methods andthe only one with a solid foundation in Decision Theory [32]. Multi-attribute valuetheory (MAVT) is a special case of MAUT where there is no risk.According to MAVT, a decision maker’s preferences can be modeled usingan additive multi-attribute value function (AMVF) as long as the attributes haveadditive independence, which means that the outcome on one attribute does notaffect how the decision maker feels about the possible outcomes on other attributes[32]. An AMVF has three major components (illustrated in Figure 2.1):• An attribute tree, which specifies a decomposition of high-level attributesinto lower-level attributes. The attributes at the leaves are called primitiveattributes, and each primitive attribute has a set of possible outcomes calledits domain.• A set of score functions for each primitive attribute specifying the value ofeach possible outcome to the decision maker. The best possible outcome isassigned a score of 1 and the worst a score of 0, and all other outcomes arescored relative to these two.• An assignment of weights to the primitive attributes such that the sum of theweights over all primitive attributes is 1. The weight represents the value ofswitching from the worst possible outcome to the best possible outcome forthat primitive attribute relative to the others.The final value for each alternative can be computed by taking the weightedsum of the outcome scores on all primitive attributes.9Figure 2.1: A multi-attribute value function for choosing a hotel. A MAVTconsists of an attribute tree (a) and a set of weights and score functionsfor each primitive attribute (b). The score for a each alternative-attributepair is the value assigned by the score function to the outcome of thatalternative on that attribute (c). The score for an alternative (e.g. DaysInn) is the weighted sum of the scores on each attribute (d).Other MethodsAside from MAUT, there are two other popular classes of MADM techniques.Outranking methods such as ELECTRE [49] are among the oldest MADMtechniques [59]. They require users to set qualifying and indifference thresholdsover the attributes. Then, alternatives are eliminated if they do not meet the quali-fying thresholds or if they are outranked by at least one alternative - that is, at thesame or a lower indifference class on every attribute. Although outranking meth-ods have largely been replaced by more precise methods, they are still sometimesused to winnow the set of alternatives to a reasonable number [59].The Analytical Hierarchy Process (AHP) is the main contemporary contenderto MAUT [50]. In AHP, the decision maker is presented with pairs of alternativesand asked to indicate their degree of preference for one alternative over the other on10each attribute. These comparisons are used to generate a score for each alternative-attribute pair. A similar procedure is used to elicit weights. AHP has been criticizedfor its susceptibility to rank-reversal, which means that adding a new alternative tothe set may cause the relative ranks of two other alternatives to change [59].2.1.3 Group Multi-Attribute Decision MakingMulti-attribute decision making has been applied in group settings since its ini-tial formulation. Group MADM is similar to individual MADM except that theselection process factors in the preferences of multiple stakeholders. Bose et al.reviewed several early applications of MAUT in group decision making contexts[6]. Based on their findings, they argued that MAU-based models considerablyenhance communication and understanding among group members and should besupported in more computer-based group decision support systems.Salo and Ha¨ma¨la¨inen also argued that MADM methods can benefit group deci-sion making by increasing transparency and legitimacy [51]. They analyzed severalrecent applications of MADM methods to group decision making and identified sixbasic steps common to all:1. Clarification of the decision context and identification of group members2. Explication of decision objectives3. Generation of decision alternatives4. Elicitation of preferences5. Evaluation of decision alternatives6. Synthesis and communication of decision recommendationsThey note that steps 3 and 4 are sometimes reversed but recommend followingthe suggested order because listing alternatives first makes it easier for people toreason about preferences. They also note that decision makers often revisit earliersteps to refine the decision model.One of the major theoretical challenges behind Group MADM is how to com-bine multiple preference models in a way that is both rational and fair. This isthe main concern of a philosophical discipline called Social Choice Theory. Ar-row’s Impossibility Theorem states that it is not possible to aggregate individual11preference rankings into a group preference ranking which is guaranteed to sat-isfy certain reasonable conditions [3]. This is also true for multi-attribute utilityfunctions unless each decision maker is allowed to define her own objectives [31].Fortunately, a theoretically-sound aggregate model might not be necessary inmost cases. In small group decision settings, it may be more important for decisionmakers to understand their own and others’ preferences on an individual basis sothat they can negotiate effectively. To this end, high-quality visualizations have thepotential to help decision makers communicate and reason about their preferences.2.2 Visual and Interactive Techniques For MADM andRelated DataThis section summarizes several published techniques for visualizing scores andpreferences in the context of MADM, as well as techniques for visualizing sim-ilar multi-attribute data, such as rankings, surveys, and product reviews. Thesetechniques are summarized in Table 2.1.Table 2.1: Eight techniques for visualizing multi-attribute data. Techniqueswith ‘Very High’ relevance explicitly support Group MADM. The re-maining techniques were assigned ‘High’ or ‘Medium’ relevance basedon quality and novelty.Context Main Encodings RelevanceGroup ValueCharts [4] Group MADMStacked bar chart;Tabular bar chartVery HighConsensUs [36] Group MADM Dot plots in small multiples Very HighWeb-HIPRE [39] Group MADM Stacked bar chart Very HighLineUp [25] Multi-attribute rankingsSlope graph;Stacked or tabular bar chartHighSurveyVisualizer [10] Multi-attribute survey results Parallel coordinates tree HighWeightLifter [41] MADMStacked bar chart;Parallel coordinatesMediumDCPAIRS [20] MADM Scatterplot matrix (SPLOM) MediumQStack [42] Multi-attribute rankings Stacked bar chart Medium122.2.1 Group ValueChartsGroup ValueCharts [4] is perhaps the most sophisticated tool designed specificallyto support Group MADM. In particular, it is intended for large infrastructure deci-sion problems where trade-offs must be considered between multiple criteria andmultiple decision makers’ preferences. It aims to make the decision process moreparticipatory, transparent, and comprehensible.The main encodings are multi-bar charts (also known as grouped bar charts),which show the total score of each alternative for each decision maker, and tabularbar charts (also known as faceted bar charts or small multiples bar charts), whichshow the breakdown of scores by criteria.Figure 2.2: Group ValueCharts [4]. A colour-coded multi-bar chart showsthe total score of each alternative for each decision maker (top right). Atabular bar chart shows the breakdown of scores for each decision makerby attribute (bottom right). Red outlines show the weights assigned toeach attribute by each decision maker. The criteria hierarchy is shownusing a rectilinear tree (bottom left).13A key strength of Group ValueCharts is that it is compact and informationdense. The multi-bar chart makes it possible to compare the overall performance ofeach alternative across evaluators, while the tabular bar chart supports comparisonon a per-attribute basis. The tabular bar chart also supports direct comparison ofweights.A limitation of Group ValueCharts is that it does not scale beyond a dozen at-tributes, alternatives, or decision makers. One reason for this is that it does notimplement any data reduction strategies to cope with spatial constraints. Anotheris that colour is used to differentiate decision makers, and people can only differ-entiate up to around a dozen colour hues [38].Web ValueChartsWeb ValueCharts is a successor to Group ValueCharts that integrates the capabil-ities of both the group version and the individual version (ValueCharts [12]) ona web platform. It is intended to bring structured decision support to a broaderaudience.Web ValueCharts is a modular system with components to support chart def-inition, chart management, preference elicitation, and preference inspection. Thepreference inspection component has two views - the individual view and the groupview (Figures 2.3 and 2.4).14Figure 2.3: Web ValueCharts - Individual View. A colour-coded stacked bar chart shows the total score of each alter-native (top right). A tabular bar chart shows the breakdown of scores by attribute (bottom right). A rectilineartree shows the attribute hierarchy and the score functions for each attribute (bottom left).15Figure 2.4: Web ValueCharts - Group View. The overall design is the same as for Group ValueCharts.16The individual view (Figure 2.3) allows users to inspect their own scores andpreferences in isolation. They can adjust their score functions and weights dynam-ically by clicking and dragging relevant components. The group view (Figure 2.4)shows the alternative scores and preferences for all users in a single view. Thedesign is identical to that for Group ValueCharts, except that users may also in-spect the score functions. Users can select a subset of decision makers to view bytoggling the check boxes beside the names.Both views support manual reordering of alternatives and attributes. There arevarious view options that can be turned on and off, including the score functions,the outcomes overlay, and the utility scale.Web ValueCharts supports real-time synchronization of all group members’charts. Users can join a group chart, update their preferences, and even edit thecriteria and alternatives in real-time. Although Web ValueCharts improves upon itspredecessor in many ways, it has the same limitations when it comes to scalability.2.2.2 ConsensUsConsensUs is another tool that is designed to facilitate multi-criteria group decisionmaking [36]. In particular, it aims to support the consensus-building process byhighlighting sources of disagreement. Its primary encoding is the strip plot, whichuses point position on an axis to represent values.Figure 2.5: ConsensUs - Individual View (left) and Group View (right) [36].Each view has one strip plot per attribute. The alternatives are colour-coded dots, and their positions represent the scores assigned by the in-dividual (left) or the group average (right).The solution consists of an individual view and a group view. Individual eval-uations are collected via the individual view before being displayed in the group17view. Each evaluator scores each alternative relative to the others using a slidingscale.The group view has two kinds of colour-coded dots: small dots showing theindividual scores and large dots showing the group averages. It also emphasizestwo kinds of disagreement for each attribute: the alternative with the largest differ-ence between individual and group score (red line below) and the alternative withthe largest variance in score within the group (red line on top). The line lengthencodes the degree of disagreement.A few kinds of interaction are available in the group view. Users may click onthe large dots to see the scores assigned to that alternative by each evaluator. Usersmay also filter alternatives (top-right) and change which other user is shown on thelarge dots (top-left).Strip plots are notable for their succinctness and compactness relative to barcharts that show the same data [46]. However, there is the risk of occlusion if pointshave the same or nearly the same value. A weakness of this design in particularis that it only allows users to compare their scores to those of the average user orone other user at a time. Also, the use of colour to differentiate alternatives limitsscalability.2.2.3 Web-HIPREWeb-HIPRE is one of the oldest interactive support tools for multi-attribute deci-sion analysis [39]. It was originally designed for AHP analysis only but was laterextended to support other decision analysis paradigms.Web-HIPRE’s main window (Figure 2.6) shows the attribute hierarchy and al-ternatives. From there, users can open other windows to inspect their preferencesor analyze results.18Figure 2.6: Web-HIPRE - Main View [40]. The blue nodes represent theattribute tree. The yellow nodes represent the alternatives.The Analysis Window (Figure 2.7) uses stacked bar chars to simultaneouslyshow the total score and per-attribute score of each alternative. Users can selectwhich data to map to bars and segments. Effectively, this means that they canreverse the mapping or select a different level of the attribute hierarchy.19Figure 2.7: Web-HIPRE - Analysis Window [40]. The bars encode alterna-tive scores and the segments encode per-attribute scores.Web-HIPRE supports multi-attribute group decision making by allowing usersto define a new decision problem on top of multiple individual models. The ag-gregate model treats each user as an attribute in a new decision problem, as shownin Figure 1.1. Web-HIPRE is notable in that it is the only tool that explicitly sup-ports the specification of different weights for different decision makers. However,it is limited in that it has few interactive options and its features are divided overmultiple windows.2.2.4 LineUpLineUp is an award-winning interactive tool for comparing ranked entities acrossmultiple attributes [25]. It supports a variety of tasks related to rank comparisonand sensitivity analysis and is commendable for its power, flexibility, and attentionto detail.20Figure 2.8: LineUp [25]. Each ranked entity is a row and each column is anattribute. Multiple rankings can be compared side-by-side, and sameentities are connected with sloped lines. This figure compares sevenrankings.The solution is an elaborate hybrid of bar charts and slope graphs, which drawconnecting lines between the same entities across different rankings. Each item isa row and each attribute is a column, and a ranking is an ordering of items based onthe total score over multiple attributes. Categorical attribute columns display text,while numerical attribute columns encode the attribute scores with bars, which arecolour-coded by attribute. Histograms above each column show the distribution ofscores for that attribute. The slope graph feature can be used to compare two ormore rankings side-by-side.The columns within each ranking can be shown as a stacked bar chart or tabularbar chart based on the user’s selection. In this respect, the core idiom is similar tothat of ValueCharts. LineUp’s extensive list of features allows users to:• Sort and filter entities by attribute score• Perform sensitivity analysis on attribute weights and score functions• Identify missing values• Scroll through rows or inspect a fish-eye view of the rows (supports scala-bility on entities)• Collapse or combine columns (supports scalability on attributes)21• Select one of the following alignment strategies: stacked, diverging stacked,ordered stacked, or tabularAlthough LineUp was not designed specifically for Group MADM, it could beadapted to it in a couple ways. First, two or more decision makers’ models couldbe compared in full using the slope-graph component of LineUp. Second, multipledecision makers’ models could be condensed into a single model by defining ameta-column over all decision makers.A possible criticism of LineUp is that it may have too many features for typi-cal users. Not all of the features were mandated by the preliminary requirementsanalysis.2.2.5 WeightLifterWeightLifter is a novel visual and interactive technique to help system designersunderstand the impact of criteria weights on the decision outcome [41].Figure 2.9: WeightLifter [41]. Sliders support exploration of two-way trade-offs between criteria, and a triangle with adjustable line intersectionssupports exploration of three-way trade-offs.WeightLifter supports interactive exploration of two-way and three-way trade-offs. Two-ways trade-offs are supported by sliders - users can put any number ofcriteria on either end and then adjust the slider position. Three-way trade-offs are22supported by a triangle with intersecting lines perpendicular to each edge that theuser can adjust. In Figure 2.9, the coloured regions (a) show the points at whichthe current top solution (c) would change or fall out of the top three, and blacklines show the points at which the top solution would change. The sliders alsohave histograms (b) that show what fraction of the entire weight space given thattrade-off has the current solution at the top. Users can constrain the weight spaceto sub-ranges on the sliders (d).WeightLifter was integrated with two additional views to support all the tasksidentified in the preliminary requirements analysis (Figure 2.10). The Ranked So-lution Details view is akin to a simplified version of LineUp - it uses stacked barcharts to show the weighted sum of costs for each alternative over the criteria. Itsone unique feature is a strip divided into coloured segments proportional to cri-teria weights. Each segment also contains a glyph showing the direction of thescore function. The Criteria Value View uses parallel coordinates to show criteriaoutcomes for each alternative. It also allows users to set filters by brushing.Figure 2.10: WeightLifter plus two additional views [41]. The Ranked So-lution Details view allows users to inspect per-criterion scores for thetop ranked alternatives. The Criteria Value View shows the criteriaoutcomes for each alternative.232.2.6 SurveyVisualizerSurveyVisualizer is a tool that supports exploration of large, hierarchical satisfac-tion survey data [10]. It was originally designed to visualize customer satisfactiondata for the public transportation system of Zurich. This data consisted of re-sponses to 89 survey questions, which were grouped into 23 quality dimensionsand 3 quality indices. The surveys were partitioned into analysis groups based ondemographic information.Figure 2.11: SurveyVisualizer [10]. A parallel coordinates tree (top) showsthe survey results at three levels of aggregation. Each line correspondsto an analysis group. The analysis group selector (bottom) allows usersto control which analysis groups are included.The basis of SurveyVisualizer is a novel encoding called a Parallel CoordinatesTree, which shows the performance of every analysis group across criteria at threelevels of aggregation. The groups are drawn in light grey by default, but individualgroups can be emphasized temporarily by: (a) hovering over them, which high-24lights them and brings up details about them, (b) clicking on them, which turnsthem black temporarily, or (c) assigning them a permanent colour.The navigation mechanism is the bifocal lens, which allows users to emphasizeindividual analysis groups. The Parallel Coordinates Tree and the analysis groupselectors are coordinated - selecting an analysis group in one causes the same groupto be emphasized in the other view.This work is notable for its novelty and the fact that it has achieved some com-mercial success [10]. It combines the strengths of two different types of encodings- parallel coordinates and trees. The parallel coordinates component supports in-spection of multiple items, while the tree component supports inspection at variouslevels of aggregation. Parallel coordinates scale well to hundreds of items andare effective for identifying outliers and trends between neighboring attributes, butthey are sensitive to the ordering of axes [38]. SurveyVisualizer also makes ef-fective use of linked highlighting, annotation, and the focus plus context designchoice.2.2.7 DCPAIRSDCPAIRS is a compact tool for individual MCDA that allows users to exploretrade-offs between alternatives without using colour to distinguish attributes [20].This is motivated by the fact that colour is not a scalable identity channel, sincepeople can only distinguish up to around a dozen hues [38]. This work investigatesthe use of colour for user annotation instead - a novel feature for MCDA tools.25Figure 2.12: DCPAIRS [10]. The six focal attributes are placed on the maindiagonal (a), and the remaining attributes are arranged in the lowertriangle (b). The points in each scatterplot are the alternatives, and theircoordinates encode their weighted scores on each of the two attributesat that intersection (c). Points are coloured according to user-definedannotation groups.The solution consists of a scatter-plot matrix that shows pairwise trade-offsbetween six attributes at a time. The six focal attributes are placed on the maindiagonal (a), and the remaining attributes are arranged in the lower triangle (b).The points in each scatterplot are the alternatives, and their coordinates encodetheir weighted scores on each of the two attributes at that intersection (c).The user can drag-and-drop attribute tiles into one of the six slots and adjusttheir weights using the sliders on the tiles (h). The current weight of each attributeis redundantly coded in gray-scale. The attribute score functions are positive linearby default, but the user can invert them by toggling ‘high’ and ‘low’ in each tile(g).When the user clicks on a point, that alternative gets highlighted in all the plots,and the inspector (f) gets populated with the score information for that alternative.The user can interactively assign alternatives to colour-coded groups (e) based onfeatures of interest. Finally, the user can filter alternatives on overall score usingthe threshold slider (d).The dominant encoding - the scatter-plot matrix - is limited in that it is only ef-fective at showing pair-wise trade-offs. Furthermore, it does not show the contribu-tion of weighted attribute scores to total score. Nevertheless, the design does have26its strengths, including the annotation feature and the use of details-on-demand andlinked highlighting for selected alternatives. It also has relatively high scalabilityfor number of alternatives.2.2.8 QStackQStack is a tool for ranking collections in multi-tag datasets based on tag frequency[42]. For instance, a user might want to find photo albums on Flikr with highincidence of the tags ‘summer’ and ‘flowers.’Figure 2.13: QStack [42]. Each bar in the focus view (top) corresponds to acollection, and each segment represents the frequency of a particulartag. The tags are coded by colour (left). The context view (bottom)shows the entire data-set, and the focus view is populated with datafrom the selected portion.QStack is similar to ValueCharts and LineUp in that its primary encoding toshow score totals is the stacked bar chart. The user enters a set of tags in thesearch bar, and a set of collections that contain any of these tags is returned. Thefocus view (Figure 2.13) is then populated with stacked bars, where each bar is acollection and the height of each coloured segment encodes the tag frequency forthat collection.The context view below the focus view shows the total tag frequencies of col-27lections in seven different clusters. Users can brush the context view to select asubset of the data to inspect in the focus view.Users can sort in ascending or descending order by a particular tag or total tagfrequency. When the user hovers over an item, the Distributions column of the tagtable (left) is populated with the tag distributions for that item.For the most part, QStack is simply a weaker version of LineUp, but it is notwithout its merits. Its primary strength is that it uses the focus plus context designchoice to achieve scalability by splitting the view into a focus view and contextview.2.2.9 Lessons from EvaluationsIt is important to note that few of these techniques have been thoroughly evaluated,and many have not been evaluated at all.Group ValueCharts was evaluated in a qualitative study with two groups in-volved in real-world decision making. The participants expressed a desire to seethe average scores and disagreement levels, so these features were added to the tool[4]. The authors of ConsensUs performed a laboratory study and concluded thatshowing disagreement visually is more effective than showing verbal argumentsand just as effective as showing both [36]. Finally, the authors of LineUp also con-ducted an experiment and discovered that a strong analysis tool enables novicesto complete complex tasks faster than experts using Excel or Tableau [25]. Theoverarching finding in all studies was that people generally react positively to toolsof this nature [4] [36] [25] [42].What has yet to be established is which of the many features implemented bythese tools are valuable in various Group Preferential Choice contexts. This isour primary motivation for developing a comprehensive data and task model forpreference synthesis in the context of Group Preferential Choice.2.3 Design Space AnalysesAnother major goal of this work is to produce a design space of visual tools forpreference synthesis in the context of Group Preferential Choice. For inspiration,we reviewed six papers that can be loosely described as design space analyses, but28the papers vary in what that entails. These are summarized in Table 2.2.Three of these are best described as design surveys - they review large bodiesof literature on Information Visualization solutions for a particular domain or dataclass: disease epidemiology [13] traffic data [17], and sentiment in text [33]. Theyattempt to identify the major dimensions and classify the solutions according tothese dimensions. A couple of these works conclude with some broad suggestionsfor design [13] [33], but none produce a complete set of design recommendations.The result of these works is a descriptive design space, which covers what designscurrently exist.Ceneda et al. [14] also has a design survey component, but rather than focusingon a particular domain, it considers a particular aspect of InfoVis solutions in gen-eral - guidance. Another difference is that it first develops the design space basedon previous work and then describes the surveyed works in terms of this space.The result of this work is also a descriptive design space.Brehmer et al. (2016) [8] is best described as a design study, which involvesanalyzing a specific problem faced by domain experts and developing a visualiza-tion solution to address the problem [52]. In this case, the goal was to produce aset of design guidelines for presenting time-oriented data in the energy analysis do-main and develop a support tool based on these guidelines. A number of possibledesigns were proposed and evaluated, and a set of design guidelines was produced.The result of this work is a combination of a speculative design space, which de-scribes what designs are possible, and a prescriptive design space, which describeswhat designs are recommended.Brehmer et al. (2017) combines elements of all of these works [9]. First, itsurveys over 100 existing timelines from various sources to produce a descriptivedesign space of timelines. Then, it considers all combinations of different facets ofthe design space, resulting in a speculative design space. Finally, the speculativedesign space is winnowed based on viability, and recommendations are made fordifferent story-telling goals. The final product is a prescriptive design space. Via-bility was assessed based on existing principles, common sense, and the author’sintuition rather than any new empirical data.29Table 2.2: Summary of six design space analyses.Type of design space # of works surveyedEpidemiology Visualization [13] Descriptive 88Sentiment Visualization [33] Descriptive 132Traffic Data Visualization [17] Descriptive Unclear (10s)Characterizing Guidance [14] Descriptive Unclear (10s)Energy Portfolio Analysis [8]Speculative,PrescriptiveN/ATimelines for Storytelling [9]Descriptive,Speculative,Prescriptive145Of the works above, Brehmer et al. (2017) is the closest to our goals, as weintend for our design space to cover all existing viable designs (that we know of),as well as potentially viable new designs.However, there are some key differences worth noting. First, there are notenough existing tools designed specifically for Group Preferential Choice to sup-port a design survey of the same scope. Hence, our design space may be morespeculative. Second, the speculative component of Brehmer et al. (2017) onlyconsiders novel combinations of dimension values (for example, spiral layout withlogarithmic scale), whereas ours may also propose novel dimension values. Fi-nally, Group Preferential Choice data is more heterogeneous, which means thatour design space will be more complex.30Chapter 3Characterizing GroupPreferential ChoiceThe goal of this chapter is to characterize sources of variation among real-worldGroup Preferential Choice scenarios that might have implications for the designof visual support tools for preference synthesis. In particular, we examine thefollowing for each scenario:1. The nature of the decision problem in terms of alternatives, decision makers,and criteria2. The nature of the individual preference models3. The goals of the decision makers during preference synthesis4. The decision making contextSection 3.1 presents the tentative data model for Group Preferential Choice thatis grounded in the existing vocabulary of Multi-Attribute Decision Making (Sec-tion 2.1.2). This establishes the scope of our work and constrains which scenariosare suitable for analysis.Section 3.2 analyzes seven real-world Group Preferential Choice scenarios thatroughly conform to this model. These scenarios were selected to cover as muchvariation as possible.31Section 3.3 proposes revisions to the data model based on this analysis. Therevised model is presented in full in Section 3.4.Section 3.5 collates the scenario-specific goals into scenario-independent goalsfor preference synthesis in the context of Group Preferential Choice. This list ofgoals serves as input to the task analysis in Chapter 4.Finally, Section 3.6 discusses contextual features of the seven scenarios, suchas the stakes involved, the expertise of decision makers, and the amount of time in-vested. It also summarizes the number of alternatives, decision makers, and criteriain each scenario.3.1 Preliminary Data Model for Group PreferentialChoiceWe define Group Preferential Choice as a situation where two or more decisionmakers, each with his or her own explicit preferences,1 must jointly choose from aset of alternatives.2 There may or may not be explicit criteria. More formally, theGroup Preferential Choice data model has the following elements:• A set of Alternatives A : {a1...am},m≥ 1• A set of Decision Makers D : {d1...dn},n≥ 2, each of whom has a Prefer-ence Model (described below)• A set of Criteria C : {c1...cr},r ≥ 0• A set of Primitive Criteria PC ⊂C : {pc1...pcs},s≥ 0• A set of Abstract Criteria AC =C \PC• A Criteria Tree T where the set of nodes in T is equal to C, and the set ofleaf nodes in T is equal to PC. This models the criteria hierarchy.Criteria may be objective or subjective depending on whether their outcomesare measurable facts or personal judgments. For example, size of lawn is an ob-jective criterion, whereas attractiveness of lawn is a subjective criterion. Objective1We use explicit to differentiate formally-expressed preferences from hidden or informally-expressed preferences (e.g. through conversation).2Here, an alternative is an entity that the decision makers evaluate. The number of actual optionsmight be larger, as decision makers might have the option of choosing multiple or no alternatives.32criteria outcomes are the same for all decision makers, while subjective criteriaoutcomes are individually defined by each decision maker.For any objective criterion, there are the following additional elements:• A Domain function dom(pc) where pc ∈ PC, which defines the possibleoutcomes for criterion pc. The domain may be a discrete set (ordered orunordered) or a continuous range.• An Outcome function out(a, pc) ∈ dom(pc) where a ∈ A and pc ∈ PC,which defines the outcome of alternative a on criterion pc.In keeping with the standard definition of Multi-Attribute Decision Making[30] [35], this model excludes scenarios where the number of alternatives is infi-nite, or where different decision makers have different explicit criteria. We alsoexclude scenarios where the criteria outcomes are uncertain (that is, decisions un-der risk).3.1.1 Preference Model TaxonomyThere are numerous ways that preferences can be modelled in formal decision pro-cesses [30] [58]. Here, we describe a few common models that are appropriate fora variety of evaluation contexts. These are organized into a hierarchy of increasingcomplexity based on what is evaluated and how the preferences are expressed.Level P0: The decision makers evaluate the alternatives holistically.a. Ordinal evaluation. Each decision maker ranks the alternatives. Prefer-ences can be modeled as a function rd(a) ∈ [1, |A|] where:1. a ∈ A and d ∈ D2. If abest is the most preferred alternative for decision maker d, thenrd(abest) = 13. rd(a1)< rd(a2) if and only if d prefers a1 to a2b. Cardinal evaluation. Each decision maker scores each alternative alonga common linear scale. Preferences can be modeled as a function sd(a) ∈[min,max] where:1. a ∈ A and d ∈ D332. min and max are the minimum and maximum points on a linear scalecommon to all decision makersLevel P1: The decision makers evaluate each alternative with respect to each cri-terion.a. Ordinal evaluation. Each decision maker ranks the alternatives with re-spect to each criterion. Preferences can be modeled as a function rd(a, pc)∈[1, |A|] where:1. a ∈ A, d ∈ D, and pc ∈ PC2. If abest is the most preferred alternative for decision maker d on crite-rion pc, then rd(abest , pc) = 13. rd(a1, pc)< rd(a2, pc) if and only if d prefers a1 to a2 on criterion pcb. Cardinal evaluation. Each decision maker scores each alternative withrespect to each criterion along a common linear scale. Preferences can bemodeled as a function sd(a, pc) ∈ [minpc,maxpc] where:1. a ∈ A, d ∈ D, and pc ∈ PC2. minpc and maxpc are the minimum and maximum points on a linearscale for pc common to all decision makersb+w. Same as above, with the addition of weights specifying the relativevalue of switching from the worst to the best outcome on each criterion.This can be modeled as a function wd(pc) ∈ [0,1], where d ∈ D, pc ∈ PC,and|PC|∑i=1wd(pci) = 1 (3.1)At this level, the raw (unweighted) preferences are specified by the functionuwsd(a, pc), while the weighted preferences are specified by the functionsd(a, pc) = uwsd(a, pc)∗wd(pc).Level P2: The decision makers evaluate each possible outcome of each criterion.a. Ordinal evaluation. Each decision maker ranks the possible outcomes ofeach criterion. (This is only applicable for criteria with discrete domains.)Preferences can be modeled as a function rd(out, pc)∈ [1, |dom(pc)|]where:341. d ∈ D, pc ∈ PC, and out ∈ dom(pc)2. If outbest is the most preferred outcome for decision maker d on crite-rion pc, then rd(outbest , pc) = 13. rd(out1, pc) < rd(out2, pc) if and only if d prefers out1 to out2 on cri-terion pcb. Cardinal evaluation. Each decision maker scores each possible outcomeof each criterion along a common linear scale.3 Preferences can be modeledas a function sd(out, pc) ∈ [minpc,maxpc] where:1. d ∈ D, pc ∈ PC, and out ∈ dom(pc)2. minpc and maxpc are the minimum and maximum points on a linearscale for pc common to all decision makersb+w. Same as above, with the addition of weights specifying the relativeimportance of each criterion. More precisely, a weight is the relative valueof switching from the worst to the best outcome on each criterion. This canbe modeled in the same manner as Level P1b+w.At this level, the raw (unweighted) preferences are specified by the func-tion uwsd(out, pc), while the weighted preferences are specified by the func-tion sd(out, pc) = uwsd(out, pc)∗wd(pc).Recommended UsageThis taxonomy is intended to be descriptive in that it captures several models thatare used in practice. Here, we briefly discuss a few prescriptive considerationspertaining to preference models.Raw preference data should be collected at the lowest level possible given thecriteria. Level P2 is recommended whenever alternative outcomes can be definedglobally, which is the case when the criteria are objective. This eliminates the biasassociated with direct evaluation of alternatives. Otherwise, preferences must becollected at Level P1 (when the criteria are subjective) or Level P0 (when there areno explicitly-defined criteria). A decision maker’s preferences may span multiplelevels of the taxonomy if there is a mix of subjective and objective criteria.3This mapping from outcomes to scores is often called a Score Function, and the minimum andmaximum values are typically 0 and 1.35If different scales are used for different criteria at Levels P1b or P2b, the scoresmust be normalized. Additionally, in forced-choice scenarios where decision mak-ers must select at least one alternative, it is customary to scale decision makers’scores such that the best and worst alternatives for each criterion receive the mini-mum and maximum scores on the scale (see Campbell River, Section 3.2.3). Thisensures that differences are maximally emphasized in the problem space. However,in scenarios where decision makers may elect to choose none of the alternatives,then absolute performance matters, and the original assessments should be pre-served (see Faculty Hiring, Section 3.2.2).At Level P1b, this can be achieved simply by scaling the scores such that:1. If abest is the most preferred alternative for decision maker d on criterion pc,then sd(abest , pc) = maxpc2. If aworst is the most preferred alternative for decision maker d on criterionpc, then sd(abest , pc) = minpc3. rd(out1, pc) < rd(out2, pc) if and only if d prefers out1 to out2 on criterionpcAt Level P2b, this also mandates restricting the domain of each criterion tothose represented in the problem space. In other words, there must be a one-to-one relationship between each primitive criterion’s domain and the set of outcomesachieved by the alternatives on that criterion. Then, each score function can bescaled such that:1. If outbest is the most preferred outcome for decision maker d on criterion pc,then sd(outbest , pc) = maxpc2. If outworst is the most preferred outcome for decision maker d on criterionpc, then sd(outworst , pc) = minpc3. Corollary: there must be at least two possible outcomes for each criterion,and no decision maker may be completely indifferent to the outcomes of anycriterion.At Level P1a and below, the criteria should be as independent as possible. Thatis, the performance of an alternative on one criterion should not change the way a36decision maker feels about its performance on another criterion. This ensures thatthe decision maker’s total score for each alternative can be attained by summingover the criteria scores. The more independent the criteria, the more accurate theadditive model will be.Conversion Between Taxonomy LevelsOnce preferences have been collected at a certain level, it is possible to move upthe taxonomy by applying simple transformations to the data, as shown in Figure3.1.The left-to-right arrows indicate possible conversions between numeric lev-els of the taxonomy, each of which is coded in a different color. Preferences overalternative-criterion pairs (P1) can be derived from preferences over outcomes (P2)simply by looking up the score/rank of the alternative’s outcome on that crite-rion. Preferences over alternatives only (P0) can be derived from preferences overalternative-criterion pairs (P1) by aggregating over criteria. Conversion from P1bto P0b is achieved by mapping the criteria scores to a common scale (if they arenot on the same scale already) and summing over the normalized scores. Conver-sion from P1a to P0a involves combining ranks to form a new ranking. There are anumber of established techniques for doing this, but none of them are guaranteed tosatisfy all plausible fairness properties [3], so trade-offs must be considered. Thedashed arrow is used to convey this ambiguity.The right-to-left arrows show possible conversions between alphabetic levelsof the taxonomy (a, b, and b+w). The conversion from b levels to a a levels isstraightforward, since a set of scores implies a ranking. The dotted arrow betweenLevels P2b to P2a means that this is only applicable for criteria with finite domains.The conversion from b+w to b involves calculating a weighted criterion score bymultiplying the unweighted criterion score by the criterion weight.Finally, it is possible to move from Level P1a to P0b by converting ranks acrosscriteria into a numeric score for each alternative. One of the simplest and mostwidely-used methods of doing so is the Borda count, which gives each alternativeone point for every alternative it beats on each ballot (in this case, each criterion)[21]. Again, there is no single way to meaningfully convert a set of ranks into a37score, so a dashed arrow is used to capture this ambiguity.Figure 3.1: Preference Model Taxonomy. The numeric component of eachlevel reflects what is evaluated (P0: alternatives, P1: alternatives by cri-terion, P2: outcomes by criterion), whereas the alphabetic componentencodes how they are evaluated (a: ordinal, b: cardinal, b + w: cardi-nal + criteria weights). An upward arrow means that the level belowimplicitly encodes the level above, as discussed in the text.38Simplifying AssumptionsIn order to constrain the scope of the analysis, we make the following tentativeassumptions regarding the nature of the preference models:1. The preferences do not span multiple levels. That is, there in not a mix ofobjective and subjective criteria or ordinal and cardinal evaluations.2. All decision makers express their preferences at the same levels of the tax-onomy.3. The preferences are complete, that is:(a) At level P0, every decision maker ranks (or scores) every alternative.4(b) At level P1, every decision maker ranks (or scores) every alternativewith respect to every criterion.(c) At level P2, every decision maker ranks (or scores) every outcome forevery criterion.54. The preferences and weights are treated as certain. That is, there is no fuzzi-ness.Whether or not these assumptions are realistic varies from situation to situation.We revisit this taxonomy and list of assumptions in Section 3.3 after analyzingseveral real-world scenarios.3.2 Seven Real-World ScenariosThis section describes seven real-world Group Preferential Choice scenarios. Fourwere assessed through one-on-one interviews with decision makers and the remain-der were drawn from secondary sources. We address the following questions foreach scenario:1. What is the decision problem, and what is the decision-making process?2. What is the formal description in terms of the data model from Section 3.1?4In the case of P0a, this means that the preference model must specify a total order over thealternatives, but not necessarily a strict total order (which disallows ties).5In the case of continuous domains, a complete score function may be attained by extrapolatingfrom scores on a few sample points.393. What are the goals during preference synthesis, and how are they achieved?4. What are other relevant characteristics of the decision making context? Inparticular:(a) Is this decision made in a professional or casual setting?(b) How high are the stakes?6(c) How often does this decision recur?(d) How much time is devoted to preference synthesis?(e) Are the decision makers familiar with MCDA?The first aim of this analysis is to validate and refine the data model. Section3.3 proposes revisions to the data model to capture all relevant information andsources of variation. The updated model is presented in Section 3.4.The second aim is to identify key preference synthesis goals and specific tasksthat support them. Section 3.5 summarizes these findings.The final aim is to characterize contextual factors that could inform systemdesign. These findings are summarized in Section 3.6.3.2.1 Best Paper at a ConferenceThis scenario was characterized through a one-on-one interview with a facultymember that was involved in selecting the best paper at a conference. Five re-searchers were tasked with choosing two papers for the best paper award out offour candidates that had been selected by the program chairs.1. Decision ProcessThe five researchers met in person to choose the two best papers. They each rankedthe papers according to their preferences, with ties permitted. Then, they summa-rized their rankings and had a discussion.6For this question, the following broad categories suffice for this analysis:• Low: minor impact on a few individuals• Medium: major impact on a small organization or a few individuals• High: major impact on a large organization (> 100 members)• Very High: major impact on multiple large organizations or the general public402. Formal Data DescriptionThe decision makers were the five researchers, and the alternatives were the fourpapers. There were no explicit criteria. Each researcher ranked the alternatives,which corresponds to level P0a of the Preference Model Taxonomy.3. Preference Synthesis GoalsThe overarching goal was to arrive at a consensus through focused discussion.To achieve this, the decision makers combined their ranks in a spreadsheet withdecision makers on rows, papers on columns, and ranks in cells (Figure 3.2). Inthe event of ties, the average of the spanned ranks was assigned to each of the tiedpapers. For instance, if Papers A - D were ranked 1, 2, 2, and 3 respectively, theadjusted ranks would be 1, 2.5, 2.5, and 4. A sum of ranks was computed for eachpaper to assess overall performance.Figure 3.2: Spreadsheet of ranks for each paper by researcher. (This is notthe actual data.)The researchers used the spreadsheet to identify disagreement among them-selves, taking note of papers with high variability in rank. This focused the discus-sion on contentious points and encouraged the decision makers to reflect on theirown assessments.A pair of papers was chosen that minimized the total rank sum while also ad-hering to certain constraints (in this case, no two papers from the same author wereto be selected). The decision makers agreed that the process was efficient and sys-tematic compared to less formal approaches. The goals and tasks are summarizedin Table 3.1.41Table 3.1: Preference Synthesis Goals and Tasks for Best Paper ScenarioHigh-Level Goals Supporting Activities and TasksG1 Reach consensus Discussion, focused around G2 - G5.G2 Identify papers with best overall performanceT1: Compute rank sum for each paperT2: Compare papers with respect to rank sumG3 Identify disagreement among decision mak-ersT3: Compare paper ranks across decision makersT4: Identify rank discrepancies across decisionmakers and papersG4 Encourage reflection on individual prefer-encesT3: Compare paper ranks across decision makersT5: Identify discrepancies between a particulardecision maker’s rankings and others’ rankingsG5 Understand reasons for disagreement Discussion4. Contextual FeaturesThis is a medium-stakes decision made in a professional setting over the course ofa one-hour meeting. The decision is made annually, although the exact scenariomay vary from year to year. The decision makers do not typically have MCDAknowledge.3.2.2 Faculty HiringThis scenario was characterized through one-on-one interviews with four facultymembers of a research department at a major university. This department followsa semi-formal process to evaluate candidates for open faculty positions. The pro-cess is overseen by a hiring committee consisting of select faculty members andstudents. Two of the interviewees were members of the hiring committee, and theother two were voting members of the department.1. Decision ProcessThe decision process has roughly four stages.In the first stage, the candidate pool is winnowed via process of elimination.The applications are screened, and select candidates are asked to send letters. Asubset of these candidates are contacted for Skype interviews. The candidates thatpass the Skype interview are invited to visit the department in person.In the second stage, the short-listed candidates give a talk at the department,42meet students, and have one-on-one meetings with faculty. The department is in-vited to evaluate the candidate using a standardized form. Around 50 - 100 opinionsare collected this way.In the third stage, the hiring committee meets to decide who will receive anoffer. If more than one candidate is approved for an offer, then the committeedecides the order in which the offers will be given.In the final stage, the committee presents its recommendation at a departmentmeeting. Faculty members may vote to approve, disapprove, or abstain. Thesevotes are consulted by the department head, who makes the final decision.2. Formal Data DescriptionThe overarching process is composed of many distinct decision problems. How-ever, we focus on Stages 3 and 4, as these make use of formal evaluations collectedfrom the department.In this context, the alternatives are the short-listed candidates. The explicitcriteria are Research, Communication, Compatibility, Maturity, Research Fit, andTeaching Fit. In Stage 3, the decision makers are the members of the hiring com-mittee. In Stage 4, the decision makers are the voting faculty members.7Preferences are collected using a standard form that can be filled out by anyonein the department. It consists of 6-point scales for each of the six criteria. There isalso an ‘NA’ (not applicable) option for each scale. Each scale is accompanied bya text field in which the user can justify their rating. Additionally, each evaluatoris asked to rate their confidence in their evaluation as either ‘Low’, ‘Medium’, or‘High’. This preference model corresponds to level P1b of the taxonomy.3. Preference Synthesis GoalsPrior to the hiring committee meeting, the quantitative results of the departmentsurvey are summarized in the form of histograms, as shown in Figures 3.3 and 3.4.In Stage 3, committee members consider these results and discuss their ownopinions. In addition to the explicit criteria, the committee members consider ad-7Ultimately, the final decision is made by the department head, but this distinction is not criticalto this analysis.43ditional factors about the candidates, such as number of papers published at topconferences and how well they complement current faculty members. Facultymembers that work in the same area as the candidate (the ‘in-area’ faculty) aregiven more weight in the discussion.Both interviewees on the hiring committee reported that the subjective feed-back provided in text fields or expressed in conversation was much more importantthan the quantitative summaries. Because this is a high-stakes decision with a highlevel of personal investment, subtleties are taken seriously.Figure 3.3: Here, the results are summarized as a matrix of histograms withcriteria on rows and candidates on columns. Each bar encodes the fre-quency with which that candidate scored at a particular level for thatcriterion. The levels are VS: Very Strong, S: Strong, AP: At Par, W:Weak, VW: Very Weak, and NA: Not Applicable.44Figure 3.4: Here, the results are summarized separately for candidates A andB. The x-axis groups the results by score level, and the bars are colour-coded by criterion. Each bar encodes the percentage of reviewers thatgave that candidate that score for that criterion. The levels are VS: VeryStrong, S: Strong, AP: At Par, W: Weak, VW: Very Weak, and NA: NotApplicable.In Stage 4, Figures 3.3 and 3.4 (or equivalent) are presented to the departmentalong with selected text excerpts in order to justify the hiring committee’s recom-mendation.Two voting faculty members were interviewed following a department meet-ing. In order to assess how they used the information presented, they were askedthe following questions:451. Did the visual summary influence your decision? If so, how?2. Is there any information that would have helped you make your decision?Both interviewees said that the summary confirmed what they already sus-pected - that people generally liked the candidate. One interviewee said that thesummary influenced him by corroborating his viewpoint. The other said it did notinfluence her because she was already convinced, but that it might have if it hadrevealed more controversy.Both interviewees expressed a desire to see the breakdown of scores by role(student, in-area faculty, and other faculty) and confidence level (low, medium,and high), which was not provided by the visualization. One interviewee wouldhave liked to read specific comments by people that gave negative feedback.Table 3.2 provides a summary of the preference synthesis goals and tasks. Forsimplicity, the goals and tasks of Stage 3 and Stage 4 have been combined.Table 3.2: Preference Synthesis Goals and Tasks for Faculty Hiring ScenarioHigh-Level Goals Supporting Activities and TasksG1 Reach consensus through approval voting Facilitated discussion around G2 - G5.G2 Gauge candidate performance across cri-teriaT1: Count frequency of scores for that candidateon each criterionT2: Inspect distribution of scores for that candidateon each criterionG3 Identify discrepancies in performanceacross candidatesT3: Compare distribution of scores across candidatesfor each criterionG4 Identify evaluators that might not be sat-isfied with a particular candidateT4: Count frequency of ‘disagree’ and ‘strongly disagree’outcomes for each criterion and candidate(Identification of individual evaluators not supported)G5 Identify disagreement across evaluatorroles (student, in-area faculty, others)Not supportedG6 Identify disagreement across evaluatorconfidence levels (low, medium, high)Not supportedG7 Understand reasons for evaluator opin-ionsConsult textual feedback (Stage 3 only)G8 Understand reasons for voter opinions DiscussionG9 Give more weight to expert opinionsImplicit (voters do this mentally)Grant experts dedicated time to make a case464. Contextual FeaturesThis is a high-stakes decision made in a professional setting. The hiring committeemeeting lasts about 2 hours, while the department meeting devotes about 1 hourto the faculty hiring segment. This scenario recurs once or twice a month duringrecruiting season. The decision makers do not typically have MCDA knowledge.3.2.3 Campbell River WatershedThis scenario was characterized by watching a Webinar prepared by Compass, aVancouver-based consulting firm that helps organizations tackle high-stakes de-cision problems using structured decision making techniques. In this scenario,Compass oversaw the selection of a new operation strategy for the Campbell Riverhydroelectric facilities on Vancouver Island. The process took three years and in-volved numerous stakeholders, including the Federal and Provincial Government,BC Hydro, local businesses, and First Nations.1. Decision ProcessThe Campbell River Watershed is a major hydroelectric facility on Vancouver Is-land. The region is also one of cultural significance to First Nations peoples, hometo multiple salmon species, and a popular recreation destination.At the time, the Watershed consisted of three reservoirs and three river divi-sions. The goal was to devise a new operation strategy that would better appeal toa diverse set of interests.A list of initial issues was collected through a series of public open houses.These issues were pared down and organized by interest group: flooding and ero-sion, fish and wildlife, recreation, water quality, and financial. Then, special sub-committees were formed for each interest group to identify key objectives anddescribe them in terms of measurable attributes. This process resulted in twelveobjective-attribute pairs. The final set of criteria was produced by listing applica-ble objectives at each of five watershed locations, yielding a total of fifteen criteria(Figure 3.5). A score function for each attribute was developed by an expert, whichwould apply to all stakeholders.Meanwhile, six alternatives were devised by considering feasible strategic ad-47justments at each location in the watershed. The outcomes on each objective wereestimated, and these were arranged in a consequence table (Figure 3.5).Finally, fifteen stakeholders from different interest groups met to evaluate thealternatives with respect to their individual preferences. The process is describedbelow. The best two alternatives were identified in this manner, and these weretaken back to the drawing board for refinement. The final choice was made byconsensus voting.Figure 3.5: Consequence table for six operation strategies on fifteen criteria(derived by listing applicable objective-attribute pairs for each of fivewatershed locations).2. Formal Data DescriptionThe alternatives were the six operation strategies, and the decision makers were thefifteen stakeholders. The criteria were the twelve objectives.Two types of preferences were collected: holistic and criteria-based. Holis-tic preferences were obtained by asking users to rank the alternatives in order ofpreference, with ties allowed. Then, they were asked to assign the highest rankedalternative a score of 100 and score the others relative to that. These preferencescorrespond to levels P0a and P0b in the taxonomy.Criteria-based preferences were obtained by collecting weights using the SMARTER48technique [23]. This corresponds to level P2b+w of the taxonomy, where theweights are supplied by individual decision makers and the score functions by anexpert. A score for each alternative was calculated using Simple Additive Weight-ing (SAW) [30] over the weighted criteria scores.The reason for collecting two types of preferences was to validate the model.Discrepancies between the two outcomes would indicate that one or both modelsis flawed - either the decision maker did not consider all criteria in her holisticassessment (the holistic model is flawed) or some criteria of interest to the decisionmaker are missing from the set (the criteria-based model is flawed). The resultsof this comparison were presented to each user in the form of a line graph (Figure3.6).Figure 3.6: Comparing two preference models for a participant.493. Preference Synthesis GoalsThe ultimate goal of preference synthesis was to support negotiation and help thestakeholders reach consensus. To this end, Compass provided each user with twosheets of paper, each featuring a different graphic.Figure 3.7 shows each person her weights in the context of the range of weightsfor the whole group. The intent of this was to help the decision makers see howtheir priorities compare to the rest of the group, and to reveal criteria for whichthere was a wide range of opinions.Figure 3.8 shows the ranking of each alternative for each person and scor-ing method. The purpose of this was to help the decision makers identify high-performing alternatives at a glance, and then to see which decision makers are notcontent with the top alternatives.Figure 3.7: The range of weights assigned to each criteria across users. Ayellow square denotes the weight assigned by that participant.50Figure 3.8: The performance of each alternative for each user’s two prefer-ence models. The number in each cell represents rank, whereas colorencodes score.After a period of discussion, alternatives G and H were selected for furtherrefinement. The goals and tasks are summarized in Table 3.3.51Table 3.3: Preference Synthesis Goals and Tasks for Campbell River Sce-narioHigh-Level Goals Supporting Activities and TasksG1 Reach consensus Discussion, focused around G2 - G6.G2 Identify differences in priorities amongdecision makersT1: Inspect distribution of weights for eachcriterionG3 Identify differences in priorities betweenself and othersT2: Compare own weight to distribution ofweights for each criterionG4 Identify strategies with best overall per-formanceT3: Compare strategy scores and ranks acrossdecision makersG5 Identify decision makers that may not besatisfied with a particular strategyT4: Identify decision makers that assigned alow score to that strategyG6 Understand reasons for disagreement Discussion4. Contextual FeaturesThis was a one-time, very high-stakes decision made in a professional setting. Pref-erence synthesis took an entire day. The decision makers did not have MCDAknowledge themselves, but they were assisted by MCDA experts.3.2.4 MJS77 ProjectDyer and Miles describe the first recorded application of MCDA methods to a real-world group decision problem [22]. The decision makers were NASA scientists,and the task was to choose a pair of trajectories for the Jupiter/Saturn 1977 (MJS77)Project. The resulting missions were later named Voyager 1 and Voyager 2.1. Decision ProcessThe Jet Propulsion Laboratory (JPL) of NASA was tasked with selecting a pair oftrajectories (flight paths) for two spacecrafts that would be launched within days ofeach other. The trajectories had to be chosen jointly, as the merits of one dependedon the other.The choice of trajectory is a major factor in determining the mission’s success,52so the JPL recruited a team of eighty scientists for advice. The scientists weredivided into ten teams by specialty. Each team was represented by its leader in aninter-team committee called the Science Steering Group (SSG).An initial set of possible trajectory pairs was developed through back-and-forthconsultation between the JPL and science teams. This resulted in a set of 32 can-didate pairs.Then, each science team evaluated each of the candidate pairs by ranking andscoring them holistically, as described below. Each team was permitted to usewhatever decision making process it wished. The JPL synthesized the results andpresented them to the SSG. The final trajectory was selected by the SSG followinga discussion.2. Formal Data DescriptionThe alternatives were the 32 trajectory pairs, and the decision makers were the tenscience teams. Each team ranked the trajectory pairs, with ties permitted. Thiscorresponds to level P0a in the taxonomy.Scores on a cardinal scale were obtained using von Neumann-Morgenstern lot-teries [60]. This elicitation method was selected due to its “theoretical consistency,wide acceptance, and ease of implementation” [22]. This yielded an expected util-ity score of 0 - 1 for each pair. This corresponds to level P0b in the taxonomy.3. Preference Synthesis GoalsThe synthesis of preferences was carried out by the JPL. The ranks were aggregatedby summing across teams and dividing by the number of trajectory pairs. In theevent of ties, the average of the spanned ranks was used. This aggregation methodis identical to that used in the Best Paper scenario, with an additional rescaling stepat the end.The JPL tested eight different ways of aggregating the cardinal scores. Inparticular, they experimented with team weights, normalization procedures, andaggregation techniques. The purpose of this was to perform sensitivity analysisover different collective choice rules. The level of agreement was quantified usingKendall’s coefficient of concordance, which came to 0.96. This is very high, and it53suggests that this problem was not especially sensitive to any of these factors.Finally, the results of this analysis were presented to the SSG in the form ofFigures 3.9 and 3.10. Three trajectory pairs (26, 29, and 31) were found to be in thetop three for all collective choice rules. The scientists discussed the pros and consof these three trajectories, and all but one team eventually agreed that trajectory 26would be acceptable. Trajectory 26 was modified to address the concerns of thedisapproving team and then approved by the project manager. The goals and tasksof preference synthesis are summarized in Table 3.4.Table 3.4: Preference Synthesis Goals and Tasks for MJS77 ScenarioHigh-Level Goals Supporting Activities and TasksG1 Reach consensus Facilitated discussion around G2 - G5.G2 Identify high-performing trajectoriesT1: Identify trajectories ranked in top threeacross collective choice rulesG3 Identify teams that may not be satisfiedwith a particular trajectoryT2: Identify teams that assigned that trajectorya low ranking (under 10)G4 Understand sensitivity of outcome to col-lective choice ruleT3: Compare trajectory rankings across ninecollective choice rulesG5 Understand reasons for each team’s rank-ingsDiscussion54Figure 3.9: The scores for the top 10 trajectory pairs on each of nine collective choice rules [22].55Figure 3.10: Ordinal rankings for the top 10 trajectory pairs for each of the10 science teams (RSS ... PRA) [22]. In the event of ties, the numericscore for each pair is the average of the spanned ranks.4. Contextual FeaturesThis was a one-time, very high-stakes decision made in a professional setting.Preference synthesis took several days. The decision makers did not have MCDAknowledge themselves, but they were assisted by MCDA experts.3.2.5 Nuclear Crisis ManagementIn this study, Mustajoki et al. ran a two day workshop in which a group of par-ticipants planned countermeasures for a hypothetical nuclear emergency scenario[40]. It was one of the first attempts to demonstrate the efficacy of MCDA softwarein group decision making scenarios.1. Decision ProcessThe participants in the study were authorities in nuclear emergency planning thatwould be responsible for devising a plan in the event of a real emergency.56The participants were split into six groups, and each group was assigned toa computer equipped with Web-HIPRE (HIerachical PREference analysis on theWeb), a decision support application based on MAVT [32].The facilitator described the hypothetical scenario: an accident had taken placein a nuclear power plant in Finland. It was now a week later, and the fallout covereda major milk production area. The group was tasked with choosing the best strategyto mitigate damage.The alternatives had been developed during prior meetings with experts. Therewere four possible strategies: provision of uncontaminated fodder (‘Fod’), pro-cessing of milk into other products (‘Prod’), banning the milk (‘Ban’), and doingnothing (‘–’). The alternatives were six realistic pairs of strategies over two timeperiods: weeks 2 - 5 and 6 - 12 after the accident: (‘–+–’), (‘Fod+–’), (‘Fod+Fod’),(‘Prod+Fod’), (‘Ban+Fod’), and (‘Ban+Ban’).A preliminary set of criteria had also been developed during prior meetingswith experts. The conference group deliberated and narrowed these down to seven.Each group then used the software to supply its preferences, as described be-low. The results were presented by the facilitator, and then approval voting was car-ried out for each of the possible alternatives. Two of the alternatives, (‘Fod+Fod’)and (‘Fod+Prod’), were unanimously approved.2. Formal Data DescriptionThe decision makers were the six teams, and the alternatives were six pairs ofstrategies.There were seven criteria arranged hierarchically into three groups: Health(Thyroid Cancer, Other Cancers); Social-Psychological (Reassurance, Anxiety, In-dustry, Feasibility); and Cost (Cost). These constituted a mix of subjective andobjective criteria, although the authors did not specify which were which.For subjective criteria, the groups directly rated each alternative on a 0 - 1 scale.For objective criteria, a common score function was defined by experts. Weightsfor criteria were obtained using a SWING weighting technique [61]. Taken to-gether, the preference model is a hybrid of P1b+w and P2b+w.573. Preference Synthesis GoalsThe overarching goal was to reach consensus through approval voting on the alter-natives.First, the alternative scores for each team were projected onto a screen one byone. The facilitator led a discussion of each, pointing out essential characteristicsand explaining how different criteria contribute to the overall score. The facilitatorperformed sensitivity analysis to demonstrate how changing the criteria weightscan affect the outcome.Individual models were combined by computing a weighted sum of total scoresfor each alternative. The results of this were projected onto the screen (Figure3.11). At first, equal weights were assigned to the groups. Then, the facilitatorperformed sensitivity analysis to demonstrate how changing the weights of thegroups can affect the outcome.Web-HIPRE provides a visual breakdown of the scores of each alternative bygroup, as seen in Figure 3.11. The teams can also see a breakdown by criteria, oreven switch the bars and segments such that the total score for each criterion isbroken down by alternative.The participants discussed the results, paying particular attention to groupswhose preferences did not align with the others (Group 2). Eventually, the groupsarrived at a consensus, and two of the alternatives, (‘Fod+Fod’) and (‘Fod+Prod’),were unanimously approved.58Figure 3.11: Score for each alternative, broken down by group [40].The process was well-received by the participants, and they were satisfied withtheir final decision. However, they felt it would be better suited for planning inadvance than in the event of a real crisis.59Table 3.5: Preference Synthesis Goals and Tasks for Nuclear Crisis ScenarioHigh-Level Goals Supporting Activities and TasksG1 Reach consensus through approval voting Facilitated discussion around G2 - G5.G2 Identify best performing strategy for aparticular groupT1: Compute total score for each strategyfor that groupT2: Compare strategies with respect to total scorefor that groupT3: Compare strategies with respect to criteria scoresfor that groupG3 Understand effect of criteria weights onoutcome for particular groupT4: Perform sensitivity analysis on criteria weightsfor that groupG4 Identify best performing strategies over-allT5: Compute total score for each strategyT6: Compare total scores of each strategyG5 Identify disagreements on overall strat-egy performanceT7: Compare strategies with respect to total scorefor each groupG6 Understand effect of group weights onoutcomeT8: Perform sensitivity analysis on group weightsG7 Understand contribution of each group tototal score for each alternativeT9: Inspect breakdown of total scores into scores for each groupG8 Understand reasons for disagreement Discussion4. Contextual FeaturesThis was a simulation of a one-time, very high-stakes decision made in a profes-sional setting. Preference synthesis took an entire day. The decision makers didnot have MCDA knowledge themselves, but they were assisted by MCDA experts(which may or may not be feasible in the event of a real crisis).3.2.6 Technology Selection at XpertsCatchThis scenario was characterized by observing a team meeting of the software re-cruitment start-up, XpertsCatch. During this meeting, the company decided whichtechnology stack to use for their next product. The CTO and two senior employeeswere interviewed individually after the meeting.This is not technically Group Preferential Choice since there was no formalpreference modelling. However, the interviewees explained that it would have beenfeasible and useful to express their preferences formally, and they speculated abouthow they might do so in the future.601. Decision ProcessPrior to the meeting, the senior employees narrowed down their options to twostacks of interoperable technologies along six dimensions: language, database, dataformat, deploy target, back-end framework, and web server.During the meeting, the CTO met with the engineering team, which consistedof two senior and two junior developers. The CTO and each engineer cast a votefor one of the two stacks and presented his or her supporting arguments. TheCTO made the final decision, putting more weight on the arguments of the seniordevelopers.2. Formal Data DescriptionThe decision maker was the CTO, and the alternatives were two possible stacks:(Javascript, MongoDB, JSON, Mobile HTML, Express, NodeJS) and (Python,MongoDB, XML, Android, Meteor, Apache).Criteria and preferences were not explicitly modelled. However, the intervie-wees said that they implicitly evaluated the stacks based on the six technologicaldimensions and two whole-system criteria: learning curve and adaptability. Dif-ferent interviewees expressed different priorities over the criteria. For instance, theCTO cared most about deployment target because it affects the target demographic,whereas the back-end developer cared most about language because it affects hisday-to-day productivity.The interviewees said that if they were to use explicit preference modelling,they would treat each of the six technological dimensions as objective criteria andeach of learning curve and adaptability as subjective criteria. Additionally, theywould use weights to capture their priorities. They all agreed that explicit prefer-ence modelling would have been helpful for their analysis. Such a model wouldcorrespond to a hybrid of levels P2b+w and P1b+w.3. Preference Synthesis GoalsAs there was no formal preference modelling, there was no formal synthesis ofpreferences. Preferences were shared through conversation, and the expertise ofeach stakeholder was taken into account. As such, the elicitation, evaluation, and61synthesis phases were intertwined.Before the final decision was made, the CTO consulted with a developer thathad voted for the other stack to confirm that he would accept the decision. He saidthat he would.Table 3.6: Preference Synthesis Goals for XpertsCatch ScenarioHigh-Level Goals Supporting ActivitiesG1 Choose best technology stack for thecompanyCompany meeting addressingG2 - G6G2 Identify most preferred stack for eachemployeeElicit votesG3 Identify most preferred stack overall Count votesG4 Understand reasons for each employee’spreferenceHear supporting argumentsG5 Identify differences in preferences Compare supporting argumentsG6 Identify employees that do not prefer aparticular stackReview votesG7 Differentiate between senior and juniorengineersImplicit (in CTO’s head)4. Contextual FeaturesThis is a medium-stakes decision made in a professional setting over the course ofa one-hour meeting. This or similar decisions are made about once a year. Thedecision makers do not have MCDA knowledge.3.2.7 Buying a Gift for a ColleagueIn this scenario, a research lab chose a gift to buy for a recently-graduated col-league, Oscar.8 It was characterized by interviewing the person that led the process,as well as two other members of the lab.Like the XpertsCatch case, this was not technically Group Preferential Choice8All names have been replaced62since there was no formal preference modelling. However, the interviewees wereable to speculate about how formal preference models might have been useful.1. Decision ProcessThe lab members agreed that each person would contribute $10 - $20 toward thegift. One colleague, Sayid, volunteered to lead the selection process.First, he made a list of possible gifts within that price range. He did not havetime to consult the whole group, so he asked two colleagues that were close friendsof Oscar to help him narrow down the list. After brainstorming, they agreed onsome criteria: the gift should be around $150, high-quality, long-lasting, useful,and aesthetically appealing. One friend believed that the usefulness of the gift wasmore important than its aesthetic appeal, whereas the other thought that the col-league might prefer an artistic gift since he did a lot of sketching for his PhD. Afterconversing for about ten minutes, they narrowed down the list to three options.Then, Sayid arranged a meeting with the whole lab. He presented the optionsand explained the criteria that they had considered. The lab then voted on the threeoptions, and the gift with the most votes was purchased.2. Formal Data DescriptionThe decision makers were the lab members (ten in total), and the alternatives werethe three gifts. The criteria were cost, quality, durability, usefulness, and aestheticappeal. Another criteria, size, was used to screen options: only options that couldfit in his backpack were considered. There was no formal preference modelling.Two interviewees said that formal preference modelling would have been agood way to collect more opinions. There was a mixture of objective and sub-jective criteria, as well as differences in opinion over which criteria were mostimportant. Therefore, the appropriate preference model would be a combination oflevels P2b+w and P1b+w.3. Preference Synthesis GoalsAs there was no explicit preference modelling, there was no formal synthesis ofpreferences. Sayid gave a verbal synthesis of his and the two friends’ opinions63to the group. He presented the pros and cons of the three options, and the othercolleagues used this information to decide how to vote.Two of these colleagues were interviewed about the process. They were bothhappy with the outcome, but they wished that more opinions had been collectedprior to the meeting. They noticed that not many people actively commented duringthe meeting, and they suspected this might have been due to the size of the group.One interviewee said that he would not feel comfortable expressing a contraryopinion in front of the others. The other interviewee said that she would put themost weight on the opinions of the organizer (Sayid) and Oscar’s friends.4. Contextual FeaturesThis is a low-stakes decision made in a semi-professional setting over the courseof a one-hour meeting. This or similar decisions are made about once a year. Thedecision makers do not have MCDA knowledge.3.3 Data Model RevisionsThis section proposes adjustments and extensions to the data model from Section3.1 in order to fully and accurately capture the key aspects of all seven scenarios.These are summarized in Table 3.8. The final, updated data model is presented inSection 3.4.3.3.1 Participant RolesSeveral scenarios indicate that the current definition of decision maker is inade-quate to capture the complexity of participant roles.In the Faculty Hiring case, feedback from the department is considered by thehiring committee and voters at the department meeting. In both cases, the pref-erences of non-voting stakeholders are taken into account. The voters themselvesmay or may not be evaluators, depending on whether or not they completed thefeedback form.In the XpertsCatch case, the opinions and preferences of four engineers wereconsidered, but ultimately, the CTO was responsible for the final decision.64Finally, in the MJS77 and Nuclear Crisis cases, groups of individuals of oper-ated as single decision-making units.To address these subtleties, we add the following definitions to our model:• A Stakeholder is an individual or group that is invested in the outcome ofthe decision.• A Decision Maker is an individual, or a group functioning as an individual,that is responsible for reviewing a collection of preferences and making adecision accordingly, either through voting or acting independently.• An Evaluator is an individual, or a group functioning as an individual,whose preferences are modelled and taken into account by the decision mak-ers.In light of the definition changes above, all references to Decision Maker inthe data model definition are replaced with Evaluator (Section 3.4).In the seven scenarios that we analyzed, the decision makers and the evaluatorswere all stakeholders. However, this might not always be the case, since non-stakeholder preferences may be taken into consideration for additional information.We limit the definition of Group Preferential Choice to scenarios where at leasttwo of the evaluators are also stakeholders. But other types of problems may callfor the synthesis of non-stakeholder preferences exclusively. For example, a shop-per might want to inspect a summary of product reviews to inform her selection.This is not Group Preferential Choice, but the data and tasks may be similar, andso many of the same visual techniques may apply.3.3.2 CriteriaThere are situations where a common score function is defined for objective criteria(Campbell River, Nuclear Crisis). In these cases, all evaluators are assigned thesame score function for that criterion. This may be appropriate when the relativevalues of different outcomes are generally agreed upon (e.g. cost) or require expertjudgment (e.g. fish population). Our contact at Compass explained that this isnormal, and that it is unusual for each evaluator to supply her own score function.65For this reason, we extend the definition of objective criteria to include anoptional score function specification. Additionally, we add a new level P2w tothe Preference Model Taxonomy that only includes weights. It sits below P2b+wbecause the score functions for the evaluators are implicitly encoded as the scorefunctions for the criteria.3.3.3 Evaluator Groups and WeightsThe current data model does not provide a way to partition evaluators into groups.The Faculty Hiring, XpertsCatch, and Gift cases indicate that this would be usefulfor capturing different classes of evaluators. Furthermore, interviewees in the Fac-ulty Hiring case expressed a desire to see a breakdown of the results by departmentrole or confidence level. To support this, the data model would need to permitmultiple ways of partitioning the evaluators into groups.The current data model also does not provide a way to quantify evaluator im-portance, which may vary for a number of reasons including expertise, authority,or degree of investment in the outcome. For instance, in the Faculty Hiring case,the opinions of experts are valued more than those of non-experts. As such, thedata model should support weights for individual evaluators or evaluator groups.To address these limitations, the following elements have been added to thedata model:• A set of Group Trees GT : gt1...gtu, |GT | ≥ 1, where:– A Group Tree gt is a tree where the internal nodes are Groups and theleaf nodes are Evaluators from E.– If preferences are collected at Level P0b are below:∗ Each Group Tree has a weights function w(e) ∈ [0,1], where e ∈Evaluators, and ∑|Evaluators|i=1 w(ei) = 1.Each Group Tree represents a hierarchical partitioning of Evaluators into Groups,analogous to the hierarchical partitioning of Primitive Criteria into Abstract Crite-ria. We assume that GT always includes a default Group Tree that places everyEvaluator under a single Group.663.3.4 Preference Model TaxonomyTable 3.7 shows which levels of the taxonomy are covered by which cases, indi-cated by Xs. In the XpertsCatch and Gift for Colleague columns, the Xs indicatethe levels that would have been covered if preferences had been formally modelled(according to the interviewees).There are no cases for levels P1a and P2a. In fact, a subsequent review ofMCDM literature uncovered no recorded cases where this type of preference modelwas used, even when there is only one decision maker. Nevertheless, these levelswill be retained for completeness.Table 3.7: Coverage of Preference Model Taxonomy. The checkmarks in-dicate which levels are present in each scenario. Checkmarks with anasterisk are hypothetical.Faculty Hiring Best Paper Campbell River Voyager Nuclear Crisis XpertsCatch Gift for ColleagueP0a X X Xb X XP1ab Xb+w X X* X*P2abb+w X* X*w X XFinally, we return to the list of potential simplifying assumptions:1. The preferences do not span multiple levels. That is, there in not a mix ofobjective and subjective criteria or ordinal and cardinal evaluations.2. All decision makers express their preferences at the same levels(s) of thetaxonomy.3. The preferences are complete, that is:(a) At level P0, every decision maker ranks (or scores) every alternative.(b) At level P1, every decision maker ranks (or scores) every alternativewith respect to every criterion.(c) At level P2, every decision maker ranks (or scores) every outcome forevery criterion.674. The preferences and weights are treated as certain. That is, there is no fuzzi-ness.Assumptions 2 and 4 hold in all scenarios.Assumption 1 is violated by the Nuclear Crisis, XpertsCatch, and Gift scenar-ios, which each use a mix of objective and subjective criteria. This should not havemajor implications for visualization design - it simply means that designs may needto handle more heterogeneity.Assumption 3 is violated by the Faculty Hiring scenario because evaluators canselect ‘NA’ for any of the criteria. Furthermore, in cases where multiple candidatesare being considered, evaluators are not required to evaluate all candidates. Miss-ing values could pose a significant challenge for both the mathematical model andthe visual design, but the problem does not appear to be ubiquitous in Group Pref-erential Choice Scenarios. For this reason, we will maintain this assumption goingforward and leave the missing values problem to future work.Table 3.8: Summary of Data Model Issues from ScenariosCategory Issue Scenarios SolutionParticipantRolesRelationship between decision makersand evaluators may not be one-to-oneFaculty Hiring,XpertsCatchDistinguish betweendecision maker, stake-holder, and evaluatorDecision makers and evaluators may begroups functioning as individualsMJS77, NuclearCrisisRevise role definitionsaccordinglyCriteriaA common score function may be de-fined for an objective criteriaCampbell River,Nuclear CrisisAdd an optional scorefunction to the objectivecriterion definition; addlevel P2w to the Prefer-ence Model TaxonomyEvaluatorGroupsandWeightsDecision makers may want to partitionthe evaluators into groups, and then as-sign different weights to different groupsFaculty Hiring,XpertsCatch,GiftIntroduce Group Trees,Groups, and GroupWeightsDecision makers may want to assignweights to individual evaluatorsMJS77, Nu-clear Crisis,XpertsCatch,GiftSame as above (assignindividual evaluators totheir own group)683.4 Revised Data Model for Group Preferential ChoiceWe define Group Preferential Choice as a situation where one or more decisionmakers must jointly choose from a set of alternatives based on two or more stake-holders’ preferences over the alternatives.• A set of Alternatives A : {a1...am},m≥ 1• A set of Evaluators E : e1...en, |E| ≥ 2, each of whom has a PreferenceModel (described below)• A set of Criteria C : {c1...cr},r ≥ 0• A set of Primitive Criteria PC ⊂C : {pc1...pcs},s≥ 0• A set of Abstract Criteria AC =C \PC• A Criteria Tree T where the set of nodes in T is equal to C, and the set ofleaf nodes in T is equal to PC. This models the criteria taxonomy.• A set of Group Trees GT : gt1...gtu, |GT | ≥ 1, where:– A Group Tree gt is a tree where the internal nodes are Groups and theleaf nodes are Evaluators from E.– If preferences are collected at Level P0b are below:∗ Each Group Tree has a weights function w(e) ∈ [0,1], where e ∈Evaluators, and ∑|Evaluators|i=1 w(ei) = 1.Criteria may be objective or subjective depending on whether their outcomesare measurable facts or personal judgments. For any objective criterion, there arethe following additional elements:• A Domain function dom(pc) where pc ∈ PC, which defines the possibleoutcomes for criterion pc. The domain may be a discrete set (ordered orunordered) or a continuous range.• An Outcome function out(a, pc) ∈ dom(pc) where a ∈ A and pc ∈ PC,which defines the outcome of alternative a on criterion pc.• An optional Score function score(out, pc) ∈ [0,1] where pc ∈ PC and out ∈Domain(pc) and score(out, pcworst) = 0 and score(out, pcbest) = 1693.4.1 Preference Model TaxonomyLevel P0: The evaluators evaluate the alternatives holistically.a. Ordinal evaluation. Each evaluator ranks the alternatives. Preferencescan be modeled as a function re(a) ∈ [1, |A|] where:1. a ∈ A and e ∈ E2. If abest is the most preferred alternative for evaluator e, then re(abest) =13. re(a1)< re(a2) if and only if e prefers a1 to a2b. Cardinal evaluation. Each evaluator scores each alternative along a com-mon linear scale. Preferences can be modeled as a function se(a)∈ [min,max]where:1. a ∈ A and e ∈ E2. min and max are the minimum and maximum points on a linear scalecommon to all evaluatorsLevel P1: The evaluators evaluate each alternative with respect to each criterion.a. Ordinal evaluation. Each evaluator ranks the alternatives with respect toeach criterion. Preferences can be modeled as a function re(a, pc) ∈ [1, |A|]where:1. a ∈ A, e ∈ E, and pc ∈ PC2. If abest is the most preferred alternative for evaluator e on criterion pc,then re(abest , pc) = 13. re(a1, pc)< re(a2, pc) if and only if e prefers a1 to a2 on criterion pcb. Cardinal evaluation. Each evaluator scores each alternative with respectto each criterion along a common linear scale. Preferences can be modeledas a function se(a, pc) ∈ [minpc,maxpc] where:1. a ∈ A, e ∈ E, and pc ∈ PC2. minpc and maxpc are the minimum and maximum points on a linearscale for pc common to all evaluators70b+w. Same as above, with the addition of weights specifying the relativevalue of switching from the worst to the best outcome on each criterion.This can be modeled as a function we(pc) ∈ [0,1], where e ∈ E, pc ∈ PC,and|PC|∑i=1we(pci) = 1 (3.2)At this level, the raw (unweighted) preferences are specified by the functionuwse(a, pc), while the weighted preferences are specified by the functionse(a, pc) = uwse(a, pc)∗we(pc).Level P2: The evaluators evaluate each possible outcome of each criterion.a. Ordinal evaluation. Each evaluator ranks the possible outcomes of eachcriterion. (This is only applicable for criteria with discrete domains.) Pref-erences can be modeled as a function re(out, pc) ∈ [1, |dom(pc)|] where:1. e ∈ E, pc ∈ PC, and out ∈ dom(pc)2. If outbest is the most preferred outcome for evaluator e on criterion pc,then re(outbest , pc) = 13. re(out1, pc)< re(out2, pc) if and only if e prefers out1 to out2 on crite-rion pcb. Cardinal evaluation. Each evaluator scores each possible outcome ofeach criterion along a common linear scale. Preferences can be modeled asa function se(out, pc) ∈ [minpc,maxpc] where:1. e ∈ E, pc ∈ PC, and out ∈ dom(pc)2. minpc and maxpc are the minimum and maximum points on a linearscale for pc common to all evaluatorsb+w. Same as above, with the addition of weights specifying the relativevalue of switching from the worst to the best outcome on each criterion.This can be modeled in the same manner as Level P1b+w.At this level, the raw (unweighted) preferences are specified by the func-tion uwse(out, pc), while the weighted preferences are specified by the func-tion se(out, pc) = uwse(out, pc)∗we(pc).71w. Same as above, except with weights only. (Assumes that a common scorefunction is defined for each primitive criterion.)3.5 Summary of Preference Synthesis GoalsThis section collates the scenario-specific goals into scenario-independent goalsfor preference synthesis in the context of Group Preferential Choice.In all scenarios, the overarching goal is to arrive at consensus or make a well-informed decision that most stakeholders can accept. This is primarily achievedthrough discussion, with the quantitative summaries serving as a guide. This isa key point - in none of the scenarios did the quantitative summaries completelysupplant verbal exchange. Rather, the role of quantitative summaries was to fo-cus analysis on points of interest, which can greatly enhance the efficiency of theprocess. In particular, decision makers used the quantitative summaries to:1. Discover viable alternatives2. Discover sources of disagreement3. Explain individual scoresThe first item narrows the scope of analysis to alternatives that show promise.This task is often paired with identifying evaluators that gave these alternatives lowscores or ranks (Faculty Hiring, XpertsCatch). Then, these evaluators can explainwhy they felt this way. If the decision makers have the option of selecting noalternatives, this also involves weighing alternatives against the status quo.The second is concerned with identifying sources of disagreement among eval-uators. In order to reach consensus, the decision makers need to understand howtheir preferences differ so they can negotiate and make compromises. Variationson this goal occur in all seven scenarios.The third refers to the process of decomposing an individual evaluator’s scoreinto its constituents. This is necessary to support the second goal of understand-ing points of contention, and it also allows evaluators to understand how differentaspects of their preferences (e.g. weights, score functions) contribute to their totalscores.72These three goals pertain to understanding the model - an additional goal is tovalidate the model. In practice, it is not uncommon for evaluators to adjust theirpreferences after the first round of preference synthesis [51]. The accuracy and ro-bustness of the model can be tested by encouraging reflection (Best Paper, NuclearCrisis), collecting preferences at multiple levels of the taxonomy (Campbell River),or testing different aggregation techniques (MJS77). The process of observing howchanges to inputs influence outputs is called sensitivity analysis. If inconsistenciesor inadequacies are discovered, evaluators should be given an opportunity to adjusttheir preferences. In some cases, it may also be necessary for the decision makersto revise the criteria or alternatives.Finally, quantitative models are seldom sufficient to fully capture individualpreferences. So, a final goal is to discover nuances that the explicit preferencemodels do not capture. This is achieved by engaging in discussion (all scenarios)or consulting textual feedback if not all evaluators are present (Faculty Hiring).Table 3.9 presents these goals in list form and relates them to scenario-specificgoals. The scenario goals are indexed by XX.YY, where XX is the scenario ID andYY is the goal ID in that scenario’s Goals table. The scenario IDs are:• BP = Best Paper (Table 3.1)• FH = Faculty Hiring (Table 3.2)• CR = Campbell River (Table 3.3)• MS = MJS77 (Table 3.4)• NC = Nuclear Crisis (Table 3.5)• XC = XpertsCatch (Table 3.6)73Table 3.9: Goals for Preference Synthesis in Group Preferential ChoiceGENERIC GOAL SCENARIO GOALSG1 Discover Viable Alternativesa Discover high-performing alternatives across evaluators/evaluator groupsBP.G2, CR.G4, MS.G2,NC.G4, XC.G3bDiscover high-performing alternatives across criteria(aggregated over evaluators)FH.G2, FH.G3c Discover high-performing alternatives for a single evaluator/evaluator group XC.G2G2Discover Sources of Disagreement(i.e. find discrepancies across evaluators)aDiscover and explain disagreement about an alternative(across evaluators/evaluator groups)BP.G3, FH.G4, FH.G5,FH.G6, CR.G5, MS.G3,NC.G5, XC.G6, XC.G7bDiscover differences in preference models(across evaluators/evaluator groups)CR.G2, CR.G3, XC.G5G3 Explain Individual ScoresaAnalyze contribution of different criteria to an alternative’s score(for a single evaluator/evaluator group)NC.G2bAnalyze contribution of different parts of the preference model(e.g. weights) to an alternative’s score (for a single evaluator/evaluator group)cAnalyze contribution of different evaluators and evaluator weightsto an alternative’s total scoreNC.G7G4 Validate Modela Understand sensitivity of evaluator scores to evaluator preference models NC.G3b Understand sensitivity of total scores to group weights MS.G4, NC.G6c Understand sensitivity of total scores to aggregation method MS.G4d Discover discrepancies between one’s own preferences and others’ preferences BP.G4G5Discover nuances in evaluators’ preferences that are not capturedby the preference modelsBP.G5, FH.G8, CR.G6,MS.G5, NC.G8, XC.G43.6 Summary of Contextual Features and ScaleThe scenarios divide roughly into three clusters based on contextual features (Table3.10). The first cluster consists of very high-stakes, one-time decision problemsthat the decision makers devote one or more days to analyzing with the help ofMCDA experts (MJS77, Nuclear Crisis, and Campbell River). The second consistsof high and medium-stakes decision problems that recur annually or monthly thatthe decision makers devote only a few hours to analyzing (Faculty Hiring, BestPaper, and XpertsCatch). The final cluster is a low-stakes decision made in a morecasual setting (Gift).74Table 3.10: Contextual Features of Seven ScenariosMJS77ProjectNuclearCrisisCampbellRiverFaculty Hir-ingBest PaperXpertsCatchGift forColleagueAssessmentMethodJournal Arti-cleJournal Arti-cleWebinar Interview Interview Interview +observationInterviewWork Con-textProfessional Professional Professional Professional Professional Professional Semi-professionalFrequency Once Once Once Monthly Annually Annually OnceStakes Very High Very High Very High High Medium Medium LowPreferenceModel(s)P0a, P0b P2w, P1b P0a, P0b,P2wP1b P0a P2b, P1b P2b, P1bTime Al-lowanceSeveral days 1 day 1 day 1 - 2 hours 1 hour 1 hour 1 hour# Evalua-tors11 6 15 50 - 100 5 - 10 5 10# Alterna-tives32 4 6 1 - 4 4 - 15 2 3# Criteria NA 7 12 6 NA 8 5Each of these clusters is likely to have somewhat different requirements forits support system. Decision makers in the first cluster may benefit the most fromadvanced analytic features, since they have the time, incentive, and expertise totake advantage of them. Decision makers in the second cluster are more likelyto benefit from systems that are easy to learn and deliver insights quickly. If thesystem is too complex or cumbersome, decision makers in this cluster may not bewilling to put in the effort to learn and use them. The third cluster may have aneven greater preference for usability over sophistication.In most of these scenarios, the decision problem dimensions (number of eval-uators, alternatives, and criteria) do not exceed twenty. This is reassuring from adesign standpoint, as it suggests that a variety of problems can be addressed with-out encountering major scalability issues. The two exceptions are the number ofalternatives in the MJS77 scenario and the number of evaluators in the Faculty Hir-ing scenario. In the former case, it is conceivable that the initial list could havebeen winnowed further prior to complex preference modelling.The Faculty Hiring scenario, on the other hand, is an outlier in more ways thanone. First, the number of evaluators is much higher than in any other scenario.Second, it is the only scenario in which not all evaluators are present during thepreference synthesis. This may be why textual data is highly valued in this scenario75- the evaluators are not always around to clarify their preferences in person. It isunclear at this point if the Faculty Hiring scenario is sufficiently different from theothers to warrant its own design space.3.7 ConclusionThe primary goal of this chapter was to characterize sources of variation amongGroup Preferential Choice scenarios. We conclude with a summary of the similar-ities and differences that were discovered.DataSimilarities. By definition, all Group Preferential Choice scenarios have alterna-tives, evaluators, and a rank or score for each alternative-evaluator combination.In all but two scenarios, the number of evaluators, alternatives, and criteria did notexceed twenty.Differences. Between the seven scenarios, six different levels of the PreferenceModel Taxonomy were represented (Table 3.7). Two of the scenarios had non-flatcriteria hierarchies (that is, they had abstract criteria other than the implicit root).Three scenarios had or would have benefited from a non-flat evaluator hierar-chy, and four had or would have benefitted from evaluator weights (Table 3.8). Intwo scenarios, the relationship between decision makers and evaluators was notone-to-one (Table 3.8).GoalsIn Section 3.5, the following goals were identified in three or more scenarios:• G1a. Discover high-performing alternatives across evaluators/evaluator groups• G2a. Discover and explain disagreement about an alternative• G5. Discover nuances in evaluators’ preferences that are not capture by thepreference models.The remaining goals were associated with at most two scenarios each.76ContextAs described in Section 3.6, the scenarios divide roughly into three clusters withsimilar features. There is considerable variation between clusters and some varia-tion within.77Chapter 4Data and Task Abstraction forPreference Synthesis in GroupPreferential ChoiceThe goal of this chapter is to produce an abstract data and task model for prefer-ence synthesis in the context of Group Preferential Choice. The resulting model isintended to be broad enough to cover a variety of real-world scenarios but detailedenough to guide requirements analysis for support tools. This model will informthe analysis of potential visual encodings and interactions in Chapter 5.Section 4.1 describes our existing data model (Section 3.4) in terms of a newabstraction based on tables, which is more suitable for visualization design andanalysis.Section 4.2 develops a task model by relating the goals identified in Section 3.5to tasks described in terms of a taxonomy by Brehmer and Munzner [7]. Section4.2.1 describes high-level tasks on Group Preferential Choice data, and Section4.2.2 decomposes each of these tasks into lower-level tasks on generic data types.4.1 Data AbstractionIn order to assess the suitability of different visual encodings, it is helpful to de-scribe the data in terms of multidimensional tables, which are datasets consisting78of attributes and dimensions, where an attribute is something that can be measuredand a dimension is a set of entities for which an attribute can be defined [38]. Theentities of a dimension are called keys, and the specific instances of an attributeare called values. Types of attributes include categorical, ordinal, and quantitative[38].Multidimensional tables form the basis of OLAP (Online Analytical Process-ing), a popular Business Intelligence paradigm that is integral to Analytics toolssuch as Microsoft Excel and Tableau [16] [53]. In this context, multidimensionaltables are called data cubes and the term measure is used in lieu of attribute. Be-cause attribute is also a synonym for criterion in MCDA, we will also use the termmeasure instead.As an example of these concepts, say that the cosmetics department of Macy’sGotham made a $2000 profit on 11-11-2009. In this case, Department, Store Lo-cation, and Date are dimensions and Profit is a measure. Cosmetics, Gotham, and11-11-2009 are keys, and $2000 is the a value for Profit defined by these keys.Measures can be further divided into basic measures, which the user supplies,and derived measures, which can be computed from the basic measures. Thedimensionality of a measure is the set of dimensions whose keys map to a sin-gle value for that measure. In the example above, the dimensionality of Profit is{Department, Store Location, Date}.In Group Preferential Choice, each level of the Preference Model Taxonomy isdefined by one or two basic measures, as summarized in Table 4.1. All measuresare quantitative except for Outcome, whose type depends on the domain of thecriterion. Referring back to the terms and notation introduced in Section 3.8, thedimensions are Evaluators, Criteria, Alternatives, and Outcomes, and their keys areE, PC, A, and (⋃|PC|i=1 dom(pci)), respectively.1OLAP also supports the specification of hierarchies, which impose hierarchicalarrangements on the entities of a dimension [54]. In Group Preferential Choice,the Criteria dimension has a hierarchy that is specified by the Criteria Tree. TheEvaluators dimension has one hierarchy for each Group Tree that is defined.1This abstraction makes the simplifying assumption that the Outcomes dimensions is independentof the Criteria dimension. This is not especially problematic - we can simply treat the nonsensicalintersections as undefined.79The derived measures at each level of the taxonomy include all basic measuresthat are defined by any of its descendants (see Figure 3.1). Additional derivedmeasures can be obtained via roll-up, which is the process of aggregating over (thatis, factoring out) a dimension or aggregating within a dimension to a higher levelof some hierarchy. Applicable to this analysis are the following derived measures:1. The aggregate of any basic measure except Outcome for an evaluator group(aggregating up the Evaluators hierarchy)2. The aggregate of AltCritRank, AltCritScore, or CritWeight for an abstractcriterion (aggregating up the Criteria hierarchy)3. The TotalRank/TotalScore for an alternative, which is the aggregate of Al-tRank/AltScore over evaluators (factoring out the Evaluators dimension)There are numerous ways that values can be aggregated, but we assume thataggregate totals are obtained via summation.Table 4.1: Basic Measures. The formulae refer to those defined in Section3.4. The dimensionality of a measure is the set of dimensions whosekeys map to a single value for that measure. In other words, they are theinputs to the formula for that measure.Taxonomy Level Basic Measure Formula DimensionalityP0b and descendants EvaluatorWeight w(g) {Evaluators}P0a AltRank re(a) {Evaluators, Alternatives}P0b AltScore se(a) {Evaluators, Alternatives}P0b UnweightedAltScore uwse(a) {Evaluators, Alternatives}P1a AltCritRank re(a, pc) {Evaluators, Alternatives, Criteria}P1b AltCritScore se(a, pc) {Evaluators, Alternatives, Criteria}P1b+w UnweightedAltCritScore uwse(a, pc) {Evaluators, Alternatives, Criteria}P2a OutRank re(out, pc) {Evaluators, Criteria, Outcomes}P2b OutScore se(out, pc) {Evaluators, Criteria, Outcomes}P2b+w UnweightedOutScore uwse(out, pc) {Evaluators, Criteria, Outcomes}P1b+w, P2b+w CritWeight we(pc) {Evaluators, Criteria}P2a/b/b+w Outcome outa(pc) {Alternatives, Criteria}804.2 Task AbstractionIn the next two sections, we relate each of the goals in Table 3.9 to abstract tasksbased on Brehmer and Munzner’s typology of visualization tasks [7] (Figure 4.1).This typology is rooted in Munzner’s nested model of visualization design, whichseparates data/task abstraction from consideration of visual encodings/interactionidioms [37]. The idea is that designers should be able to describe why a task isperformed and what data it is performed on independently of how it is achieved.At this stage, we are only concerned with the what and why levels of description,as our aim is to develop a task model that is independent of any particular system.Figure 4.1: Brehmer and Munzner’s typology of abstract visualization tasks[7]. The why group consists of actions arranged hierarchically fromhigh to low level. The what group encapsulates the targets, which areseparated into inputs and outputs..4.2.1 High-Level Task AbstractionAll of the goals in Table 3.9 are instances of the high-level task Consume: Dis-cover, which covers many facets of inquiry [7]. The terms Explain (G3), Analyze(G3a, G3b), Verify (G4), and Understand (G4a - G4d) are all included in the list ofvocabulary related to the Discover task [7].Tables 4.2 - 4.5 relate each of the sub-goals in Table 3.9 to high-level tasksthat support that goal. Although these tasks are lower-level than the goals, they arestill high-level from the perspective of the task typology since they, too, fall under81the umbrella of Discover. In these descriptions, criterion may refer to a primitivecriterion or an abstract criterion, and evaluator may refer to a single evaluator or anevaluator group, unless otherwise noted. For abstract criteria and evaluator groups,the respective value will be an aggregate as described in the previous section.These tasks were identified in two ways: (a) revisiting the scenario-specificgoals and coding them as tasks (if they were more specific than the generic goalthey were grouped with) and (b) brainstorming tasks that were missing from thescenarios but could clearly support the goal in question.Table 4.2 shows tasks to support G1: Discover Viable Alternatives. Support-ing tasks for G1a include finding alternatives with high overall scores (T1) or lowvariation in scores across evaluators (T2), as these may constitute viable ‘compro-mise’ alternatives. Another way to focus the analysis is to identify non-dominatedalternatives (T3), which can minimize distraction and interference from others. Tonarrow the list further, it is essential to be able to consider trade-offs between com-petitive alternatives (T4). Finally, it may be necessary to looks at absolute pros andcons of one alternative (where the ‘pros’ are evaluators with high scores and the‘cons’ are evaluators with low scores), especially if selecting no alternatives is anoption (T6).G1b is about identifying high-performing alternatives across criteria after eval-uators have been factored out. In addition to a high overall scores (T1), consistentperformance across criteria may be desirable (T6). As with T3, it may be usefulto focus analysis on non-dominated alternatives in criteria-space (T7). Finally, onemight want to look at the relative strengths and weaknesses (that is, the trade-offs)between a pair of alternatives (T8), or the absolute strengths and weaknesses ofone alternative (T9).G1c is about identifying high-performing alternatives for a particular evaluatorof interest, such as oneself. Tasks T11 - T14 are analogous to Tasks T6 - T9 exceptthat they target a particular evaluator instead of the aggregate over all evaluators.82Table 4.2: Tasks to Support G1: Discover Viable Alternatives. Viable alter-natives may include those with high overall scores (T1) or low variationin scores (T2), as these may constitute viable ‘compromise’ alternatives.Discovering non-dominated alternatives (T3) can focus the analysis oncompetitive alternatives and minimize distraction and interference fromthe others.TASK Applicable LevelsG1a. Discover high-performing alternatives across evaluatorsT1 Discover alternative(s) with best TotalRank/TotalScore AnyT2 Discover alternatives(s) with low variance in AltRanks/AltScores acrossevaluatorsAnyT3 Discover non-dominated alternatives across evaluators AnyT4 Discover trade-offs in AltRanks/AltScores between alternatives a and b AnyT5 Discover pros and cons in AltRanks/AltScores for alternative a AnyG1b. Discover high-performing alternatives across criteriaT6 Discover alternatives(s) with low variance in AltCritRanks/AltCritScoresacross criteria (aggregated over evaluators)P1a and descendantsT7 Discover non-dominated alternatives across criteria (aggregated overevaluators)P1a and descendantsT8 Discover trade-offs in AltCritRanks/AltCritScores between alternatives aand b (aggregated over evaluators)P1a and descendantsT9 Discover strengths and weaknesses of alternative a (aggregated over eval-uators)P1a and descendantsG1c. Discover high-performing alternatives for a single evalu-atorT10 Discover alternative(s) with best AltRank/AltScore for evaluator e AnyT11 Discover alternatives(s) with low variance in AltCritRank/AltCritScoreacross criteria for evaluator eP1a and descendantsT12 Discover non-dominated alternatives across criteria for evaluator e P1a and descendantsT13 Discover trade-offs in AltCritRanks/AltCritScores between alternatives aand b for evaluator eP1a and descendantsT14 Discover strengths and weaknesses of alternative a for evaluator e P1a and descendantsTable 4.3 shows tasks to support G2: Discover Sources of Disagreement. Atthe highest level, disagreement can be discovered by finding alternatives with highvariance in scores, as these are more likely to be controversial (T15). Once aninteresting alternative has been identified (either through G1 or G2a), one can zeroin on evaluators with dissenting opinions (T16) or criteria that are responsible forthe controversy (T17).83Another approach to discovering sources of disagreement is to look directly atthe preferences. If weights are included in the model, one can look for criteria withhigh variance in weights (T18) and then identify dissenters (T19). This can also bedone for score functions if they are included (T20 and T21).Table 4.3: Tasks to Support G2: Discover Sources of Disagreement. Thesetasks hone in on where the disagreement is (T15) and who is disagreeing(T16), bringing dissenting viewpoints to light.TASK Applicable LevelsG2a. Discover and explain disagreement about an alternativeT15 Discover alternatives(s) with high variance in AltRank/AltScore acrossevaluatorsAnyT16 Discover evaluators that are outliers with respect to AltRank/AltScore foralternative aAnyT17 Discover criteria with high variance in AltCritRank/AltCritScore acrossevaluators for alternative aP1a and descendantsG2b. Discover differences in preference modelsT18 Discover criteria with high variance in CritWeights across evaluators P1b+w, P2b+wT19 Discover evaluators that are outliers with respect to CritWeights for crite-rion cP1b+w, P2b+wT20 Discover criteria outcomes with high variance in OutRanks/OutScoresacross evaluatorsP2b and descendantsT21 Discover evaluators that are outliers with respect to OutRanks/OutScoresfor outcome o of primitive criterion pcP2b and descendantsTable 4.4 shows tasks to support G3: Explain Individual Scores. Each of thesetasks involves breaking down a derived measure into its constituents.84Table 4.4: Tasks to Support G3: Explain Individual Scores. These tasks allowdecision makers to analyze constituents of global scores and individualevaluator’s scores.TASK Applicable LevelsG3a. Analyze contribution of different criteria to an alternative’s scoreT22 Analyze breakdown of AltRanks/AltScores intoAltCritRanks/AltCritScores for alternative a and evaluator eP1a and descendantsG3b. Analyze contribution of different parts of the preference modelto an alternative’s scoreT23 Analyze breakdown of AltCritScore into UnweightedAltCritScore andCritWeight for alternative a, evaluator e, and criterion cP1b+wT24 Analyze breakdown of OutScore into UnweightedOutScore andCritWeight for evaluator e, primitive criterion pc, and outcome outP2b+wT25 Understand mapping between AltCritRank/AltCritScore and Out-Rank/OutScore for a particular evaluator, alternative, and primitive cri-terionP2a/b and descendantsT26 Analyze breakdown of AltCritRank/AltCritScore for alternative a, evalu-ator e, and abstract criterion acP1a and descendantsG3c. Analyze contribution of different evaluators and evaluatorweights to an alternative’s total scoreT27 Analyze breakdown of AltScores into UnweightedAltScore and Evalua-torWeight for alternative a and evaluator eP0b and descendantsT28 Analyze breakdown of TotalRanks/TotalScores into AltRanks/AltScoresfor alternative aAnyFinally, Table 4.5 shows tasks to support G4: Validate Model. Tasks T29 - T32support sensitivity analysis on various aspects of the model. The remaining tasksallow individuals to compare their preference models to those of others, whichcould inspire them to reevaluate their own preferences.85Table 4.5: Tasks to Support G4: Validate Model. These tasks support sensi-tivity analysis and encourage comparison of individual preferences withthose of others.TASK Applicable LevelsG4a. Understand sensitivity of evaluator scores to evaluator preference modelsT29 Discover differences in AltRanks/AltScores for evaluator e before andafter changing CritWeightsP1b+w, P2b+wT30 Discover differences in AltRanks/AltScores for evaluator e before andafter changing non-weight component of preference modelAnyG4b. Understand sensitivity of total scores to evaluator weightsT31 Discover differences in TotalScores before and after changing Evaluator-WeightsP0b and descendantsG4c. Understand sensitivity of total scores to aggregation methodT32 Discover differences in TotalRanks/TotalScores from two different aggre-gation methodsAnyG4d. Discover discrepancies between one’s own preferences and others’ preferencesT33 Discover differences in CritWeights for evaluator e to CritWeights forother evaluatorsAnyT34 Discover differences in non-weight component of preference model (e.g.AltScores at P0b, OutScores at P2b) for evaluator e to that of other evalu-atorsAny4.2.2 Low-Level Task AbstractionIn this final stage of analysis, we decompose the high-level Discover tasks intolow-level Search and Query tasks.Task TargetsThe what node in the typology presented in Figure 4.1 represents the targets of thetasks, which include inputs and outputs.In Group Preferential Choice, there are two types of targets: values and distri-butions (which are simply sets of values). Referring back to Table 4.1, the valueof a measure is defined by its complete key-set. For instance, an AltScore value isuniquely defined by an alternative and an evaluator. If any of the keys are missing,the result is a distribution.The codes for various targets are provided in Tables 4.6 and 4.7. The distri-butions in Table 4.7 are the result of allowing one dimension for the measure in86question to vary and fixing the others. For instance, D2( john) is the distribution ofAltRanks or AltScores for evaluator john over all alternatives.In each of these tables, a criterion may refer to an abstract or primitive criterion.Similarly, an evaluator may consist of a single evaluator or multiple evaluators in agroup. If it is an abstract criterion or multi-evaluator group, the values in questionwill be aggregates.Table 4.6: Target Values for Task Analysis. These include all measures ap-plicable to the given level.Task Targets - Values For Applicable LevelsV1(a) TotalRank/TotalScore a ∈ A AnyV2(a,e) AltRank/AltScore a ∈ A,e ∈ E AnyV3(a,e) UnweightedAltScore a ∈ A,e ∈ E AnyV4(a,e,c) AltCritRank/AltCritScore a ∈ A,e ∈ E,c ∈C AnyV5(a,e,c) UnweightedAltCritScore a ∈ A,e ∈ E,c ∈C P1b+w and descendantsV6(e,pc,o) OutRank/OutScore e ∈ E, pc ∈ PC,o ∈ dom(PC) P2a and descendantsV7(e,pc,o) UnweightedOutScore e ∈ E, pc ∈ PC,o ∈ dom(PC) P2b+wV8(c) CritWeight c ∈C P1b+w, P2b+wV9(e) EvaluatorWeight e ∈ E P0b and descendantsV10(a,pc) Outcome a ∈ A, pc ∈ PC P2a and descendantsTable 4.7: Target Distributions for Task Analysis. The Across column spec-ifies the dimension that varies and the For column specifies the dimen-sions that are fixed.Task Targets - Distributions Across For Applicable LevelsD1 TotalRanks/TotalScores Alternatives All data AnyD2(e) AltRanks/AltScores Alternatives e ∈ E P0a/b and descendantsD3(a) AltRanks/AltScores Evaluators a ∈ A P0a/b and descendantsD4(a,c) AltCritRanks/AltCritScores Evaluators a ∈ A,c ∈C P1a/b and descendantsD5(a,e) AltCritRanks/AltCritScores Criteria a ∈ A,e ∈ E P1a/b and descendantsD6(pc,o) OutRanks/OutScores Evaluators pc ∈ PC,o ∈ dom(PC) P2a/b and descendantsD7(c) CritWeights Evaluators c ∈C P1b+w, P2b+wD8(e) CritWeights Criteria e ∈ E P1b+w, P2b+w87Auxiliary TasksAll of the high-level tasks from the previous section can be accomplished using acombination of just ten auxiliary tasks on these targets. These are defined in termsof an action, an input type, and an output type (Table 4.8).Brehmer and Munzner define four types of Search tasks based on whether theidentify and location of the search target are known [7]. The target of a Locate orLookup task is an element with a particular identity, whereas the target of a Browseor Explore task is an element with particular features. The search space for Locateand Explore tasks is the whole data-set, while the search space for Lookup andBrowse is restricted.We use three of these tasks - Locate, Lookup, and Browse. In this context,Locate and Lookup involves finding the value or distribution for a measure givena key-set (e.g. the AltScore for a particular alternative and evaluator), whereasBrowse involves looking through distributions or sets of distributions for interest-ing subsets (e.g. find outliers in a set of AltScores).Brehmer and Munzner define three types of Query tasks based on the numberof items involved: Identify (single item), Compare (two items), and Summarize(3+ items) [7]. Query tasks are often performed on the outputs of a Search tasks.When paired with Locate or Lookup, Query returns features; when paired withBrowse or Explore, it returns identities [7].Table 4.8: Auxiliary Tasks. In AT3 and AT4, ‘matched distributions’ meansdistributions of the same type (i.e. same row in Table 4.7).AUXILIARY TASKSAction Input Output Supported byAT1 Query: Identify A single value or distribution Its key-setAT2 Query: Compare A pair of values DifferenceAT3 Query: Compare A pair of matched distributions A tuple of differences AT2AT4 Query: Compare A pair of matched distributions A dominance relation AT3AT5 Query: Summarize A single distribution A summary of varianceAT6 Search: Locate A key-set A single value or distributionAT7 Search: Lookup (in context) A key-set + a single value or distribution A single value or distribution AT6AT8 Search: Browse A single distribution Outliers AT2AT9 Search: Browse A single distribution Top/bottom values AT2AT10 Search: Browse A set of distributions Non-dominated distributions AT4Equipped with these and the targets from 4.6 and 4.7, we can now describe88how to accomplish each high-level task. Auxiliary tasks and targets are referencedby code and accompanied by a short English description.• T1: Discover alternative(s) with best TotalRank/TotalScore1. AT9(D1)→ X (Get top values in TotalScores distribution)2. AT1(x) for x ∈ X (Identify top values)• T2: Discover alternatives(s) with low variance in AltRanks/AltScores acrossevaluators1. AT5(D3(a)) for a ∈ A → X (Get variance of each AltScores distribu-tion)2. AT9(X)→ Y (Get bottom values)3. AT1(y) for y ∈ Y (Identify bottom values)• T3: Discover non-dominated alternatives across evaluators1. AT10({D3(a) for a∈ A})→ X (Get non-dominated AltScores distribu-tions)2. AT1(x) for x ∈ X (Identify non-dominated AltScores distributions)• T4: Discover trade-offs in AltRanks/AltScores between alternatives a and b1. AT6(D3(a))→ X (Locate AltScores for a)2. AT6(D3(b))→ Y (Locate AltScores for b)3. AT3(X,Y))→ X (Get differences between AltScores distributions)• T5: Discover pros and cons in AltRanks/AltScores for alternative a1. AT6(D3(a)) (Locate AltScores for a)2. AT6(V2(a,e)) for e ∈ E (Locate every AltScore for a)3. AT2(V2(a,e1),V2(a,e2)) for {a1,a2}∈E (Pairwise compare every AltScorefor a)• T6: Discover alternatives(s) with low variance in AltCritRanks/AltCritScoresacross criteria (aggregated over evaluators)891. AT5(D5(a,e= all)) for a ∈ A→ X (Get variance of each AltCritScoresdistribution)2. AT9(X)→ Y (Get bottom values)3. AT1(y) for y ∈ Y (Identify bottom values)• T7: Discover non-dominated alternatives across criteria (aggregated overevaluators)1. AT10({D5(a,e= all) for a∈A})→X (Get non-dominated AltCritScoresdistributions)2. AT1(x) for x ∈ X (Identify non-dominated AltCritScores distributions)• T8: Discover trade-offs in AltCritRanks/AltCritScores between alternativesa and b (aggregated over evaluators)1. AT6(D5(a,e = all))→ X (Locate AltCritScores for a)2. AT6(D5(b,e = all))→ Y (Locate AltCritScores for b)3. AT3(X,Y) (Get differences between AltCritScores distributions)• T9: Discover strengths and weaknesses of alternative a (aggregated overevaluators)1. AT6(D5(a,e = all)) (Locate AltCritScores for a)2. AT6(V4(a,e = all,c)) for c ∈ PC (Locate every AltCritScore for a)3. AT2(V4(a,e = all,c1),V4(a,e = all,c2)) for {c1,c2} ∈ PC (Pairwisecompare every AltCritScore for a)• T10: Discover alternative(s) with best AltRank/AltScore for evaluator e1. AT6(D2(e))→ X (Locate AltScores for e)2. AT9(X)→ Y (Get top values)3. AT1(y) for y ∈ Y (Identify top values)• T11: Discover alternatives(s) with low variance in AltCritRank/AltCritScoreacross criteria for evaluator e901. AT6(D5(x,e))→ X (Locate AltCritScores for e)2. AT5(X) for a∈A→Y (Get variance of each AltCritScores distribution)3. AT9(Y )→ Z (Get bottom values)4. AT1(z) for z ∈ Z (Identify bottom values)• T12: Discover non-dominated alternatives across criteria for evaluator e1. AT6(D5(x,e))→ X (Locate AltCritScores for e)2. AT10(D5(X))→ Y (Get non-dominated AltCritScore distributions)3. AT1(y) for y ∈ Y (Identify non-dominated AltCritScore distributions)• T13: Discover trade-offs in AltCritRanks/AltCritScores between alterna-tives a and b for evaluator e1. AT6(D5(a,e))→ X (Locate AltCritScores for a and e)2. AT6(D5(b,e))→ Y (Locate AltCritScores for b and e)3. AT3(X,Y) (Get differences between AltCritScores in X and Y)• T14: Discover strengths and weaknesses of alternative a for evaluator e1. AT6(D5(a,e)) (Locate AltCritScores for a)2. AT6(V4(a,e,c)) for c ∈ PC (Locate every AltCritScore for a and e)3. AT2(V4(a,e,c1),V4(a,e,c2)) for {c1,c2} ∈ PC (Pairwise compare ev-ery AltCritScore for a and e)• T15: Discover alternatives(s) with high variance in AltRanks/AltScores acrossevaluators1. AT5(D3(a)) for a ∈ A → X (Get variance of each AltScores distribu-tion)2. AT9(X)→ Y (Get top values)3. AT1(y) for y ∈ Y (Identify top values)• T16: Discover evaluators that are outliers with respect to AltRank/AltScorefor alternative a911. AT6(D3(a))→ X (Locate AltScores for a))2. AT8(X)→ Y (Get outliers)3. AT1(z) for y ∈ Y (Identify outliers)• T17: Discover criteria with high variance in AltCritRank/AltCritScore acrossevaluators for alternative a1. AT6(D4(a,x))→ X (Locate AltCritScores for a)2. AT5(X) for a∈A→Y (Get variance of each AltCritScores distribution)3. AT9(Y )→ Z (Get top values)4. AT1(z) for z ∈ Z (Identify top values)• T18: Discover criteria with high variance in CritWeights across evaluators1. AT5(D7(pc)) for pc ∈ PC → X (Get variance of CritWeights for eachcriterion)2. AT9(X)→ Y (Get top values)3. AT1(y) for y ∈ Y (Identify top values)• T19: Discover evaluators that are outliers with respect to CritWeights forcriterion c1. AT6(D7(c))→ X (Locate CritWeights for c))2. AT8(X)→ Y (Get outliers)3. AT1(y) for y ∈ Y (Identify outliers)• T20: Discover primitive criteria outcomes with high variance in OutRanks/OutScoresacross evaluators1. AT5(D6(pc,o)) for pc ∈ PC,o ∈ dom(pc) → X (Get variance of eachOutScores distribution)2. AT9(X)→ Y (Get top values)3. AT1(y) for y ∈ Y (Identify top values)92• T21: Discover evaluators that are outliers with respect to OutRanks/OutScoresfor outcome o on primitive criterion pc1. AT6(D6(pc,o))→ X (Locate D6 for pc, o)2. AT8(X)→ Y (Get outliers)3. AT1(y) for y ∈ Y (Identify outliers)• T22: Analyze breakdown of AltRanks/AltScores into AltCritRanks/AltCritScoresfor alternative a and evaluator e1. AT7(D5(a,e),V2(a,e)) (Locate AltCritScores for a, e in context of AltScorefor a, e)• T23: Analyze breakdown of AltCritScore into UnweightedAltCritScore andCritWeight for alternative a, evaluator e, and criterion c1. AT7(V5(a,e,c),V4(a,e,c)) (Locate UnweightedAltCritScore for a, e, cin context of AltCritScore for a,e,c)2. AT7(V8(e,c),V4(a,e,c)) (Locate UnweightedOutScore for e, c in con-text of AltCritScore for a,e,c)• T24: Analyze breakdown of OutScore into UnweightedOutScore and CritWeightfor evaluator e, primitive criterion pc, and outcome o1. AT7(V7(e,pc,o),V6(e,pc,o)) (Locate UnweightedOutScore for e, pc, oin context of OutScore for e,pc,o)2. AT7(V8(e,pc),V6(e,pc,o)) (Locate UnweightedOutScore for e, pc incontext of OutScore for e,pc,o)• T25: Understand mapping between AltCritRank/AltCritScore and OutRank/OutScorefor a alternative a, evaluator e, and primitive criterion pc1. AT6(V10(a,pc))→ X (Locate Outcome for a, pc)2. AT6(V7(e,pc,X)) (Locate UnweightedOutScore for e, c, X)• T26: Analyze breakdown of AltCritRank/AltCritScore for alternative a, eval-uator e, and abstract criterion ac931. AT7(V4(a,e,c),V4(a,e,ac)) for c ∈ children(ac) (Locate AltCritScorefor each child of ac in context of AltCritScore for a,e,c)• T27: Analyze breakdown of AltScores into UnweightedAltScore and Eval-uatorWeight for alternative a and evaluator e1. AT7(V3(a,e),V9(e)) (Lookup UnweightedAltScore for a, e in contextof EvaluatorWeight for e)• T28: Analyze breakdown of TotalRanks/TotalScores into AltRanks/AltScoresfor alternative a1. AT7(D2(a),V1(a)) (Lookup AltScores for a in context of TotalScorefor a))• T29: Discover differences in AltRanks/AltScores for evaluator e before andafter changing CritWeights1. AT6(D2(e) before)→X (Locate AltScores for e in the ‘before’ dataset)2. AT6(D2(e) after)→ Y (Locate AltScores for e in the ‘after’ dataset)3. AT3(X,Y) (Get differences between AltScores distributions)• T30: Discover differences in AltRanks/AltScores for evaluator e before andafter changing non-weight component of preference model1. Same as T29• T31: Discover differences in TotalScores before and after changing Evalua-torWeights1. AT6(D1) before)→ X (Locate D1 in the ‘before’ dataset)2. AT6(D1) after)→ Y (Locate D1 in the ‘after’ dataset)3. AT3(X,Y) (Get differences between D1s)• T32: Discover differences in TotalRanks/TotalScores from two different ag-gregation methods1. AT6(D1) method1)→ X (Locate TotalScores in the ‘method1’ dataset)942. AT6(D1) method2)→Y (Locate TotalScores in the ‘method2’ dataset)3. AT3(X,Y) (Get differences between TotalScores distributions)• T33: Discover differences between CritWeights for evaluator e and CritWeightsfor other evaluators1. AT6(D8(e))→ X (Locate CrightWeights for e)2. AT3(X,D8(e‘)) for e‘ ∈ E (Get differences between CrightWeights fore and every other evaluator)• T34: Discover differences between non-weight component of preferencemodel for evaluator e to that for other evaluators1. Analogous to T33 - simply replace CritWeights with the distributioncorresponding to the base measure for the taxonomy level95Chapter 5A Design Space of Visualizationsto Support Preference Synthesisin Group Preferential ChoiceThis chapter presents a design space for visualizations to support inspection andexploration of multiple evaluators’ preferences in the context of Group PreferentialChoice. This is not intended to cover all possible designs, but rather, a viable subsetthat designers can choose from to suit their needs. We discuss the strengths andweaknesses of the various options, analytically evaluate their efficacy for differenttasks, and offer recommendations based on contextual features. Such designs canbe used in isolation or integrated into more sophisticated decision support systems.At this time, we focus solely on Level P0b of the Preference Model Taxonomy(Section 3.8). Furthermore, we limit the design space to Group Preferential Choicescenarios where:1. There are no more than a dozen alternatives or evaluators.12. The Evaluator hierarchy is flat - that is, there is only one group that containsall evaluators.3. Preferences are expressed on a scale with no negative values. This is impor-1This threshold was selected because colour is effective for encoding up to a dozen distinct iden-tities. Beyond this, other strategies are required.96tant because diverging scales have somewhat different design implications[47].Chapter 6 will briefly discuss how the design space might be extended to coverother levels of the taxonomy and scenarios that do not meet restrictions above.The inputs to this analysis are the data and task abstractions developed in Chap-ter 4, with the exception of the tasks related to sensitivity analysis (T29 - T33),which we leave to future work. At this time, we only consider tasks that do notinvolve manipulating the underlying data. The design space is described in termsof the following design aspects:1. Static design aspect (Section 5.1) - the basic idioms that are available andvarious options for mapping the dimensions and measures to marks andchannels.2. Dynamic design aspect (Section 5.2) - the mechanisms for transforming thedata and the view, including:(a) View transformations, which change how the data is shown(b) Data transformations, which change what data is shown3. Composite design aspect (Section 5.3) - the options for arranging and coor-dinating different views relative to each other.This chapter will use of a running example of seven friends - Beth, Darnell,Janelle, Jessica, Joel, and Taycee - trying to choose a hotel to stay at - Budget,Days Inn, or Fairmont. Each friend scored each hotel on a scale from 0 to 1.5.1 Static Design AspectThis section describes the static design aspect for Level P0b of the PreferenceModel Taxonomy. It introduces the major competitive idioms for presenting small-scale tabular data with categorical keys and numeric values. As such, it providesthe basic building blocks from which the entire design space for all levels ofthe taxonomy may be built. We limit our discussion to idioms that encode valuesusing position on a common scale, as it is the most effective channel for encodingmagnitude [37]. Idioms that use less effective channels in exchange for greater97information density, such as heatmaps, are more appropriate for larger datasets.These are discussed in Chapter 6.Section 5.1.1 discusses each of these idioms in turn, and Section 5.1.2 performsan analytic evaluation of the idioms based on the tasks identified in Chapter 4.5.1.1 Major IdiomsWe start with the simplest case in which there are no evaluator weights. At thislevel, evaluators score the alternatives holistically according to their preferences.To recap, the data consists of:• 2 - 12 Evaluators (E)• 2 - 12 Alternatives (A)• |A|x|E| AltScores• |A| TotalScoresThe data abstraction is a two-dimensional table with Evaluators and Alterna-tives as categorical keys and AltScores as numeric values. The TotalScores areobtained by summing AltScores over Evaluators.Note that all non-radial designs described in this section can be oriented hori-zontally or vertically. For succinctness, we show the horizontal orientation only.Bar-based IdiomsOne of the most common ways to represent tabular data is the bar chart [38].Bar charts redundantly encode values using two perceptual channels: positionand length. There are three styles of bar charts that are suitable for presentingtwo-dimensional tabular data: stacked bar charts, multi-bar charts, and tabular barcharts [25].Stacked Bar ChartStacked bar charts are appropriate when a one-dimensional measure is the sum of atwo-dimensional measure, as is the case with TotalScores and AltScores [38] [25].The stacked bar chart in Figure 5.1 maps alternatives to bars and evaluators tosegments. The TotalScore of each alternative is encoded by the length and position98of its bar, while the AltScore for each evaluator is encoded by segment length. Toimprove discriminability, the segments are typically assigned different colour hues[38].Because unaligned lengths are more difficult to compare than aligned lengths,the stacked bar chart is not particularly effective for tasks that require comparisonof AltScores [38] [55]. However, they are effective at supporting TotalScore com-parisons while also providing extra information about the relative contribution ofeach AltScore to the TotalScore.Figure 5.1: Stacked Bar Chart. Each bar encodes the TotalScore for eachhotel. The segment lengths correspond the AltScores for each evaluator.Multi-bar ChartMulti-bar charts map spatial regions to dimensions in a nested fashion such that allbars are aligned to a common baseline. Additionally, color hue may be mapped tothe secondary grouping to facilitate comparison across regions.Figures 5.2 and 5.3 show the two possible designs given the available mappingsfrom spatial region and color hue to Evaluators and Alternatives.99Figure 5.2: Multi-bar Chart Design 1: bars are grouped by alternatives, andcolour is mapped to evaluators.Figure 5.3: Multi-bar Chart Design 2: bars are grouped by evaluators, andcolour is mapped to alternatives.Tabular Bar ChartTabular bar charts map dimensions to spatial regions in a grid. There are fourpossible designs given the available mappings from spatial region and color hue toevaluators and alternatives. Figures 5.4 and 5.5 show the versions that map colour100to the column’s dimension (Figures 5.4 and 5.5).Note that Figure 5.4 pairs nicely with Figure 5.1, since a stacked bar chartcan be transformed into a tabular bar chart simply by pulling apart the segmentsand aligning them to their own baseline. This pairing would allow users to easilytransition between the tasks of comparing TotalScores, inspecting the breakdownof TotalScores into AltScores, and comparing AltScores for a particular evaluator.Figure 5.4: Tabular Bar Chart Design 1: alternatives on rows and evaluatorson columns. Colour is mapped to evaluators.101Figure 5.5: Tabular Bar Chart Design 2: evaluators on rows and alternativeson columns. Colour is mapped to alternatives.Tabular bar charts are more compact than multi-bar charts of the same size,but they are also less precise because the same axis range is compressed and re-peated across columns. Another weakness of tabular bar charts is that each columnhas its own baseline, and so comparisons across columns are less accurate thancomparisons across regions in multi-bar charts [55].Point-based IdiomsStrip PlotThe simplest of the point-based idioms is the strip plot, which uses position alonga common axis to encode values. Two-dimensional tabular data can be representedas a series of strip plots with one dimension separated by region (each with its ownstrip plot) and the other distinguished using another channel, typically colour hue.2Figures 5.6 and 5.7 show the two possibilities.2Colour hue is the second most effective channel for encoding categorical attributes after spatialregion [38]. Another option is mark shape, which is sometimes used redundantly along with colourhue [46].102Figure 5.6: Strip Plot Design 1: alternatives on axes and evaluators on points.Figure 5.7: Strip Plot Design 2: evaluators on axes and alternatives on points.The key strength of strip plots relative to bar charts is that they place an entiredimension along a single axis. In doing so, they unite the precision of multi-barcharts with the compactness of tabular bar charts. This property also makes themsuperior to bar charts for tasks related to spread, such as identifying clusters andoutliers, since the user only needs to scan a single spatial dimension to obtain all103relevant information.However, strip plots are less effective than bar charts at supporting look-uptasks because the secondary dimension is differentiated using colour alone. Thenecessity of colour also limits their scalability, since people can only differentiateup to around a dozen hues [38]. Their efficacy is contingent on the quality of thecolour palette, which should be highly discriminable and accessible to individualswith colour-blindness [38].Another challenge associated with strip plots is that occlusion may occur iftwo or more points have the same (or nearly the same) value. This is especiallylikely to become a problem if a discrete evaluation scale is used. There are severalways to address this challenge, including mark transparency, fill removal, jitteringor stacking, or using another channel such as shape to redundantly encode pointidentity [24] [19]. Perhaps the most scalable option is a combination of stackingand fill removal, which means plotting multiple unfilled points (as in Figure 5.6) ina vertical ‘stack’ at the same x-coordinate.Finally, point-based idioms are ill-suited to showing part-whole relationships.An additional plot could be added to Strip Plot Design 2 that shows evaluator av-erages, but this would not show how the parts contribute to the total.Strip Plot EnhancementsStrip plots can be augmented in one of two ways to support comparison of distri-butions across either dimension. First, each axis can be overlaid with distributioninformation in the form of range plots, box plots, or violin plots. This further in-creases their efficacy for tasks related to spread along an axis. For succinctness,we will only consider box plots.104Figure 5.8: Box Plot Design 1 (Note: the gray fill is a feature of Tableau’sbox plot design. We do not recommend using a fill, as it makes it moredifficult to differentiate the colours.)Figure 5.9: Box Plot Design 2Alternatively, the points corresponding to items in the secondary dimension canbe connected with straight lines of the same colour. This enhancement transformsthe strip plot into another popular idiom - parallel coordinates. This design supports105tasks related to inspection and comparison across axes. However, its effectivenessfor these tasks depends on the order of the axes [38].Figure 5.10: Parallel Coordinates Design 1: alternatives on axes and evalua-tors on lines.Figure 5.11: Parallel Coordinates Design 2: evaluators on axes and alterna-tives on lines.A variation on parallel coordinates is the radar chart, which arranges the axes106radially (Figures 5.13 and 5.14). For the most part, radar charts are effective for thesame tasks as parallel coordinates. However, they are less effective for comparisonof values across axes since the axes are not aligned. Furthermore, their cycliclayout may be misleading if the data itself is not cyclic [38].Yet another problem with radarcharts is that a value of zero on oneaxis will cause the polygon to collapseon top of the neighboring axes. Figure5.12 illustrates this problem using asimple example where Ann and Carolhave assigned scores of 0 to Days Innand Budget respectively. Also, crowd-ing gets worse the closer the scores areto 0. For these reasons, the overlapproblem for radar charts is much morecomplicated than it is for strip plotsand parallel coordinates.Figure 5.12: Troublesome radarchart.One benefit of radar charts is that polygon area is roughly proportional to thesquared sum of the axis scores. This means that Radar Chart Design 2 (Figure5.14) roughly encodes TotalScores. Although area is a less effective channel forencoding magnitude than position and length [38], it may be useful to have thisinformation for additional context.107Figure 5.13: Radar Chart Design 1: alternatives on axes and evaluators onpolygons. (Note: this figure was generated using onlinecharttool.com,and the polygon fill is a feature of their radar chart design. It is notrecommended, as the blending of colours makes it more difficult toidentify the boundaries.)108Figure 5.14: Radar Chart Design 2: evaluators on axes and alternatives onpolygons.With Evaluator WeightsIntroducing evaluator weights means adding the following measures to the dataset:• |A|x|E| UnweightedAltScores (scores before applying the weights)• |E| EvaluatorWeightsIntegrated ViewThe most straightforward way to show the relationship between the original scores(UnweightedAltScores), the weighted scores (AltScores), and the evaluator weights109is with a modified version of Tabular Bar Chart Design 1 (Figure 5.4) where thecolumn widths are proportional to the corresponding EvaluatorWeights. This ef-fectively compresses each axis into an amount of space proportional to the weightof that evaluator. This design is shown in Figure 5.15.Figure 5.15: Tabular Bar Chart Design 1 with variable column widths.The width of each column encodes the weight of each evaluator.The relative width of each bar within its column encodes the Un-weightedAltScore. The absolute width of each bar encodes theAltScore (that is, the product of the UnweightedAltScore and Eval-uatorWeight).This encoding pairs nicely with a stacked bar chart where the segments corre-spond to the AltScores (Figure 5.16). No other idiom can be as easily adapted toshow the part-whole relationship between AltScores and EvaluatorWeights.110Figure 5.16: Stacked Bar Chart corresponding to Tabular Bar Chart in Figure5.15. The width of each segment encodes the weighted AltScore, andthe length and position of each bar encodes the TotalScore.Separate ViewsAn alternative approach is to show the AltScores, UnweightedAltScores, and Eval-uatorWeights independently in separate views. This is not recommended, as it ob-scures the relationship between the measures. We especially advise against show-ing the AltScores apart from EvaluatorWeights, as this may lead users to erro-neously attribute differences in scores to differences in preferences when they areactually due to differences in weights.However, it may be sensible to supplement the integrated view with additionalviews that better support certain tasks. We will return to this discussion in Section5.3.5.1.2 Task-based Evaluation of EncodingsSection 4.2 showed how various tasks identified in our analysis can be decomposedinto auxiliary tasks on particular values and distributions. This section performs anin-depth assessment of the suitability of each encoding for each task-input pair thatsupports some high-level task for Level P0b. Table 5.1 summarizes the possibleinputs to each task.111Table 5.1: Possible Inputs for Each Auxiliary TaskAT1 AT2 AT3 AT4 AT5 AT6 AT7 AT8 AT9 AT10A single AltScore mark XA single TotalScore mark XA single EvaluatorWeight mark XA pair of AltScore marks for one evaluator XA pair of AltScore marks for one alternative XThe set of AltScore marks for one evaluator XThe set of AltScore marks for one alternative X XThe set of AltScore marks for a pair of alternatives X XThe set of AltScore marks for a pair of evaluators XThe set of all AltScore marks XThe set of all TotalScore marks XA single evaluator X XA single alternative X XAn evaluator/alternative pair XTasks that apply to more than one type of the input (AT2, AT3, AT6, AT7, andAT9) are split into cases in the descriptions below. Note that much of this eval-uation is speculative and will require empirical validation. The results of thisassessment are summarized in Figures 5.20 and 5.21.AT1: Identify a markThe input to this task is a single AltScore mark, and the output is the alternativeand evaluator it corresponds to.Bar charts are the most effective for this task because each mark occupies alabeled region, and the user does not need to consult a color key. Furthermore,there is no risk of marks overlapping. Tabular bar charts may be superior to multi-bar charts because they do not nest labels, and this could improve legibility.Whether or not there are differences among the point-based idioms is lessdefinitive. The connecting lines in parallel coordinates and radar charts may im-prove identification speed by increasing the salience of the colour. Box plots donot provide anything useful for this task.AT2: Compare values (Case A: one evaluator, two alternatives)The input to this task is a pair of AltScore marks for one evaluator, and the outputis an approximate difference.There are numerous factors to consider when ranking the encodings for this112task. Figure 5.17 provides an overview of the key ideas using a simplified versionof the hotel problem. The box plots are omitted because the example data set isvery small.Figure 5.17: What is the best encoding for comparing Budget and Days Innfor Bob? This figure divides the encodings into four efficacy groupsaccording to key principles. (a) Highly effective - comparisons areperformed along a single axis or within single region; (b) Less effec-tive - comparisons are made across axes or regions; (c) Less effec-tive - axes are condensed and offer less precision; (d) Least effective- requires comparison of unaligned widths or positions. The rankingswithin groups (a) and (b) are nuanced, as discussed in the text.The most critical factor is whether the values to compare are plotted on alignedaxes. This is not the case for the Stacked Bar Chart, Tabular Bar Chart Design 2,and Radar Chart Design 1, so these are the least effective encodings for this task(Figure 5.17d).Another important factor is precision, or how much space is allocated to eachaxis. Tabular Bar Chart Design 1 and Radar Chart Design 2 offer less precisionbecause the axes are shorter relative to the area of the plot (Figure 5.17c). Fur-thermore, Radar Chart Design 2 may be at a disadvantage because not all axes areperpendicular to the line of site.113Of the remaining point-based idioms, Strip Plot Design 2 is superior to StripPlot Design 1 (and its derivatives) because the positions to compare are locatedon the same axis. Similarly, Multi-bar Chart Design 2 is superior to the Multi-barChart Design 1 because the comparison is made within rather than across regions[56]. This division is illustrated in Figure 5.17a and 5.17b.Within these two groups, it is unclear whether the box plot or parallel coordi-nate overlays for the strip plots would improve or interfere with performance - ourintuition is that they might interfere by distorting the perception of distance.It is also difficult to rank the multi-bar charts relative to the point-based id-ioms, as there are several factors that may contribute in subtle ways. Bar chartsredundantly encode values using both the position and length channels, which maystrengthen their efficacy for comparison tasks. However, they are more clutteredthan point-based idioms [46], and their efficacy is sensitive to sort order - non-adjacent bars are more difficult to compare than adjacent bars because they arefurther apart and the viewer must ignore the bars in between [56]. This problemcan be mitigated by giving users the ability to filter alternatives.Point-based idioms are more succinct than multi-bar charts because they donot use length to encode values. Also, there is a more direct relationship betweenrelative positions and relative values - it is simply the distance between the points.Finally, the fact that each plot uses just one spatial dimensions means that ordinalrelationships can be identified at a glance simply by checking which point lies tothe left or right of the other. Point-based idioms also risk points overlapping, butthere are several effective strategies for dealing with this [24] [19].In light of these factors, we surmise that Strip Plot Design 2 is the best for thistask overall.AT2: Compare values (Case B: one alternative, two evaluators)The input to this task is a pair of AltScore marks for one alternative, and the outputis an approximate difference.The evaluation of encodings for Case B mirrors that of the Case A with theDesign numbers reversed. In other words, the most effective encodings are the De-sign 1 non-radial point-based idioms and Multi-bar Chart Design 1. We surmisethat Strip Plot Design 1 is the best overall.114AT3: Compare distributions (Case A: all evaluators, two alternatives)The input to this task is the set of AltScore marks for two alternatives, and theoutput is a rough approximation of the pairwise differences.This task requires the user to keep two distribution in focus while performingmultiple comparisons in sequence. As such, it is a hybrid of AT2 Case A and AT6Case B, and the ranking of encodings reflects this (Figure 5.18).Figure 5.18: What is the best encoding for comparing Fairmont and Budgetacross all evaluators? Parallel Coordinates Design 2 makes it easy toperform multiple precise comparisons in sequence, especially if fil-tering is permitted. The other three encodings shown here are alsoeffective, but each has its weaknesses.Interestingly, the most effective encoding for this task may be Parallel Coor-dinates Design 2, as the connecting lines make it easy to keep the distributions infocus while the strip plot base makes it easy to perform individual comparisons.Furthermore, trade-offs can be identified at a glance by looking for line intersec-tions. The same is true of Radar Chart Design 2, although the radial layout mightmake it harder to perform repeat comparisons with accuracy. A drawback of bothis visual interference from other lines - this can be mitigated by allowing users tofilter alternatives.115Another relatively effective encoding is the Tabular Bar Chart Design 1. Thegrid structure allows users to compare two rows of bars one column at a time, albeitwith less precision than some of the other encodings. This is much easier if the tworows are adjacent. Parallel Coordinates Design 1 is also effective if the plots to becompared are adjacent.Multi-bar charts are less effective because they require comparisons to be madeacross regions regardless of how the bars are sorted. Plain strip plots and box plotsare also less effective because the absence of a grid or connecting lines makes itdifficult to visually isolate each pair for comparison. This is true whether the dis-tributions of interest lie along the plots (Design 1) or across the plots (Design 2).Again, the least effective encodings are those that require comparison of unalignedposition and widths - the Stacked Bar Chart, Tabular Bar Chart Design 2, and RadarChart Design 1.AT3: Compare distributions (Case B: all alternatives, two evaluators)The input to this task is the set of AltScore marks for two evaluators, and the outputis a rough approximation of the pairwise differences.The ranking of encodings for Case B mirrors that of the Case A with the De-sign numbers reversed.AT4: Identify a dominance relationThe input to this task is a set of AltScore marks for a pair of alternatives, and theoutput is an assessment of whether one dominates the other.This task is a special case of AT3 Case A, so the evaluation of encodings issimilar. Notice that there is a dominance relationship between Fairmont and Budgetin Figure 5.18, so it applies to this task as well.The best encodings for this task are Parallel Coordinates Design 2 and RadarChart Design 2, as a dominance relation can be easily identified by checking if thelines intersect. In Radar Charts Design 2, this amounts to checking for enclosure.As in AT3 Case A, interference from other lines can be eliminated by filteringalternatives.The next most effective encoding is Tabular Bar Chart Design 1, as users canidentify dominance by comparing two rows, one column at a time. This is easier if116the two rows are adjacent.Finally, the Design 1 non-radial point-based idioms are also somewhat effec-tive, since a dominance relationship can be identified by checking whether eachpoint lies to the left of the same-coloured point in the other plot. This is easier todo if the plots are adjacent. The connecting lines in Parallel Coordinates Design1 may help, since all connecting lines will tilt in the same direction or not at all ifone alternative dominates the other (see Figure 5.18).The remaining encodings are not effective for this task for reasons similar tothose discussed in AT2 and AT3.AT5: Summarize varianceThe input to this task is a set of AltScores for a single alternative, and the output isa rough approximation of how much variation there is in the set.Box Plot Design 1 is the best encoding for this task, as it provides direct in-formation about the distribution and range. The next best encoding is Strip PlotDesign 1 and its other derivatives, as it enables the user to inspect the range anddistribution by scanning a single spatial dimension. Radar Chart Design 1 may beat a slight disadvantage because not all axes are perpendicular to the line of site.Variance can be roughly assessed using Multi-bar Chart Design 1 by looking atthe variation in bar length within a region. This can also be done with Tabular BarChart Design 2, albeit with less precision. This is more challenging with Multi-barChart Design 2 since the comparisons must be made across regions.Variance can also be roughly assessed using Parallel Coordinates Design 2 andRadar Chart Design 2 by examining the smoothness of the line or polygon. How-ever, this relationship is sensitive to axis order - the impression of variance may beexaggerated if clusters are split.The remaining encodings are not effective for this task for reasons similar tothose discussed in AT2 and AT3.AT6: Locate a value for a key-set (Case A: one alternative, one evaluator)The input to this task is an alternative/evaluator pair and the output is the AltScorevalue for that pair.Bar charts are best for look-up tasks for the same reason that they are good for117identification tasks - each mark is assigned to a particular region, so it is possibleto look up values without discriminating colour.Which style of bar chart is best may depend on the size of the dataset and theamount of space allocated to the view. The nesting of labels in multi-bar chartsmay result in more crowding, but it may also reduce the amount of area the userneeds to scan to find the labels of interest.AT6: Locate a distribution for a key-set (Case B: one alternative)The input to this task is an alternative and the output is the distribution of AltScoresfor that alternative.The best encoding for this task is Tabular Bar Chart 2, since it differentiatesalternatives using both contiguous spatial region and colour hue. The next bestencodings are the Design 1 encodings, as these assign alternatives to contiguousspatial regions.The Stacked Bar Chart and Multi-bar Chart Design 2 assign alternatives to non-contiguous spatial regions, and users must tune out the bars in between. ParallelCoordinates Design 2 and Radar Chart Design 2 map alternatives to connectedlines, but users must tune out the other lines that occupy the same space.For the remaining encodings, the user must visually group disconnected marksbased on colour alone, which is substantially more difficult. Filtering can reducethe amount of interference in all cases.AT6: Locate a distribution for a key-set (Case C: one evaluator)The input to this task is an evaluator and the output is the distribution of AltScoresfor that evaluator.The evaluations of encodings is the same as in AT6 Case B, except with theDesign numbers reversed.AT7: Look-up value in context (Case A: AltScore in TotalScore)The input to this task is an evaluator and a TotalScore mark for some alternative,and the output is the AltScore for that evaluator and alternative. In other words, thetask is to identify the contribution of some evaluator’s AltScore to the TotalScore.If EvaluatorWeights are defined, then the only applicable encoding for this task118is the Stacked Bar Chart. Otherwise, Radar Chart Design 2 is also weakly effec-tive for this task, since the area of each polygon is roughly proportional to theTotalScore squared.AT7: Look-up value in context (Case B: UnweightedAltScore in EvaluatorWeight)The input to this task is an alternative and an EvaluatorWeight mark for someevaluator, and the output is UnweightedAltScore for that alternative. This task isonly applicable when EvaluatorWeights are defined.The only applicable encoding for this task is the Tabular Bar Chart with Vari-able Widths. Using this encoding, the task can be achieved by assessing whatfraction of the evaluator’s column is filled by the bar.AT8: Browse for outliersThe input to this task is a set of AltScores for a single alternative, and the output isa set outliers.Box Plot Design 1 is best for this task, since it encodes outliers explicitly. StripPlot Design 1 (and its other derivatives) are also effective, since outliers can beidentified simply by finding points that are relatively far from the others.Outliers can be detected in bar charts by identifying bars that are much longeror shorter than others in their region. Large outliers are more perceptually salientthan small outliers because they ‘stick out’ from the others. Multi-bar Chart Design1 is the most effective of the bar-based idioms due to its precision and the proximityof the bars. Tabular Bar Chart Design 2 is less precise, while Multi-bar ChartDesign 2 requires comparison of bars across regions. Sorting based on AltScorecould increase the efficacy of bar charts for this task.Outliers can be detected using Parallel Coordinates Design 2 or Radar ChartDesign 2 by looking for non-recurrent spikes. Unlike bar charts, these are notperceptually biased toward large outliers, but they are disadvantaged in that thelines overlap with each other and may interfere perceptually.The remaining Design 2 point-based idioms are not effective because it is toodifficult to visually isolate the distribution of interest (see AT6: Case B). TheStacked Bar Chart and Tabular Bar Chart 1 are the least effective because theyrequire comparison of unaligned widths.119AT9: Browse for top/bottom values (Case A: one evaluator)The input to this task is a set of AltScores for a single evaluator, and the output isa set of top or bottom values.The relative strengths of the encodings for identifying top and bottom valuesare similar to those of task AT8, except this case is concerned with a distributionover alternatives.The best encoding for this task is Strip Plot Design 2 (and its derivatives), asthe top and bottom values are simply the points furthest to the left or right alonga single axis. This could be harder with Radar Chart Design 2 because the axis ofinterest might not be perpendicular to the line of site.The remainder of the assessment mirrors that of AT8 with the Design numbersreversed. In this case especially, the ability to sort bar charts by AltScore couldsignificantly improve task performance.AT9: Browse for top/bottom values (Case B: all data)The input to this task is a set of TotalScores for all alternatives, and the output isa set of top values. If EvaluatorWeights are defined, the only applicable encodingfor this task is the Stacked Bar Chart. Otherwise, an ‘Average Evaluator’ can beadded to any of the other plots to show the average scores (which is effectively thesame as showing the total scores). In this case, the efficacy of each encoding is thesame as for AT9 Case A.AT10: Browse for non-dominated distributionsThe input to this task is all the AltScores, and the output is a set of dominancerelationships between the alternatives.In the worst case, this simply requires performing task AT4 for every pair ofalternatives, but this is not necessary most of the time. Dominance relationshipscan be identified at a glance using Parallel Coordinates Design 2 or Radar ChartDesign 2 by looking for lines or sets of lines that do not intersect.The efficacy of the Design 1 non-radial point-based idioms for this task can beimproved by sorting the plots by TotalScore so that fewer comparisons need to bemade. The same is true of Tabular Bar Chart Design 1 (Figure 5.19).120Figure 5.19: It is easier to identify dominated alternatives when the rows aresorted by TotalScore (right) than when they are not (left). When therows are sorted, each row only needs to be compared to the rows aboveit. In this example, Grandma’s Basement is dominated by Budget,which is dominated by Fairmont.The remaining encodings are not effective for this task for reasons discussed inAT4.Summary of Task-based AssessmentTable 5.20 summarizes the results of the task-based assessment when Evaluator-Weights are defined. For each task, the best encodings are assigned a score of3, strongly effective encodings are assigned a score of 2, weakly effective encod-ings are assigned a score of 1, and ineffective encodings are assigned a score of 0.Inapplicable encodings are marked with a hyphen.Table 5.21 summarizes the same information when EvaluatorWeights are notdefined. Note that the only differences between the two tables are:1. Table 5.21 does not have a row for Tabular Bart Chart Design 1 with VariableWidths (it is not applicable).2. Table 5.21 does not have a column for AT7: B (it is not applicable).3. Some of the scores for AT7: A and AT9: B are different for reasons discussedin the text.121Figure 5.20: Support for each auxiliary task by encoding when Evaluator-Weights are defined. 3 = best, 2 = strongly effective, 1 = weakly ef-fective, 0 = ineffective. Gray cells indicate that the encoding is notapplicable to that task. The rows are sorted by the Total Score col-umn, which contains the sum of scores for each row.Figure 5.21: Support for each auxiliary task by encoding when Evaluator-Weights are not defined. 3 = best, 2 = strongly effective, 1 = weaklyeffective, 0 = ineffective. Gray cells indicate that the encoding is notapplicable to that task. The rows are sorted by the Total Score column,which contains the sum of scores for each row.What is apparent is that most tasks are strongly supported by at least one ofthe top two encodings in Table 5.20: Tabular Bar Chart Design 1 (with variableweights) and Parallel Coordinates 2. This suggests that these two encodings can beused in conjunction to support most tasks.Another observation is that parallel coordinates dominate radar charts exceptfor in AT7 Case A, and then only when there are no EvaluatorWeights. In otherwords, the only benefit that radar charts confer is that they weakly encode To-talScore via polygon area. In light of this and the numerous problems with radar122charts discussed earlier, we eliminate them from further consideration.We will return to this discussion in Section 5.3 when we consider how differentdesign choices may be combined to effectively support a variety of tasks.5.2 Dynamic Design AspectThis section describes a number of options for transforming the view so that theuser can perform multiple analytic tasks in sequence or perform particular tasksmore effectively.5.2.1 View TransformationsRearrange: Reorder and SortAllowing users to manually reorder rows, columns, and plots gives them controlover which items are adjacent, and this can improve their performance on compar-ison tasks (AT2, AT3, and AT4). This is especially true for the bar-based idioms.Allowing users to sort elements by TotalScore or AltScore for a particular eval-uator or alternative can improve their performance on tasks related to identifyingtop values (AT9) or looking for dominance relationships (AT10). It can also helpthem perform further analysis on top performing options only. For instance, oneevaluator might want to inspect how her top alternatives perform for other evalua-tors.Rearrange: Change MappingAllowing users to change the mapping from dimensions to regions/marks givesthem the flexibility to toggle between Designs 1 and 2 of each idiom. Whether ornot this is advised depends on which idioms are already provided and how potentialconflicts in the use of colour will be resolved (Section 5.3).If a multi-bar chart or tabular bar chart is in use, users might also be permit-ted to select which dimension to map to colour, since it is not strictly dictated bythe spatial mapping. This functionality would not greatly add to the users’ abilityto perform any of the identified tasks. Furthermore, it is not recommended if thetabular bar chart is paired with a stacked bar chart, as this would break the corre-spondence between the two.123Table 5.2 summarizes which rearrangements are applicable to each encoding.We do not recommend allowing users to manually reorder bars within regions of amulti-bar chart, as this could lead to inconsistency across regions. We also do notrecommend allowing users to reorder the segments of a stacked bar chart. However,if the stacked bar chart is paired with a tabular bar chart, then changing the order ofcolumns in the tabular bar chart should change the order of the segments as well.Table 5.2: Applicable rearrangements for each encoding. Justifiable transfor-mations are shown in green with a checkmark. Applicable but ill-advisedtransformations are shown in yellow with a question mark. Impossibleor nonsensical transformations are shown in gray.Manually ReorderAlternativesManually ReorderEvaluatorsSort Alternativesby TotalScoreSort Alternativesby AltScore(for an evaluator)Sort Evaluatorsby AltScore(for an alternative)Swap Region/MarkMappingSwap ColourMappingStacked Bar Chart X ? X X XMulti-bar Chart Design 1 X ? X X X X XMulti-bar Chart Design 2 ? X X X X X XTabular Bar Chart(Designs 1 and 2)X X X X X X ?Point-based Design 1 X X X XPoint-based Design 2 X X X XAdd EmphasisA final type of view transformation is the ability to emphasize or highlight an entityof interest. This technique alters the appearance of a mark to make it stand out -possible alterations include changing the hue, increasing saturation, or magnifyingthe mark. Linked highlighting adds emphasis to a set of entities that are related tothe selected entity. In this case, related entities would be those of the same colour -that is, all other marks for a particular evaluator (in the case of Design 1 encodings)or alternative (in the case of Design 2 encodings). Linked highlighting could im-prove users’ ability to locate distributions (AT6) and compare distributions (AT3),especially in cases where the distributions of interest are spread across regions oraxes.This design choice is coupled with the select design choice, which is the mech-anism by which users choose items for further action (in this case, highlighting)[38]. One common mechanism that we recommend is hover, which selects an itemfor as long as the mouse hovers over it. It may also be worthwhile to allow users124to select multiple items for highlighting at once - this is typically done via mouseclick. This would make it easier for users to keep two or more distributions in focusfor comparison tasks.5.2.2 Data TransformationsFilteringThere are two types of filtering a designer might want to support: filtering onentities and filtering on values.Filtering on entities is the ability to select a subset of alternatives or evalua-tors to inspect at any time. This can facilitate any number of tasks by removingdistracting elements. Filtering is especially important when working with parallelcoordinates or radar charts, since the distributions occupy the same space.Filtering on values is the ability to exclude alternatives based on TotalScoreor AltScore for a particular evaluator. This would allow users to set satisficingthresholds that must be met for an alternative to be considered. This feature is notrequired to support any of the tasks we identified, but it could be useful in scenarioswhere satisficing thresholds are important.Both types are filters are applicable to all encodings. There are a number ofways to implement filter controls, such as checklists for categorical entities or rangesliders for quantitative entities. Another mechanism for filtering is brushing, whichallows users to specify a region to filter out or leave in with a drag of the mouse. Ifthis design choice is used, there also needs to be a clear mechanism for reversingthe action.Details-on-demandAnother type of transformation involves augmenting the display with more detailedinformation. For example, users might want to query the precise value encoded bya bar or mark, as this information may be difficult to glean from the graphical rep-resentation alone. Possible implementations of this feature include a label overlaythat can be turned on or off or a tool-tip that appears when the user hovers over amark. The tool-tip could also include the label for the mark in order to expediteidentification (AT1).Other forms of textual information designers might consider making available125on demand include averages, variances, and axis details. If evaluators supple-mented their scores with text explanations, this could be displayed whenever auser clicks on the corresponding mark.5.3 Composite Design AspectIn this section, we offer recommendations on how different encodings and interac-tions can be integrated to create a complete interactive tool for preference synthesisin the context of Group Preferential Choice at Level P0b of the taxonomy. Notethat all recommendations are tentative and may be revised as we collect moreempirical data.We start with some general recommendations that apply to all cases. Then,we present recommendations for each of the three classes of users identified inSection 3.6, starting with the least sophisticated. We recognize that not all casesfall cleanly into one of these three classes, but designers should be able to pick andchoose recommendations from each to suit their exact situation.5.3.1 General RecommendationsNumber and Arrangement of ViewsIt is clear from the task-based assessment that no single encoding is sufficient tosupport all tasks. For this reason, many of our recommendations employ the mul-tiform design choice, in which the same data is faceted into two views that usedifferent encodings [38]. If the intended platform is a desktop of laptop computer,we recommend splitting the window horizontally and populating each half witha single encoding in the horizontal orientation, since this arrangement offers themost precision. It may also be beneficial to allow users to adjust the size of theviews in order to devote more screen real-estate to one or the other.There is a cost associated with multiple views, both in terms of cognitive loadand screen real estate [62], so we do not advise supporting more than two views.As the next few sections will demonstrate, it is possible to strongly support everytasks using combinations of just two encodings and a few basic interactions.126Evaluator WeightsFor designers of general purpose tools, we recommend including support for evalu-ator weights, since this feature was desired in the majority of the cases we studied.If EvaluatorWeights are included, then the only viable option is Tabular Bar ChartDesign 1 paired with a Stacked Bar Chart, as it is the only combination that sup-ports joint inspection of the three related measures (AT7 Cases A and B) and iden-tifying alternatives with the top weighted scores (AT9 Case B). From this pointforward, we will treat these two encodings as a unit due to their complementarynature.If the designer does not intend to support evaluator weights, then the optionsare more flexible. In this case, there may be no need to compute total scores inthe first place. In fact, it may be more useful to show the average scores, sincethese are on the same scale as individual scores and can be conveyed by adding an‘Average Evaluator’ to any plot.5.3.2 Class C: Casual UsersThis class includes users involved in low-stakes decision making in a casual setting.Examples include selecting a gift for a colleague or choosing a hotel to stay at. Wenow present two viable options that ought to be suitable for this class of users.Option 1: Tabular Bar Chart Design 1 + Stacked Bar Chart (single view)This is the simplest option if the designer intends to support evaluator weights,as it only requires one view. It is not effective for tasks that require comparisonacross evaluators (AT2 Case B, AT5, AT8), and it is only weakly effective for tasksthat require comparison across alternatives (AT2 Case A, AT9 Case A). For thelatter, the designer might include a text overlay that labels each bar with its valueto facilitate more precise comparison. Additionally, users could be given the optionto collapse the Stacked Bar Chart to devote more space to the Tabular Bar Chart.Option 2: Option 1 + Box Plot Design 1 (dual view)In order to identify potentially strong combinations for a dual-view design, wecomputed a score for each pair of encodings by taking the sum of the maximum127score on each task. The parallel coordinate designs were excluded from consid-eration due to the moderate learning curve and the fact that most people are notfamiliar with them [38] [41]. Including unfamiliar idioms may confuse casualusers and make them less likely to stick with the tool.Of the pairs that were included, the top scoring combinations were:1. Tabular Bar Chart Design 1 + Box Plot Design 12. Tabular Bar Chart Design 1 + Strip Plot Design 23. Box Plot Design 1 + Strip Plot Design 24. Box Plot Design 1 + Multi-bar Chart Design 2Of these, only the first uses the same colour mapping in both encodings. Thisis desirable because it preserves the semantics of colour across views [43]. Theremaining pairs would require two distinct, non-overlapping colour pallets. Oth-erwise, they would risk implying connections between unrelated marks [43]. Thislimits their scalability to about a dozen entities in total (alternatives and evaluators).For this reason, we recommend the first pairing above all others.When combined with the Stacked Bar Chart (Figure 5.22), this pairing stronglysupports all tasks except AT9 Case A, which is weakly supported by the TabularBar Chart. This weakness can be mitigated by including sort functionality and textlabels for bar values. The one drawback of this design (and multiform designs ingeneral) is that users might get confused shifting attention back and forth betweenthe two views since they use different idioms and the axes do not correspond.128Figure 5.22: Class C Option 2: Dual View with Tabular Bar Chart Design 1+ Stacked Bar Chart (top view) and Box Plot Design 1 (bottom view).Note that this and other figures in this section are intended forrough illustration only - we would expect an actual implementationto be more polished and include appropriate interaction controls.InteractionsAt the very least, users should be able to sort rows and plots by TotalScore orAltScore for a particular evaluator, as this is essential for inspecting top values(AT9) and identifying dominance relationships (AT4). Ideally, users should alsobe able to reorder plots, rows, and columns manually to support particular compar-isons of interest. The ability to sort columns by AltScore for a particular alternativeis not essential, but would be nice to have. Whenever the columns in the tabularbar chart are reordered, the segments in the corresponding stacked bar chart shouldbe reordered too. We leave it to the designer to choose the mechanism for imple-menting these features.Another essential feature is the ability to filter alternatives and evaluators, asthis allows users to remove distractions and narrow the scope of analysis. Filteringon values is not essential for small data-sets and may be too advanced for casualusers. We recommend a global scope for filter controls in order to preserve consis-129Table 5.3: Recommended Interactive Features for Class CEssential Sort alternatives by TotalScore/AltScore for evaluatorFilter alternatives and evaluatorsIdealManually reorder alternatives and evaluatorsTool-tips for dots (AltScore and identify)Label overlay for bars (AltScore)Nice-to-have Sort evaluators by AltScore for alternativeLinked highlighting (on hover)tency between views [43].If linked highlighting is implemented, it should be applied to same-colourmarks across all views. This will help users stay oriented when shifting attentionbetween views. Highlight-on-hover may be sufficient for casual users.Finally, we recommend tool-tips for dots that show their identity and value.As previously mentioned, we also recommend text overlays for the bar charts thatshow the values of the segments and bars. If EvaluatorWeights are defined, the textoverlay should specify the AltScore (not the UnweightedAltScore) for consistencybetween the tabular and stacked bar charts. The text colour should be discernibleagainst the bar colour, and the user should have the ability to turn the overlay onand off.5.3.3 Class B: Professional UsersThis class of users includes professionals involved in medium to high-stakes deci-sion making in a work setting. Examples include faculty hiring and software stackselection. In the cases we studied, this class of decisions was also recurrent, butthis may have been an coincidence within our sample.The space of viable options for this class is somewhat larger than for ClassC, since designers may want to provide more or less flexibility depending on theexact work context and expertise of potential users. Here, we describe two possibleoptions that might be worth considering.130Option 1: Dual View with Custom Strip PlotThis design is identical to Option 2 for Class C except that the second view containsa custom strip plot that allows users to select:1. which dimension to map to plots2. which overlay to apply (box plot, parallel coordinates, or none)Figure 5.23: Class B Option 1: Dual View with Tabular Bar Chart Design 1+ Stacked Bar Chart (top view) and Custom Strip Plot (bottom view).The user may select a dimension to plot and an overlay. In this ex-ample, the user has selected Evaluators with a Parallel Coordinatesoverlay, producing Parallel Coordinates Design 1.This design allows users to access the capabilities of all six strip plot-baseddesigns with just a little exploration. All tasks are strongly supported by at leastone encoding in this space. The only problem is that it introduces the risk of twodifferent colour mappings in the same window (Figure 5.24). Again, this is notideal because it reduces scalability by a factor of two, but it might be acceptable ifboth dimensions are small.131Figure 5.24: The user has constructed Parallel Coordinates Design 2, result-ing in two different colour mappings. (In this example, the colourpallets overlap - we recommend using distinct colour pallets.)A possible solution would be to let users define the colour mapping at a globallevel. That way, they can choose the mapping that is most helpful for their currenttask while preserving consistency between views. If the Evaluators is selected,then Strip Plot Design 2 dots belonging to the same axis will all have the samecolour (and vice versa). To preserve some degree of discriminability in all cases,the designer might also choose to map different shapes to the items of the sec-ondary dimension (Figure 5.25). If Alternatives is selected, then the segments ofthe Stacked Bar Chart will be the same colour. A dividing line can be drawn be-tween them to keep them distinguishable (Figure 5.26).132Figure 5.25: The user has selected to map colour to Evaluators. The bottomview contains Box Plot Design 2, where shape is used to preserve somediscriminability of hotels along each plot.Figure 5.26: The user has selected to map colour to Alternatives, causing thesegments of the Stacked Bar Chart to be the same colour. White divid-ing lines preserve some discriminability.133Option 2: Dual View with Intelligent Plot SelectionNotice that all tasks are strongly supported by at least one of the following: StackedBar Chart, Tabular Bar Chart Design 1, Box Plot Design 1, and Parallel CoordinatesDesign 2. In fact, Box Plot Design 1 and Parallel Coordinates Design 2 are thereason that Option 1 achieves complete task coverage.However, transitioning back and forth between these two encodings in Option1 requires three toggles - one to change the colour mapping, one to change thedimension mapping in the strip-plot, and one to change the overlay. Furthermore,the user might not realize the complementary power of these two encodings andmay end up wasting time with less effective intermediaries.An alternative approach is to populate the two views with effective, comple-mentary encodings given the selected colour mapping. If colour is mapped toEvaluators, then the secondary view is populated with Box Plot Design 1. Oth-erwise, it is populated with Parallel Coordinates Design 2. On its own, each pairof designs strongly supports most tasks, but the combination of all four stronglysupports for every task.Figure 5.27: Intelligent plot selection when the user has selected to mapcolour to Evaluators.134Figure 5.28: Intelligent plot selection when the user has selected to mapcolour to Alternatives.One problem with this design is that users might not expect the encoding tochange when they toggle the colour mapping. A simple solution would be tochange the name of the drop-down or other toggle mechanism to ‘Analysis Mode.’InteractionsThe recommended interactions for this group include all of those for Class C witha few additions (Table 5.4). The first addition is the ability to change the colourmapping, which is integral to both suggested designs. The second addition is linkedhighlighting with multi-select, which would allow users to apply persistent high-lighting to items of interest. This could help them keep multiple items in focuswhile performing complex tasks involving parallel coordinates, such as AT3. Thefinal addition is the ability to filter alternatives based on TotalScore or AltScore,which would enable user to set satisficing thresholds.135Table 5.4: Recommended Interactive Features for Class B. Items in bold areadditions to the list for Class C (Table 5.3).EssentialSort alternatives by TotalScore/AltScore for evaluatorFilter alternatives and evaluatorsSwap colour mappingIdealManually reorder alternatives and evaluatorsTool-tips for dots (AltScore and identify)Label overlay for bars (AltScore)Linked highlighting + multi-selectNice-to-haveSort evaluators by AltScore for alternativeLinked highlighting (on hover)Filter alternatives on AltScore/TotalScore values5.3.4 Class A: Specialized UsersThis class of users includes professionals and governing officials involved in veryhigh-stakes decision making that impacts society at large. These users are oftenaided by consultants with expertise in formal decision processes - these experts areincluded in this group as well.This class is the most likely to require sophisticated analysis software. How-ever, this need typically comes hand-in-hand with more sophisticated preferencemodels - that is, expressed at a higher level of the Preference Model Taxonomy. Assuch, the recommendations for Class A do not differ much from those for Class Bat this level. The recommendations will diverge as we extend the design space tohigher levels of the taxonomy.If users in this class do express their preferences at Level P0b, then a likelytask would be to assess the sensitivity of the final result to aggregation method andevaluator weights, as in the Mariner Jupiter-Saturn project [22]. Tasks related tosensitivity analysis are currently beyond the scope of this analysis - we leave thistopic to future work.136Chapter 6ConclusionGroup Preferential Choice can be challenging due to its multi-variate and inter-personal nature. There is considerable evidence that structured decision processes[6] [51] and individual preference modeling in particular [4] can promote morefruitful analysis and discussion, ultimately leading to greater satisfaction with theoutcome.The potential benefits of individual preference modelling are constrained byhow effectively the data is presented to decision makers. Information Visualizationsolutions have great potential, but only a handful have been attempted [40] [4] [36].Furthermore, no work thus far has attempted to characterize sources of variationamong Group Preferential Choice scenarios.This work makes progress on these fronts in three major steps, which are sum-marized in Section 6.1. Section 6.2 critically reflects on the limitations and visionof the work, and Section 6.3 presents possible directions for future work.6.1 Summary of ContributionsThis section summarizes the major contributions of this work and anticipates howthey might be used by other academics or designers of Group Preferential Choicesupport tools. All contributions are works in progress - they may be extended orrefined as new information is gathered.1376.1.1 Characterization of Group Preferential ChoiceThe goal of Chapter 3 was to characterize sources of variation in the data, goals,and decision making contexts of Group Preferential Choice. This was achievedby performing an in-depth analysis of a diverse set of Group Preferential Choicescenarios. The results of this analysis can help designers define the scope of theirwork by orienting them to the space of possibilities.Section 3.4 presented a data model for Group Preferential Choice, includinga taxonomy of commonly-used preference models. The model was extended toaccount for new sources of variation that were discovered during the analysis ofscenarios. This model is the interface between specific decision problems and therest of our work - if a decision problem can be described in these terms, then readerscan easily identify which tasks and design recommendations are applicable to theirsituation.Section 3.5 presented a summary of goals for preference synthesis in the con-text of Group Preferential Choice. It is worth reiterating that this is not intendedto be an exhaustive list. Depending on the exact situation, designers may wish tosupport additional goals or only a subset of these goals. As noted in Section 3.7,three goals were found in at least three scenarios - these would be good candidatesfor inclusion in any general-purpose support tool.Finally, Section 3.6 summarized the variation in contextual features across sce-narios. We found that the scenarios form roughly three clusters at different levelsof sophistication. This result can help designers define the target audience for theirtools by giving them a sense of likely classes of users.6.1.2 Data and Task Abstraction for Preference SynthesisThe goal of Chapter 4 was to describe the data and goals identified in Chapter 3in abstract terms that are suitable for visualization design and analysis. This is thebridge between the descriptions of Chapter 3 and the design recommendations inChapter 5 and beyond.Section 4.1 described the data in terms of multi-dimensional tables. This ab-straction is useful because the pros and cons of different encodings for tabular dataare well known [38], and there has been considerable work on representing large138multi-dimensional data sets in particular [16] [54].Section 4.2.1 presented a list of tasks to support each goal. Some of thesetasks were derived from the scenarios we studied, and others were added based onintuition. Again, this list is not intended to be exhaustive and may be iterativelyimproved as more data is collected. Finally, Section 4.2.2 described each of thesetasks in terms of a smaller set of low level tasks from Brehmer and Munzner’s tasktaxonomy [7]. This is useful because it allows potential designs to be evaluatedmore efficiently.In addition to providing abstractions for the current set of goals, this analysisalso serves as a template for abstracting new goals that are identified in the future.6.1.3 Design Space for Preference SynthesisChapter 5 presented our final contribution, which is a prescriptive design space ofvisualizations to support preference synthesis in the context of Group PreferentialChoice. At this time, the design space is limited to small-scale decision problemswhere preferences are expressed at Level P0b of the taxonomy - that is, each evalu-ator simply scores each alternative. Despite the limited scope of the current designspace, the analysis underlying its construction lays the foundation upon which acomplete design space may be built. As it stands, we believe that designers ofGroup Preferential Choice support tools will find plenty of useful suggestions re-gardless of the complexity of their data.Section 5.1 introduced the major competitive idioms for presenting small-scaletabular data and analytically evaluated their suitability for each auxiliary task. Sec-tion 5.2 described how interactivity could be introduced to enhance the efficacy ofthe static encodings. Finally, Section 5.3 showed how a complete support systemcould be constructed from the aforementioned elements, with specific recommen-dations for each of the three contextual classes identified in Chapter 3.Although the design space is tailored to Group Preferential Choice, many ofour recommendations could also be applied to the design of visualizations for otherpreferential data-sets, including but not limited to rankings, surveys, and evalua-tions. Furthermore, the task-based assessment of static encodings (Section 5.1.2)applies to tabular data in general.1396.2 Critical Reflections6.2.1 Goals ElicitationThe procedure we used to elicit scenario goals in Chapter 3 was structured andsystematic, but it was not without limitations. According to human-centred designexperts, the most effective way to attain a complete and accurate understanding ofa situation is using a combination of in situ observation and interviews [57]. Dueto time constraints, we were only able to do this for two scenarios - Faculty Hiring(department meeting portion) and XpertsCatch.The Best Paper, Gift, and Faculty Hiring (committee meeting portion) scenar-ios were assessed through interviews conducted after the fact. This is not as effec-tive, since interviewees may not be able to accurately identify, recall, or communi-cate key aspects of the situation [27]. The remaining three cases were assessed byreviewing second-hand reports, which is also error-prone due to the degree of sep-aration between the original situation and the analyst. Another potential source ofbias is the analyst’s interpretation of the data - in our case, this involved compilinga list of scenario goals from the interview notes.On the one hand, problem characterization is seldom done at all in InformationVisualization [37], so any attempt to do so may constitute satisfactory progress. Onthe other hand, Group Preferential Choice is highly complex and human-centered,and so the risk of some elements getting lost in translation is high. Our hope is thatthe scenarios we examined are sufficiently rich that they converge upon key pointsdespite the methodological limitations.There are several immediate actions we could take to validate our model, whichare discussed in Section 6.3. However, it is unlikely that we will fully understandthe complexity of this problem space until support tools are deployed, which bringsus to our next topic.6.2.2 A More Agile Approach?Thus far, our approach has been to perform a series of analyses on an entire classof problems in sequence. The strength of this approach is that we now have a solidframework for relating specific scenarios to the overarching problem space. This140is useful because it allows us to iteratively refine our understanding as new data isencountered.A major challenge with this approach is that the sheer amount of variationwithin each step makes it easy to lose site of tangible realities. Errors in the earlystages of analysis did not always become apparent until later stages, and recoverywas sometimes costly due to the layers of complexity and abstraction that neededto be synchronized.Now that we have a preliminary model in place, it may be worthwhile to switchto an alternative but complementary approach. Specifically, we could embark ona series of design studies following the methodology proposed in Sedlmair et al.[52]. In a design study, the needs of a particular group of users are identified,a visualization solution is implemented and evaluated, and insights are recorded.After several iterations of this process, we could compile our insights and updateour data model, task abstractions, and design space recommendations accordingly.This would allow us to achieve breadth while maintaining agility and practicalgrounding.6.3 Future Work6.3.1 Validating the Data and Task ModelThere are a number of actions we could take to improve the completeness andaccuracy of our data and task models.The first and easiest would be to have another researcher reproduce the de-scriptions of each scenario based on our interview notes and second hand sources.Then, the two descriptions could be compared for discrepancies. Another easy op-tion would be to go back and validate the written description of each scenario withinterviewees and authors of second hand sources (where possible).An even better approach would be to collect new data by observing GroupPreferential Choice scenarios as they occur. This might be more productive, sincewe could apply the lessons learned to the new situation. This could be done in thecontext of complete design studies, as suggested in Section 6.2.1416.3.2 Validating the Task-based AssessmentIt is important to emphasize that the task-based assessment in Section 5.1.2 con-tains a fair amount of speculation. We referred to reliable sources wherever pos-sible, but there were a surprising number of cases where we could not definitivelysay which encoding was better based on available literature. In particular, thereis a scarcity of literature devoted to comparing strip plots and bar charts, and thatwhich does exist invokes general principles such as data-ink maximization ratherthan empirical data on task efficacy [15] [46].This could be a rich territory for future research in the field of Vision Science.Questions that one might ask include:1. Do the connecting lines on parallel coordinates plots affect perception ofdistance between points along each axis?2. Under what circumstances do bar charts or strip plots support more accuratecomparisons?3. Do bar charts or parallel coordinates (single line) give a more accurate im-pression of variance?It may well be the case that answers to these questions exist but are hard tofind due to a scarcity of relevant surveys. In this case, conducting a review ofrelevant Vision Science literature could be a valuable avenue for future research.Otherwise, we hope that future research in Vision Science will shed light on thesequestions, as the answers would be valuable to anyone interested in the pros andcons of different ways of presenting tabular data.6.3.3 Extending the Design Space to Other Levels of the TaxonomyWe are currently working on extending the design space to the remaining levels ofthe taxonomy while retaining the same constraints, that is:1. There are no more than a dozen alternatives or evaluators.2. The Evaluator and Criteria hierarchies are flat.3. Preferences are expressed on a scale with no negative values.142Recall that the set of applicable designs depends on which dimensions andmeasures are defined (Figure 6.1). With the exception of EvaluatorWeights, this isdetermined wholly by the level of the Preference Model Taxonomy.Figure 6.1: Overview of Dimensions and Measures defined at each level ofthe Preference Model Taxonomy.Since each level of the taxonomy implicitly encodes all the levels above it, thedesign space at each new level is a superset of the design space of the levels aboveit. Therefore, we will also consider ways to support transitions between differentlevels of the taxonomy - for instance, factoring out the Criteria dimension to movefrom P1b to P0b.6.3.4 Relating Existing Encodings to the Design SpaceWe have already performed an extensive analysis of each of the tools introduced inChapter 2. This analysis currently exists as a detailed slide-deck, which is shown inAppendix A. It describes the capabilities of these tools in terms of our data modeland identifies the static idioms, mappings, interactive techniques, and other designchoices they employ. The next step is to relate this explicitly to the the design space143once it is complete.6.3.5 Extending the Design Space to Hierarchical and LargeDimensionsIn the future, we hope to extend the design space to include hierarchical dimensionsand dimensions with more than a dozen items. This promises to be an exciting areaof research, as there are a number of interesting possibilities, including:• More compact encodings for tabular data, such as heatmaps• Data reduction strategies, such as:– Hierarchical aggregation– Histograms– Focus + context (juxtaposed views or focal lens)• Hierarchy representation and traversal strategies, such as:– Node-link graphs– Rectilinear trees– Semantic zoomingWe have already begun reviewing relevant literature in this area. Liu et al. [34]provides an overview of the pros and cons of different data reduction strategies.The main takeaway is that binned aggregation is the ideal data reduction strategy,since it captures both global trends and outliers. Other data reduction strategiesinclude filtering, sampling, and model-fitting. Filtering and sampling may hideglobal trends, whereas model-fitting may hide interesting outliers.Stolte et al. (2002a) [53] presents Polaris, a novel interface for exploring multi-dimensional table, and Stolte et al. (2002b) [54] extends Polaris to support hier-archical dimensions. It proposes basic mechanisms to allow users to drill-downand roll-up hierarchies via drop-down selection. Polaris became the basis for thepopular visual analytics tool suite Tableau. When extending the design space toinclude hierarchical dimensions, we will look to Tableau for guidance due to itslong history and widespread use.144Bibliography[1] D-Sight. 4 Rue des Pe`res Blancs 1040 - Brussels, Belgium, 2015. URLhttp://www.d-sight.com/. → pages 2[2] K. Abraham, F. Flager, J. Macedo, D. Gerber, and M. Lepech.Multi-attribute decision-making and data visualization for multi-disciplinarygroup building project decisions. In Working Paper Series, Proceedings ofthe Engineering Project Organization Conference, Winter Park, CO, 2014.→ pages 2[3] K. J. Arrow. Social Choice and Individual Values. Yale University Press,1963. → pages 12, 37[4] S. Bajracharya, G. Carenini, B. Chamberlain, K. Chen, D. Klein, D. Poole,H. Taheri, and G. O¨berg. Interactive visualization for groupdecision-analysis. Submitted to: Decision Support Systems, 2017. → pages2, 3, 4, 12, 13, 28, 137[5] J. Bautista and G. Carenini. An integrated task-based framework for thedesign and evaluation of visualizations to support preferential choice. InProceedings of the Working Conference on Advanced Visual Interfaces,pages 217–224. ACM, 2006. → pages 4[6] U. Bose, A. M. Davey, and D. L. Olson. Multi-attribute utility methods ingroup decision making: Past applications and potential for inclusion inGDSS. Omega, 25(6):691–706, 1997. → pages 2, 11, 137[7] M. Brehmer and T. Munzner. A multi-level typology of abstractvisualization tasks. IEEE Transactions on Visualization and ComputerGraphics, 19(12):2376–2385, 2013. → pages xii, 6, 78, 81, 88, 139[8] M. Brehmer, J. Ng, K. Tate, and T. Munzner. Matches, mismatches, andmethods: Multiple-view workflows for energy portfolio analysis. IEEE145Transactions on Visualization and Computer Graphics, 22(1):449–458,2016. → pages 29, 30[9] M. Brehmer, B. Lee, B. Bach, N. H. Riche, and T. Munzner. Timelinesrevisited: A design space and considerations for expressive storytelling.IEEE Transactions on Visualization and Computer Graphics, 23(9):2151–2164, 2017. → pages 6, 29, 30[10] D. Brodbeck and L. Girardin. Visualization of large-scale customersatisfaction surveys using a parallel coordinate tree. In IEEE Symposium onInformation Visualization, pages 197–201. IEEE, 2003. → pages 12, 24, 25,26[11] V. A. Brown, J. A. Harris, and J. Y. Russell. Tackling Wicked Problems:Through the Transdisciplinary Imagination. Earthscan, 2010. → pages 1[12] G. Carenini and J. Loyd. ValueCharts: Analyzing linear models expressingpreferences and evaluations. In Proceedings of the Working Conference onAdvanced Visual Interfaces, pages 150–157. ACM, 2004. → pages 4, 14[13] L. N. Carroll, A. P. Au, L. T. Detwiler, T.-c. Fu, I. S. Painter, and N. F.Abernethy. Visualization and analytics tools for infectious diseaseepidemiology: A systematic review. Journal of Biomedical Informatics, 51:287–298, 2014. → pages 29, 30[14] D. Ceneda, T. Gschwandtner, T. May, S. Miksch, H.-J. Schulz, M. Streit, andC. Tominski. Characterizing guidance in visual analytics. IEEETransactions on Visualization and Computer Graphics, 23(1):111–120,2017. → pages 29, 30[15] J. M. Chambers, W. S. Cleveland, B. Kleiner, P. A. Tukey, et al. GraphicalMethods for Data Analysis, volume 5. Wadsworth Belmont, CA, 1983. →pages 142[16] S. Chaudhuri and U. Dayal. An overview of data warehousing and OLAPtechnology. ACM SIGMOD Record, 26(1):65–74, 1997. → pages 79, 139[17] W. Chen, F. Guo, and F.-Y. Wang. A survey of traffic data visualization.IEEE Transactions on Intelligent Transportation Systems, 16(6):2970–2984,2015. → pages 29, 30[18] C. A. B. E. Costa and J.-C. Vansnick. The MACBETH approach: Basicideas, software, and an application. In Advances in Decision Analysis, pages131–157. Springer, 1999. → pages 2146[19] T. N. Dang, L. Wilkinson, and A. Anand. Stacking graphic elements toavoid over-plotting. IEEE Transactions on Visualization and ComputerGraphics, 16(6):1044–1052, 2010. → pages 104, 114[20] E. Dimara, P. Valdivia, and C. Kinkeldey. DCPAIRS: A pairs plot baseddecision support system. In EuroVis-19th EG/VGTC Conference onVisualization, 2017. → pages 12, 25[21] C. Dwork, R. Kumar, M. Naor, and D. Sivakumar. Rank aggregationrevisited. Technical report, IBM Almaden Research Center, 650 HarryRoad, San Jose, CA 95120, 2001. → pages 37[22] J. S. Dyer and R. F. Miles Jr. An actual application of collective choicetheory to the selection of trajectories for the Mariner Jupiter/Saturn 1977project. Operations Research, 24(2):220–244, 1976. → pages 52, 53, 55,56, 136[23] W. Edwards and F. H. Barron. SMARTS and SMARTER: Improved simplemethods for multiattribute utility measurement. Organizational Behaviorand Human Decision Processes, 60(3):306–325, 1994. → pages 49[24] S. Few and P. Edge. Solutions to the problem of over-plotting in graphs.Visual Business Intelligence Newsletter, 2008. → pages 104, 114[25] S. Gratzl, A. Lex, N. Gehlenborg, H. Pfister, and M. Streit. Lineup: Visualanalysis of multi-attribute rankings. IEEE Transactions on Visualization andComputer Graphics, 19(12):2277–2286, 2013. → pages 12, 20, 21, 28, 98[26] J. S. Guest, S. J. Skerlos, J. L. Barnard, M. B. Beck, G. T. Daigger,H. Hilger, S. J. Jackson, K. Karvazy, L. Kelly, L. Macpherson, et al. A newplanning and design paradigm to achieve sustainable resource recovery fromwastewater. Environmental Science & Technology, 43(16):6126–6130, 2009.→ pages 1[27] J. T. Hackos and J. Redish. User and task analysis for interface design.Wiley, New York, 1998. → pages 140[28] P. Hansen and F. Ombler. A new method for scoring additive multi-attributevalue models using pairwise rankings of alternatives. Journal ofMulti-Criteria Decision Analysis, 15(3-4):87–107, 2008. → pages 2[29] I. B. Huang, J. Keisler, and I. Linkov. Multi-criteria decision analysis inenvironmental sciences: Ten years of applications and trends. Science of theTotal Environment, 409(19):3578–3594, 2011. → pages 8147[30] C. L. Hwang and K. Yoon. Multiple Attribute Decision Making: Methodsand Applications State-of-the-Art Survey, volume 186. Springer Science &Business Media, 2012. → pages 2, 33, 49[31] R. L. Keeney. Foundations for group decision analysis. Decision Analysis,10(2):103–120, 2013. → pages 12[32] R. L. Keeney and H. Raiffa. Decision with Multiple Objectives. Wiley, NewYork, 1976. → pages 8, 9, 57[33] K. Kucher, C. Paradis, and A. Kerren. The state of the art in sentimentvisualization. In Computer Graphics Forum. Wiley Online Library, 2017. →pages 29, 30[34] Z. Liu, B. Jiang, and J. Heer. imMens: Real-time visual querying of bigdata. In Computer Graphics Forum, volume 32, pages 421–430. WileyOnline Library, 2013. → pages 144[35] J. Lu and D. Ruan. Multi-Objective Group Decision Making: Methods,Software and Applications with Fuzzy Set Techniques, volume 6. ImperialCollege Press, 2007. → pages 33[36] N. Mahyar, W. Liu, S. Xiao, J. Browne, M. Yang, and S. P. Dow.ConsesnsUs: Visualizing points of disagreement for multi-criteriacollaborative decision making. In Companion of the 2017 ACM Conferenceon Computer Supported Cooperative Work and Social Computing, pages17–20. ACM, 2017. → pages 4, 5, 12, 17, 28, 137[37] T. Munzner. A nested model for visualization design and validation. IEEETransactions on Visualization and Computer Graphics, 15(6), 2009. →pages 81, 97, 140[38] T. Munzner. Visualization Analysis and Design. CRC Press, 2014. → pages2, 14, 25, 79, 98, 99, 102, 104, 106, 107, 124, 126, 128, 138[39] J. Mustajoki and R. P. Ha¨ma¨la¨inen. Web-HIPRE: Global decision support byvalue tree and AHP analysis. INFOR: Information Systems and OperationalResearch, 38(3):208–220, 2000. → pages 2, 12, 18[40] J. Mustajoki, R. P. Ha¨ma¨la¨inen, and K. Sinkko. Interactive computer supportin decision conferencing: Two cases on off-site nuclear emergencymanagement. Decision Support Systems, 42(4):2247–2260, 2007. → pages2, 3, 19, 20, 56, 59, 137148[41] S. Pajer, M. Streit, T. Torsney-Weir, F. Spechtenhauser, T. Mo¨ller, andH. Piringer. WeightLifter: Visual weight space exploration for multi-criteriadecision making. IEEE Transactions on Visualization and ComputerGraphics, 23(1):611–620, 2017. → pages 12, 22, 23, 128[42] P. G. Pham and M. L. Huang. Qstack: Multi-tag visual rankings. Journal ofSoftware, 11(7):695–703, 2016. → pages 12, 27, 28[43] Z. Qu and J. Hullman. Keeping multiple views consistent: Constraints,validations, and exceptions in visualization authoring. IEEE Transactions onVisualization and Computer Graphics, 24(1):468–477, 2018. → pages 128,130[44] M. D. Resnik. Choices: An Introduction to Decision Theory. University ofMinnesota Press, 1987. → pages 7[45] L. Robbins. An Essay on the Nature and Significance of Economic Science.MacMillan, 1932. → pages 8[46] N. B. Robbins. Dot plots: A useful alternative to bar charts. BusinessIntelligence Network Newsletter, 2006. → pages 18, 102, 114, 142[47] N. B. Robbins, R. M. Heiberger, et al. Plotting Likert and other rating scales.In Proceedings of the 2011 Joint Statistical Meeting, pages 1058–1066,2011. → pages 97[48] M. N. Rothbard. Economic Controversies. Ludwig von Mises Institute,2011. → pages 8[49] B. Roy. Classement et choix en pre´sence de points de vue multiples. RevueFranc¸aise D’informatique et de Recherche Ope´rationnelle, 2(8):57–75,1968. → pages 10[50] T. L. Saaty. What is the analytic hierarchy process? In MathematicalModels for Decision Support, pages 109–121. Springer, 1988. → pages 10[51] A. Salo and R. P. Ha¨ma¨la¨inen. Multicriteria decision analysis in groupdecision processes. In Handbook of Group Decision and Negotiation, pages269–283. Springer, 2010. → pages 2, 11, 73, 137[52] M. Sedlmair, M. Meyer, and T. Munzner. Design study methodology:Reflections from the trenches and the stacks. IEEE Transactions onVisualization and Computer Graphics, 18(12):2431–2440, 2012. → pages29, 141149[53] C. Stolte, D. Tang, and P. Hanrahan. Polaris: A system for query, analysis,and visualization of multidimensional relational databases. IEEETransactions on Visualization and Computer Graphics, 8(1):52–65, 2002.→ pages 79, 144[54] C. Stolte, D. Tang, and P. Hanrahan. Query, analysis, and visualization ofhierarchically structured data using Polaris. In Proceedings of the EighthACM SIGKDD International Conference on Knowledge Discovery and DataMining, pages 112–122. ACM, 2002. → pages 79, 139, 144[55] M. Streit and N. Gehlenborg. Points of view: Bar charts and box plots.Nature Methods, 11(2):117–117, 2014. → pages 99, 102[56] J. Talbot, V. Setlur, and A. Anand. Four experiments on the perception ofbar charts. IEEE Transactions on Visualization and Computer Graphics, 20(12):2152–2160, 2014. → pages 114[57] M. Tory and T. Moller. Human factors in visualization research. IEEETransactions on Visualization and Computer Graphics, 10(1):72–84, 2004.→ pages 140[58] A. Tsoukias, M. Ozturk, and P. Vincke. Preference modelling. MultipleCriteria Decision Analysis: State of the Art Surveys, International Series inOperations Research and Management Science, 78:27–71, 2005. → pages33[59] M. Velasquez and P. T. Hester. An analysis of multi-criteria decision makingmethods. International Journal of Operations Research, 10(2):56–66, 2013.→ pages 10, 11[60] J. Von Neumann and O. Morgenstern. Theory of Games and EconomicBehavior, 2nd rev. Princeton University Press, 1947. → pages 7, 53[61] D. Von Winterfeldt and W. Edwards. Decision Analysis and BehavioralResearch. Cambridge University Press, 1986. → pages 57[62] M. Q. Wang Baldonado, A. Woodruff, and A. Kuchinsky. Guidelines forusing multiple views in information visualization. In Proceedings of theWorking Conference on Advanced Visual Interfaces, pages 110–119. ACM,2000. → pages 126150Appendix AAnalysis of Existing Encodings151Analysis of Existing Encodings1Analysis of Existing Encodings● Inclusion criteria: Explicitly visualizes the performance of alternatives with respect to multiple criteria and/or multiple people’s preferences● Excludes:○ Visualizations of users (without showing the alternatives)○ MODM visualization (infinite alternatives, i.e. design space exploration)● Class 1: Interactive Tools (have some interaction)○ Group and individual MCDA support tools○ Tools for visualizing related datasets (evaluations, surveys, opinions)● Class 2: Standalone Encodings (no interactions)○ Encodings used in the scenarios from Ch. 12Class 1: Interactive Tools3152Preview: Group MCDAConsensUs [2]Group ValueCharts [1]Web-HIPRE (used in Nuclear case) [3,4]4Web-HIPRE [3,4]5What data is supported?Measures:Taxonomy level? P2b+w (and above) *Evaluator weights?Dimensions:Criteria hierarchies?Evaluator hierarchies?6153Overall Organization● Main window shows criteria hierarchy and alternatives● From there, user can view other windows:○ Priorities window○ Analysis window○ Ratings window● Group MCDA is is achieved by treating evaluators as criteria in an another decision problem7Main Window (Individual)● In individual MCDA context, the main window shows alternatives and criteria hierarchy● Can open additional windows from hereCriteria hierarchy8Main Window (Group)● In group MCDA context, the main window shows alternatives and criteria hierarchy● Can open additional windows from hereEvaluators hierarchy9154Priorities Window● Priorities window supports preference elicitation ● Can be opened by clicking on a criterion in the main window● Supports five elicitation methods, each in a different tabCritWeights10Priorities Window● Priorities window supports preference elicitation ● Can be opened by clicking on a criterion in the main window● Supports five elicitation methods, each in a different tabCritWeights11Priorities Window● Priorities window supports preference elicitation ● Can be opened by clicking on a criterion in the main window● Supports five elicitation methods, each in a different tab● Can also be used to define or inspect the score function and alternative outcomesOutcomesUnweighted OutScore12155Analysis Window (Individual)● In individual MCDA context, the analysis window supports evaluation phase tasks● Can map different things to bars and segments:○ Alternatives○ Criteria (one level at a time)AltScoreAltCritScore13Analysis Window (Individual)● In individual MCDA context, the analysis window supports evaluation phase tasks● Can map different things to bars and segments:○ Alternatives○ Criteria (one level at a time)● Interaction: roll-up/drill-down criteria hierarchyAltScoreAltCritScore14Analysis Window (Individual)● In individual MCDA context, the analysis window supports evaluation phase tasks● Can map different things to bars and segments:○ Alternatives○ Criteria (one level at a time)● Interaction: roll-up/drill-down criteria hierarchy● Can also do stuff like...15156Analysis Window (Group)● In group MCDA context, the analysis window supports synthesis phase tasks● Can map different things to bars and segments:○ Alternatives○ Evaluators (one level at a time)TotalScoreWeightedAltScore16Analysis Window● Another tab allows users to perform sensitivity analysis    (i.e. inspect trade-offs):17Ratings Window● Ratings window contains a consequence table● Colors:○ Yellow: min/max○ Blue: unit○ Green: value present○ Red: value missing18157How: Encode (Measures)Measure Class Measure Window Idiom EncodingScores TotalScore Analysis (group) Aligned stacked bar chartDimension mappings: customizableLength of barAltScore Length of segmentUnweightedAltScore Analysis (individual) Length of barAltCritScore Length of segmentWeights CritWeights Priorities (any weights tab) Horizontal bar charts + text field Length of bar, textEvaluatorWeights * Horizontal bar charts + text field Length of bar, textScore Functions UnweightedOutScore Priorities (ValueFn tab) Interactive line graph Point on graph, text coordinatesOutcomes Outcome Table (meaning of color unclear) Color-coded textRatings Table Text in color-coded cell19How: Encode (Dimensions)Dimension Window Idiom EncodingCriteria Main (individual) Node-link graph(Nodes color-coded by dimension)Blue nodeAlternatives Yellow nodeEvaluators Main (group) Node-link graph(Nodes color-coded by dimension)Blue nodeAlternatives Yellow node20How: Manipulate (Data Changing Interactions)● Change weights:○ Change values of text fields in Priorities windows● Change score function:○ Adjust coordinates of a single point on the score function graph in ValueFn tab of the Priorities window (click-and-drag)21158How: Manipulate (View Changing Interactions)● Change mapping:○ Swap the selections in the Segments and Bars drop-downs in the Analysis window● Change aggregation level: (How: Reduce -> Aggregate)○ Change selected dimension hierarchy level in one of the three drop-downs in Analysis window● Change data shown: (How: Reduce -> Filter)○ Change selected dimension in one of the three drop-downs in Analysis window22Group ValueCharts [1]23What data is supported?Measures:Taxonomy level? P2b+w (and above)Evaluator weights?Dimensions:Criteria hierarchies?Evaluator hierarchies?24159Overall Organization● One main window with two different views:○ Individual view○ Group view● Each view has the following components:○ Details component (with 3 tabs: Chart Details, Alternatives, and User List)○ Criteria component○ Scores component○ Score functions component● Another window may be opened to view score functions up closeHow: Facet -> PartitionHow: Facet -> Linked viewsHow: Facet -> Superimpose25Individual View26Individual View - Details ComponentOutcomes1. User List tab2. Alternatives tab3. Alternatives tab after clicking an alternative2 3127160Individual View - Criteria ComponentCriteria hierarchyCritWeight28Individual View - Scores ComponentAltScoreCritWeightOutcomeAltCritScoreAltCritScoreMax AltScoreHow: Facet -> Superimpose29Individual View - Score Functions ComponentUnweightedOutScoreUnweightedOutScoreCategorical Score Function Continuous Score Function30161How: EncodeMeasure Class Measure View Component Idiom EncodingScores AltScore Scores 1. Aligned stacked bar chart2. Tabular bar chartDimension mappings:Alternatives -> columnsCriteria -> rows, colourHeight of bart (1)  + textAltCritScore Height of bar (2); height of segment (1)Max(AltScore) Red-coloured textWeights CritWeights Row heightCriteria Rectilinear node-link graph Row heightOutcomes Outcome Scores “” Text label in (2)Details Tabular list TextScore Functions UnweightedOutScore Score Functions Interactive bar chart/ line graph Height of bar/Y-coordinate of dot31How: Manipulate (Data Changing Interactions)● Change weights:○ Adjust height of box in Criteria Component■ Click-and-drag■ “Pump” (double-click to inflate/deflate)● Change score function:○ Adjust y-coordinate of a point/bar in the score functions graph (click-and-drag)32How: Manipulate (View Changing Interactions)● Change arrangement:○ Change orientation (vertical or horizontal)○ Reorder Objectives (drag-and-drop)○ Reorder Alternatives (drag-and-drop, alphabetical, or by Objective score)● Change data shown: (How: Reduce -> Filter)○ Choose Alternative to see Outcomes for (click on name in Alternatives tab) ■ This is a special case of filter where exactly one item may be chosen● Change elements shown:○ Toggle view options (average lines, score functions, outcomes, score labels, utility scale)● Change viewpoint: (How: Navigate)○ Expand score function (How: Navigate -> Geometric Zoom)33162Group View34Group View - Details ComponentOutcomes1. User List tab2. Alternatives tab3. Alternatives tab after clicking an alternative12 335Group View - Criteria ComponentCriteria hierarchyMax CritWeight36163Group View - Scores ComponentAltScoreTotalScoreCritWeightOutcomeAltCritScoreMax CritWeight37Group View - Score Functions ComponentUnweightedOutScoreUnweightedOutScoreCategorical Score Function Continuous Score Function38How: EncodeMeasure Class Measure View Component Idiom EncodingScores TotalScore (AvgScore) Scores 1. Aligned multi bar chart2. Tabular multi-bar chartDimension mappings:Alternatives -> columns (primary)Criteria -> rowsEvaluators -> columns (secondary), colourVertical position of horizontal line (1)AltScore Height of bar (1)  + textAltCritScore Height of filled bar (2)Max(AltScore) Red-coloured textWeights CritWeights Height of unfilled bar (2)Max(CritWeight) Row heightCriteria Rectilinear node-link graph Row heightOutcomes Outcome Scores “” Text label in tabular bar chartDetails Tabular list TextScore Functions UnweightedOutScore Score Functions Interactive bar chart/ line graph Height of bar/Y-coordinate of dot39164How: Manipulate (View Changing Interactions)● Change arrangement:○ Change orientation (vertical or horizontal)○ Reorder Objectives (drag-and-drop)○ Reorder Alternatives (drag-and-drop, alphabetical, or by Objective score)● Change mapping:○ Change color for user● Change data shown: (How: Reduce -> Filter)○ Filter users (toggle checkboxes)○ Choose Alternative to see Outcomes for (click on name in Alternatives tab) ■ This is a special case of filter where exactly one item may be chosen● Change elements shown:○ Toggle view options (average lines, score functions, outcomes, score labels, utility scale)● Change viewpoint: (How: Navigate)○ Expand score function (How: Navigate -> Geometric Zoom)40ConsensUs [2]41What data is supported?Measures:Taxonomy level? P1bEvaluator weights?Dimensions:Criteria hierarchies?Evaluator hierarchies?42165Overall Organization● One main window with two linked views:○ Individual view○ Group viewHow: Facet -> PartitionHow: Facet -> Linked views43Individual ViewAltCritScore(Individual, Sam, Academic)AltScore(Individual, Sam)Alternatives44Group ViewAlternativesEvaluatorsTotalScore(Jim)Avg(AltCritScore(x, Jim, Readiness))AltCritScore(Individual*, Sam, Academic)AltScore(Individual, Jim)45166How: EncodeMeasure Class Measure View Idiom EncodingScores AltScore Group and Individual Small multiples, dot plot (specifically, Cleveland dot plot)Dimension Mappings:Alternatives -> colourCriteria -> rowsEvaluators -> size (two levels)Position of dot on plot (horizontal axis)AltCritScoreTotalScore GroupAvg(AltCritScore(x, a, c)) *46How: Manipulate (Data Changing Interactions)● Change scores:○ Adjust position of dot along criteria slider (click-and-drag)47How: Manipulate (View Changing Interactions)● Change data shown: (How: Reduce -> Filter)○ Filter alternatives (toggle checkboxes)○ Choose an evaluator to map to big dots (click on name in list)■ This is a special case of filter where exactly one item may be selected● Change aggregation level: (How: Reduce -> Aggregate)○ Drill-down average criterion score to see breakdown by evaluator (click on dot)48167Preview: Individual MCDAWeightLifter [7]DCPAIRS [6]ValueCharts (now just a part of GVC) [5]49DCPAIRS [6]50What data is supported?Measures:Taxonomy level? P2+wEvaluator weights?         (Single evaluator)Dimensions:Criteria hierarchies?Evaluator hierarchies?         (Single evaluator)51168View (only one)● SPLOM (scatter-plot matrix) over six criteria at a time● Each cell shows trade-off between row/column criteria (each point is an Alternative)● Other criteria can be swapped in, and are shown as tiles in the bottom corner● Criteria weight is encoded on sliderUnweightedAltCritScores (for selected Alternative)AltCritScores Criteria CritWeightHow: Facet -> PartitionHow: Facet -> Linked ViewsHow: Reduce -> Embed -> Focus + Context52How: EncodeMeasure Class Measure Idiom EncodingScores UnweightedAltCritScore Scatter-plot matrixDimension mappings:Alternatives -> spatial coordinatesCriteria -> spatial regionsx or y coordinate of point on scatter plot (two criteria per plot)UnweightedAltCritScore(a, c) is shown on every plot in the row and column for c Bar chart Length of bar + textWeights CritWeights --- Position of knob on slider widget; color of tile (grayscale)53How: Manipulate (Data Changing Interactions)● Change weights:○ Adjust position of dot along criteria slider (click-and-drag)● Change score function:○ Toggle positive linear/negative linear (click text)● Define alternative groups:○ Assign selected alternatives to a group54169How: Manipulate (View Changing Interactions)● Change mapping:○ Change color for alternative group● Change data shown:○ Filter alternatives on score (adjust position on range sliders)○ Choose attributes to put on the main diagonal (drag-and-drop)■ This is a special case of filter where exactly six items may be chosen● Change emphasis:○ Highlight selected alternative in all plots (click on point in one plot)55WeightLifter [7]56What data is supported?Measures:Taxonomy level? P2b+wEvaluator weights?         (Single evaluator)Dimensions:Criteria hierarchies?Evaluator hierarchies?         (Single evaluator)57170Overall Organization● One main window with three different views:○ Ranked Solution Details view○ Criteria view○ WeightLifter viewHow: Facet -> Linked ViewsHow: Facet -> SuperimposeHow: Reduce -> Embed -> Focus + Context58Ranked Solution Details ViewCritWeightUnweighted OutScores AltScoreAltCritScoreAlternatives59Criteria Value ViewAlternativesCriteria● Parallel coordinates plots where:○ Each line is an Alternative○ Each axis is a Criterion○ Each coordinate is an OutcomeOutcome60171WeightLifter View61How: EncodeMeasure Class Measure View Idiom EncodingScores AltScore Ranked Solution Details Table with embedded stacked barsDimension mappings:Alternatives -> rowsCriteria -> colourLength of barAltCritScore Length of segmentWeights CritWeights Stacked bar (on top of above-mentioned table)Length of bar, textScore Functions UnweightedOutScore Line graph glyphOutcomes Outcome Criteria Values Parallel coordinatesDimension mappings:Alternatives -> marks (lines)Criteria -> axes, colourCoordinate of line for Alternative on axis for Criterion62How: Manipulate (Data Changing Interactions)● Change weights:○ Adjust height of box in Criteria Component■ Click-and-drag■ “Pump” (double-click to inflate/deflate)● Change score function:○ Adjust y-coordinate of a point/bar in the score functions graph (click-and-drag)63172How: Manipulate (View Changing Interactions)● Change arrangement:○ Reorder Alternatives (by selected criterion score)● Change emphasis:○ Highlight selected alternative in all views (click on point in one plot)● Change data shown: (How: Reduce -> Filter)○ Filter alternatives (toggle in Ranked Solution Details View)○ Filter alternatives by criterion value (brush values in Criteria Values View)64Preview: Related DatasetsLineUp [8]SurveyVisualizer [9]Rizoli (2009) [10] 65LineUp [8]66173What data is supported?Measures:Taxonomy level? P2b+w (analogous)Evaluator weights?         (Single evaluator)Dimensions:Criteria hierarchies?         (Define on the fly)Evaluator hierarchies?         (Single evaluator)67Overall Organization● One interactive main view that allows users to dynamically define and compare multiple rankings● Data-mapping (e.g. score function) editor available on demandHow: Facet -> Linked ViewsHow: Facet -> SuperimposeHow: Reduce -> Embed -> Focus + Context -> Distort68Single-Ranking ViewAltCritScore Sum(AltCritScore(...))Outcome69174Multi-Ranking View70Data-Mapping Editors● Used to define mapping from domain values to scores (i.e. score functions)● Can also filter values by not mapping them to any score● (No need to get into the details beyond this)71How: EncodeMeasure Class Measure View Idiom EncodingRanks (not included in design space analysis)AltRank Main Slope graph/bump chart; Table with embedded bars (with option of aligned, stacked, or stack diverging}Dimension mappings:Alternative -> rowsCriteria -> columns, colourRow order, text labelScores AltScore Length of stacked bar (available on demand)AltCritScore Length of segmentWeights CritWeights Column widthOutcomes Outcome Text label on segment for AltCritScore (numeric Criteria)Text label in cell for criterion (categorical Criteria)Score Functions UnweightedOutScore Data-Mapping Editor? Coordinate on score axis72175How: Manipulate (Data Changing Interactions)● Change weights:○ Adjust width of column for criterion (click-and-drag, or manually enter a percentage)● Change score function:○ Adjust boundaries in data-mapping editor● Define meta-criteria:○ Assign selected criteria columns to a group73How: Manipulate (View Changing Interactions)● Change arrangement:○ Reorder Alternatives by column or meta-column score (click on header)○ Change alignment strategy (stacked bars, aligned bars, diverging bars, or sorted bars)● Change level of detail:○ Expand/collapse criterion column○ See exact outcomes for an Alternative (hover over row)● Change data shown: (How: Reduce -> Filter)○ Filter alternatives by categorical criterion value (enter text filter in widget in column header)○ Filter alternatives by numeric criterion value (adjust mappings in Data-Mapping Editor)○ Filter missing values (checkbox toggle in Data-Mapping Editor)● Change emphasis:○ Highlight selected alternative in all plots (hover for grey highlighting, click for yellow)cont...74How: Manipulate (View Changing Interactions)● Change navigation strategy:○ Toggle between uniform and fisheye view of rows● Change viewpoint: (How: Navigate)○ Change position of fisheye lens (How: Navigate -> Pan)● Create new linked viewt: (How: Navigate)○ Create snapshot of current view (which will appear next to it, connected by a slope graph)75176SurveyVisualizer [9]76What data is supported?Measures:Taxonomy level? P1b (analogous)Evaluator weights?         (Single evaluator)Dimensions:Criteria hierarchies?Evaluator hierarchies?         (Single evaluator)77Overall Organization● One main window with two linked views:○ Parallel Coordinates Tree View○ Analysis Group Selector ViewHow: Facet -> Linked viewsHow: Reduce -> Embed -> Focus + Context -> Distort 78177Parallel Coordinate TreeCriteria HierarchyAltCritScores *79Analysis Group Selector● Narrows down the set of surveys included (here, surveys are analogous to alternatives).● Nothing relevant to GPC.80How: EncodeMeasure Class Measure / Dimension Idiom EncodingScores AltCritScore Parallel coordinate tree (rectilinear tree with embedded parallel coordinates plots)Dimension mappings:Evaluators -> marks (lines)Criteria -> axesCoordinate of line for Evaluator Group on axis for CriterionN/A Criteria Line; region of rectilinear tree81178How: Manipulate (View Changing Interactions)● Change mapping:○ Change color for Alternative● Change emphasis:○ Highlight selected alternative in red (hover over line)○ Highlight selected alternative in black (click on line)● Change data shown: (How: Reduce -> Filter)○ Filter alternatives by analysis group (expand tree, toggle checkbox)○ Filter alternatives by criterion value (brush values in Criteria Values View)● Change viewpoint: (How: Navigate -> Pan)○ Change position of bifocal lens82Rizoli (2009) [10]83What data is supported?Measures:Taxonomy level? P1b (analogous)Evaluator weights?       Dimensions:Criteria hierarchies?Evaluator hierarchies?       84179Main ViewCount(AltCritScore(x, Yoyodyne, Image) == -1)Possible values for AltCritScore (diverging scale)Sum(Count(AltCritScore(x, Yoyodyne, y) == -1))TotalScore (Avg Score)How: Facet -> Partition 85How: EncodeMeasure Class Measure Idiom EncodingScores Avg(AltCritScore(x, a, c)) Small-multiples of histogramsDimension mappings:Alternatives -> rowsCriteria -> columnsText labelCount(AltCritScore(x, a, c)  == value) Height of barStacked bar chartDimension mappings:Alternatives -> rowsCriteria -> colourHeight of segmentSum(Count(AltCritScore(x, a, y)  == value))Height of barTotalScore (Avg(AltCritScore(x, a, y))) Text label86Class 2: Standalone Encodings87180Case StudiesFaculty Hiring Campbell RiverVoyager [11]Best Paper88Faculty HiringHow: Facet -> Partition 89What is the data?Measures:Taxonomy level? P1b (with discrete evaluation scale)Evaluator weights?Dimensions:Criteria hierarchies?Evaluator hierarchies?90181View 1● Small-multiples view faceted by criteria on rows, alternatives on columns● Each cell shows distribution of AltCritScores over possible values of AltCritScore:○ Bar encodes count of AltCritScores with that value● One bar for each combination of Alternative, Criteria, and possible AltCritScore value (aggregated over Evaluators)Count(AltCritScore(x, A, Research) == VS)Possible values for AltCritScore (specified by discrete evaluation scale) 91View 2● Small-multiples view partitioned by Alternative● Each view consists of a multi-bar chart, partitioned into regions by possible values of AltCritScore● Each region shows distribution of AltCritScores over criteria● One bar for each combination of Alternative, Criteria, and possible AltCritScore value (aggregated over Evaluators)Count(AltCritScore(x, A, Research) == VS)Possible values for AltCritScore (specified by discrete evaluation scale)92How: EncodeMeasure Class Measure View Idiom EncodingScores Count(AltCritScore(x, a, c)  == value) * 1 Small-multiplesDimension mappings:Alternatives -> columns (primary)Criteria -> rowsOutcomes -> columns (secondary)Height of bar2 Small-multiples, multi-bar chartDimension mappings:Criteria -> columns (secondary), colourOutcomes -> columns (primary)Height of bar;Text labels on Count axisView 1 and 2 show the same data in different ways93182Campbell River94What is the data?Measures:Taxonomy level? P0b and P2+wEvaluator weights?Dimensions:Criteria hierarchies?Evaluator hierarchies?95View 1● Range plot, with one range glyph per Criterion● Each range glyph shows the range (min and max) of CritWeights over that criterion● One bar for each combination of Alternative, Criteria, and possible AltCritScore value (aggregated over Evaluators)CritWeight(MikeM, Flooding)Criteria HierarchyRange(CritWeight(x, Erosion))96183View 2● Table with Evaluators/Elicitation method on rows, Alternatives on columns, and AltRanks in cells● AltScores are sorted into three bins, and a different colour is used for each bin (it is unclear what the scale is)AltRank(15, J) (number) and AltScore(15, J) (colour)97How: EncodeMeasure Class Measure View Idiom EncodingRanks (not included in design space analysis)AltRank 2 TableDimension mappings:Alternatives -> columnsEvaluators -> rowsTextScores AltScore Colour (three bins)Weights Range(CritWeights) 1 Range plotDimension mappings:Criteria -> columnsRange barCritWeight Point on range barN/A Criteria Hierarchy Label groups (a tree, loosely)Views 1 and 2 show different data98Best PaperMeasures:Taxonomy level?P0aEvaluator weights?Dimensions:Criteria hierarchies?Evaluator hierarchies?Measure Class Measure Idiom EncodingRanks (not included in design space analysis)AltRank TableDimension mappings:Alternatives -> colsEvaluators -> rowsTextSum(AltRank) TextData:Encoding:Analysis:AltRank(Person 1, Paper D) Sum(AltRank(x, Paper D)99184Voyager [11]Measures:Taxonomy level?P0a, bEvaluator weights?Dimensions:Criteria hierarchies?Evaluator hierarchies?Data:Analysis:Measure Class Measure Idiom EncodingRanks (not included in design space analysis)TotalRank * Table (Table III) TextAltRank Table (Table II)Scores TotalScore * Table (Table III)AltScore Table (Table II)Weights EvaluatorWeights Table (Table III)Encoding:● Table II: Alternatives on rows, Evaluators on columns, AltRank and AltScore in cells● Table III: Alternatives on rows, Collective choice rules (including EvaluatorWeights) on columns, TotalRank and TotalScore in cells100References101[1] Bajracharya, S., et al. "Interactive visualization for group decision analysis." International Journal of Information Technology & Decision Making. 2016. (in revision)[2] Mahyar, N., et al. "ConsesnsUs: Visualizing points of disagreement for multi-criteria collaborative decision making." Companion of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. ACM, 2017.[3] Mustajoki, J. and Hamalainen, R.P. “Web-HIPRE: Global decision support by value tree and AHP analysis.” INFOR: Information Systems and Operational Research, 38.3 (2000): 208–220.[4] Mustajoki, J., Hamalainen, R.P., and Sinkko, K. “Interactive computer support in decision conferencing: Two cases on off-site nuclear emergency management.” Decision Support Systems, 42.4 (2007): 2247–2260.[5]  Carenini, G. and Lloyd, J. “ValueCharts: Analyzing linear models expressing preferences and evaluations.” Proceedings of the Working Conference on Advanced Visual Interfaces, 150–157. ACM, 2004.[6] Dimara, E., Valdivia, P.,  and Kinkeldey, C. "DCPAIRS: A pairs plot based decision support system." EuroVis-19th EG/VGTC Conference on Visualization. 2017.[7] Pajer, S., et al. "WeightLifter: Visual weight space exploration for multi-criteria decision making." IEEE Transactions on Visualization and Computer Graphics 23.1 (2017): 611-620.[8] Gratzl, S., et al. "Lineup: Visual analysis of multi-attribute rankings." IEEE Transactions on Visualization and Computer Graphics 19.12 (2013): 2277-2286.[9] Brodbeck, D. and Girardin, L. "Visualization of large-scale customer satisfaction surveys using a parallel coordinate tree." Symposium on Information Visualization. IEEE, 2003.[10] Carenini, G. and Rizoli, L. "A multimedia interface for facilitating comparisons of opinions." Proceedings of the 14th International Conference on Intelligent User Interfaces. ACM, 2009.[11] Dyer, J.S. and Miles Jr., R.F. “An actual application of collective choice theory to the selection of trajectories for the Mariner Jupiter/Saturn 1977 project.” Operations Research 60.3 (1976): 220-244.185

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0363067/manifest

Comment

Related Items