UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Visualization of construction management data Chiu, Chao-Ying 2011

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata

Download

Media
ubc_2011_fall_chiu_chaoying.pdf [ 17.09MB ]
[if-you-see-this-DO-NOT-CLICK]
Metadata
JSON: 1.0049598.json
JSON-LD: 1.0049598+ld.json
RDF/XML (Pretty): 1.0049598.xml
RDF/JSON: 1.0049598+rdf.json
Turtle: 1.0049598+rdf-turtle.txt
N-Triples: 1.0049598+rdf-ntriples.txt
Original Record: 1.0049598 +original-record.json
Full Text
1.0049598.txt
Citation
1.0049598.ris

Full Text

VISUALIZATION OF CONSTRUCTION MANAGEMENT DATA by Chao-Ying Chiu B.Sc., National Chiao Tung University, Taiwan, 1994 M.Sc., National Cheng Kung University, Taiwan, 1996 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES (Civil Engineering)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)  October 2011 © Chao-Ying Chiu, 2011  Abstract To date, the research and development effort as reported in the literature for presenting input/output data in support of human judgment for conducting construction management (CM) functions and associated tasks has been relatively limited. In practice, CM practitioners often find it difficult to digest and interpret input/output information because of the sheer volume and high dimensionality of data. One way to address this need is to improve the data reporting capability of a construction management information system, which traditionally focuses mainly on using tabular/textual reports. Data visualization is a promising technology to enhance current reporting by creating a CM data visualization environment integrated within a CM information system.  Findings from a literature review combined with a deep understanding of the CM domain were used to identify design guidelines for CM data visualization. A top-down design approach was utilized to analyze general requirements of a CM data visualization environment (e.g. common visualization features) that effect visual CM analytics for a broad range of CM functions/tasks. A bottom-up design process integrated with design guidelines and the top-down design process was then employed to implement individual visualizations in support of specific CM analytics and to acquire lessons learned for enriching the design guidelines and common visualization features. Taken together, these three components provide a potent approach for developing a data visualization tool tailored to supporting CM analytics.  A research prototype CM data visualization environment that has an organization of thematic visualizations categorized by construction conditions and performance measures under multiple views of a project was created. Features of images generated from the foregoing visualizations can be characterized by different themes, types, contents, and/or formats. The visualization environment provides interaction features for changing/setting options that characterize images and enhancing readability of images as well as a mechanism for coordinating interaction features to increase efficiency of use. Case studies conducted using this environment provide the means for comparing its use with ii  current (traditional) data reporting for CM functions related to time, quality, and change management. It is demonstrated that visual analytics enhances CM analytics capabilities applicable to a broad range of CM functions/tasks.  iii  Preface The research reported in this thesis consists of identification of research problems and questions, formulation of research methodologies in pursuing answers to research questions, comprehensive and critical literature review, analysis of design guidelines and development methods of a CM data visualization environment, analysis, design, and implementation of a prototype CM data visualization environment, collecting and organizing construction management data from actual projects, and conducting case studies for assessing and seeking lessons learned from the application of data visualization technology and the prototype visualization environment developed. The topic of dissertation was proposed by the author's Ph.D. program supervisor Dr. Alan D. Russell. The data collection part of this research was an ongoing collaboration work by the author, three former master students (Miss Tanaya Korde, Mr. Ali Mehrdana, and Mr. Jehan Zeb), and a former co-op student (Mr. Phiradej Luechachandej). The programming work for implementing a prototype CM data visualization environment was done by Mr. William Wang, a senior programmer, in the Department of Civil Engineering, UBC. The author was solely responsible for the rest of other components of this research with the guidance from Dr. Alan D. Russell. The thesis includes three manuscripts:  A version of Chapter 2 has been published. Russell, Alan. D., Chiu, Chao-Ying., Korde, Tanaya. (2009). "Visual Representation of Construction Management Data." Automation in Construction, 18(8), 1045-1062.  A version of Chapter 3 has been published. Chiu, Chao-Ying., and Russell, Alan D. (2011). "Design of a Construction Management Data Visualization Environment: a Top–Down Approach." Automation in Construction, 20 (4), 399-417.  A version of Chapter 4 has been submitted for publication. Chiu, Chao-Ying, and Russell, Alan D. "Design of a Construction Management Data Visualization Environment: a Bottom–Up Approach".  iv  Table of Contents Abstract…… ..................................................................................................................ii Preface……. .................................................................................................................. iv Table of Contents .......................................................................................................... v List of Tables ................................................................................................................. x List of Figures .............................................................................................................. xii Acknowledgements .................................................................................................. xxvii Chapter 1 Introduction ................................................................................................. 1 1.1 Problem statements and proposed solutions ....................................................................... 1 1.2 Terminology definitions .................................................................................................. 12 1.3 Research questions, objectives, and hypothesis ................................................................ 16 1.4 Research scope and assumptions ...................................................................................... 17 1.4.1  Scope or focus .................................................................................................. 17  1.4.2  Assumptions ..................................................................................................... 20  1.5 Research methodologies .................................................................................................. 21 1.6 State-of-the-art data visualization and its application to CM ............................................. 24 1.6.1  Shortcomings of current applications of data visualization in CM ..................... 24  1.6.2  How state-of-the-art data visualization can be adopted/adapted ......................... 26  1.7 Research contributions .................................................................................................... 29 1.8 Structure of the thesis ...................................................................................................... 46  Chapter 2 Visual Representation of Construction Management Data ..................... 49 2.1 Introduction..................................................................................................................... 49 2.2 Motivation for use of visualization................................................................................... 53 2.3 Data visualization in construction .................................................................................... 54 2.4 General principles of visual analytics design processes .................................................... 58 2.4.1  Understanding the purposes of analytical reasoning .......................................... 58  2.4.2  Organizing data representations and data transformations ................................. 59  2.4.3  Designing visual representations and interaction features .................................. 61  2.4.4  Design evaluation ............................................................................................. 63 v  2.5 Design of visual representations of change order data ...................................................... 64 2.5.1  Change order management ................................................................................ 65  2.5.2  Visual representations for project 1 and design 1 ............................................... 67  2.5.2.1 Purposes of analytical reasoning for project 1 and design 1 ............................... 68 2.5.2.2 Choice of data representations and transformations – project 1 and design 1 ..... 69 2.5.2.3 Choice of visual representations – Figures 2.1 and 2.2, project 1 and design 1 ... 70 2.5.2.4 Evaluation – project 1 ....................................................................................... 71 2.5.2.5 Lessons learned – project 1 ............................................................................... 73 2.5.3  Visual representations for project 2 ................................................................... 74  2.5.3.1 Visual representations for project 2 and design 1, Figures 2.3 - 2.5.................... 75 2.5.3.2 Visual representations for project 2 – design 2, Figure 2.6................................. 84 2.5.3.3 Visual representations for project 2 and design 3, Figure 2.7 ............................. 86 2.6 Some general observations .............................................................................................. 87 2.6.1  Applying identified principles for design process .............................................. 87  2.6.2  Evaluation and feedback ................................................................................... 88  2.6.3  Organizing lessons learned for development of a general CM visual analytics model ............................................................................................................... 89  2.6.4  Issue of CM data management .......................................................................... 91  2.6.5  Data exploration flexibility ............................................................................... 92  2.7 Conclusions ..................................................................................................................... 93  Chapter 3 Design of a Construction Management Data Visualization Environment: a Top-Down Approach .............................................................................. 95 3.1 Introduction..................................................................................................................... 95 3.2 Approach and structure of chapter ................................................................................... 98 3.3 Data visualization in construction management .............................................................. 100 3.4 Concepts of analytics and relation to development of a construction management data visualization environment .............................................................................................. 102 3.4.1  Analytics for construction management .......................................................... 103  3.4.2  Visual CM analytics supported by a data visualization environment ................ 107  3.4.3  Analytics for time performance management .................................................. 109  3.4.3.1 Visual CM analytics for planning/predicting time ........................................... 110 3.4.3.2 Visual CM analytics for monitoring/diagnosing/controlling time ..................... 112  vi  3.4.3.3 Visualization requirements deduced from time performance management analytic needs .............................................................................................................. 114 3.5 Case study of CM analytics using a data visualization environment ............................... 115 3.5.1  Case study overview ....................................................................................... 115  3.5.2  Visualization for CM analytics for planning/predicting time ............................ 117  3.5.3  Visualization for CM analytics for monitoring/diagnosing/controlling time ..... 120  3.6 Evaluation of and extensions to the current data visualization environment .................... 125 3.7 Conclusions ................................................................................................................... 135  Chapter 4 Design of a Construction Management Data Visualization Environment: a Bottom-Up Approach ........................................................................... 137 4.1 Introduction................................................................................................................... 137 4.2 Visualization design 1--time performance measure variance visualization ...................... 141 4.2.1  Visualization requirements.............................................................................. 141  4.2.1.1 CM variables involved .................................................................................... 142 4.2.1.2 How characteristics of time performance variance measures can be observed for identifying potential causes of time performance as a function of project context dimensions ..................................................................................................... 142 4.2.2  Visualization design specifications.................................................................. 145  4.3 Visualization design 2--PCBS attributes visualization .................................................... 150 4.3.1  Visualization requirements.............................................................................. 151  4.3.1.1 CM variables involved .................................................................................... 152 4.3.1.2 How characteristics of PCBS attributes can be observed for identifying potential impacts on context dimensions........................................................................ 152 4.3.2  Visualization design specifications.................................................................. 155  4.4 Visualization design 3--time performance cause-effect visualization .............................. 157 4.4.1  Visualization requirements.............................................................................. 159  4.4.1.1 CM variables involved .................................................................................... 159 4.4.1.2 How characteristics of construction conditions related to a certain activity can be observed for identifying abnormalities and their timing ................................... 161 4.4.2  Visualization design specifications.................................................................. 162  4.5 Use of a state-of-the art CM data visualization environment........................................... 165 4.5.1  Description of projects and project data .......................................................... 165  4.5.2  Demonstration cases ....................................................................................... 166 vii  4.6 General observations ..................................................................................................... 177 4.7 Conclusions ................................................................................................................... 189  Chapter 5 Conclusion-Summary, Answering the Research Questions, Contributions, Future Work ................................................................... 193 5.1 Overview of conclusions ............................................................................................... 193 5.2 Research summaries ...................................................................................................... 193 5.3 Demonstrating and analyzing the merit of using a CM data visualization environment ... 198 5.3.1  Demonstration cases- project 1 ....................................................................... 200  5.3.1.1 Project 1- demonstration case 1 ....................................................................... 205 5.3.1.2 Project 1 - demonstration case 2 ...................................................................... 210 5.3.2  Demonstration cases- project 2 ....................................................................... 217  5.3.2.1 Project 2- demonstration case 1 ....................................................................... 222 5.3.2.2 Project 2- demonstration case 2 ....................................................................... 229 5.3.3  Demonstration cases- project 3 ....................................................................... 234  5.3.3.1 Project 3- demonstration case 1 ....................................................................... 238 5.3.3.2 Project 3- demonstration case 2 ....................................................................... 245 5.3.4  Analysis of demonstration results.................................................................... 250  5.3.5  Conclusions from the demonstration and analysis ........................................... 255  5.4 Summary of research contributions ................................................................................ 256 5.5 Future work ................................................................................................................... 257  Bibliography .............................................................................................................. 259 Appendices ................................................................................................................. 279 Appendix A  Data Visualization in Construction Management ........................................ 279  A.1  Visual representations of planned/baseline construction conditions ................. 279  A.2  Visual representations of predicted/ baseline construction performance ........... 281  A.3  Visual representations of how changes of planned/baseline conditions affect predicted/baseline performance-optimizing construction plans ........................ 285  A.4  Visual representations of how changes of planned/baseline conditions affect predicted/baseline performance- identifying and analyzing construction risks . 291  A.5  Visual representations of actual or actual vs. planned/baseline construction conditions ....................................................................................................... 298  A.6  Visual representations of actual or actual vs. predicted/baseline construction performance ................................................................................................... 300 viii  A.7  Visual representations of dependency/cause-effect between conditions and performance ................................................................................................... 304  A.8 Appendix B  Interacting with computerized visual representations of CM data .................... 306 Overview of State-of-the-Art Data Visualization ........................................ 313  B.1  Introduction to data visualization .................................................................... 313  B.2  Toward better visual representations for analytics ........................................... 316  B.3  Interacting with computerized visual representations ....................................... 320  B.4  Designing and developing data visualization tools........................................... 322  B.5  State-of-the-art data visualization tools ........................................................... 327  ix  List of Tables Table 1.1 Correspondence between research questions and research methodologies for answering them ............................................................................................. 22 Table 1.2 Primary common CM visualization features: presenting construction conditions or performance measures mapped against primary project context dimensions ..................................................................................................................... 37 Table 1.3 Secondary common CM visualization features: secondary feature consideration after addressing the primary visualization features ........................................ 42 Table 1.4 Design guidelines in addition to those presented in Chapter 2 ........................ 43 Table 2.1 Change order properties of interest ................................................................. 66 Table 2.2 Summary of visual representation evaluations for Figures 2.3 and 2.4, project 2 ..................................................................................................................... 80 Table 3.1 Data visualization environment requirements for visual CM analytics and conformance/non-conformance of current environment (monitoring/diagnosing for time performance without the use of explicit explanatory CM models) ... 127 Table 3.2 Visualization features that are available in or in development process for the current CM data visualization environment ................................................. 130 Table 4.1 Summary of data representations/transformations for time performance variance measure visualization .................................................................... 146 Table 4.2 Summary of data representations/transformations for quantitative PCBS attribute visualization .................................................................................. 155 Table 4.3 Summary of data representations/transformations for visualizing the distribution of construction conditions versus time ...................................... 163 Table 5.1 Descriptions of the exploration- answer process using the CM data visualization environment and current (traditional) data reporting functionalities: project 1-demonstration case 1 ............................................ 206 Table 5.2 Descriptions of the exploration- answer process for both the use of a CM data visualization environment and the current data reporting functionalities in project 1-demonstration case 2 .................................................................... 211  x  Table 5.3 Descriptions of the exploration-answer process for both the use of a CM data visualization environment and current data reporting functionalities: project 2demonstration case 1 ................................................................................... 223 Table 5.4 Descriptions of the exploration-answer process for both the use of a CM data visualization environment and the current data reporting functionalities: project 2-demonstration case 2 ................................................................................ 230 Table 5.5 Descriptions of the exploration-answer process for both the use of a CM data visualization environment and current data reporting functionalities: project 3demonstration case 1 ................................................................................... 239 Table 5.6 Descriptions of the exploration-answer process for both the use of a CM data visualization environment and the current data reporting functionalities in project 3-demonstration case 2 .................................................................... 246 Table 5.7 Summarized comparison between the use of a CM data visualization environment and current (traditional) data reporting features for the six demonstration cases .................................................................................... 251 Table B.1 Representative state-of-the-art data visualization tools and their general functionality .............................................................................................. 328  xi  List of Figures Figure 1.1 Data flows for the task "develop schedule" as suggested by PMBOK (2008) .. 3 Figure 1.2 Data flows for the task "monitor and control project work" as suggested by PMBOK (2008) .............................................................................................. 4 Figure 1.3 The relationship between several current state-of-the-art CM_IS, computer assistance, human judgment, and conducting stand alone CM tasks ................. 5 Figure 1.4 One of many pages of CM_IS generated tabular reports of planned/actual schedule data and problems encountered during execution .............................. 7 Figure 1.5 One of many pages of Microsoft Word documents of deficiency lists ............. 8 Figure 1.6 One of many pages of Microsoft Excel spreadsheets for change order data ..... 9 Figure 1.7 A proposed solution in terms of enhancing data reporting functionalities in the relationship between a current state-of-the-art CM_IS, computer assistance, human judgment, and conducting stand alone CM tasks ................................ 10 Figure 1.8 A two way structured top-down and bottom-up CM data visualization environment development process................................................................. 34 Figure 1.9 A user interface for image selection by construction conditions that are grouped under project views (i.e. the tab items such as "process", "as-built"). This user interface only allows users to choose images representing distribution of values of construction conditions in certain project context dimensions. This figure showcases the primary common visualization feature of “image theme by construction conditions or performance measures” described in the 1st row of Table 1.2 ............................................................. 39 Figure 1.10 Two record distribution images visualizing number of deficiencies distributed in different project context dimensions: (a) product and location dimensions, (b) project participant and location dimensions. This figure showcases the primary common visualization feature of “image type by context dimensions” described in the 2nd row of Table 1.2 ............................................................ 39 Figure 1.11 Records distribution images visualizing the number of deficiencies distributed in the location dimension but at different levels of granularity: (a) location set (stories), (b) location (rooms). This figure showcases the primary xii  common visualization feature of “image contents by granularity of context dimensions” described in the 3rd row of Table 1.2 ........................................ 40 Figure 1.12 Records distribution images visualizing the number of deficiencies distributed in the location dimension but of different data value ranges: (a) "all" deficiencies, (b) only "long lead time" deficiencies. This figure showcases the primary common visualization feature of “image contents by items selection of project context dimensions” described in the 4th row of Table 1.2 ..................................................................................................................... 40 Figure 1.13 The user interface (the left bottom combo box) for adjusting data status (i.e. planned, actual, and planned vs. actual work area received percentage). This figure showcases the primary common visualization feature of “image contents by data status states” described in the 5th row of Table 1.2 ........................... 41 Figure 2.1 Project 1 CO history in terms of ID & Location, timing and value of work ... 68 Figure 2.2 Project 1 History of COs by location, time, responsibility and number .......... 69 Figure 2.3 Project 2 Number and reasons for change orders ........................................... 76 Figure 2.4 Project 2 Distribution of value and reasons for change orders ....................... 79 Figure 2.5 Project 2 Stacked graphs for number, values and reasons for change order.... 83 Figure 2.6 Project 2 Distribution of change orders by physical system ........................... 85 Figure 2.7 Project 2 Causal model reasoning – number of COs and corresponding schedule update dates and projected completion dates ................................... 87 Figure 3.1 Differentiating between current state-of-art CM systems and potential of systems with a formal visualization environment applicable to a wide range of functions ....................................................................................................... 97 Figure 3.2 CM analytics flow charts ............................................................................ 105 Figure 3.3 Product view – (a) project locations; (b) location attributes; and (c) photo of components ................................................................................................. 116 Figure 3.4 LP and Bar Chart schedule representations ................................................. 118 Figure 3.5 Tiled LP chart views of: (a) as-built schedule for all activities, and (b) planned vs. actual work trajectory for lead activity, excavate & mud slab. ................ 121 Figure 3.6 LP chart (a) and problem status charts- problem status by problem code vs. time (b); problem status by problem code vs. location (c) ............................ 123 xiii  Figure 3.7 Parallel comparisons of operation process of using a CM data visualization environment and the thought process of CM analytics it supports reflected in Figure 3.6.................................................................................................... 126 Figure 3.8 Generate and juxtapose (a) production rate chart, (b) activity status charts, and (c) environment condition charts ................................................................. 134 Figure 4.1Illustration of the definitions of time performance and time performance variances ..................................................................................................... 143 Figure 4.2 (a) A specific CM analytical reasoning task/question regarding time performance variance measures, (b) corresponding data dimensions and visual encodings, and (c) the generated visual representation ................................. 148 Figure 4.3 (a) A specific CM question related to product attributes, (b) corresponding data dimensions and visual encodings, (c) the generated visual representation for the entire data space, and (d) the generated visual representation for an area of interest in (c). .......................................................................................... 156 Figure 4.4 Illustration of concept of cause-effect reasoning for identifying potential reasons for time performance. ..................................................................... 158 Figure 4.5 (a) A specific CM question pertaining to the construction conditions of activity execution status and problem status, (b) corresponding data dimensions and visual encodings, and (c) the generated visual representation....................... 164 Figure 4.6 (a) A screenshot of a 3D version of schedule variance graphics presenting the duration variance values of the activity sets related to constructing substructures for project 1 for all locations, (b) a screenshot of a 3D version of schedule variance graphics zooming into regions of interest seen in Figure 4.6(a) (i.e. the phase 5 locations, from locations F533L to F 458) ................ 168 Figure 4.7 Three 2D versions of PCBS attribute graphics presenting the distribution of planned vs. actual values for location attributes: (a) percentage work area received, (b) underground (UG) utilities relocation by others, (c) overhead (OH) utilities relocation by others. .............................................................. 171 Figure 4.8 A 3D version of schedule variance graphics presenting the activity time variance values of the sets of trade activities related to construction work at the parkade location for project 2 ...................................................................... 173 xiv  Figure 4.9 A multiple view image representing selected construction conditions associated with the activitiy "shotcrete shoring at parkade level": (a) comparison schedule (progress date of 31 December 03 vs. planned project early start date of 20 October 03), (b) activity status, (c) problems encountered, (d) temperature, (e) ground conditions, (f) daily and cumulative precipitation. ................................................................................................................... 175 Figure 4.10 A multiple view image representing selected construction conditions associated with the activity "bulk excavate substructure at parkade level": (a) problems encountered, (b) daily and cumulative precipitation, (c) comparison schedule (progress date of 31 December 03 vs. planned project early start date of 20 October 03), (d) activity status, (e) Equipment (truck) planned resource usage, (f) Equipment (hydraulic excavator) planned resource usage (for (e) and (f) the early and late plots are identical because the activity is a critical one) . ................................................................................................................... 176 Figure 4.11 An illustration of the hierarchical relationship between various time performance variance measures. .................................................................. 184 Figure 4.12 In contrast to Figure 4.8, different bar shapes are used to represent different levels in the hierarchical relationship between various time performance variance measures. ...................................................................................... 186 Figure 4.13 An improvement on Figure 4.8 by using a mock-up of stacked-bar charts to represent the hierarchical relationship about summing together quantities of various time performance variance measures. .............................................. 187 Figure 4.14 An improvement on Figure 4.7(c) by using two more distinctive colors for representing planned and actual data status and by reversing the ordering of labeling on the Z axis. ................................................................................. 188 Figure 4.15 A mock-up 2D version image for the 3D graphics shown in Figure 4.8. .... 190 Figure 5.1 The PCBS (product) view of project 1. The dialogue box shows how users can define attributes (e.g. concrete quantity) for product items. ......................... 201 Figure 5.2 The PCBS view (physical work locations) of project 1 ............................... 202 Figure 5.3 The process view for project 1 (correspond to original modeling of the project). ...................................................................................................... 203 xv  Figure 5.4 A photo of actual columns of project 1........................................................ 204 Figure 5.5 The organization view of project 1 .............................................................. 204 Figure 5.6 A traditional bar graph of as-planned (blue bars) vs. as-built (green bars) schedule representing the "excavate and mud slab" activity executed in the location range F 737 to F 654 ...................................................................... 208 Figure 5.7 A non-traditional as-planned (blue lines) vs. as-built (green lines) schedule generated from a CM data visualization environment representing the "excavate and mud slab" activity executed in the location range F 737 to F 654 ................................................................................................................... 209 Figure 5.8 First page of the 14 page tabular report of planned values of product attributes (e.g. concrete quantity, formwork area) for the foundations and columns at all work locations ............................................................................................. 213 Figure 5.9 (a&b) Number and lengths of piles by location, (c&d) number and lengths of rock anchors by location.............................................................................. 214 Figure 5.10 (a) Planned formwork areas, (b) Planned concrete quantities, (c) Planned reinforcing bar lengths required by the foundations and columns at all locations...................................................................................................... 215 Figure 5.11 (a) Planned schedule for "pour footing" (left connecting lines) and "pour column" (right connecting lines) activities executed at the first 54 locations, (b) Planned concrete quantities required by the foundations and columns at all locations...................................................................................................... 216 Figure 5.12 The PCBS view (physical locations) of project 2 ...................................... 218 Figure 5.13 The PCBS view (products) of project 2 ..................................................... 219 Figure 5.14 The organizational view (project participant) of project 2 ......................... 220 Figure 5.15 The As-built view (deficiency records) of project 2. The dialogue box shows how users can associate a deficiency item with items from other views such as the project participant view and PCBS view ................................................ 221 Figure 5.16 The first page of the 304 page tabular report of deficiency records that include information about project participants who are responsible for the deficiencies, deficient products, and locations of the products. .................... 225  xvi  Figure 5.17 Number of deficiencies distributed in the location dimensions of two levels of location dimension granularity ("location set" and "location" levels) and the project participant dimension at the level of granularity of individual "project participant". ................................................................................................ 226 Figure 5.18 Number of deficiencies (excluding the ones of the Painter and Cleaning trades) distributed in two levels of location dimension granularity ("location set" and "location" levels) and the project participant dimension at the level of granularity of individual "project participant". ............................................. 227 Figure 5.19 Number of (a) Painter trade deficiencies, and (b) Cleaning trade deficiencies distributed in the product dimensions of three different levels of granularity ("System", "Subsystem", and "Element" levels) .......................................... 228 Figure 5.20 The first page of a 16 page tabular report of deficiencies that require a longer time to fix. It includes information about project participants who are responsible for the deficiencies, deficient products, locations of the products, and types of deficient work. ........................................................................ 232 Figure 5.21 Number of long lead time deficiencies (i.e. deficiencies that need a longer time to correct) distributed in: (a) the location dimensions at two levels of granularity ("location set" and "location" levels) and the project participant dimension at the level of granularity of "project participant", (b) the keyword (i.e. deficiency type) dimension at the level of granularity of "2nd level of deficiency definitions" and the the project participant dimension at the level of granularity of "project participant". ............................................................. 233 Figure 5.22 The PCBS view (both product and location) of project 3 ........................... 235 Figure 5.23 The organizational view (project participant) of project 3 ......................... 236 Figure 5.24 As-built view (change order records) for project 3. The dialogue box shows how users can associate a change order item with items from other views such as the project participant view and PCBS view ............................................ 237 Figure 5.25 The first page of the 43 page tabular report of change orders that include information about project participants who will execute them, products and locations involved, and types of change order. ............................................ 241  xvii  Figure 5.26 Number of (a) scope change orders, (b) design change orders, (c) site condition change orders, (d) owner change orders, distributed in the product dimension at the level of granularity of "system" and the location dimension at the level of granularity of "location". ........................................................... 242 Figure 5.27 Number of (a) scope change orders related to the building site work , (b) design change orders related to the interior and service systems, distributed in the product dimensions of three different levels of granularity ("System", "Subsystem", and "Element" levels) and the location dimensions of two levels of granularity ("location" and "sub-location" levels"). ................................. 243 Figure 5.28 Number of change orders distributed in the project participant dimension at the level of granularity of "project participant". ........................................... 244 Figure 5.29 The first page of the 24 page tabular report of change orders that includes information about project participants who will execute the change orders, products and locations involved with a change order, and change order issue date. ............................................................................................................ 248 Figure 5.30 (a) Number of all change orders distributed in the time dimension at the level of granularity of "month" and the project participant dimension at the level of granularity of "project participant". Number of change orders issued between mid February and June 2005 that are associated with the (b) Broadway trade, and (c) Celtic trade, distributed in the time dimension at the level of granularity of "month" and the location dimension at the level of granularity of "location". ................................................................................................................... 249 Figure A.1 has been removed due to copyright restrictions. It was a graphics visualizing planned resource allocation. Original source: O'Brien, J. J. (1965). CPM in Construction Management 1st Ed: Fig. 15.1., pp 186 ............................ 280 Figure A.2 has been removed due to copyright restrictions. It was a graphics visualizing activity sequencing. Original source: de Leon, G. P. (2008). Project Planning using Logic Diagramming Method. AACE International Transactions: Figure 1, pp. PS.S05.03 ......................................................................................... 280 Figure A.3 Visualizing temporal and spatial distribution of activities. Source: (The National Building Agency 1968) ................................................................ 281 xviii  Figure A.4 A construction schedule in bar graph format seen as early as 1917. Source: (Brinton 1939) ........................................................................................... 282 Figure A.5 Network diagram--activities on arrows. Source: (Fondahl 1962) ................ 283 Figure A.6 Network diagram--activities on nodes. Source: (Fondahl 1962) ................. 283 Figure A.7 has been removed due to copyright restrictions. It was a graphics visualizing S curve. Original source: O'Brien, J. J. (1965). CPM in Construction Management 1st Ed: Fig. 14.5., pp 168 ...................................................... 284 Figure A.8 Cash flow diagram. Source: (Cooke and Williams 2004) ........................... 285 Figure A.9 Changing planned construction conditions (resources) vs. changed forecast performance (time) in multiple views. Source: (Russell et al. 2009) ........... 287 Figure A.10 Changing planned construction conditions (profit margin desired) vs. changed forecast performance (cash flow) in a single view. (© 2000 IEEE. Reprinted, with permission, from Khosrowshahi, Information Visualization in aid of Construction Project Cash Flow Management, Proceedings of the International Conference on Information Visualisation,2000) ..................... 288 Figure A.11 Changing planned construction conditions (resources) vs. changed forecast performance (unit cost/productivity) in a single view. T = team; S = saw; O = trucking old panels, N = trucking new panels. Source: (Zhang et al. 2008) 289 Figure A.12 A nomograph that encodes a mathematical model predicting required crew size. The model considers factors such as CPM duration, project deadline, number of sites for a repetitive activity. Source: (Elhakeem and Hegazy 2005) .................................................................................................................. 290 Figure A.13 A four variable influence diagram representing uncertain relationships between them. Source: (Diekmann 1992) ................................................... 292 Figure A.14 has been removed due to copyright restrictions. It was a graphics visualizing distribution in time and space of risk from project participants. Original source: Korde, T., Wang, Y., Russell, A. D. (2005). Visualization of Construction Data. Proceedings of 6th Construction Specialty Conference: Figure 3, pp. CT-148-6............................................................................... 292 Figure A.15 has been removed due to copyright restrictions. It was a Tornado Plot visualizing how negative/positive 10% change in independent variables xix  affects the net present value. Original source: Vrijland, M. S. A. (2003). Visual Display of Sensitivity and Risk. AACE International Transactions: Figure 2, pp. RISK 11.4 ............................................................................. 293 Figure A.16 has been removed due to copyright restrictions. It was a Spider Plot visualizing how negative and positive percentage changes in various independent variables affect net present value. Original source: Vrijland, M. S. A. (2003). Visual Display of Sensitivity and Risk. AACE International Transactions: Figure 7, pp. RISK 11.7 ........................................................ 293 Figure A.17 has been removed due to copyright restrictions. It was a Radar plot showing sensitivity scores for the independent variables. Original source: Vrijland, M. S. A. (2003). Visual Display of Sensitivity and Risk. AACE International Transactions: Figure 8, pp. RISK 11.8 ........................................................ 294 Figure A.18 Probability distribution (includes cumulative probability distribution) for right of way cost and construction cost. Source: (Washington State Department of Transportation 2010) ........................................................... 295 Figure A.19 Probability impact matrix showing likelihood and degree of schedule/cost consequence of risk events associated with a highway project. Source: (Washington State Department of Transportation 2010) ............................. 296 Figure A.20 Tabularized risk register of a highway project. Source: (Washington State Department of Transportation 2010) ........................................................... 297 Figure A.21 Representing as-built data of problems and site conditions. Source: (Russell and Udaipurwala 2002a) (with Permission from ASCE) ............................. 298 Figure A.22 has been removed due to copyright restrictions. It was a graphics visualizing actual crew size and planned vs. actual crew size. Original source: Pinnell, S. S. (1998). How to Get Paid for Construction Changes: Figures 10.38 and 10.39, pp. 330 ............................................................................................ 299 Figure A.23 "Causes and who" responsible for change order costs. Source: (Cox et al. 1999) ......................................................................................................... 300 Figure A.24 Actual daily status of activities. Source: (Hegazy et al. 2005)................... 301  xx  Figure A.25 Actual time performance of activities is encoded in colors and projected onto product components corresponding to those activities. Source: (Song et al. 2005) ......................................................................................................... 301 Figure A.26 Actual activity status at operational level of detail. (© 2008 Reprinted, with permission, from Vrotsou, K., Ynnerman, A., Cooper, M., Seeing Beyond Statistics: Visual Exploration of Productivity on A Construction Site, Proceedings of 2008 International Conference in Visualization, 2008) ....... 301 Figure A.27 has been removed due to copyright restrictions. It was a graphics visualization of EVM index. Original source: Anbari, F. T. (2003). Earned Value Project Management Method and Extensions. Project Management Journal 34(4): Figures 4, 5, and 9, pp. 14 and 16. ....................................... 302 Figure A.28 Actual vs. planned cost distribution in activities at different levels of detail. Source: (Nie et al. 2007)............................................................................. 303 Figure A.29 has been removed due to copyright restrictions. It was a Tree-map representations for cost index (i.e. actual cost of work performed/budgeted cost of work performed) of pay items of different levels of detail - the color scale is used for representing cost index values. Original source: Songer, A. D., Hays, B., North, C. (2004). Multidimensional Visualization of Project Control Data. Construction Innovation: Information, Process, Management 4(3): Figure 4, pp. 185. ............................................................................... 303 Figure A.30 has been removed due to copyright restrictions. It was a graphics juxtaposing weather conditions with activity status for validating/invalidating cause-effect relationship between the two. Original source: Zeb, J., Chiu, C., Russell, A. (2008). Designing a Construction Data Visualization Environment. Proceedings of the 1st Forum on Construction Innovation: Figure 4, pp. 7 . 304 Figure A.31 Visual representation of output data of an explanatory model using generic C4.5 decision-tree classification rules for explaining reasons for delays in pipeline laying activities. Source: (Soibelman and Kim 2002) .................... 305 Figure A.32 Visual representations of output data of an explanatory model using generic relevance partitioning/significance testing rules to explain reasons for the increase of budgeted cost. Source: (Roth and Hendrickson 1991) ............... 306 xxi  Figure A.33 A multiple view created by the Vico Control commercial software including images of: (a) schedule in flow line format, (b) resource usage, (c) cash flow, and (d) activity status. ................................................................................ 309 Figure A.34 A system that can generate computer graphics in both: (a) flow line, (b) bar graph, and (c) network formats for visualizing part of the schedule of a transit guideway project. Source: (Russell and Udaipurwala 2002b) ..................... 310 Figure A.35 A system incorporating the flow line representation with an simulation model for both computationally and visually optimizing a project's schedule. Source:(Hegazy and Kamarah 2008) .......................................................... 311 Figure A.36 Visualization of EVM indices for any selectable combination of location and product. Source: (Zhang et al. 2009) ........................................................... 312 Figure B.1 A graphic (Fig. 5) presenting data of "Expansion of Air" (vertical coordinate) and "Height of Mercury" (horizontal coordinate). Source: (Halley 1686) .... 314 Figure B.2 A graphic comparing wages of a "good mechanic" (line) with wheat prices (bars) from the year 1565 to 1821. Source: (Playfair 1821); Image from: (Friendly 2008) .......................................................................................... 314 Figure B.3 A specimen of a time line chart presenting the names, lengths of lives, birth and death dates, and occupations of the most distinguished persons from BC 1200 to AD 1800. Source:(Priestley 1744) ................................................. 315 Figure B.4 has been removed due to copyright restrictions. It was a table of guidelines of using multiple views in terms of rules and their corresponding positive and negative impacts on the utility of information visualization. Original source: Wang Baldonado, M. Q., Woodruff, A., Kuchinsky, A. (2000). Guidelines for Using Multiple Views in Information Visualization. Proceedings of the Working Conference on Advanced Visual Interfaces: Table 1, pp. 118....... 320 Figure B.5 A classification of functions of interaction features in information visualization. (© 1996 IEEE. Reprinted, with permission, from Chuah, M. C., and Roth, S. F., On the Semantics of Interactive Visualizations, Proceedings of the 1996 IEEE Symposium on Information Visualization (INFOVIS '96), 1996) ......................................................................................................... 321  xxii  Figure B.6 has been removed due to copyright restrictions. It was a graphics showing the use of sliders to specify attribute values of chemical elements. In the Periodic Table, elements that do not meet the query specifications are dimmed in real time while the user moves the buttons of sliders. Original source: Ahlberg, C., Williamson, C., Shneiderman, B. (1992). Dynamic Queries for Information Exploration: an Implementation and Evaluation. CHI '92 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems: Figure 2, pp. 621 ............................................................................................................. 330 Figure B.7 A data visualization system with intuitive interfaces for users to select data dimensions, filter data ranges, sort and group data, and specify how data dimensions to be mapped to visual variables. (© IEEE 2002. Reprinted, with permission, from Stolte, C., Tang, D., Hanrahan, P., Polaris: A System for Query, Analysis, and Visualization of Multidimensional Relational Databases, Transactions on Visualization and Computer Graphics 8(1), 2002)............. 330 Figure B.8 Selecting data points in a scatter plot by a rectangular brush and data points in other scatter plots associated with the selected data get highlighted. Source: (Cleveland and Becker 1987) (Reprinted with permission from Technometrics Copyright 1987 by the American Statistical Association) ........................... 331 Figure B.9 Using a "mouse lasso" to replace with, intersect with, add, subtract, or toggle the previous rectangle-select data points. (© IEEE 1996. Reprinted, with permission, from Wills, G., Selection: 524,288 Ways to Say "This Is Interesting", Proceeding of IEEE Symposium on Information Visualization, 1996) ......................................................................................................... 331 Figure B.10 has been removed due to copyright restrictions. It was graphics showing (a) A magnification function by which the scale of an image is determined by distance from the focal point, (b) application of the magnification function to the horizontal dimension, (c) application of the magnification function to both horizontal and vertical dimensions. Original source: Leung, Y. K., and Apperley, M. D. (1994). A Review and Taxonomy of Distortion-Oriented Presentation Techniques. ACM Trans. Comput. -Hum. Interact.: Figures 5(b)~5(d), pp. 133 ...................................................................................... 332 xxiii  Figure B.11 has been removed due to copyright restrictions. It was two graphics showing interaction features that allow users to specify sizes of 3D bars thereby reducing occlusion problems. The blue and red bars in (b) were adjusted to be thinner than the ones in (a) so that green bars can be seen more clearly. Original source: Chuah, M. C., Roth, S. F., Mattis, J., Kolojejchick, J. (1995). SDM: Selective Dynamic Manipulation of Visualizations. Proceedings of UIST' 95 Symposium on User Interface Software and Technology: Figures 8 and 11, pp. 65............................................................................................. 332 Figure B.12 A interactivity mechanism by which data density changes along with the zooming in/out of a map. (© IEEE 2003. Reprinted, with permission, from Stolte, C., Tang, D., Hanrahan, P., Multiscale Visualization Using Data Cubes, IEEE Trans. Visual. Comput. Graphics 9(2), 2003)......................... 333 Figure B.13 An anomalous pattern at the "Morris" barley site can be identified in the left "Trellis view" but not in the right view while: in (a) dot charts are sorted by median yields of barley sites (from bottom to top) and by median yield of year; data points in each dot chart are sorted by median yields of barley varieties, in (b) dot charts and data points in a dot chart are simply sorted by alphanumeric ordering. Source: (Becker et al. 1996) (Reprinted with permission from Journal of Computational and Graphical Statistics Copyright 1996 by the American Statistical Association) ............................................ 334 Figure B.14 has been removed due to copyright restrictions. It was a flow chart of a multiple view coordination model. In this model, an event is initiated by users in one of the two views (V1, V2). This event may include basic visualization processes such as filtering data (i.e. enhance), mapping data to visual variables (i.e. map), rendering an images (i.e. render), and rotating view points (i.e. transform), which consist of a coordination space that is applicable to both views. Source: (Boukhelifa et al. 2003) (© IEEE 2003. Reprinted, with permission, from Boukhelifa, N., Roberts, J. C., Roberts, P. J., Rodgers, P. J., A Coordination Model for Exploratory Multi-view Visualization, Proceedings of the Conference on Coordinated and Multiple Views in Exploratory Visualization, 2003) ................................................................................... 335 xxiv  Figure B.15 has been removed due to copyright restrictions. It was a multiple view of many 2D vertical bar charts. Original source: Pirolli, P., and Rao, R. (1996). Table Lens as a Tool for Making Sense of Data. Proceedings of the Workshop on Advanced Visual Interfaces: Figure 1, pp. 69 ........................................ 336 Figure B.16 A multiple view of a choropleth map, a parallel coordinates chart, a scatter plot matrix, and a scatter plot for users to explore insights from different angles in a demographic dataset. (© IEEE 2005. Reprinted, with permission, from Feldt, N., Pettersson, H., Johansson, J., Jern, M., Tailor-made Exploratory Visualization for Statistics Sweden, Proceedings of the Coordinated and Multiple Views in Exploratory Visualization, 2005) ........ 336 Figure B.17 Parallel coordinate representations for visualizing multi-dimensional fatal accident data at different levels of detail. (© IEEE 1999. Reprinted, with permission, from Fua, Y., Ward, M. O., Rundensteiner, E. A., Hierarchical Parallel Coordinates for Exploration of Large Datasets, Proceedings of the Conference on Visualization '99, 1999) ...................................................... 337 Figure B.18 has been removed due to copyright restrictions. It was a "same bin size Mosaic Plot" that represents the data about answers to the question of "how you heard about survey". The degree of grey shades represents counts of responses to certain answer options. Original source: Hofmann, H. (2006). Multivariate Categorical Data-Mosaic Plots. Graphics of Large DatasetsVisualizing a Million: Figure 5.13, pp. 120 ................................................ 337 Figure B.19 A multiple view of three images that shows the biological data of stickleback and pufferfish with regard to synteny relationship at the : (a) genome, (b)chromosome, and (c)block level. (© IEEE 2009. Reprinted, with permission, from Meyer, M., Munzner, T., Pfister, H., MizBee: A Multiscale Synteny Browser, IEEE Trans. Visual. Comput. Graphics 15(6), 2009) ...... 338 Figure B.20 has been removed due to copyright restrictions. It was a visualization called "Table Lens" that allows users to view detailed values (i.e. focus) and overall patterns (i.e. context) simultaneously. Original source: Rao, R., and Card, S. K. (1994). The Table Lens: Merging Graphical and Symbolic Representations in an Interactive Focus + context Visualization for Tabular Information. xxv  Proceedings of the SIGCHI Conference on Human Factors in Computing Systems: Color Plate 1, pp. 481 .................................................................. 338 Figure B.21 A tree map representation that shows planned values (rectangle sizes) and cost variances (colors) of both individual projects and project groups (grouped by departments). (© IEEE 2004. Reprinted, with permission, from Chintalapani, G., Plaisant, C., Shneiderman, B., Extending the Utility of Treemaps with Flexible Hierarchy, Proceedings of Eighth International Conference on IV, 2004) ............................................................................ 339 Figure B.22 has been removed due to copyright restrictions. It was a graphics showing a non-Euclidian geometry providing a smoothly varying focus+ context for visualizing hierarchical data. Original source: Lamping, J., and Rao, R. (1996). Visualizing Large Trees Using the Hyperbolic Browser. CHI '96: Proceedings of Conference Companion on Human Factors in Computing Systems: Figure 1, pp. 388 ......................................................................... 339 Figure B.23 has been removed due to copyright restrictions. It was a graphics showing the use of 3D virtual space to house 2D and 3D graphics for visualizing data from Statistics Canada. Original source: Brath, R. (2003). Paper Landscapes: A Visualization Design Methodology. Proceedings of Conference on Visualization and Data Analysis (VDA 2003): Figure 4(right), pp. 131. ..... 340  xxvi  Acknowledgements Many people contributed to the conception, growth, and blossoming of this challenging research undertaking. Of many helps I received during this journey, my deepest gratefulness is to my research advisor Professor Alan Russell. His prophetic and realistic vision, wisdom, and knowledge provide invaluable guidance throughout this research program. He also provides me with firm support morally and financially so that I can persist and persevere with the research work. Secondly, I would like to express my thanks to my supervisory committee members, Professor Scott Dunbar, Professor Sheryl Staub-French, and Professor Tamara Munzner for their insightful advices and guidance throughout my research work. Special thanks are to Professor Tamara Munzner for her provision of exceptional expertise in information visualization that is core to this research. The programming work by Mr. William Wong helped to implement my research ideas and is particularly appreciated. Without a workable prototype, this research work could be very challenging. Special thankfulness goes out to former and current fellow students who were involved in the data collection work; Tanaya Korde, Ali Mehrdana, Jehan Zeb, and Phiradej Luechachandej, thank you all. Concert Properties Ltd. and Scott Management and Group are also greatly appreciated for their generosity in providing their project data. Lastly, I am very grateful for my family's full support of my pursuing this Ph.D. research. My father Hui-Chin Chiu and mother Su-Chen Chiang play the most important role in encouraging me to hold on to my dream. Dedication must go to my wife Hsiu-Wen Tsai for her sacrifice of discontinuing her job to accompany me and care for our two lovely, healthy, and considerate daughters, Yan-Lin and Yan-Chi, who make this journey more joyful.  xxvii  Chapter 1 Introduction This thesis is a manuscript-based document describing the research topic of applying data visualization technology to construction management (CM). The research centers on seeking answers to three research questions. Answers to the first two research questions: "How should a CM data visualization environment be developed?" and "What are the key features of a CM data visualization environment that best reflect the functions expected for it?" are described in three manuscripts. Two have been published (Chapters 2 and 3) and the third (Chapter 4) has been submitted for review. In terms of the overall structure of the thesis, Chapter 1 describes the background, goals, methodologies, literature review (presented in the form of appendices), and contributions of the research. Chapters 2 through 4 describe the findings of phases 1 through 3 of the research work respectively (see section 1.5 Research Methodologies). Chapter 5 concludes this thesis by reporting on the fourth and final phase of the research work in terms of answering the last research question: "How does the use of a CM data visualization environment help conduct CM analytics that cannot be done or which are difficult to do with current data reporting practices?". It summarizes the research work and contributions, and suggests future work for tackling challenges/limitations encountered in this research endeavor. The literature review that underpins the research work is organized in two major Appendices due to the number and richness of the images associated with the text for describing the results of the literature review. Appendix A treats a full literature review of the use of data visualization in construction management (including the data visualization capabilities of current commercial CM information systems), while Appendix B treats an overview of state-of-the-art data visualization technologies.  1.1 Problem statements and proposed solutions Construction management processes serve several functions including time, cost, scope, and quality management as well as their integration. Each function requires CM personnel to conduct one or more tasks, which, in general terms, involves referencing "input" information and knowledge and utilizing techniques to produce "output" information and knowledge useful for other tasks. Two examples of tasks are "develop schedules" (time management function; planning) and "monitor and control project work" 1  (integration management function; monitoring and controlling), as elaborated upon in the Project Management Body of Knowledge (PMBOK; (Institute 2008)). Data flows of inputs/outputs and their originating/destination management functions/tasks as suggested by PMBOK are illustrated in Figures 1.1 and 1.2. In Figure 1.1, the task of interest, “develop schedule", is enveloped in a red oval. For this task, users need to consider the impact of referencing input data which in turn are outputs from various management functions/tasks as shown in the text boxes above the red oval in Figure 1.1. The outputs of this task, the schedule data, are in turn reference information for other management tasks (i.e. have impacts on other management tasks) as shown in the text boxes below the red oval in Figure 1.1. The items enveloped by the blue lines in Figure 1.1 correspond to the scope of good practice suggested by PMBOK; contents outside of the blue envelope represent other inputs that may be relevant to the "develop schedule" task and other management functions/tasks that may be impacted by its outputs (i.e. schedule data). The conventions used in Figure 1.2 are the same as the ones for Figure 1.1, but here the task of interest is "monitor and control project work". Two layers of data flow for generating inputs for this task are depicted.  Explicit CM knowledge developed over many years has been codified into good practices/guidelines/rules/procedures as to the tasks involved in each management function, and the relevant inputs/outputs/tools for executing the task. A complete CM process encompasses many tasks, which can be grouped by CM functions (e.g. management of time, cost, scope, quality), project phases (pre-planning, execution, postexecution) and/or CM purposes (planning/predicting, monitoring/diagnosing/controlling). Usually inputs for one task are outputs from other tasks. A CM information system (shortened as CM_IS hereafter) can be developed for providing a computerized environment that implements good practices/guidelines/rules/procedures to support CM tasks. Automation in support of the execution of CM tasks to generate outputs constitutes "computer assistance" by which the CM_IS produces outputs, given the requisite inputs. The manual process of executing CM tasks and generating outputs relies partly on the judgment of CM personnel by referencing relevant input/output information and their own CM knowledge. This human judgment includes detecting and considering patterns 2  Procurement Management  Conduct Procurement  Time Management  Define Activities  Estimate Activity Resources  Sequence Activities  Estimate Activity Durations  Human Resource Management  Acquire Project Teams  Scope Management  Define Scope Outputs  Outputs Resource calendars  Project scope statement Other Management Functions Other tasks  Outputs Activity lists, activity attributes, activity resource requirements, schedule network, activity durations estimates  Outputs Inputs Inputs  Time Management Develop Schedule  Other project data Inputs Other potentially useful Inputs Other available project data  Outputs Schedule data  Inputs Cost Management  Estimate Costs  Determine Budgets  Inputs  Inputs  Quality Management  Plan Quality  Procurement Management  Plan Procurement  Contents enveloped by blue lines represent good practices suggested by the PMBOK in terms of onelayer sources of inputs and usefulness of outputs for the "develop schedule" task  Management function/management task  Figure 1.1 Data flows for the task "develop schedule" as suggested by PMBOK (2008)  3  Serve as potentially useful Inputs Other Management Functions Other tasks  Inputs for or outputs from a management task  Cost Management  Control Cost  Time Management  Control Schedule  Outputs  Integration Management Develop Project Management Plans Outputs  Work performance information  Work performance measurements: planned vs. actual schedule, cost, and technical performance  Inputs  Project management plans  Inputs Inputs  Communication Management  Report Performance Outputs  Integration Management  Direct and Manage Project Execution  Outputs  Outputs  Outputs  Budget forecasts  Scope Management  Control Scope  Other Management Functions Other tasks  Inputs  Outputs Performance reports Other project data Inputs  Integration Management  Monitor and Control Project Work  Inputs Other potentially useful Inputs  Other available project data  Outputs Contents enveloped by blue lines represent good practices suggested by PMBOK in terms of two-layer sources of inputs and usefulness of outputs for the "Monitor and Control Project Work" task  Management function/management task  Inputs for or outputs from a management task  Figure 1.2 Data flows for the task "monitor and control project work" as suggested by PMBOK (2008)  4  Management function XX CM IS Management Knowledge Computer assistance Task 1 Human judgment  function YY  CM IS Knowledge  Management function ZZ  Computer assistance CM IS output dataTask 2 Knowledge Human judgment Mostly tabular report of input/output output data data; limited data visualization Computer assistance Task 3 Mostly tabular report of input/output input data Human judgment Information/data data; limited data visualization output data CM IS input data Mostly tabular reports of input/output Information/data data; limited data visualization CM IS input data Information/data CM IS  Figure 1.3 The relationship between several current state-of-the-art CM_IS, computer assistance, human judgment, and conducting stand alone CM tasks  hidden in input and output data that indicate potential effects (impacts) on and/or causes of CM variables. These identified potential effects and/or causes may be relevant to the CM tasks at hand and therefore actions to address the effects (i.e. correction) and/or causes (i.e. prevention) can be taken for conducting/refining the execution of CM tasks. Examples of the different kinds of input/output data and associated management tasks can be observed in the shaded rectangles and round-edged rectangles in Figures 1.1 and 1.2. The relationship between several individual current state-of-the-art CM_ISs, computer assistance, human judgment, and conducting CM tasks is illustrated in Figure 1.3. Both computer assistance and human judgment complement each other by addressing weaknesses in its counterpart. For example, fast generation of derived data that requires human inspection can be leveraged by computer assistance. Inspecting the characteristics of CM variables that are relevant to a CM task can remedy shortcomings of predefined knowledge that of necessity oversimplifies the complexity that accompanies reality by considering only a limited number of variables.  5  In the past, as compared to the focus on developing a CM_IS that enhance its degree of computerization and its role in automatically generating outputs, the development effort for presenting input/output data in support of human judgment for conducting individual CM tasks and complete CM functions has been relatively limited. As a result, CM users often find it difficult to digest and leverage input/output information because of the sheer volume and high dimensionality of data. Two main issues in current CM_IS that create this difficulty are elaborated upon below. 1. Issues of report format: Currently, data reports, specifically referred to here as reports of metadata along with some descriptive remarks describing project models or contents of source documents that can be generated from the state-of-the-art CM_IS are mostly tabular or textual, with very few of them being graphical. Figure 1.4 represents one out of almost 60 pages of a CM_IS-generated data report that encompasses input/output data obtained from CM tasks carried out during the planning and execution phases of an actual project. Data treated include planned/actual schedules and records of problems encountered during execution. In fact, many CM staff simply use spreadsheets or even word documents to generate and store input/output data relevant to CM tasks. Seen in Figures 1.5 and 1.6 are actual data reports of deficiencies and change orders collected using either Word documents or Excel spreadsheets. Such tabular reports can provide details of individual raw data for executing CM tasks (e.g. referencing items in a deficiency list to inspect the rectified work). However, they do not provide overall insights that can be deduced from examination of all or part of a data set, which in turn provides leads as to potential effects (impacts) on and/or causes of CM variables. Therefore, a preliminary research question emerges as:  Rather than using only tabular/textual reports, how should input/output data sets be presented so that CM users can reason about potential effects and/or causes that matter to the CM tasks at hand?  6  Figure 1.4 One of many pages of CM_IS generated tabular reports of planned/actual schedule data and problems encountered during execution 7  Figure 1.5 One of many pages of Microsoft Word documents of deficiency lists  8  Figure 1.6 One of many pages of Microsoft Excel spreadsheets for change order data  2. Issues of the need to continuously refine the search of several reports before generating the "right" reports: Another issue relates to the need of CM_IS users to go through many data reports, which may come from:   Inputs prepared for and outputs generated from computer assisted automation or any relevant required inputs for the task dictated by explicit knowledge (e.g. shaded rectangles enveloped by the blue lines in Figures 1.1 and 1.2), or    Outputs from other tasks that are deemed by the implicit knowledge of CM_IS users, but not expressed in terms of explicit knowledge, as relevant inputs to the tasks at hand (e.g. shaded rectangles outside the blue envelopes in Figures 1.1 and 1.2).  The processes involved in going through many reports on demand helps CM_IS users to pose and answer questions continuously until they are satisfied with their analytical reasoning. These processes may include iterations of: 1) looking for and selecting relevant reports and data contents, and 2) adjusting presentation formats of the reports to ensure readability. However, current CM_ISs provide limited interactive features with which users can readily conduct the foregoing processes. Also, while some prototypes of integrated CM_IS have been developed, most commercial CM_ISs only support specific CM tasks (e.g. Microsoft Project mainly supports CM tasks of time management functions both in the planning and monitoring phases). This creates a 9  Time, cost, quality, scope, risk, ... management functions Plan (re-plan)/forecast (re-forecast) Deterministic or probabilistic  Execute  Monitor/control  CM Information System Knowledge base, policies, etc. Machine assistance Task 2 Human judgment  Machine assistance Task 1 Human judgment output data  output data  “Better” presentations of input/output data in addition to tabular reports; interaction features input data  input data Project information, historic information CM Information System  Figure 1.7 A proposed solution in terms of enhancing data reporting functionalities in the relationship between a current state-of-the-art CM_IS, computer assistance, human judgment, and conducting stand alone CM tasks  difficulty for CM users to seamlessly seek and view a variety of relevant data reports. Thus, another preliminary research question is:  How can CM users explore project data sets in order to generate the "right" data reports in visual form for assisting their analytical reasoning processes on an ongoing basis?  To address the foregoing issues, my preliminary idea for a solution is to improve the data reporting capability of a CM_IS system which focuses mainly on using tabular/textual reports. The proposed enhancement is conceptualized in Figure 1.7. As seen in this figure, the layer (the orange rectangle) between an integrated CM_IS and CM processes represents an enhanced "data reporting center" that adds the following two features (an enhanced data reporting center will be referred to hereinafter solely to mean a data reporting center that has these two features but excluding tabular data reporting) that are lacking in current CM_ISs and their data reporting features as depicted in Figure 1.3: 1. In addition to tabular data reports, input/output data from a diverse range of CM tasks can also be viewed using "new" data presentation formats that can deliver messages 10  hidden in datasets that are not readily discernable using traditional tabular/textual reports; and 2. These "new" presentations of input/output data should be available "through a single shop or interface" with sufficient interactive features to allow users to flexibly select and adjust the presentations.  The goal of this thesis is to demonstrate how the foregoing can be achieved through the use of data visualization. Data visualization is defined as "the use of computer-based, interactive visual representations of data to amplify cognition" (Card et al. 1999). Issues related to it have been research topics in other knowledge domains including computer science, statistics, and psychology. Suitable visual representations of data can be designed and created by transforming original raw datasets into desired structured data representations and then representing data dimensions by effective visual variables (e.g. spatial positions, colors). For at least three reasons, visual representations of data help derive overall information/insights that tabular/textual/audio/video raw data cannot otherwise provide: 1. For unstructured raw data, important topics and their contexts are parsed into discrete data dimensions (i.e. variables) and values corresponding to them, which in turn provide a focused and structured CM variable data space for investigation. Thus, unstructured raw data is turned into tabular metadata. For example, one of the important topics of construction meeting minutes relates to locations of troubled activities and therefore specific locations and activities mentioned in each meeting are documented. Thus, the minutes can be structured into at least a two dimensional data table with the dimensions being "location" and "activity" and values of the dimension being codes or names of locations/activities mentioned in the meeting minutes. For the foregoing transformed tabular metadata, new data dimensions such as "counts of meeting minutes" can be derived, leading to information as to the number of meeting minutes related to a certain activity and/or location. It is the aforementioned data transformation of original raw data and the derivation of data tables that provides users with a focus for examining certain variables that are most pertinent to their CM analytic tasks. The "data" in the "visual representation of data" usually refers to the 11  foregoing transformed tabular metadata along with the additional data derived from the metadata. 2. Features such as spatial positions or colors provide low similarity amongst different features as compared to texts or numbers which is one of the key reasons why human beings can be visually attentive to certain symbols (Duncan and Humphreys 1989) and identify visual patterns. These patterns indicate important messages/insights hidden in a collection of data, prior to conscious attention (Ware 2004). 3. Large amounts of visual/diagrammatic information can be processed by the human visual perception system in parallel as opposed to the serial processing required for textual or numeric information (Larkin and Simon 1987; Ware 2004).  In addition to the foregoing merits of visual representations of data, data visualization also enables interaction with visual representations of data thereby allowing users to continuously formulate the data content and format of the visual representations during their insight generation exploration processes. This facility provides users with a continuously updated information platform and aids the decision making process. Another concept is visual analytics, which is a new data analysis paradigm. It is defined as "the use of visual representations of data and interactions to accelerate rapid insights into data" (Thomas and Cook 2005). The purpose of visual analytics is similar to the needs of the CM community in terms of exercising human judgment when conducting CM tasks through interacting with an enhanced "data reporting center".  Use is made in this thesis of terminologies employed by both the visualization and construction management communities combined with the concepts presented in the first part of this chapter. To ensure a clear understanding of the vocabulary used, definitions of terms used are presented in the next section and adhered to throughout this document.  1.2 Terminology definitions 1. CM data: the input/output data generated in support of CM tasks. 2. Visual representation of CM data (or images of CM data or simply images): presentations of CM data that are either in their natural form (if the CM data 12  represents tangible objects such as a building) or visual form that map data dimensions to visual variables (if the CM data represents non-tangible abstractions). Since this research focuses on abstract CM data, the latter definition is adopted herein. 3. Pre-coded image: an image whose default specifications (e.g. how data dimensions are mapped to visual variables, levels of detail of data to be presented) have been predefined in a data visualization system. The system has user interfaces for showing selections of pre-coded images, which can be in the form of menu items, check boxes, or small icons. If a user chooses one of them, the visualization system will generate the image selected based on the pre-defined specification of that image. 4. Thematic images: images can be categorized by certain themes. For example, various images that portray values of different attributes of a product can be categorized as product thematic images. 5. Interaction features: content and display controls that allow users to interact with images such as querying data ranges and then updating the images according to the query. 6. Interactivity: depending on the context in which this term is used, "interactivity" is loosely used to describe a feature in particular or a capability in general of interacting with images. It is a term commonly used in the computer science literature. 7. CM data visualization environment: an "enhanced CM data reporting center" that is created based on data visualization concepts and technologies. 8. Construction conditions: an umbrella concept covering construction strategies imposed, construction requirements dictated in contracts, construction constraints encountered, and so forth. 9. Construction performance measures: time, cost, scope, quality, safety, and risk. 10. CM analytics: CM user focused analytical reasoning for identifying potential effects and/or causes from the characteristics of CM variables related to construction conditions and construction performance measures that pertain to the CM task/function at hand. 11. Visual CM analytics: the use of a CM data visualization environment for conducting CM analytics. 12. Analytic reasoning artifacts: this term was defined in (Thomas and Cook 2005) as 13  "tangible pieces of information that contribute to reaching defensible judgments" about a question. These artifacts can be elemental ones such as relevant information and evidence, patterns, high-order knowledge constructs, and complex reasoning constructs. 13. Top-down development approach: A "top-down approach" involves a design process that is focused on identifying CM analytics applicable to one or more CM functions/tasks and visualization requirements responsive to them. The analysis of visualization requirements includes scoping what CM variables are involved in the CM analytics and identifying general rules of how their characteristics can be observed, on demand, in support of the CM analytics. Determination of the relevant CM variables provides the scope/direction of the visualization development; the analysis of how characteristics of these variables can be observed in general (i.e. in support of CM analytics common to various CM functions/tasks) results in required common visualization features. For example, from an overall CM perspective, the common CM analytics applicable to most CM functions/tasks is to explore potential causes and/or effects amongst construction conditions/performance measures. Using the multiple view modeling of a project, these construction conditions/performance measures can be attributes of the project context dimensions of product, process, project participant, environment, etc. This recognition helps form the scope/direction of CM data visualization environment development, i.e., CM variables in terms of construction conditions/performance measures to be visualized. Also, it is recognized that the characteristics of the aforementioned CM variables in general can be observed by project context dimensions, different levels of detail, and data status. Thus, the common visualization features should support presenting the distribution of values of construction conditions/performance measures in terms of different data status, different context dimensions, and at different levels of detail. 14. Bottom-up development approach: A "bottom-up approach" deals with the detailing needed in order to create an actual visualization. These detailing processes include:   Analysis of more specific CM analytics (i.e. CM questions) in relation to CM functions/tasks;    Identifying specific visualization requirements in terms of what CM variables are 14  involved for specific CM analytics and how their characteristics can be observed, on user demand, in support of the CM analytics. Analysis of the latter is conducted by first analyzing specifics of the top-down common visualization features that fit the nature of particular CM analytics and/or CM variables. Additional new visualization features may also be sought. Inclusion of interaction features for changing default settings of a visual representation that are as flexible as possible is key to allowing users to decide how to observe the characteristics of CM variables on demand. In the bottom-up design, the focus is on "what cannot" be changed if these changes do not add value to the usability and utility of the visualization;   Specifying required data representations/transformations, visual encodings and non-visual encoding attributes of a visual representation, and interaction features according to specific visualization requirements;    Implementing the specifications;    Evaluating the implemented visualization. Both designers and/or end users can utilize the "inspection evaluation method" (Amar and Stasko 2005; Ardito et al. 2006; Zuk et al. 2006) to identify any deficiency in terms of non-conformance to the requirements/specifications and new requirements/specifications that better help answer the CM questions posed.  The test of operating an implemented visualization should make use of sizeable sets of actual as well as synthetic project data that is representative and reflective of real world projects in order to ensure the visualization tool can handle the realities of construction project data. The function of the bottom-up development process, from the perspective of developing a CM data visualization environment, is that through focusing on details and using the implemented visualization on actual and representative data sets, lessons can be learned for contributing to refining design guidelines and/or top-down common visualization features. 15. Environment architecture: an organization of thematic visualizations, mainly hierarchical from abstract to specific (e.g. time performance visualization vs. milestone finish date monitoring visualization), categorized by construction conditions and performance measures under multiple views of a project. Each 15  visualization is developed in a consistent way by following design guidelines and addressing common visualization features required for supporting general CM analytics. These include the use of consistent user interfaces and mechanisms for users to adjust them. A direct benefit is enhanced environment learnability.  1.3 Research questions, objectives, and hypothesis Based on the observed problems, proposed solutions, and terminology described previously, several research questions have been formulated along with attendant research objectives which seek answers to these questions. The questions and research objectives are as follows: 1. How should a CM data visualization environment be developed?  What are the status and shortcomings of current data visualizations for effecting visual CM analytics both in commercial CM_IS software and academic research?  Are there concepts, theories, processes, and technologies from state-of-the art data/information visualization that can be adopted or adapted to developing a CM data visualization environment?  What methodology should be used to adopt or adapt state-of-the-art data visualization in order to develop/enhance a CM data visualization environment, which addresses shortcomings of the current use of data visualization in CM?  2. What are the key features of a CM data visualization environment that best reflect the functions expected for it?  What are the key features of images for assisting CM analytics?  What are the key features of interactivity and environment architecture that allow users to flexibly explore a variety of images useful for CM analytics?  3. How does the use of a CM data visualization environment help conduct CM analytics that cannot be done or which are difficult to do with current data reporting practices?  The formulation of and seeking answers to the third question along with research assumption 3 (see next section) form the research hypothesis as: 16  The use of an appropriately developed CM data visualization environment helps CM personnel interpret CM data for assessing, learning, and communicating causes and/or effects amongst a wide range of CM variables of a construction project thereby improving the quality of CM processes and decision-making.  1.4 Research scope and assumptions In order to explore meaningful answers to the previously formulated research questions, it is essential to have a clear and focused scope of work along with supporting assumptions that are appropriate for a Ph.D. thesis. This is essential in order to understand the context (assumptions/conditions) for which the answers sought are valid. Research assumptions and scope of work that underlie the thesis are as follows:  1.4.1 Scope or focus 1. The focus of this research is on understanding how to visualize a collection of abstract CM data visualizations for identifying potential causes by and/or effects on CM variables. This "abstract CM data" refers to structured metadata describing important topics/aspects of textual descriptions, project documentation (e.g. meeting minutes, videos, photos), and project models (e.g. built products, processes). Visualization of their actual contents (e.g. actual text in documents, appearance and geometry of products) is excluded from this research. 2. To the greatest extent possible, limited yet sufficiently diverse and sizeable sets of CM data originating from different CM tasks/functions (as-built records of change order data, deficiency data, schedule data, and product/location attributes data) are used to deduce general knowledge/insights surrounding the topic of utilizing CM data visualization to apply CM analytics to CM processes. 3. The use of CM data visualization is a complement to CM data analysis that uses a computational approach. In fact, both the data visualization approach and computational data analysis approach can leverage each other to achieve greater utility. Given the focus of my work on data visualization, a computational data analysis approach is not used as a comparison benchmark – i.e. the research question (hypothesis) of interest does not relate to the superiority or inferiority of computational 17  data analysis versus data visualization, but to the question of what and when benefits can be derived through visual analytics. 4. Inquiring into industry end user performance differences arising from the deployment of data visualization tools and accompanying visual analytics paradigm in a day-today work environment is excluded from this research. The purpose of developing a prototype of a CM data visualization environment in this research is to provide a test bench for generating and analyzing insights with regard to the design and use of such an environment. This does pose some challenges however if third party evaluation is sought of the ideas developed, as the focus can and most often move to the full set of features and robustness one finds in commercial software rather than the concepts involved. It is believed that the insights generated through implementation of a prototype will contribute to future research work directed at extending the visualization and usability capabilities of the current prototype as well as those developed by other researchers to facilitate conducting full scale observations on applying data visualization and visual analytics to CM functions and eventually to the development of commercial software that has the features and robustness required to effect adoption by industry. 5. The main focus of this research in response to the research question of "How should a CM data visualization environment be developed" is on how to develop a CM analytics-oriented one that helps identify potential causes and/or effects from a collection of data representing a variety of CM variables. Limited visualization capabilities found in the literature and commercial software, along with commonly used tabular/textual data reports are mostly task-oriented for supporting execution of narrowly focused CM tasks (e.g. read project scope statements and create work breakdown structure according to statements; read activities in the upcoming days in a schedule for executing construction work, etc). The kind of visualization features my research is exploring have largely been unaddressed heretofore. Therefore, when evaluating an implemented new visualization the focus is on whether it serves the CM analytics for which it is implemented. In most cases, alternative visualizations in support of the same analytic task do not exist, precluding comparisons and evaluations as to superiority. 18  6. The author recognizes that in pursuing answers to the research questions, the front end part of the development processes involves analysis of CM analytic needs and the corresponding visualization requirements (identification of CM analytic themes and important CM variables or data dimensions associated with these themes). This front end part is knowledge domain specific. A later stage of the development process involves turning requirements into visualization specifications (e.g. map multi-data dimensions to visual variables considering choices of interaction features) and implementing the specifications, which is not knowledge domain specific and remains as a difficult design problem in the general data visualization domain. It is a difficult general data visualization problem because there could be too many combinations of visual encodings leveraging on supporting interaction features for mapping the same data. In other words, there could be many alternative specifications that may all meet the visualization requirement (e.g. the need to map ten data dimensions important to a certain analytic need into visual forms) but which have different levels of usability (admittedly usability may have an impact on utility to some extent). Therefore, another focus of this research with regard to answering research question 2 "What are the key features of a developed CM data visualization environment…." is on the front end analysis of CM analytic needs and visualization requirements. "Satisficing” specifications for some visualizations are identified and implemented, but they are only used to demonstrate the completeness of the development process and attention to state-of-the-art data visualization, and provide a test bench for answering research question 3. Identification of the best specification amongst several specification alternatives is believed by the author to be a generic data visualization problem and is excluded as a focus of this research. 7. As part of this research work, various visualizations and supporting infrastructure were implemented to demonstrate the usefulness of these features. However, the focus of this implementation effort was not on developing novel programming approaches in support of the implementation work or the use of new programming languages. Accordingly, the programming work associated with implementation is not claimed as a research contribution. It was assisted by a dedicated programmer who had extensive knowledge of the system in which the visual images are embedded. 19  1.4.2 Assumptions 1. As mentioned in the beginning of this chapter, a CM_IS is developed to implement explicit knowledge of CM processes. There could be different ways of organizing and recognizing CM knowledge and hence there will be variations in different CM_IS in terms of system architectures, data structures, and terminologies used. One of the structured ways of organizing CM knowledge is the concept that a construction project can be described and modeled by multiple views (e.g. process view, cost view, quality view) (Russell and Udaipurwala 2004), and each view has its unique data structures, data definitions, and data computation routines for implementing required inputs/outputs and techniques associated with the CM functions/tasks supported. Multiple views are tightly associated by sharing common project context dimensions so that data from multiple views can be more readily used as inputs/outputs for specific CM tasks/functions. The exploration of how to develop a CM data visualization environment in this research makes use of this "multiple views of a project" concept and a research CM_IS (REPCON (Russell 1985; Russell and Udaipurwala 2004)) that implements this concept. This research CM_IS provided the platform for exploring the data visualization concepts described in this thesis. It provided important information system infrastructure for allowing full scale implementation and testing of concepts. However, it also provided some constraints with respect to the visualization tool kit that could be used because the visualization application programming interface (API) used needs to be compatible with the architecture of the research CM_IS. The visualization API used is ChartFX 6.2 Client Server (Software FX Inc.), and the visualization features are somewhat limited to the visualization components this API can offer (e.g. have no control over attributes of the third coordinate). 2. CM data / information needs to be abstracted into structured and electronic data formats that are compatible with the CM information system(s) with which the CM data visualization environment is integrated. While at least in theory data abstraction can be done by computer through data mining techniques, this research will rely on manually pre-processing messy and unstructured CM data into structured data formats. For example, for the case study that focused on deficiency data, considerable effort 20  was expended by the author to review as-built data that was documented in plain sentences. Sentence contents were parsed into dimensions (e.g. participant, product, process, schedule) that match the data fields of the CM information system, and then the corresponding data was entered into the system. 3. The cause-effect relationship between using a CM data visualization environment and improving CM processes is an indirect two layer cause-effect relationship: using a CM data visualization environment enhances the quality of CM analytics through strengthening human CM analytics ability, which in turn improves the quality of conducting CM functions/tasks and consequently CM processes. The latter relationship is an assumption used in this research, and hence the focus on validating/invalidating the hypothesis that a CM data visualization environment can help enhance CM analytics capabilities.  1.5 Research methodologies Correspondence between research questions for which answers are sought and research methodologies for searching for them can be found in Table 1.1. The research techniques adopted are observational ones that include literature/state-of-the-art review, structured analysis, and prototype development through case studies. These research techniques were applied in the following sequential research phases: 1. Phase 1: To develop guidelines and principles for designing a CM data visualization environment based on experience gained from operating state-of-the-art data visualization software and structured analysis of past literature regarding visual analytics, data visualization, and construction management. A case study using change order (CO) data analysis application and CO data of a complex rehabilitation project was utilized to:  Demonstrate the merits of presenting data in visual form;  Test the use of identified design guidelines/principles for analyzing visualization requirements and specifications for the change order analysis application; and,  Identify lessons that can be learned regarding how to develop a CM data visualization environment and key features required of images and interaction features. 21  Table 1.1 Correspondence between research questions and research methodologies for answering them  Research Questions (objectives)  Research methodologies   What are the status and shortcomings of the current use of data visualization for effecting visual CM analytics both in commercial CM_IS software and academic research?   Literature reviews which focus on the use of either visual representations of data or data visualization to present CM data; review of the state-of-the-art of data visualization for CM_IS with the focus on mainstream commercial representations and an academic one available to me   Are there concepts, theories, processes, and technologies from state-of-the art data/information visualization that can be adopted or adapted for developing a CM data visualization environment?  What methodology should be used to adopt or adapt state-of-the-art data visualization in order to develop/enhance a CM data visualization environment, which can address shortcomings of the current use of data visualization in CM? What are the key features of a CM data visualization environment that best reflect the functions expected for it?  How does the use of a CM data visualization environment help conduct CM analytics that cannot be done or which are difficult to do with current data reporting practices?   Literature reviews with the focus on fundamentals of how to develop data visualization tools and prevailing data visualization features/technologies  Analysis and case studies (both application cases and project data cases) for executing the prototype development process in order to obtain lessons learned in terms of how to develop a CM data visualization environment  Analysis and case studies (both application cases and project data cases) for executing the prototype development process in order to obtain lessons learned in terms of deducing key features of the images implemented and their interaction features Case studies, structured comparison evaluation, and analysis for comparing analytical reasoning capability between the use of a CM data visualization environment and past data reporting functionalities  2. Phase 2: To develop a CM data visualization environment using a top-down approach, based on the guidance of design principles/guidelines and structured analysis of relevant data visualization and CM literature. The focus of the design is to identify requirements from the conceptual level for an overall CM data visualization environment that can serve CM analytics in general, to the more detailed level for the subpart of an overall CM data visualization environment that is specific to time performance management. An existing CM data visualization environment, which was designed and developed in the past using a bottom-up approach, was compared against prescribed requirements. A case study using schedule/as-built data from an actual project and an existing CM data visualization environment were utilized for time performance management applications to:   Demonstrate the merits of utilizing interaction features to explore and formulate 22  visual representations of CM data,   Test the use of a top-down approach for analyzing visualization requirements that relate to common/general CM analytics for CM functions/tasks, and    Identify lessons that can be learned regarding how to develop a CM data visualization environment and key features of images and interaction features.  3. Phase 3: To develop a CM data visualization environment using a bottom-up approach in accordance with the design principles/guidelines and visualization requirements identified in phase 2. The focus of the development is to identify, implement, and evaluate new visualization features of a CM data visualization environment in support of specific CM analytics including schedule variance analysis, product/location attribute analysis, and reasons for time performance analysis within the CM function of time performance control. The inspection kind of evaluation method (Amar and Stasko 2005; Ardito et al. 2006; Zuk et al. 2006) is used to evaluate implemented features against prescribed specifications, requirements, and analytic needs in order to identify the need for refinements. Case studies using product/schedule/location/asbuilt data from actual projects and the newly enhanced CM data visualization environment were utilized for time performance control applications to:   Demonstrate the merits of the environment architecture proposed in this thesis,    Test the use of the bottom-up approach for developing visualization features that are responsive to both common and specific CM analytics for CM functions/tasks, and    Identify lessons that can be learned regarding how to develop a CM data visualization environment and key features required of images and interaction features.  4. Phase 4: Demonstrate how the key concepts used to define a CM data visualization environment may enhance CM analytics capability. The demonstration and analysis are done by comparing a CM information system that is integrated with the developed prototype CM data visualization environment against current data reporting practices that mainly use textual/tabular reports and a few graphics (e.g. bar graph schedule). Specifically, case studies, conventional data reporting functionality, use of the CM data visualization environment developed as part of this research work, and a 23  structured comparison process were employed to demonstrate and analyze how the use of a CM data visualization environment may enhance CM analytics capability.  1.6 State-of-the-art data visualization and its application to CM A full literature review of the use of data visualization in construction management (including the data visualization capabilities of current commercial CM information systems) and an overview of state-of-the-art data visualization technologies were carried out in order to understand the current state of applying data visualization technologies to facilitate CM visual analytics in support of CM processes. Due to the number and richness of the images associated with the text for describing the results of the literature review, more detailed descriptions of them are presented in Appendices A and B (Appendix A: Data visualization in construction management; Appendix B: Overview of state-of-the-art data visualization). A distillation of the findings from the literature review documented in Appendices A and B contributes to answering the first two questions under the question heading of "How should a CM data visualization environment be developed?”(see page 16), which includes the identification of shortcomings of current applications that should be addressed and how state-of-the-art data visualization can be adopted/adapted to developing a CM data visualization environment. The findings of this review are summarized below.  1.6.1 Shortcomings of current applications of data visualization in CM 1. Data for limited CM tasks/functions is visualized: Exclusive of the visualization of the geometric aspects of a project (e.g. built products, equipment) and how the geometric aspects progress in time (e.g. 4D model, equipment operation-topics that have been extensively investigated in the literature, and which are not covered in my literature review), the use of visualizations of abstract CM data in the past focuses mainly on construction process planning, scheduling, and estimating. These visualizations usually present values of the independent and dependent variables that are explicitly defined in a prediction model (e.g. Figure A.9 in Appendix A) and are usually limited to "showing a plan" in terms of schedules, cash flows, and resource allocation.  However,  many  CM  variables 24  that  relate  to  planning  assumptions/conditions are not either explicitly recorded or visualized. These variables include planned values of project environment attributes in support of risk management  (e.g.  political  environment,  economic  environment,  climatic  environment), product attributes in support of scope, cost, and time management (e.g. product quantity, quality requirement/grades, design features), location attributes in support of risk management and planning (e.g. work area size, utility distribution), and organization/contractual attributes for contract management (e.g. organization performance appraisal).  It is only recently that researchers have started to explore the potential of visualizing as-built data to support the control function of CM. However, the focus is mainly on monitoring deviations between planned and actual cost (e.g. Figure A.27, Figure A.29 in Appendix A). Little was found about using visualization of CM data to monitor other performance measures and to seek reasons for performance from data representing actual values of planning conditions/assumptions including visualization of metadata used to describe documents that record a wide range of as-built events/conditions (e.g. problems encountered, change directives, meeting minutes discussing project/construction issues, site reports, etc). 2. No domain wide CM analytics-oriented visualization is supported: Most current CM data visual representations (e.g. bar graph schedule, earned value index graphics) are developed as part of data-oriented and task-oriented approaches and their use tends to be restricted to a single function- e.g. schedule data for planning and scheduling (e.g. Figure A.4 in Appendix A), earned value data for cost control (e.g. Figure A.36 in Appendix A), probability-impact register for risk analysis (Figure A.19 in Appendix A). They are not developed and used in conjunction with the need to know potential causes of and/or effects on CM variables that in turn could be helpful in conducting CM tasks (e.g. identify a schedule's potential impacts on cost implications of project participants and refine the schedule to eliminate the impacts). Understanding potential causes and/or effects amongst CM variables relevant to a CM task can add significant value to all kinds of CM tasks/functions.  25  Currently neither academia developed nor commercial CM information systems have provided a visualization environment that allows users to generate and interact with a wide variety of visual representations of CM data that support common and unique analytic tasks associated with the diversity of tasks/functions that comprise CM. For example, current commercial CM information systems are usually modularized into different software components or packages that support specific CM tasks/functions (e.g. Oracle-Primavera Risk Analysis software is for risk management and OraclePrimavera P6 software is mainly for scheduling and schedule tracking). Hence the data and data visualization are specific to the tasks/functions supported and can only be accessed in the individual software component. To achieve the integrated use of data visualization in different commercial software packages, issues that have to be addressed include: 1) data integration, 2) image accessibility and composition across software packages, and 3) whether images designed for one package support common CM analytics shared by other CM tasks.  1.6.2 How state-of-the-art data visualization can be adopted/adapted 1. Limitations of state-of-the-art data visualization and focus of development: Although computer technology development and research accomplishments in data visualization have helped advance the use of data visualization from paper-based, static, and passively-presented graphics to computer-based, dynamic, and actively-exploring visual interfaces, the development of data visualization tools still faces two fundamental problems. The first one is the technical problem of limited resources, e.g., limited size of visualization space, number of visual variables, and human visual memory. The second problem is the complex interactions amongst variables including characteristics of visual display, task demands, data complexity, human factors, and user characteristics (e.g. users knowledge about the designed graph, context or information the graph represents). State-of-the-art data visualization can be creatively applied to overcome some of these technical limitations (e.g. use of interaction features to more quickly browse through many images representing different data dimensions/contents). To date, however, fundamental research in data visualization at best has answered a small part of the questions related to the aforementioned complex 26  relationships (e.g. simplistic relationship between visual encoding, data characteristics such as measurement scales of data, and effectiveness in judging data values). From the perspective of applying existing state-of-the-art data visualization principles and tools as opposed to adding to them, the challenge becomes one of utilizing the stateof-the-art to overcome technical issues (e.g. accommodate more images in one view with clarity for quicker scanning utilizing commonly used graphics). 2. The need to use domain wide analytics-oriented data visualization development methodologies: For many state-of-the-art information/data visualization tools, they tend to be one or more of technology-oriented (develop innovative technologies), dataoriented (solve issues of visualizing massive data; how to portray the same data in different ways), and/or specific task-oriented (solve specific analysis or visual analysis problems). In general their focus is not on the breadth of analytical reasoning associated with a complete knowledge domain, nor is it oriented to generalists as opposed to specialists within a domain. For example, task/specialist-oriented visualization development may focus on solving clutter issues associated with presenting activity sequences in a traditional bar graph schedule (i.e. how can one see sequences between activities more effectively and efficiently). An analytic-oriented development that covers the entire domain of CM on the other hand will first figure out the implications to CM of activity sequencing as a design/development guidance (i.e. the conclusion may be that there is little or no benefit to show sequences in bar graph schedules from the CM analytics perspective). Non-domain wide analyticsoriented visualization development usually goes through a structured development process by eliciting requirements in terms of specific task needs (or task problems, visualization problems) from a small number of end users, then design and implementation, followed by conducting user studies to quantitatively or qualitatively appraise the usability and/or utility of the developed tools. However, the goal of a CM data visualization environment is to support CM domain wide analytics. Thus, the development approach proposed by (Amar and Stasko 2005) appears to be more suitable for pursuit of this objective. This approach involves the idea of developing an  27  analytics-centered visualization system, and it tries to identify heuristic rules 1 in terms of how a visualization system should support certain high-level analytics (e.g. a visualization system should assist in creating/acquiring/transferring knowledge about important domain parameters) as a reference for designing the system and inspecting whether the developed system meets these analytics needs. Therefore, development methodologies that are more in line with this idea should be adapted to formalizing structured processes for developing a CM data visualization environment, which includes an analytics oriented, heuristic inspection kind of evaluation. 3. Understand the underlying fundamental functionalities of novel data visualization techniques/tools: Table B.1 in Appendix B demonstrates how computer technologies and two areas of data visualization fundamentals (i.e. visual encodings and the use of interaction features) were applied to creating novel/advanced data visualization tools. While they help provide tangible examples of innovative visualization techniques, designers should understand the underlying functionalities (i.e. items in the left column of Table B.1 in Appendix B such as good practices of mapping data to visual variables, data query, adjusting visual attributes such as scales of images) supported by the special techniques of these novel tools and contemplate their underlying usefulness to CM analytics rather than being tempted to incorporate these new visualization techniques in a CM data visualization environment because of their novelty (i.e. applying their use irrespective of the nature of real analytics needs and  data  structures). For example, Hyperbolic Tree (Figure B.5.17 in Appendix B) visualization that is built on an innovative visual representation of data may enhance the easiness of navigating and searching textual information organized in a hierarchical structure. This understanding of the fundamental functionalities of hyperbolic tree visualization (i.e. goal of enhancing usability; for navigating information; applicable to hierarchical categorical data) enables designers to assess its fit of use in support of the analytical tasks at hand (e.g. it is not suitable for identifying cause-effect relationship between CM variables). Designers should refrain from enthusiastically trying/pursuing the use  1  Heuristic rules are a set of pre-defined criteria, rules, or guidelines that need to be considered when developing a software interface or visualization (Carpendale 2008) and are for looking for problems in the developed products in terms of their compliance to the heuristic rules (Mankoff et al. 2003). 28  of new visualization techniques simply because of their novelty or popularity. 4. Desired degree of generality/flexibility of a data visualization tool: current state-ofthe-art data visualization tools at one end of the spectrum are single function tools either supporting narrowly defined information search/data analysis tasks or dealing with particular data characteristics (e.g. a Treemap specifically for visualizing hierarchical data, Parallel Coordinate for visualizing multidimensional quantitative data) and at the other end of the spectrum are generic visualization systems (e.g. (Eick 2000; Stolte et al. 2002)) for general public use (i.e. require no programming to access state-of-the-art data visualization features). A specific tool may provide only one or two images and a set of interaction features to interact with these images for supporting a few particular analysis tasks. A generic system is flexible enough for users to choose whatever data dimensions from databases and specify the mapping between data dimensions and visual variables. In other words, many images could be created, if not an unlimited number (e.g. unlimited combinations of data dimensions to include in a visualization compounded by unlimited choices of visual encodings). The level of generality (or uniqueness) of a CM data visualization environment should be in between the two extremes of specific tools and generic systems, and there should be a limited number of thematic images that are particularly useful for CM analytics. However, a certain degree of flexibility can be still be maintained through the use of essential interaction features to query background data for viewing different data contents associated with the same image theme, to adjust visual attributes to enhance readability, and to coordinate the use of interaction features for efficiency.  1.7 Research contributions In this section, the first two of three major research contributions and related subcontributions made are described and explained. The validation for each of contribution 1 and contribution 2 is done by describing how the research results answer the research questions in this section. While the third contribution is overviewed here, the assessment of the contribution is done through case studies, structured comparison evaluation, and results analysis, as documented in the conclusion chapter.  29  Contribution 12: Answering the research question: of "What methodology should be used to adopt or adapt state-of-the-art data visualization in order to develop/enhance a CM data visualization environment, which address shortcomings of the current use of data visualization in CM?"  Contribution 1.1: Formulation of a set of design guidelines/ principles in terms of how to apply state-of-the-art data visualization (or visual analytics) techniques to the CM domain data. These guidelines contribute to the application of structured data visualization design/development processes tailored to CM needs (e.g. all visualization designs start with analyzing analytical reasoning needs) and extracting knowledge fundamental to data visualization and CM data that pertains to the design/development of general CM data visualizations. The design guidelines address in a comprehensive manner the following issues that were not covered in previous methods proposed for developing CM information visualization (Lee and Rojas 2009; Shaaban et al. 2001; Songer et al. 2004) :  Recognizing that CM analytic needs govern visualization design (analytics-oriented design) and provide evaluation benchmarks. This recognition is a contribution because "CM analytics-oriented" design differs from traditional "task oriented", "data oriented", or "technology oriented" design approaches (Lee and Rojas 2009; Shaaban et al. 2001; Songer et al. 2004). Consider, for example, visualizing schedule data. The focus of CM analytics-oriented visualization is on "showing the rationales behind a construction plan (e.g. potential effects (impacts) by or reasons for such a plan)" as opposed to "showing a plan", "visualizing available schedule data ", or "visualizing something that available visualization technologies can offer".  Emphasizing the need to organize CM data representations and transformations for portraying information and knowledge responsive to CM analytics needs, which leads 2  Contribution 1 deals with the third question under the question heading of "How should a CM data visualization environment be developed?” (see page 16). The first two questions under this heading are addressed in Appendices A & B and section 1.6 which contribute to an in depth understanding of the shortcomings of current applications of data visualization in CM and how state-of-the-art data visualization can be adopted/adapted to develop a CM data visualization environment. 30  to the recognition of project context/performance data dimensions and the consideration of data granularity and data derivation of those data dimensions. This recognition is a contribution because data reports are usually in textual formats and concepts of project context/performance data dimensions provide a direction for transforming fuzzy messages/insights contained within the data into focused and important CM variables in the form of structured data dimensions, which in turn can be mapped to visual variables thereby creating images that address CM analytic needs.  Incorporating ideas from state-of-the-art data visualization and characteristics of CM users for designing visual representations of CM data, which include collectively considering conventions/good practices of visually encoding data, use of 2D/3D space, and leverage of interaction features. The recognition and use of these visualization design rules is a contribution because it unburdens designers in terms of allowing them to conduct "satisficing" visual encoding designs so that they can focus on analyzing CM analytics and visualization requirements.  Including an evaluation step to ensure users are enabled to glean insights from the implemented visualizations that help answer CM questions thereby leading to the understanding necessary to take appropriate management actions as required . This evaluation step can be integrated into the design process to assist in refining design of the visual images and their interaction features. A qualitative kind of self valuation combined with contrasting the design against those proposed by others (if any exists) is recommended. (A methodology that was considered but excluded is qualitative evaluation by construction management practitioners. This user study type of evaluation is suitable for inquiring into performance differences of industry end users and degree of adoption by industry, a research topic that is beyond the scope of my research work. For assessing basic concepts or changed paradigms of thinking, it is fraught with difficulties given the action orientation of most CM personnel). This identification is a contribution because it not only addresses the needs of evaluation and evolving design/development, but also sets out guidance as to how to conduct CM data visualization evaluation that fits the reality of the CM domain (both in industry and academia). 31  Contribution 1.2: Introduction of a top-down design approach for identifying common CM analytic needs and the corresponding visualization requirements. Applying this approach results in the identification of the general/common CM analytic reasoning required by the overall CM function/task and time management function/task domains, and visualization requirements responsive to those analytics needs. Items identified are as follows:  Overall CM function/task domains o Concept of understanding characteristics of construction conditions, construction performance,  construction  dependency  relations;  CM  analytics  for  predicting/planning purposes; CM analytics for monitoring/diagnosing/controlling purposes. o General visualization requirements that respond to the foregoing CM analytics such as representing different status value states for condition/performance couplets.  Time management function/task domains o CM analytics for planning/predicting time (mainly for appraising quality of a schedule); CM analytics for diagnosing/monitoring/controlling time. o General visualization requirements responding to the foregoing CM analytic reasoning needs such as associating construction conditions that can be represented visually with individual activities.  The top-down design methodology is a contribution because it provides a structured way of identifying the scope and direction of visualization development and formalizing common and essential features of images and associated interaction features required of a CM data visualization environment (e.g. images to present distribution of values of construction conditions/performance measures in terms of primary project context dimensions). Through a top-down approach, an environment architecture for a CM data visualization tool can be established. In this environment architecture, an organization of thematic CM data visualizations, mainly hierarchical from the abstract (e.g. time performance visualization) to the specific (e.g. duration variances displayed by location and activity), can be developed in a consistent way. The major benefits of employing a 32  top-down design approach to design a visualization environment architecture are two fold: 1) the environment developed possesses visualization features for supporting CM analytics that are common to a wide range of CM functions/tasks and hence can be utilized by a variety of CM functions/tasks, 2) the learnability of using the visualization environment is enhanced because different visualizations have consistent image features/attributes, allowing users to quickly apply the learning from using a few visualization to operating others.  The first benefit addresses a crucial property of  visualization environment architecture, i.e., the sharing of common design features amongst multiple CM functions/tasks, as shown in Figure 1.7.  Contribution 1.3: Employment of a structured development process that combines a bottom-up design process integrated with design guidelines and a top-down design process tailored to the needs of the CM discipline. The bottom-up development method treats details for creating an actual visualization that supports a specific CM analytics task. The process embraces CM analytics, specific visualization requirements, and visualization specification analysis, implementation, and evaluation. Its integration with design guidelines ensures that state- of-the-art data visualization is applied. Its integration with a top-down approach in terms of ensuring that common visualization features are addressed helps to preserve the environment architecture described in Contribution 1.2. Most importantly, its integration with the design guidelines and the top-down approach means lessons learned or new visualization features identified that may be common to other visual analytic reasoning needs can be added to the design guidelines or checklists of common visualization features. The concept of the foregoing integrated development process is depicted in the flow chart seen in Figure 1.8.  33  Top-down design/development process for analyzing common CM analytic reasoning needs and their corresponding visualization requirements Common CM analytics needs  Visualization requirements-- common visualization features  Design Guidelines  Visualization requirements-Scope of CM variables  Specific CM analytic reasoning needs  Specific visualization requirements  Specifications  Inspect against prescribed specifications, requirements, and CM analytic needs  Use data visualization API s (or generic data visualization systems) to implement (or configure) visual representations and interaction features according to the specifications  Apply the implemented visualization to actual project data  Bottom-up design/development process for analyzing specific CM analytic needs, visualization requirements and specifications, implementing (or configuring), and inspecting  Figure 1.8 A two way structured top-down and bottom-up CM data visualization environment development process  34  Contribution 2: answers the research questions of "What are the key features of a CM data visualization environment that best reflect the functions expected for it?" Lessons learned from conducting this research contribute to understanding key features of a CM data visualization environment that pin point the needs of assisting CM analytics and allowing users to flexibly explore a variety of images useful for CM analytics. These key features can be described as follows: 1. An organization of thematic visualizations, mainly hierarchical from abstract to specific, that are categorized by construction conditions and performance measures under multiple views of a project. 2. Each visualization has common primary and secondary features as summarized in Table 1.2 and Table 1.3 and is designed in accordance with the design guidelines described in Chapter 2 plus the additions presented in Table 1.4. In Table 1.2 and Table 1.3, the first column contains the common image features that need to be addressed when designing a new visualization; the second column contains settings for common image features that can be chosen to fit the uniqueness of a specific visualization designed to serve particular CM analytics. The last column presents the kind of interaction features corresponding to users selecting or changing the settings listed in the second column. The interaction features shown in the third column and elaborated upon in Table 1.4 in terms of the ability to change visual encodings are the "image specific" interaction features. During the bottom-up design process for specific visualizations, the focus of designing "image specific" interaction features is on "what settings of common image features and/or visual encoding cannot be changed". Principles of general interactivity for enhancing readability that deal with attributes of scales, orientations, and positions that are independent from visual encoding data values are treated as part of the design guidelines and can be observed in Table 1.4. 3. Different visualizations have commonalities in terms of shared visualization features and adherence to design guidelines. They are also unique because they respond to different specific CM analytics (e.g. different CM variables) and hence require different settings of the common visualization features or even new visualization features.  35  The primary common image features are the most fundamental ones and have been fully incorporated into the implementation of a prototype of a CM data visualization environment. Figures 1.9 to 1.13 illustrate these primary common image features as described in Table 1.2.  The identification of the common visualization features and refined design guidelines is a contribution because a CM data visualization environment can be developed in a structured way and tailored to CM user needs to explore visual representations of CM data at different levels of detail in order to identify visual patterns that represent the following insights that could not be readily deduced through other means: 1. Potential impacts of planned values of construction conditions on project context dimensions (e.g. activities, locations, project participants, time windows, pay items) and in turn inference of impacts on future construction performance (i.e. in support of evaluating quality of plans and identifying risk). 2. Potential impacts of planned and/or actual values of construction conditions on project context dimensions (e.g. activities, locations, project participants, time windows, pay items) and in turn inference of whether the planned or actual values of construction conditions have impacts on actual construction performance  (i.e. in support of  diagnosing reasons for actual performance). 3. Potential reasons for planned and/or actual values of performance measures by project context dimensions (e.g. activities, locations, project participants, time windows, pay items) and in turn inference that variances of performance measures are dependent on certain activities, locations, project participants, etc. (i.e. in support of diagnosing reasons for actual performance). 4. Potential cause-effect relationship amongst construction conditions and performance measures by comparing time stamped planned and/or actual values of them on the time dimension (i.e. in support of diagnosing reasons for actual performance).  36  Table 1.2 Primary common CM visualization features: presenting construction conditions or performance measures mapped against primary project context dimensions Interaction features for Image features Setting options for image features changing option settings Kinds of CM data dimensions treated as construction conditions or performance Image themes by construction conditions measures or performance measures (scope, time,  Attributes characterizing project context dimensions (e.g. activity-attributes, product-attributes, location-attributes, organization-attributes, environmentcost, safety, quality, etc), which are attributes) treated as measurement dimensions Select visualization  Derived attributes (e.g. productivity) Examples of choices of image themes  Counts of data records related to values of context dimensions (i.e. number of data are shown as selectable items in Figure topics that are related to certain project contexts) 1.9  Other attributes of context dimensions (e.g. daily weather conditions, daily site conditions, daily problems encountered) Image types by project context dimensions against which construction Kinds of CM data dimensions treated as project context dimensions conditions or performance measures are  Definition of construction conditions or performance measures  Activity mapped  Location Examples of different image types  Product Select visualization (distribution by product and location  Project participant dimensions vs. distribution by project  Occurrence time (data version dates can be used as occurrence time if the participant and location dimensions) occurrence time is difficult to know or no "occurring behaviour") of the same image theme (as-built record  Others (e.g. pay item, environment) distribution) are shown in Figure 1.10. Image contents by granularity of context dimensions Examples of different image contents (record distribution in the location dimension that is at the level of granularity of “location set” vs. “location”) of the same image theme/type (record distribution; record distribution by the location dimension) are shown in Figure 1.11. Image contents by items selection of  Inclusion of all, singular, or multiple levels of granularity of the project context dimensions  Select data granularity  Inclusion of all, singular, or multiple (group or range) values of the project context  Select data range  37  Image features project context dimensions  Setting options for image features  Interaction features for changing option settings  dimensions. In addition to direct selection of items, they can be selected by values of attributes characterizing the context dimensions.  Examples of different image contents (“all deficiencies” vs. “long head time deficiencies” record distribution in the location dimension) of the same image theme/type (record distribution; record distribution by the location dimension) are shown in Figure 1.12. Image contents by data status states Examples of choices of different image contents by different status states can be selected by user interface such as a combo box shown in Figure 1.13   Non-variance case: planned, actual, and/or planned vs. actual  Variance case: actual (current) - actual (target), actual (current) - planned (target), and/or planned (current) - planned (target)  38  Select data status  Figure 1.9 A user interface for image selection by construction conditions that are grouped under project views (i.e. the tab items such as "process", "as-built"). This user interface only allows users to choose images representing distribution of values of construction conditions in certain project context dimensions. This figure showcases the primary common visualization feature of “image theme by construction conditions or performance measures” described in the 1st row of Table 1.2  (a)  (b)  Figure 1.10 Two record distribution images visualizing number of deficiencies distributed in different project context dimensions: (a) product and location dimensions, (b) project participant and location dimensions. This figure showcases the primary common visualization feature of “image type by context dimensions” described in the 2nd row of Table 1.2 39  (a)  (b)  Figure 1.11 Records distribution images visualizing the number of deficiencies distributed in the location dimension but at different levels of granularity: (a) location set (stories), (b) location (rooms). This figure showcases the primary common visualization feature of “image contents by granularity of context dimensions” described in the 3rd row of Table 1.2  (a)  (b)  Figure 1.12 Records distribution images visualizing the number of deficiencies distributed in the location dimension but of different data value ranges: (a) "all" deficiencies, (b) only "long lead time" deficiencies. This figure showcases the primary common visualization feature of “image contents by items selection of project context dimensions” described in the 4th row of Table 1.2 40  Figure 1.13 The user interface (the left bottom combo box) for adjusting data status (i.e. planned, actual, and planned vs. actual work area received percentage). This figure showcases the primary common visualization feature of “image contents by data status states” described in the 5th row of Table 1.2  41  Table 1.3 Secondary common CM visualization features: secondary feature consideration after addressing the primary visualization features Interaction features for Image features Settings options for image features changing option settings  Distribution of values of construction conditions or performance measures in project context dimensions. Examples of this image format are discussed in the design cases 1and 2 of Chapter 4. At least two basic image formats  Distribution of values of several construction conditions or performance measures in Select visualization the occurrence time dimension and definition dimension (i.e. definitions of construction conditions or performance measures ). Examples of this image format are discussed in the design case 3 of Chapter 4. Image types by non-variance or variance  Visualizing non-variance values Select visualization values  Visualizing variance values Non-variance case: original data version (date), and/or several data update versions (dates) Image contents by data versions (dates) Select data versions (dates)  Image contents by how to aggregate measurements Image contents by how to compute variances between the planned and actual values of measurements Encode "holiday" "non-working day" if the time dimension is involved How to include values of context dimensions Enhancing the visual grouping of certain items of context dimensions that are not visually encoded by spatial position  Variance case: one or several paired data versions (dates)  Aggregation not allowed  Sum, Max, Min, Median  Others  No variance computation  Variance computation method  Select data aggregation method Select variance computation method   Encode "non-working day" and by what working calendar  Do not to encode "non-working day"  Include all values of a context dimension (e.g. all activities modeled, continuous time points)  Only include values of a project context dimension that have corresponding values of construction conditions or performance measures (e.g. only activities that have computed variance values)  Use connecting lines (e.g. using lines to connect bars representing activities of the same project participant)  Use normal visual encoding such as coloring (e.g. bars representing activities of the same project participant have the same color)  42  Select display options  Select display options  Select display options  Table 1.4 Design guidelines in addition to those presented in Chapter 2 Guidelines subjects Guidelines 1. The time dimension can be treated as the context dimension (i.e. occurrence time dimension) or time performance Choice of data measures representation/transformations 2. To create a null item as one value of a project context dimension because values of a construction condition may be associated with items of one context dimension but not associated with another context dimension (therefore associate with Null) 1. Choice of visual marks for different ways of value assignments for measurements dimensions Choice of visual representations  Singular valued--bars (i.e. lines), points  Multiple valued--points  Value range--Gantt bar  Choice of interaction features  2. Suggested visual encodings  If the visualizations involve visually encoding more than one data status state, two colors can be used to encode planned and actual status; three other colors can be used for three variance status states.  For image format 1 described in the first row of Table 1.3: Map context dimensions and the data version (date) dimension to positions on three orthogonal coordinates; map construction conditions/performance measures to Z axis (i.e. the coordinate that goes upward). How to encode other data dimensions (e.g. differentiating levels of granularity, data status states) will be treated on a case by case basis. For image format 2 described in the first row of Table 1.3: Map construction conditions/performance measures to Z axis (i.e. the coordinate that goes upward), map the time dimension (e.g. events occurring dates, data version dates) to positions on the axis perpendicular to Z axis. How to encode other data dimensions (e.g. context dimension other than the time dimension, differentiating levels of granularity, data status states) will be treated on a case by case basis.  When multiple project context dimensions are mapped to spatial positions, visual marks representing all items of these dimensions at different levels of granularity will be positioned. The ordering of visual marks follows the rules of: 1) first ordered by different project context dimensions (the ordering is to be specified by users), 2) then ordered by dimensions of different levels of granularity (from the coarsest to the finest), 3) then ordered by data values of context dimensions (the ordering is to be specified by users if the type of value scale is categorical). Visually differentiating project context dimensions of multi-levels of granularity and different project context dimensions by applying visual variables to their corresponding chart gridlines, labels, and/or strips Image specific interaction features Provide interfaces for users to change the settings of common visualization features and/or visual encodings. During the bottom-up design processes, designers can determine what settings and/or visual encoding cannot be changed. General interactivity  Global 3D virtual space and local optimum image format  43  Guidelines subjects  Evaluation  Guidelines o Allow the users to globally change the position, orientation, and scale of the 3D virtual space o Allow locally change positions, orientations, and scales of individual visual representations o Optimum orientations and scales are fixed for elements (e.g. labels, titles) in a visual representation  Coordinating image specific interaction features It is suggested that the coordination be specified by users. Interfaces and mechanisms should allow users to specify what visual representations to include for interaction coordination and what kind of shared image specific interaction features should be coordinated. Method and focus of evaluation 1: evaluation that is done by designer/developer and needs shorter time to perform after implementation  Conformance to specifications and requirements  Identify features required by the CM analytics not addressed in the requirements/specifications analysis  Improvement on usability, mostly for ease of use of interaction features and good visual effects (e.g. visual encoding does help provide visual effects of data grouping)  Identify new lessons learned or new visualization features common to the visualization environment for updating design guidelines and checklists of common visualization features Method and focus of evaluation 2: evaluation that is done by users and need longer time to perform after deployment  Identify which visualization in the organization of thematic visualization are frequently used  Identify new visualizations or new visualization features needed  44  Contribution 3: Answers the research question of "How does the use of a CM data visualization environment help conduct CM analytics that cannot be done or which are difficult to do with current data reporting practices?" Contribution 3.1: Demonstrated and analyzed, through case studies, how the ability to carry out CM analytics useful for dealing with the CM tasks at hand may be enhanced (i.e. human judgment enhancement from the "light color" in Figure 1.3 to the "dark color" in Figure 1.7) using data reports in visual form that are responsive to these analytics tasks.  Contribution 3.2: Demonstrated and analyzed, through case studies, how CM analytics ability may be enhanced using interaction features and an environment architecture that allows users to flexibly explore CM data collected/computed for a range of CM tasks/functions presented in visual form.  Through the demonstration cases described in section 5.3 (sections 5.3.1~5.3.3) of the conclusion chapter, it is shown that a CM data visualization environment allows users to identify more analytic reasoning artifacts and more quickly than examining data in its original textual or tabular form. Furthermore, through the analysis conducted on the demonstration cases, it was found that it is certain features of a CM data visualization environment, which are related to “reports in visual forms”, “use of interaction features”, and “an environment architecture” and which are lacking in the current tabular data reporting mechanism, are the ones that contribute to enhanced analytical reasoning performance. How these features differ from the ones that accompany tabular data reporting and traditional data images to contribute to enhanced CM analytics was analyzed and explained in a concrete way through referring to the demonstration cases and their corresponding images. A detailed discussion can be found in section 5.3.4 of the conclusion chapter. How the main features of a CM data visualization environment that contribute to the enhancement of existing CM analytics capabilities can be summarized as follows: 1. By presenting data in visual form, salient visual patterns representing values and patterns of behaviour of construction conditions or performance measures can be 45  instantly observed. 2. Images that depict the distribution of values of either construction conditions or performance measures in various context dimensions provide helpful insights as to whether or not their causes can be inferred or the impacts they generate can be inferred. 3. A thematic visualization representing CM data, which is collected/computed for a certain CM function/task, can also be utilized for CM analytics supporting other CM functions/tasks. Many such thematic visualizations have been organized for CM_IS users to access and conduct flexible analytical reasoning. 4. Users can adjust/select visual formats of data presentations of a specific thematic visualization on demand to meet their CM analytics needs. 5. Users can adjust/select data contents in terms of granularity of data sets of the visualization chosen on demand to meet their CM analytics needs. This involves choice of levels of granularity of context dimensions and aggregations of values over context dimensions of different levels of granularity.  1.8 Structure of the thesis Chapter 1 Introduction: This chapter describes the background (problems and proposed solutions), goals (questions, scope or focus, assumptions), methodologies, literature review (presented in the form of appendices), and contributions of the research.  The "contributions of the research" part of the chapter focuses on describing and justifying the first two main contributions that piece together research findings identified in Chapters 2 through 4 for providing a complete view of answers to the first two research questions. The third contribution related to answering the third research question is also overviewed. The assessment of that contribution is done through case studies, structured comparison evaluation, and results analysis, as documented in the conclusion chapter.  Chapter 2 Visual Representation of Construction management Data (a version of a published paper): This chapter describes the phase 1 research work related to developing guidelines and principles for designing a data visualization tool tailored to CM use. 46  Research findings provide partial answers to the first and second research questions.  Chapter 3 Design of a Construction Management Data Visualization Environment: a Top-Down Approach (a version of a published paper): This chapter describes the phase 2 research work related to developing a CM data visualization environment using a topdown approach in order to identify visualization requirements for an overall CM data visualization environment that can serve CM analytics in general. Research findings provide partial answers to the first and second research questions.  Chapter 4 Design of a Construction Management Data Visualization Environment: a Bottom-Up Approach (has been submitted for publication): This chapter describes the phase 3 research work related to developing a CM data visualization environment using a bottom-up approach in order to identify, implement, and evaluate new visualization features of a CM data visualization environment in support of specific CM analytics including schedule variance analysis, product/location attribute analysis, and reasons for time performance analysis within the CM function of time performance control. Research findings provide partial answers to the first and second research questions.  Chapter 5 Conclusion- Summary, Answering the Research Questions, Contributions, Future Work: This chapter concludes the research, and incorporates reporting on the fourth and final phase of the research work in terms of answering the last research question. Six demonstration and analysis cases are used to: 1) show and assess how the use of a CM data visualization environment helps conduct CM analytics that cannot be done or which are difficult to do with current data reporting practices, and, 2) provide an indicator of degree of validity and generality of the research findings in answering the research questions set forth in Chapter 1.  Appendix A Data Visualization in Construction Management: This appendix provides a full literature review of the use of data visualization in construction management (including the data visualization capabilities of current commercial CM information 47  systems).  Appendix B Overview of State-of-the-Art Data Visualization: This appendix provides an overview of state-of-the-art data visualization technologies in order to understand the current state of applying data visualization technologies to facilitate CM visual analytics in support of CM functions.  48  Chapter 2 Visual Representation of Construction Management Data3 2.1 Introduction Construction project participants are confronted with the need to make high quality and timely decisions based on the information content that can be deduced from the very large data sets required to represent the various facets of a project through its development life cycle. How best to extract information from such data sets is a question that preoccupies researchers and practitioners alike across a number of disciplines, including construction. One approach to reasoning about data is visual analytics, the science of analytical reasoning facilitated by interactive visual interfaces (Thomas and Cook 2005). We believe it has special appeal to the construction industry because of its visual orientation, and because visual analytics has the potential to be directly usable by construction practitioners without a requirement for specialist knowledge or assistance.  Visual analytic models provide the building blocks for development of an interactive visualization environment which is tailored to the special needs of a particular industry and user audience profile in order to help personnel glean insights from large and complex data sets. In this chapter, we use the term visual analytics environment to refer to a computerized information system which treats pre-coded scenes that consist of one or more visual representations and accompanying interaction features, and a user interface that assists users  to interact with data and pre-coded images for analytic reasoning  purposes. Further, we use the term visual analytics model to refer to the specifications of requirements of components for implementing a visual analytics environment. Also with respect to terminology, a distinction is drawn between the terms, visual analytics, and data visualization, with the latter referring to the use of computer-based, interactive visual representations of data to amplify cognition (Card et al. 1999). In effect, data visualization corresponds to one of the components of visual analytics.  3  A version of this chapter has been published. Russell, Alan. D., Chiu, Chao-Ying., Korde, Tanaya. (2009).  "Visual Representation of Construction Management Data." Automation in Construction, 18(8), 1045-1062. 49  The design of effective visual analytics models is built on four pillars: (i) the purpose(s) of the analytical reasoning; (ii) the choices of data representations and transformations; (iii) the choices of visual representations and interaction technologies (i.e. data visualization); and, (iv) the production, presentation, and dissemination of the visual analytics findings. Of these four dimensions, data representations and transformations constitute the foundation on which visual analytics is built, while the use of visual representations and interactions to accelerate rapid insights into complex data is what distinguishes visual analytics software from other types of analytics tools (Thomas and Cook 2005). A data representation is a structured form, which is generated from the original raw data and which retains the information and knowledge content within the original data to the greatest degree possible (Thomas and Cook 2005). A data transformation deals with transforming data into varying levels of abstraction or deriving additional data that has new semantic meaning. Visual representations translate data into a visible form that helps the analyst perceive salient aspects of the data quickly.  Interaction technologies support  dialogue between the analyst and data (Thomas and Cook 2005). Thus, the second and third pillars identified previously constitute the core of a visual analytics model.  Described in this chapter are aspects of our work related to the application of visual analytics to the domain of construction management. Our particular focus is on how visual representations and interaction technologies, in concert with a nine-view data representation of a construction project (e.g. physical, , process, cost, as-built, quality, change, organization/contractual, environmental, and risk views) (Russell and Udaipurwala 2004) that supports a range of construction management (CM) functions, can improve the construction management process.  Two research hypotheses guide our work: (i) the  application of visual analytics to CM functions (e.g. change order management, quality management, drawing control, schedule analysis, etc.) improves the CM process through enhanced understanding of project status and reasons for it, improves communication amongst project participants, assists with the detection of potential causal relationships, and improves decision making; and, (ii) a visual analytics environment can be developed which is sufficiently general to serve the needs of a broad range of CM functions. Our 50  concern is with the practical application of visual analytics, with practical meaning the use of visualization technologies that are compatible with the constraints associated with the construction industry – e.g. a heterogeneous user audience with highly variable education backgrounds, a focus on action and results as opposed to exploration, and ease of use without the requirement for specialist assistance. Implications of the foregoing statement are at least two-fold: (i) the user audience is comprised of generalists as opposed to specialists; and (ii) usability by practitioners of the visual representations developed depends on an implementation strategy that encodes the representations in a ready-to-use manner along with the ability to interact with them to extract the greatest meaning possible.  To illustrate the concepts and principles presented, attention is focused on  change management and its corresponding change-data view, which also interacts with one or more of the other eight project-data views. Nevertheless, the concepts and principles described are broadly applicable to other CM management functions.  Designing good visual representations involves several challenges: fit of a visual representation with the characteristics of users within a given industry (e.g. visual perception and cognitive abilities); scalability of a visual representation; and, the degree to which a visual representation is useful. With the aid of interaction techniques and careful arrangement of data representations and transformations, these challenges can be addressed. In the construction world, there can be multiple target audiences, and the type of visual representations used may vary from one audience to another depending on their comfort with 2D, 3D, and more complex ones. The scalability issue of visual representations also exists due to the volume of data generated for a large scale, complex, construction project. Thus while construction data representations have many dimensions which need to be translated into visual representations, at most one must work with a three dimensional visualization space. Therefore, one is confronted with the need to consider various combinations of dimensions to develop the understanding required for analytical reasoning. Considering the foregoing issues, substantive research and design challenges must be addressed in formulating data representations and transformations that reflect knowledge and information in the context of the analytic reasoning tasks of interest and 51  crafting them into visual representations with relevant interaction techniques so that expert to novice users can extract important information hidden in the data. Success in meeting these challenges can be measured in terms of the breadth of analytical reasoning supported, including the number and total domain value of insights gleaned by industry users (i.e. the total sum of the significance of the insights generated) (Saraiya et al. 2004), and the ease of use of the visualizations supported.  Of several questions that need to be addressed in pursuit of proving the research hypotheses described previously, three are examined in this chapter, with emphasis being placed on the first two.   Q1: What principles and guidelines should be used for designing a visual analytics environment for a broad-based treatment of construction management functions, in terms of analytical reasoning tasks supported, data representations and transformations supported, and corresponding visual representations and interaction features?    Q2: Can the usefulness of individual visual representations designed in response to specific analytic reasoning tasks be perceived by users, thus lending support to the hypothesis that the application of visual analytics to construction management functions improves the CM process?    Q3: How should a visual analytics environment be designed and implemented so that it is responsive to the realities of the construction industry and satisfies the criterion or test of practicality?  The remainder of the chapter is structured as follows. A brief overview of the general motivation that drives research on visualization is provided. This is succeeded by a short description of recent construction data visualization work.  Important principles or  guidelines related to visual representations are then discussed, with emphasis on the viewing dimensions of data and thus how it may be portrayed, features desired of a visualization environment, and how the usefulness of various visual representations may be evaluated and validated. Our interest lies with examining collections of entities as 52  opposed to individual entities (e.g. a register of change orders as opposed to an individual change order). Then, visual representations of change order data are explored in detail in order to examine issues associated with the questions posed and application of the principles/guidelines set out in the previous section.  The chapter concludes with a  discussion of findings from work performed to date, and their extension to other construction management functions and data types.  2.2 Motivation for use of visualization Representing data in a visual format helps amplify cognitive ability or reduce complex cognitive work (Card et al. 1999; Keim et al. 2006). Humans can derive overview information from data better and faster if it is presented in a suitable visual format other than textual/numerical scripts or tables. This is because features such as spatial positions or colors provide low similarity amongst different features than do texts or numbers, which is one of the key reasons why human beings can be visually attentive to certain symbols (Duncan and Humphreys 1989) and identify visual patterns prior to conscious attention (Ware 2004). Another explanation states that this is because large amounts of visual/diagrammatic information can be processed by the human visual perception system in parallel as opposed to the serial processing required for textual or numeric information (Larkin and Simon 1987; Ware 2004). Based on these theories, various attributes of the data of interest are mapped against certain features in the visual representation like color, size, shape, location or position thereby reducing the need for explicit selection, sorting and scanning operations within the data (Shneiderman 1994; Tufte 1990).  These  techniques thus tailor the data to be retrieved, such that the large arrays of neurons in the eyes can rapidly extract features of visual representations and distinguish salient visual patterns (Ware 2004) that correspond to patterns hidden in the data. This helps the target audience achieve insights faster and better as to the information content of a data set that may otherwise be concealed or not easy to comprehend from its representation in tabular or text form.  For the current state-of–the-art of computerized visualization techniques, data 53  representation is often coupled with real time interactive tools like zooming and filtering, details-on-demand windows and setting dynamic query fields, which allow users to browse through and study the represented data. Emphasis is also placed on the rapid filtering of data to reduce the result sets (Ahlberg and Shneiderman 1994). This is called visual data exploration. Thus, visualization can be described as a two-fold process of data presentation and data exploration. Effective visual representation schema assist the efficient scanning of different parts of an organization’s or project’s database, allowing users to instantly “identify the trends, jumps or gaps, outliers, maxima and minima, boundaries, clusters and structures in the data” (Brautigam 1996). Exploration tools allow continuous interaction between users and the graphic displays by offering scope for “constant reformulation” of search goals and parameters as new insights into the data are gained (Ahlberg and Shneiderman 1994). They provide a continuously updated information platform to users, thereby aiding the decision making process.  Of the several references reviewed that describe frameworks for classifying visualization techniques, we mention one in particular because of its potential applicability to the construction domain.  Specifically, to reduce the ‘complexity inherent in choosing a  visualization technique for a particular application context’, Lengler and Eppler (Lengler and Eppler 2007) compiled a pre-selected group of a hundred visualization techniques thought to be applicable to management functions in the form of a periodic table analogous to the Dmitri Periodic table of elements. Through this structure, the authors highlighted the fact that for a given requirement there need not be just one appropriate visualization method. Rather, there is a potential of employing a combination of different methods to enhance understanding. Such an approach may be particularly appropriate for the construction domain, as it has the potential to enhance practicality and ease of use of visual representations for different management functions.  2.3 Data visualization in construction In carrying out the literature review on visualization techniques, we also undertook to identify the extent to which they have been applied to the field of construction. The 54  majority of the work described in the literature has focused on visualizing the spatial and temporal aspects of construction project data, with very limited emphasis being placed on the visualization of abstract, non-spatial data. A rich literature has developed over many years dealing with 2D, 3D, and 4D and even nD visualization of the physical artefact to be constructed (Collier and Fischer 1995; Heesom and Mahdjoubi 2004; McKinney and Fischer 1998; U.S. General Service Administration 2008). For example, there is a growing use of 3D and 4D models to minimize the potential for design and construction errors in the construction product, to identify critical space and time during construction (Dawood et al. 2005), to determine the most suitable construction methods and sequence, and to monitor construction progress (e.g. (Sriprasert and Dawood 2003; Staub and Fischer 1998)). For visualizing some aspects of a project’s data, Song et al. (Song et al. 2005) proposed a 3D model-based project management control system where the visual platform (i.e. the 3D building model) itself serves as a construction information delivery platform. The system enables the user ‘to show a holistic picture of a project by applying the multiple project data sets to the geometric attributes (such as shapes, faces, and edges) of the 3D building model components through color- tone variation and motion.  The  proposed control system uses a Project Dashboard as the user control interface allowing the user to freely choose the sets of data to apply to the different visual attributes of the 3D model.  Although this approach makes it relatively easy to visually associate project  control data with components of the physical product, how best to generate insights from abstract construction data that require representations of salient spatial and one or more of temporal, or organizational patterns (e.g. clusters, trends, and anomalies) is not obvious because spatial positions have been dedicated to represent only the geometric data of the built product.  In contrast to visualizing the physical artefact to be built for purposes of constructability reasoning or workability of the methods selected for its construction, or even accessing relevant project information through the mechanism of a 3D model, our primary focus is on the visualization of abstract construction management data in support of exploratory data analysis and the application of project participant tacit knowledge. Specifically, our 55  interest is with collections of entities (e.g. change orders, drawings, RFIs (Request for Information), etc.) and their association with other entity collections, with the definition of a collection being determined by the choice of values for one or more properties of an entity associated with a management function. Somewhat surprisingly, there is very little literature that addresses visualization of construction data (Rojas and Lee 2007), particularly with respect to how data visualization can play an important role in aiding analytic reasoning for a range of CM functions. This observation results in both an opportunity and a challenge for researchers exploring the use of data visualization and visual analytics for construction. The opportunity is that it is a relatively virgin field of inquiry. The challenge is that when positing ideas either for the design of visual images themselves or complete visualization environments, in terms of validating ideas there is very little with which to compare and contrast. Consequently, it is important to set out some basic principles and guidelines against which one can assess the usefulness of image and visualization environment designs proposed. A project’s database is voluminous, containing data that varies from textual form such as drawing specifications and contractual clauses, to quantitative data like number of change orders and related properties (e.g. value, timing, number of participants), RFIs issued and turn around times, SIs (site instructions), correspondence, photos, drawing control data, planned and actual schedule data, weather conditions on site, and cost breakdowns. The data is generally time and location variant and originates from or affects multiple project participants. The sheer volume and nature of the data poses significant management challenges. Further complicating these challenges is the observation that construction data is often poorly organized because it lacks proper grouping and sub grouping which can lead to missed opportunities to associate related data or facts, and more often than not it is incomplete. For effective management of a project, efficient handling, monitoring and control of all project data is essential. Buried within this data are important messages which relate to the reasons for performance to date, but extracting this information from any database, especially a poorly organized one can be very difficult (even if a database is well organized, linkages amongst different data items may not be obvious – data 56  visualization may in fact help one forge relevant links). As a consequence, explaining different aspects of construction project performance often qualifies as a classic case of “data rich - information poor” problems (Songer et al. 2004). Thus, the massive amount of data available to management personnel results in information overload (Songer et al. 2004) unless it is accompanied by a high level of organization and accompanying reporting mechanisms.  Songer et al. (Songer et al. 2004) explored the use of Treemaps and other visual aids like scatter plots and histograms for assessing cost performance. They described an iterative process of structure-filter-communicate while considering level of detail, density, and efficiency of data representation. Vrotsou et al. (Vrotsou et al. 2008) applied Time Geographical methods to visualize work sampling data to allow analysts to understand better the distribution of activities and the interdependencies amongst them. For assessing schedule quality and aiding communication (e.g. feasibility, matching production rates, avoiding trade stacking, achieving work continuity, making clear the location sequence of work, etc.), especially for projects characterized by repetitive work, Russell and Udaipurwala (Russell and Udaipurwala 2000a; Russell and Udaipurwala 2000b; Russell and Udaipurwala 2002) and Zeb et al. (Zeb et al. 2008) demonstrated the value of using linear planning charts. Combined with ancillary images pertaining to the distribution of resource usage in time and space, additional insights on the quality of a schedule can be gleaned. Zeb et al. (Zeb et al. 2008) also explored the visualization of as-built data in terms of job site conditions encountered, problems associated with individual activities, and the juxtaposition of site condition parameters with daily activity status in support of causal reasoning. Zhang et al. (Zhang et al. 2009) used an integrated building information system and digital images captured on site to semi-automate the calculation of progress measurements (e.g. cost and schedule variance) for items of a work breakdown structure and then facilitated their visualization using data filtering techniques (i.e. single work package selection) and a composite of images to represent various progress measurements.  A limitation of work to date on abstract construction data visualization as opposed to 57  physical product visualization is that it is mainly exploratory rather than systematic in nature, with limited breadth in terms of the type of data and information entities examined, management functions examined, and guiding principles for designing relevant visual images. Thus while such individual explorations are useful, there has been a lack of an extensive program of research directed at determining what roles visualization can play across multiple functions using a common framework of principles, and what properties should be present in a visual analytics environment tailored to visualizing construction data.  2.4 General principles of visual analytics design processes The beneficial application of visual analytics begins with understanding the purposes of the analytical reasoning involved in conducting the management functions of interest. This understanding in turn provides guidance as to what data representations and transformations are desired. Then, based on the structure of these representations and transformations, data can be collected or derived. Lastly, with structured data at hand, strategies of mapping the data onto visual representations can be explored while considering the limitations on visualization space, differences in end user cognitive and visual perception abilities, and interaction techniques available. The associated design process is an iterative one that integrates an evaluation process which captures and incorporates feedback from the intended user audience. In the following subsections, principles of conducting the aforementioned steps in the visual analytics design process as applied to construction management functions are explained. They have been gleaned from an extensive review of the literature and from hands on design and exploration of visual images for various analytic reasoning tasks for a range of management functions. Later in the chapter they are applied in the context of change order management; nevertheless they are broadly applicable to a range of functions.  2.4.1 Understanding the purposes of analytical reasoning Different project managers have different thinking styles, experiences, and knowledge (Tullett 1996). Therefore, it is difficult to predict the steps a person takes to explore, acquire, organize, and use information to assist analytical reasoning (Stolte et al. 2002; 58  Tullett 1996). However, in general, the analytical reasoning involved in construction management is about gaining understanding from the perspectives of different project context dimensions, the characteristics of construction conditions (e.g. constraints, requirements, environment) and construction performance dimensions (e.g. time, cost, quality), and then confirming/exploring how construction conditions and construction performance are interrelated – i.e. identifying potential causal relationships amongst the two. In essence, the focus of analytic reasoning is on assessing conditions and performance and communicating findings to different audiences in forms that facilitate interpretation. Condition and performance characteristics can be described at three different levels: overall characteristic (i.e.: overall qualitative pattern); local characteristic (i.e.: local qualitative pattern); and, individual characteristic (i.e.: single value.) This somewhat oversimplified categorization represents a generalization from the authors’ distillation of problem solving intentions observed in past CM researches and project management principles. This distillation analysis along with a more detailed taxonomy of analytical reasoning involved in construction management functions across different construction phases are left for extended discussion in chapters 3 and 4. Nevertheless, suffice it to say that identification of the primary purposes (assessing and communicating) to be served by analytical reasoning is very important as it provides guidance and focus for the design of the components of a visual analytics model (e.g. a collection of useful pre-coded images presenting various construction conditions, construction performance dimensions, and possible causal relationship amongst them), and serves as a benchmark for evaluating the efficacy of a design (Amar and Stasko 2005).  2.4.2 Organizing data representations and data transformations In response to the purposes of analytical reasoning outlined in the foregoing, we identified several context dimensions (e.g. time (when), space (where), responsibility (who), physical system/component (what), work environment (natural & man-made work conditions)) for representing a project’s context. Each context dimension can be characterized by a number of quantitative and qualitative attributes to describe planned versus actual construction conditions. Further, one instance of a context dimension can be interrelated with another 59  instance of the same dimension through the sharing of the same value for other dimensions (e.g. two activities can share the same time, space, responsibility and work environment). Complementing context dimensions are performance dimensions which include measures such as time (how long), cost (how much), quality (e.g. number of deficiencies), safety (e.g. days and man hours lost to accidents), and scope (e.g. value of change orders). The notion of characterizing a construction project in terms of both quantitative and non-quantitative context dimensions and quantitative performance dimensions provides a cornerstone for forming structured data representations that reflect the information and knowledge needed by construction personnel.  Data transformations are directed at qualitative abstractions and quantitative aggregations and dis-aggregations of both context and performance dimensions to reflect different levels of granularity, and at deriving new semantically meaningful dimensions. As an example of a data transformation dealing with level of granularity, the space dimension can be expressed at different levels of detail, such as a sub-location (e.g. east wing), an individual location (e.g. 2nd floor), or a group of locations (e.g. all superstructure locations). An example of a data transformation dealing with data derivation is the computation of a site or work location congestion index which is defined as work space area divided by resource usage rate.  The aforementioned conceptual principles regarding organizing data representations and data transformations are based on the nature of CM analytical reasoning requirements and the characteristics of structured data (e.g. measurement scales of data values, data items being relationally and hierarchically related, etc.). Therefore, they are independent of construction project information models proposed by several researchers (e.g. (Abudayyeh and Al-Battaineh 2003; Froese 1996; Karim and Adeli 1999; Kim and Liu 2007)). However, because the essence of exploratory CM analytical reasoning is to be able to examine construction conditions, construction performance, and causal relationships amongst them for various project context dimensions, an integrated construction management information model (e.g. (Russell et al. 2004)) is essential to support analytical 60  reasoning based construction data representations and data transformations.  2.4.3 Designing visual representations and interaction features When trying to design visual representations and their interaction features, four major constraints need to be taken into account: (i) purposes of analytical reasoning; (ii) characteristics of the construction domain (e.g. the rather broad spectrum of user cognitive and visual perception abilities encountered in the construction industry, and limited resources such as time and cost for conducting the analysis and communicating the results); (iii) space limitations on the visual display and the multidimensional data representations that need to be presented; and (iv) the extent to which the user can interact with data and its visual representation. With respect to the last constraint, its removal by maximizing interaction capabilities helps to cope with the other constraints, thereby facilitating the design and use of more flexible visual representations that best meet the purposes of analytical reasoning.  To date we have identified three main general rules of thumb for initiating draft designs of visual representations.  We observe that such rules have not been systematically and  integrally discussed in the CM literature: 1. Follow conventions and good practices: Use effective visual encoding principles (i.e.: choice of encodings depends on measurement scales of data values (Cleveland and Robert 1984; Mackinlay 1986)); use conventions and good practice of graphing data (Bertin 1983 (originally published in French in 1967); Brath 1999; Cleveland 1985; Schmid 1983; Tufte 1986; Unwin 2008; Wainer 1997; Wilkinson 1999); and, use conventional graphics elements (e.g. orthogonal coordinate layout, points, lines, bars, pies) because natural standards and organized standards of graphics (Schmid 1978) have formed people’s basic graphics literacy over the years, which is one of the factors explaining how effectively people interpret visual representations of data (Shah 2005). 2. Possible use of virtual 3D space: Provided with the tool of interactive computer graphics which is far advanced from only the pen and paper used by Playfair to create the first static 2D bar chart (Beniger and Robyn 1978) 200 years ago, researchers have 61  started exploring the opportunity of utilizing and enhancing its power in order to creatively generate dynamic 3D visualizations to assist data analysis (Cook 2009) and information search (Card et al. 1991). This opportunity should not be overlooked for designing visual representations of CM data especially for purposes of exploratory data analysis. Recently, an innovative design methodology has been proposed (Brath 2003) in which 3D virtual space “houses” several 1D, 2D, and 3D statistical graphics representing data of several dimensions in order to treat one or more  analytical  functions in one image, for promoting image aesthetics, and for adapting to the evolving desire for and comfort with 3D images by users. We speculate that addressing more than one analytical function in a single image as opposed to in several images may reduce the time needed to analyze multidimensional data. This is because the aesthetic appeal of and human preference for 3D scenes may prolong users’ patience (Cawthon and Moere 2007) and thus keep management staff more attentive/engaged. As well, communication may be enhanced (Brath et al. 2005). Therefore, this design approach should be explored for its potential use in CM data analysis applications similar to what has been done for advancing the use of visualization of geometric data of 3D product models. However, the 2D version of the 3D image should also be produced in order to accommodate users who are more inclined to use 2D visualization. 3. Use interaction features to solve graphics problems encountered in the design of visual representations: Interaction features that allow users to interact with data and its visualization (e.g. data query, view navigation, image editing, etc.) can deal with major issues encountered in the use of static graphics such as: image readability problems (e.g. occlusions, illegible labels); the inability to present large and multidimensional data in just one image making it difficult to thoroughly examine datasets from different perspectives and level of detail; and, the inability to change visual encoding to accommodate user visual perception preferences. Thus with the leverage provided by interaction features, flexible designs are possible and rosters of designs can be supported so as to not be constrained by these issues. For example, given space limitations of the display medium, it is difficult to have visual representations of a large data set in which users can observe both overall patterns and 62  detailed data simultaneously. With the use of an interaction feature for coordinated multiple views, one can design a visual representation to have two or more images, with one showing overview patterns and one or more showing details of data that users select in the overview image in order to know their exact values (Ball and Eick 1996). On the other hand, if using the interaction feature of differential scaling an image on demand of the user, one can design a visual representation requiring only one image in which a focus (showing details of the data of interest) plus context (showing overall pattern of the data) effect can be observed (Rao and Card 1994). Classifications of the generic functionality of interaction features can be found in (Chuah and Roth 1996; Unwin 2006; Yi et al. 2007). 2.4.4 Design evaluation The final product of the iterative design process is the set of implementation requirements for the components of the visual analytics environment for the management functions of interest.  Two important aspects for validating the final product are usability and  usefulness (Grinstein et al. 2003; Plaisant 2004; Scholtz 2006). Usability refers to the ease of use while usefulness examines whether or not the models serve the intended purposes of analytical reasoning.  Although usability plays a part in achieving usefulness, the  requirements to achieve it are more technology dependent while the requirements for usefulness are much more dependent on the fundamental concepts for designing visual analytics models. Our interest here is to validate the usefulness of the application of visual analytics. We believe that usability issues can be addressed by leveraging the capabilities of cutting edge technology. In terms of usefulness of a visual analytics model to serve the analytical reasoning purposes identified, one must demonstrate that users are enabled to glean insights and to apply their tacit CM knowledge through viewing the salient patterns shown in the visual representations of data while interacting with the data. These insights must then lead to the understanding necessary to take appropriate management actions as required. We suggest that the process of evaluation be integrated into the design process and in the form of a 63  qualitative type of method that includes heuristic inspection, collecting opinions, and/or contextual interview (Carpendale 2008). Such an approach allows for the capture of the perceptions of CM experts as to the usefulness of visual analytics in assisting with analytical reasoning for complex CM data analysis tasks, and the identification of features that heighten reasoning capabilities. Contrast this approach with quantitative methods such as controlled experiments which demand significant sample sizes and domain expert time, which in our experience is very difficult to obtain for the CM domain. Therefore, it is recommended that the evaluation process be basically one of self evaluation (the self evaluators themselves are domain experts) combined with comparison and contrasting against visual representation designs proposed by others (which, based on a thorough review of the literature, tend to be very modest in number). Most importantly, the designs can be evaluated by construction personnel to test for the ability of the visual analytics model to provide the analytical reasoning capabilities sought at the outset and to obtain feedback to allow further refinement and analytical reasoning. These steps in the evaluation process are applied to the design images presented in the next section of the chapter. 2.5 Design of visual representations of change order data For the remainder of this chapter, using data from two retrofit / rehabilitation projects (denoted as Project 1 and Project 2 herein), we focus on the design and evaluation of visual representations for change order data in order to demonstrate application of the thought processes and principles described in the previous section of the chapter.  The  representations developed can be readily adapted to the exploration of other management functions and data types.  They illustrate how visual analytics can facilitate analytic  reasoning by providing insights into reasons for performance to date, identifying potential cause-effect relations (e.g. an implicit causal model is that the impact of change orders on time performance is likely to be highest if they are clustered simultaneously in one or more of time, space, by project participant, or physical system), and improving communication amongst project participants. For both projects, our perspective is mainly that of the general contractor (GC) or construction manager (CM) in terms of the change order 64  management function and the possible impacts of changes on project performance. The examples given here are illustrative of the kinds of situations often encountered on capital projects, and which can be missed because of a preoccupation with individual items as opposed to the collection of many items and related patterns of occurrence – i.e. there can be a failure to see the big picture. This in turn can lead to several undesirable situations, including an underestimation of consequences, failure to initiate corrective action in a timely way, delays, management burnout, loss of entitlement, and loss of reputation, to name a few. 2.5.1 Change order management A change order (CO) (also referred to as an extra herein) corresponds to an instance of one of the sub-dimensions that comprise the process/information context dimension. COs are tracked at the instance level whether in an integrated information management system or simply by spreadsheet.  From a system design perspective, it is useful to treat CO  properties in a separate data view (e.g. change view), which is the perspective adopted herein.  Properties of interest include CO_ID (change order identification), date of  initiation, date of approval, reason(s) for the change order, project participants affected, estimated vs. approved vs. actual cost, and associations with components used to define other project data views. Other properties derived from associations with other project data views include start and finish dates of the work and hence actual duration (As-built view), physical components affected, where and related drawings (Physical view), and required procurement activities (Process view). Some of these properties are specified by system users while others are derived by the system based on information provided (e.g. durations). A list of change order properties of interest herein, their distribution across different project data views, data type and source are provided in Table 2.1. Typically for projects, a roster of change orders is maintained (e.g. a spreadsheet), and depending on the type of project and procurement mode used, this roster can become very lengthy. As illustrated later, visual analytics provides one approach for extracting and communicating the information content in such a roster.  65  Table 2.1 Change order properties of interest Change Order (CO) Property  View*  Data type  Source  CO ID (identity)  CO alphanumeric User Mgmt Date CO process initiated CO date User Mgmt Date CO approved (cancelled) CO date User Mgmt Duration of CO initiation/approval process CO number Derived Mgmt Reason for CO (client initiated, design error/omission,) CO alphanumeric User Mgmt Date CO work started As-built date User Date CO work completed As-built date User Duration of executing CO work As-built number Derived Number of consultants involved with CO CO number Derived Mgmt Identity of consultants involved (e.g. architect, structural engineer, CO alphanumeric User …..) Mgmt Number of trades involved with CO CO number Derived Mgmt Identity of trades involved (e.g. GC, mechanical, electrical, ….) CO alphanumeric User Mgmt Basis for payment (lump sum, unit price, time & materials, ..) CO alphanumeric User Mgmt Base cost of CO and cost breakdown, exclusive of impact costs CO numbers User Mgmt Estimate of impact costs of CO if applicable CO number User Mgmt Physical component(s) of project affected by CO and locations Physical alphanumeric User Long lead time procurement items associated with CO Physical alphanumeric User Procurement item procurement sequence Process alphanumeric User Association with existing schedule activities Process alphanumeric User Number of existing activities affected Process number Derived Association with new activities as a consequence of CO Process alphanumeric User Number of new activities as a consequence of CO Process number Derived As-built problems associated with CO As-built alphanumeric User Identity of existing drawings revised due to CO Physical alphanumeric User Identity of new drawings due to CO Physical alphanumeric User Number of RFI’s associated with CO As-built number Derived Identity of RFI’s associated with CO As-built alphanumeric User * Use is made by the authors of a nine-view data representation of a project: product (physical), process, organizational/contractual, cost, quality, as-built, change (CO Management) , environmental and risk (Russell and Udaipurwala 2004)  Changes and change orders are an inevitable part of any construction project. They can have a significant effect on a project and its participants in terms of productivity, and overall project performance. Further, they can give rise to contentious disputes because of their cumulative impact on the efficient execution of other work, and the additional load placed 66  on management staff. Various researchers (e.g. (Hanna et al. 2004; Moselhi et al. 2005; Thomas and Napolitan 1995)) in the past have tried to quantify these impacts as well as the properties of change orders that have the most adverse consequences for performance.  In terms of analytic reasoning from the perspective of GC/CM or the client with respect to change orders, example questions of interest include the following:  Assessing  To date, what is the distribution of change orders in terms of the context dimensions of time, space, physical system/component, project participant, etc., and what are the potential consequences of this distribution?  To date, what is the distribution of change order cost (a performance dimension) in terms of time, space, physical system/component, project participant, etc.?  What is the distribution of reasons for change orders, and are they limited to a specific facet of the project or a small subset of project participants?  What causal relations appear to exist between the distribution of change orders and project performance as measured in terms of productivity and schedule?  Communicating  How can the change order history to date be communicated in as factual and objective a manner as possible to key participants (e.g. client, architect)?  2.5.2 Visual representations for project 1 and design 1 As indicated previously, rather than focus on the properties of an individual change order, here we show how visual representations can provide a ‘big picture’ of what is happening to a project in the way of changes during its construction phase. In presenting the images in Figures 2.1 and 2.2 for Project 1 which we refer to as Design 1, use has been made of a 122 change order data set including information related to value, timing, location and responsibility of the work. The impact of change orders on labour productivity and project duration became a contentious issue for this project. One approach applied to assess the impact of the value and number of change orders involved use of the kind of analysis offered by Moselhi et al. (Moselhi et al. 1991). But such an analysis ignores the timing 67  and location of the work, and implicitly contains a retroactivity principle (i.e. future change orders impact work already done). By visualizing the distribution of CO’s using relevant meta-data (in this case timing, location and responsibility for the work), a more accurate assessment of potential impact of COs on productivity can be made and other assessment and communication issues addressed.  In what follows, the properties of  Figures 2.1 and 2.2 are analyzed in terms of the principles presented previously.  2.5.2.1 Purposes of analytical reasoning for project 1 and design 1 The analytical reasoning purpose of Design 1 is to examine the as-built change order history to identify trends of change orders versus time, the clustering of change orders in time, space, and by participant to examine possible site congestion issues which could impact productivity or schedule performance, or overwhelm management’s capabilities to process and coordinate change orders to minimize the impacts on project performance.  Figure 2.1 Project 1 CO history in terms of ID & Location, timing and value of work  68  Figure 2.2 Project 1 History of COs by location, time, responsibility and number  2.5.2.2 Choice of data representations and transformations – project 1 and design 1 Data representations: Given the purposes set for the analytical reasoning, the relevant context dimensions include the process entity of change order (CO) in terms of identification (i.e. CO_ID), time window of execution and where executed, and performance dimensions of cost and number of change orders. In terms of the original data,time was measured in days and months, and the location dimension was highly aggregated into three values – on-site, off-site, and both off and on site, with the reasoning being that offsite CO’s would make little or no contribution to productivity loss or congestion on site. As a general observation, we note that it is important to support different granularities in the definition of time (e.g. day, week, month), location (e.g. individual, group, class), project participants (individual, group, class) and physical components (e.g. individual, group, system).  Data transformations: To enhance clarity of the visual representations, we have 69  transformed the original data by using a more coarse definition of time in terms of months. In response to this transformation, a CO is counted once for each month it is active, and its dollar value is distributed uniformly over its duration. In order to reduce the original four dimensions describing a CO to three to facilitate 3D visual representation, a new CO data dimension was derived by concatenating space and identification number.  2.5.2.3  Choice of visual representations – Figures 2.1 and 2.2, project 1 and design 1  Three dimensions of change order data need to be translated into visual representations. We chose to use positions on three visual space dimensions to encode them in Figure 2.1 because some researchers have found promise in this approach (Robertson et al. 1998). In addition, all COs executed in a given month are mapped against one colour to add clarity to the visual representation. Along the X axis, individual COs are not serially ordered according to their IDs but are sorted by their location. Thus as evident from the figure, the bars grouped at the left end are ‘off-site’ COs, the one in the central area are ‘on-site COs. And COs classified in the ‘both’ category are found at the right end. Thus, from this figure, for a given time instance, one can deduce the total number of COs generated, total base costs associated with the COs, and their concentration in space in terms of an aggregated location descriptor. Figure 2.2 provides a deeper insight into the project’s set of COs and perhaps tells a more compelling story than Figure 2.1. In this visual representation, each project participant is mapped onto its own colour (as observed later for designs of Project 2, the use of colour to identify participants can become problematic when a large number are involved). The participants are stacked over one another in a predefined order. In this case we have dealt with five participants in total, three on-site trades, Trade A, Trade B and Trade C, and two fabricators, namely Fab X and Fab Y. The vertical performance dimension axis represents the number of COs active for a specific participant in a given month (a dollar value could also have been used). The COs have also been sorted according to their location along the X-axis. This makes the available information easier to assimilate. A single cell in the horizontal plane of the graph yields the project participants involved, the number of COs 70  active per participant, the active month and the location of the COs. For instance, the arrow in the figure indicates that in the month May-05, Trade B had 7 active ‘On-site’ COs. An interesting observation made from this representation is that Trade A and Fab X have been affected by more change orders in terms of number than any other project participant. This figure also reiterates the message delivered by Figure 2.1 that most of the change orders generated were towards the end of the project time line.  Figure 2.2 also highlights one of the challenges involved in designing visual representations to maximize the clarity and visibility of the data represented, especially for communication purposes with external parties when static or hard copy representations must be used. For larger data sets, if vertical columns had been used, the taller columns in the front of the image would obstruct the view of the bars in behind, thereby hiding much of the content of the image. (In an interactive environment, this problem is lessened as users can experiment with different view angles.) To avoid this problem, we experimented with the use of cones and pyramids, and found the latter provided the most pleasing and useful image. However, perception problems can arise from such a representation. While only height of the pyramid is important, in looking at the image, most individuals implicitly use volume or surface area as the quantification metric, thereby underestimating (or overestimating) the level of effort of specific participants (e.g. Fab Y). Hence, for the representations produced for Project 2, only cylinders are used in order not to bias or distort the insights provided to the user. 2.5.2.4 Evaluation – project 1 Overall level – The two visual representations shown demonstrate that most of the change orders are clustered in the latter stages of the project, although a significant share of the total value of CO work was performed earlier and was associated with just a few on-site and off-site COs. Thus, from an analytical reasoning perspective regarding a potential causal relationship between number and value of COs and reduced productivity and schedule difficulties, one could argue that the clustering of the number of change orders in the latter stage of the project could have impacted productivity, schedule performance, and 71  management’s ability to coordinate effectively all of the changes. In terms of explaining or reasoning about relative performance of project participants, it is clear that Trade A and Fabricator X were affected most by the COs, which could explain why their productivity and schedule performance suffered more than for other project participants. However, missing from the visual representations, but addressed for Project 2 is the link between schedule performance and change order occurrence, information that is crucial to strengthening the argument about CO impact.  In summary, the two visual representations provided insights about how change orders were distributed in time, space and by project participant, which in turn could assist (and did) the client, contractor, and those adjudicating the dispute resolve differences of opinion about the impact of change orders on project performance. The same benefits were not derived from examining the spreadsheet of change order data no matter how sorted by those assisting the project’s contractor. The individuals involved did not attempt to forge visual images of the contents of the roster of change orders in order to comprehend how they were clustered in terms of one or more of the project’s context dimensions. Instead, they simply relied on presenting a listing of change orders. We have witnessed first hand similar approaches in practice, and in fact encountered such for Project 2, with these practices being an impediment to telling the construction story in a readily comprehendible manner. With respect to Design 1, Project 1, unfortunately, we are unable to compare and contrast the design of our visual representations with those proposed by others due to the lack of alternative designs being documented in the literature. However, our own critical evaluation of the images led to improvements in the design of the visual representations for Project 2. Components level – Evaluation at this level is done by checking the choices of data representations and transformations, visual representations, and interaction features. In the design of Figures 2.1 and 2.2, change orders were represented by the context dimensions of time when change orders were active, trades responsible for executing change orders, and locations where the change orders were executed along with the performance measurement 72  dimensions of number and dollar values of change orders. This data representation consists of the information and knowledge fundamental to identify trends of change orders versus time and the clustering of change orders in time, space, and by participant. The original data were then transformed by abstracting locations into three categories (on-site, off-site, and both on-site and off-site) and representing time by months, a more aggregated level of detail than by individual days. The former one is essential for management to identify site congestion issues if change orders were executed on-site in clusters; the latter one is essential to add clarity of visual representations and echo industry’s practice of processing and monitoring change orders in a longer time interval. Other data transformation such as deriving performance dimensions of change order percentage (cost of one change order/ cost of all change orders or cost of one change order/cost of original related work) could provide more insights and should be considered.  As to the choices of visual  representations, Design 1 (Figures 2.1 and 2.2) utilized three dimensional visualization space in order to maximize the use of spatial positions to encode multi-dimensional data. Lastly, because the primary purpose of Design 1 is to provide management with an overview of the entire distribution of change orders, interaction features supporting further data exploration were not considered essential to the analytical reasoning purpose of this design.  However, interaction features for enhancing data readability like “details on  demand” and “navigating visual representations” could be helpful for comprehensively examining both the details and profiles contained in Figures 2.1 and 2.2. 2.5.2.5 Lessons learned – project 1 Users may have preferences for adopting different visual representations of basically the same format. For example, instead of concatenating CO_ID and location together as was done in Figure 2.1, one could also concatenate time and location together. It is left to the user to determine which visual representation best suits their cognitive abilities, but the main message is that the design of a visual analytics environment must allow the user to experiment with different representations. From the practical perspective of construction users, what this means is that a relatively large range of representations needs to be precoded, along with some guidance as to the advantages of each for analytic reasoning. 73  Further, considerable care must be taken in choosing the shape of visual objects used to represent context entities or performance dimensions in order not to create misleading or false insights on the part of the user. 2.5.3 Visual representations for project 2 Having experimented with different visual formats to represent aspects of the change order data set for Project 1, we explored a broader range of images for a more extensive change order dataset for a complex rehabilitation project, Project 2. The sheer volume of the extra work orders generated (531) during the first 2/3 of the project duration and their occurrence frequency made change order management on this project a challenging task (the construction manager providing the data used the words extras, extra work order and change order as synonyms). These 531 change orders correspond to 560 subtrade involvements – i.e. if three subtrades are involved in a single CO the actual contribution to 560 is three. For the total project, slightly more than 750 change orders were generated. The issue confronting both the construction manager and client on this project was one of communication between the two as to the reasons for the large number and attendant cost of change orders.  Printouts of the construction manager’s change order spreadsheet  provided to the client did not resolve the communication problem. A variation of the visual representation presented in Figure 2.3, developed as part of our interaction with the CM firm, assisted in clarifying the change order story of the project, especially with respect to the origins of the change orders.  A reality of current industry practice is that data sets for a number of functions are invariably incomplete, either because only a subset of the properties defined for the item of interest are recorded, and/or an incomplete set of properties have been defined. Since a primary focus of project management staff is to maintain momentum on the job, keeping and updating records in a comprehensive manner often takes a backseat. Thus, data records for many of the change orders generated on this project were found to have certain missing properties in terms of trades affected, issue date and/or date of approval, when the work was actually completed, dollar consequences for each of the affected trades, etc. 74  Though our work is focused mainly on visualization of datasets, the usefulness of visualization is dependent on the completeness of the data set. Hence considerable effort was expended in trying to obtain as complete a data set as possible. To do so, we made use of relevant and associated documents like the contract register, site instruction (SI) and request for information (RFI) lists, we reviewed individual SIs and RFIs, and through discussions with on and off-site management personnel, we tried to track the missing links in the data. This allowed us to cluster data items using different attributes such as location of the work, physical system affected, trades involved, and turnaround times, thereby yielding more insightful visual representations, which proved to be beneficial to the CM when communicating with the client. We were able to accomplish this because of the direct access provided to the site, site records and management staff. Moreover, the staff members were enthusiastic in offering their comments and providing us with prompt additional information as and when required, and finally, senior management was motivated to use findings of the work as appropriate to enhance communication with the client.  A total of 3 different visual representation designs were generated, corresponding to Figures 2.3 though 2.7. In the discussion that follows, observations are made about the specific features of these designs. A detailed critique of Figures 2.3 and 2.4 is summarized in Table 2.2 to show the kind of evaluation procedure that should be conducted as an integral part of the design process.  2.5.3.1 Visual representations for project 2 and design 1, Figures 2.3 - 2.5 Figure 2.3 represents the distribution and reasons for the change orders, and is particularly useful for communicating with the client while also providing valuable insights on how the project is evolving. This figure conveys the distribution of changes over time, trades affected, and primary reason for the change. To enable the user to identify trends in the datasets and thus obtain additional valuable insights, cumulative totals for all change orders versus primary reason for change integrated over time and for all change orders versus time integrated over reason for change are presented as an option on the side and 75  Figure 2.3 Project 2 Number and reasons for change orders  76  back panels of the chart, respectively (an example of how additional information can be incorporated into the visualization space through an interaction feature). represents time in months when a change order was issued.  The X-axis  A more fine-grained  representation of time did not add value. The right most section on this axis flags time as ‘undated’. The COs included in this section are the ones for which the issue date could not be identified. As noted previously, datasets are invariably incomplete, and thus mechanisms to treat incomplete data have to be incorporated into the design of visual images. How best to do this is not always clear and hence more exploration on this issue and related ones (e.g. zero value COs and COs involving multiple trades, see below) is needed. The Y-axis divides the entire graph into 4 separate zones depending upon the reasons for the issued COs. As described, later, every change order in the datasheet was eventually allocated to a single primary reason for issuance. The vertical axis (Z-axis) which corresponds to the performance measure or variable of interest, represents the total number of COs affecting different trades issued in that particular month as in the case of the previous image. The majority of change orders involved the work of a single trade. Nevertheless, for some COs, two or more trades were involved. In such cases, for accurate representation in Figure 2.3 when the breakdown by trade is also treated, a CO will be ‘double or triple counted’ for the month in which it was issued (hence the 560 count in Figure 2.3). We observe that if the facility to generate an image like that shown in Figure 2.3 was to be incorporated into construction management software, then the option to include a breakdown by trade should be included, and a footnote automatically included in regard to the counting issue. On the other hand, if the breakdown by trade was not chosen as an option, then the correct count of change orders would be shown on the figure.  Of the total number of change orders generated on this project, a significant number were issued as a result of design changes. A large fraction of these were found to be zero dollar changes i.e. change orders having no dollar consequences. In generating this figure, $0 change orders have been coloured as though they belong to a trade, in this case $0 trade (see color legend in Figure 2.3). From a work monitoring perspective, such changes would still have to be tracked on a trade-by-trade basis, but for keeping count of all COs issued, it 77  was deemed acceptable to treat under a $0 trade designation. This particular case is mentioned as it highlights the kind of situations often encountered when attempting to represent data in a visual format. The need exists, however, to explore other ways of treating such situations in order to present as objective a view of data as possible. Figure 2.3 was developed based on refinements to the CM’s dataset. In the original dataset, the construction manager used a suite of six reasons and allowed for a many to one relationship – i.e. many reasons to one extra. Some of these reasons overlapped to a certain extent, creating considerable ambiguity in interpreting the data and communication challenges with the client. Upon seeing a first draft of the figure, management personnel realized they needed to adopt a less ambiguous set of reasons, which led to the use of the 4 reasons shown and a one-to-one relationship between a change order and the primary reason for it. The CM revised the dataset, which provided the basis for Figure 2.3. The foregoing observations speak to the challenges of having data accurately, unambiguously and completely collected while it is current, a non-trivial task given the preoccupation of management to maintain momentum on the job. Figure 2.4 looks at the distribution of the value of change orders, and assists with client communications while providing useful insights on budget matters. This image is very similar to Figure 2.3, the only difference being that the Z-axis now corresponds to dollar amount instead of number of change orders. The cumulative total of the dollar amount of COs integrated over time for each primary reason for change orders is shown on the side panel while the back panel has two separate line graphs for cumulative total of CO debits and CO credits vs. time integrated over reasons for change. It is observed that the number of COs is not necessarily proportionate to the dollar consequences of change orders. There can be situations where a large number of COs generated in a month totals to an insignificant amount whereas in other cases a single CO may cost a very significant amount (as discussed later, such observations provide powerful motivation for being able to create and navigate scenes comprised of multiple visual representations). Management staff therefore faces a two fold challenge of managing the flow of change orders and 78  Figure 2.4 Project 2 Distribution of value and reasons for change orders  79  Table 2.2 Summary of visual representation evaluations for Figures 2.3 and 2.4, project 2 EVALUATION ITEMS  Overall Evaluation  Components Evaluation (Data Representations and Data Transformation)  CRITQUE ANAYSIS FOR EVALUATION  Strength The visual representations in Figures 2.3 & 2.4 provide clear visualizations for: 1. Showing the existence of data patterns that may help to identify potential root causes or impacts of change orders. 2. Showing an overview of characteristics of change orders for monitoring a project from a CO management perspective. 3. Communicating with the client.  Weakness The visual representations don’t convey the full range of insights possible. For example, both Figures 2.3 & 2.4 do not provide information of how change orders are distributed by sub-trades, e.g., identifying ranking of trades by number or dollar value of CO. Another example is that this design is not able to present insights that can only be gleaned from a subset of the data such as dollar value exceeding a certain threshold.  Strength 1. Representing COs by the contexts of reasons for change, time, & responsibility and the performance measurements of # and $ values of COs provides the information and knowledge essential to i) identify root causes or impacts of COs and ii) understand selected properties of COs. 2. Transforming CO data to different levels of abstraction or granularity can increase the clarity of a visual representation, which is essential to ensuring usefulness to the intended industry audience. 3. Transforming CO data by aggregating COs in counts or dollar values based on various data query conditions, which is essential for observing the distribution of COs in various context dimensions. This method matches the current state-of-art concept of OLAP and data cube.  Weakness The aggregation is not exhaustive, and thus some insights may be missing – e.g. the design did not aggregate number of COs by COs that are of the same sub-trade. Another example is that this design did not aggregate number of COs if we query a subset of CO data with the filtering condition of dollar value being over a certain amount of money. However, this can be remedied by providing interaction features for users to choose level of aggregation on demand.  80  EVALUATION ITEMS  Components Evaluation (Visual Representations and Interaction Features)  CRITQUE ANAYSIS FOR EVALUATION  Strength Compact as much information as possible into fewer and clear images for quick scanning by users.  Weakness 1. The visual encodings used would be undesirable if interaction features are not supported (e.g. use color hue to represent many sub-trades and bar length to represent breakdown of COs by trade). 2. Lacks interaction features for: o Enhancing image readability- use of 3D visualization space requires view navigation to find an optimum scale and angle of 3D chart so that occlusion is minimized. Brushing technique also can be used to alleviate issues of occlusions and ineffective color coding. o Querying data- visual analytics in essence is querying data that is presented in visual forms. Therefore, basic data query abilities such as filtering data value ranges, sorting/grouping data values, and simple data transformation (e.g. data aggregation) are a must. o Choosing visual representations- different users have different visual perception preferences or cognitive styles. This difference could be a factor affecting the effectiveness of analytical reasoning. The interaction feature of changing visual representations on users’ demand should be supported – e.g. users should be able to change from 3D charts in Figures 2.3 or 2.4 to 2D charts (e.g. Figure 2.5). o Coordinating views- when observing Figure 2.4, users may become interested in COs having dollar values that are over a certain threshold, and want to know whether this subset of COs cluster in time and/or reasons for change, which could be observed in Figure 2.3. This can be done by directly selecting visual marks in Figure 2.4 as an instruction of filtering data, and then Figure 2.3 would highlight visual marks representing data that are only related to the data selection in Figure 2.4.  81  observing the cost of change orders as they affect the overall project cost. Thus Figure 2.3 helps management assess the effect of distribution of changes by number as they affect the targeted project completion time while Figure 2.4 helps assess the effect of cost of change orders by value of work on the overall budget. Since for the latter case one is dealing with the dollar consequences of COs on different trades, the issue of double counting of COs does not exist. Another observation is that some of the change orders actually generate credits. In order to identify these credits with greater ease they have been allotted a separate zone at the forefront in the image.  Again, the need exists to explore other  alternatives of displaying such information. One important message from both Figures 2.3 and 2.4 is that incomplete information can result in the inability to derive completely accurate insights. Without being able to properly distribute the number and value of change orders in time (the undated missing data problem), especially when the numbers involved are significant, the potential impact of the cumulative effect of COs may not be properly gauged. By portraying the data in the way we have chosen, this problem is highlighted, and could provide the incentive needed to search out the data required and/or being more diligent in recording essential data.  Figure 2.5 is a 2D stacked graph presenting information similar to the content of Figures 2.3 and 2.4. As noted earlier, different users have different preferences and capabilities for visualizing data, especially when it comes to 3-D representations. Hence it becomes necessary to develop alternate formats for the same data. Figure 2.5 represents all of the information from Figures 2.3 and 2.4 in a single representation consisting of stacked graphs with time as a common context dimension. This figure can be read in two parts. The top part of the graph is a scatter plot representing the total number of COs issued each month over the project execution phase. The pie charts in the graph are comprised of an inner circle that corresponds to the reasons for initiating these COs while the outer ring depicts the fraction of the number of COs affecting individual trades. For this figure, the X-axis indicates the time when change orders were issued and the vertical Y-axis indicates the total number of change orders issued. One important advantage of this graph is that COs associated with multiple trades are not double counted, as is the case in Figure 2.3. In 82  Figure 2.5 Project 2 Stacked graphs for number, values and reasons for change order  the bottom half of Figure 2.5, the total dollar amount of the COs issued each month is shown. Also shown on this graph is the cumulative dollar amount of COs issued to date. Figure 2.5 thus enables the user to determine the number of change orders generated, corresponding trades affected and the subsequent dollar amount in one go. However this graph does not show the division of dollar amount by trade as per Figure 2.4. This could be achieved, however, in the bottom half of Figure 2.5. From our experience in dealing with construction personnel, we venture the opinion that Figure 2.5 is probably preferred to the images shown in Figures 2.3 and 2.4 simply because of their greater familiarity with 2D project representations (e.g. drawings, sketches, etc.).  As 3D representations start to  permeate the industry with the adoption of Building Information Modeling (BIM), 3D representations of construction management data are likely to receive greater acceptance.  Figures 2.3 through 2.5 also highlight the challenges involved in trying to represent as much information as possible or too much information on the same image. For example, for Figures 2.3 and 2.4, by including a breakdown of number and value of COs by trade, 83  accurate counts for each are difficult to discern, especially when small numbers are involved. Further when many organizations are involved (for the case at hand 28 trades, including the $0 ‘trade’), the use of colour to distinguish between organizations breaks down – one simply runs out of a sufficient number of distinct and easily identifiable colours. This problem would only be exacerbated for much larger projects, when many more organizations are involved. Thus there are practical limits on how much information can be depicted on one image, even when supported by an array of user interaction features. Such challenges provide in part the motivation for examining data through coordinated data views, in which overview data can be portrayed along with supporting details (e.g. number of COs in each month, and then breakdown by trade and reason for change). 2.5.3.2 Visual representations for project 2 – design 2, Figure 2.6 Figure 2.6 examines the distribution of change orders by physical system and time, and thus helps identify clustering of work and potentially speaks to the quality of design documents issued by the various professional disciplines involved. In this case the vertical Z-axis represents the total number of change orders generated, the X-axis indicates the time in months when the change orders were issued and the other horizontal Y-axis represents the physical systems affected. These physical systems are further grouped under different ‘Major elements’ (e.g. Substructure, Shells, Interiors, Services) along the Y-axis. A single cell in the graph represents the number of change orders issued in a particular month affecting a particular physical system. For example, a total of 8 change orders were generated in the month of May-05 affecting the Exterior closure which forms a part of the group ‘Shells’. In some cases a single change order is found to affect multiple physical systems. In such cases the CO gets ‘double or triple counted’ for that month and that Main Element group. Thus the number of change orders affecting different physical systems of a group do not necessarily add up to the total number of change orders affecting that group. In the form shown, the use of colour does not add value. However, if it was desired to show additional information like the reasons for change, the use of color coding would be beneficial. 84  Figure 2.6 Project 2 Distribution of change orders by physical system  85  2.5.3.3 Visual representations for project 2 and design 3, Figure 2.7 The analytical reasoning purpose behind the visual representation shown in Figure 2.7 is to explore the potential existence of a causal relationship between number and timing of change orders and schedule performance. In generating this representation, use has been made of the first 402 change orders encountered. This 3D representation deals with the trajectory of forecast project completion time versus the cumulative effect of number of change orders with time (with the underlying causal model being that the greater the number of changes, the more the potential for an extended project duration). Note that number of changes, the Z axis, is used as the surrogate measure here, not value. For quick reference, the cumulative total of the change orders considered is also displayed on the back panel of the graph. To generate this representation, use was made of the sequence of project schedules generated by the CM (ready access to this data in the form of update date and projected completion date speaks to the advantage of having an integrated, multi-view data representation of a project, which was not a feature of the CM’s data). Across the horizontal axis (X-axis) is time, which serves two purposes: (i) to indicate the months when change or extra work was identified; and, (ii) to represent the dates of schedule update, starting with the original schedule before work started all the way to the last update observed by the research team. On the other horizontal axis (Y-axis) are listed the months when the project was forecast to be completed, with the dates of project completion reflecting the update version on the X-axis. The change order work is stretched out over these months of completion, to indicate how many more changes have occurred since the last update and projected completion date. The red line reflects the trajectory of movement of the forecast completion date. This visual representation portrays that a relation appears to exist between the number of changes occurring over time and the change in the projected completion date. However it would not be fair to state that all the movement in the projected completion date is solely due to number of changes (or for that matter value of changes if used instead of number) since there might be several other factors impacting the completion date (e.g. weather, labour shortages, etc.). We have, however, limited our scope to assessing the impact of changes on the project performance outcome as measured by project duration. While not straightforward to generate, this visual representation is a 86  Figure 2.7 Project 2 Causal model reasoning – number of COs and corresponding schedule update dates and projected completion dates  reasonably compelling one, and not only helps with identifying cause-effect relationships, but assists greatly in communication with the client. The main point here, however, is that carefully designed visual images that juxtapose data from two or more project data views can offer assistance in exploring potential causal relationships between project context dimensions and project performance dimensions.  2.6 Some general observations In this section we discuss a number of issues relevant to the development of a general visual analytics model for CM functions.  2.6.1 Applying identified principles for design process As a prelude to pursuing the application of visual analytics to a CM function, the first step is to determine the analytical reasoning tasks that could benefit from the visual 87  representation of associated data. Project context dimensions and performance dimensions involved in these reasoning tasks then need to be identified. Effective visual encodings of X, Y positions and colors can be utilized to map non-quantitative context dimension while Z positions can be used to map quantitative performance dimensions (i.e.: # or $ values of COs). The use of conventional graphics elements such as orthogonal coordinates, bars, and lines are generally sufficient to accommodate CM users from a broad range of educational and experience backgrounds. Trying to represent multi-dimensional data sets in a 2D or 3D space is difficult while maximizing image information content. Compactness is viewed as a virtue so that management can develop a holistic view (overview and details) as quickly as possible, without the requirement to navigate through multiple images. Thus, the design of visual images involves a great deal of iterative design and self evaluation using both hand drawn sketches and a variety of software tools in order to formulate visual representations in terms of their dimensionality (two-dimensional or three-dimensional), scale, viewing angles and colour-coding in order to maximize both the information content of each image and the insights that can be extracted. Issues like occlusion for 3D graphics and too many colours for effective color coding might be alleviated to some extent through interaction features (e.g. linking brushing, view navigation).  2.6.2 Evaluation and feedback Included as part of the design and evaluation process for Project 2 was exposing earlier versions of the images to a group of construction personnel including senior management, and incorporating feedback received (interestingly, personnel normally worked with large spreadsheets or other tabulations of data, and had not explored on their own how data visualization could assist them in their management tasks). One non-definitive observation of the reaction by construction personnel was that the notion of image compactness can lead to information overload, and the use of multiple images as opposed to a single image to convey the insights involved may be a better choice. However, the overwhelming reaction and positive feedback by management staff that Figure 2.3 would go a long way to having the client understand the change order story for this difficult project, a 88  preoccupation of management at that time, outweighed any downside of too much information on a single image.  Apart from the industry evaluation, our self evaluation also identified a number of merits of the images designed by following the principles of the visual analytics design process described previously. A consensus of the evaluation results is that an overall qualitative understanding of change order characteristics and impacts on project performance dimensions (e.g. the majority of change orders are design changes as seen in Figure 2.3, and changes orders related to contract issues increased as the project progressed as seen in Figure 2.6) can be perceived by glancing at those images for only a few seconds per image, particularly when Figures 2.3, 2.4, and 2.6 are placed closely together. We believe that such a quick and rich understanding could further trigger the tacit knowledge of project participants as to the impact of change orders on different project performance dimensions, thus leading to deeper insights.  2.6.3 Organizing lessons learned for development of a general CM visual analytics model Based on the design/evaluation work we have done to date, we have identified three general structures of visual representation designs and a suite of interaction features that are tailored to CM use and that can be readily extended to other CM functions. However, some design details still need to be tailored to the unique analytical reasoning needs associated with specific CM functions.  Lesson 1: General structure for visual representations of CM data  Scene structure 1 - visualizing characteristics of construction conditions: A 3D scene of several charts could be generated to present the characteristics of construction conditions observed from different project context dimensions. Each chart uses an X axis and/or Y axis, and/or color coding to represent three non-quantitative context dimensions (e.g. process, product, organization, etc.) and the Z-axis to represent a quantitative attribute dimension representing a construction condition (e.g. product 89  quantity, resource usage, problems encountered, etc.). The side panel and back panel design are used to visualize aggregated data values of the condition similar to the use in Figure 2.3. Different charts represent conditions observed from different combinations of context dimensions if the investigated condition associates with more than three project context dimensions (e.g. time vs. space vs. trade, time vs. trades vs. activity, etc.).  Scene structure 2- visualizing characteristics of construction performance: A scene structure that is similar to the one for visualizing characteristics of a construction condition can be used for performance dimensions (i.e. the Z-axis is used to represent performance dimensions such as number of deficiencies, time variances, cost variances, etc.).  Scene structure 3: visualizing potential cause-effect amongst construction conditions and performance dimensions: Scenes juxtaposing or overlaying charts of construction conditions with charts of construction performance (similar to Figure 2.7) can be very useful for exploring hypotheses as to reasons for performance. Also, one should be able to juxtapose or overlay charts of construction conditions with construction conditions or construction performance with construction performance. For this type of scene design, the “floor/wall” of the virtual 3D space could be flexibly used to position charts of construction performance/ conditions.  Lesson 2: A suite of interaction features In the near future, the visual representations developed should be coupled with interactive features like ‘zooming and filtering’, ‘details-on-demand windows’ or setting ‘dynamic query fields’, thus greatly enhancing the potential for analytic reasoning. For example, a simple click on a particular CO in Figure 2.2 would pop up a ‘detail-on-demand window’ listing all the required details of the specific data item, in this case CO properties (trade name, the month of interest and the Number of COs associated with the trade) selected from the list in Table 2.1 and contained in a user defined content profile. Further, by introducing filtering techniques, users would have the flexibility to view only data of current interest. For example, if a user prefers to obtain the distribution of extras only by 90  number and trade, with the use of appropriate filter options one should be able to generate the required image which would represent a subset of the content in Figures 2.3 through 2.5. Such selection and filtering capabilities would help management absorb the content of images faster and improve the quality of insights obtained, allow users to adjust image content to reflect their own cognitive style, help pinpoint specific issues and assist with decision making directed at resolving existing or emerging problems.  The range of  interactions features that should be incorporated are identified in the critique of Figures 2.3 and 2.4 contained in Table 2.2.  2.6.4 Issue of CM data management With respect to the data itself, during the process of designing a visual analytics model for change order diagnosis, we identified two major issues regarding current industry practice of data management: missing data values, and incomplete and dissociated data representations. Both of these issues speak to the importance of good data management for generating useful visual representations of data for assessing and communicating performance and related issues. The problem of missing data values was observed and described in the previous section with respect to one or more properties of individual change orders. Although we addressed this problem through a combination of searching through project records, discussing items with management personnel and in some cases assigning default values to some properties (e.g. assigning ‘undated’ status to COs with missing date values), unless accurate recording of properties is achieved the actual patterns of data in practice could be quite different from what would be visualized using incomplete data. For example, as stated previously, the patterns shown in Figures 2.3 and 2.4 would no doubt be changed somewhat if all of the undated COs were positioned when they actually occurred. The problem of incomplete and dissociated data representations was also encountered with industry practice, either as reflected in the commercial software applications used or internally generated spreadsheets. As a result, data fields and data association simply do 91  not exist with which to record several properties, including the association of a CO with the context dimensions of space and physical systems/components, which would help provide useful insights on potential causal relationships between context dimensions and performance dimensions.  The reality is that management personnel are focused on  maintaining project momentum and are often stretched to capacity, leading to only partially populating the predefined properties of different project records (e.g. COs, drawings, RFIs, etc.), with little consideration given to properties not explicitly defined. Fundamental to persuading personnel to collect additional information is the ability to demonstrate that the benefits significantly outweigh the costs, a proposition that in most cases is not easy to prove.  2.6.5 Data exploration flexibility Currently, many commercial data visualization systems are available for supporting generic visual data analysis. At a first glance, it seems that they are sufficient with which to explore integrated CM databases. However, based on our experience using these generic tools, we found that even with the ease of use facilitated by their state-of-art interaction and data query capabilities, users could still spend much of their time examining what data is available, deciding which data items can be useful for being visualized, and determining how best to visualize them. Although these visualization environments provided very flexible interactive features (e.g. iteratively changing data query conditions and visual representations on user demand) thereby increasing the potential to detect interesting visual patterns representing unexpected phenomena hidden in data, the significant amount of time required for exploring data in this type of environment should not be underestimated. However, on the other hand, if the visual analytics environment imposes a strict analytic scenario and forces users to follow steps of viewing only certain images, the rigidity may limit the usefulness of visual analytics. How to strike a balance between these two extremes and optimize the level of data exploration flexibility when designing a CM visual analytics environment is a topic that needs further work.  92  2.7 Conclusions Visual analytics, the science of analytical reasoning facilitated by interactive visual interfaces, has the potential to improve the construction management process through the enhanced understanding of project status and reasons for it, better informed decision making, and improved communication amongst project participants. To date, while some useful exploratory work on data visualization has been carried out by a few researchers, no significant body of work exists on the application of visual analytics to the discipline of construction, despite successes in other disciplines. An approach for developing such a body of work for construction has been outlined in this paper. Of the four pillars of visual analytics, namely the purpose(s) of the analytical reasoning, the choices of data representations and transformations, the choices of visual representations and interaction technologies, and the production, presentation and dissemination of visual analytics findings, the focus herein has been on choices of visual representations. General principles to guide the design of visual representations useful for construction management processes have been identified, with emphasis on the two primary purposes served by analytical reasoning – i.e. assessing and communicating. In terms of assessing performance, visual analytics can assist with predicting the future based on lessons learned to date, examining the past in order to better understand the as-built situation, comparing performance, and identifying potential causal relationships. Other advantages offered by visual analytics include the ability to work on a more factual as opposed to perception (feeling) driven diagnosis of reasons for performance to date, and the quickness, versatility and relative ease with which data can be represented and interpreted without the need for specialist assistance. As part of the general principles identified, the notion of context dimensions vs. performance dimensions was introduced, which is of direct assistance in formulating visual representation designs. To demonstrate the application of the concepts presented, an indepth examination of how visual analytics can assist with change order management was described. Data sets from two different projects were used to demonstrate the design and practical data collection challenges involved in formulating visual representations that are useful for analytical reasoning. A detailed assessment of two of the images was presented, both in terms of strengths and weaknesses, and interaction features desired were 93  highlighted. While somewhat obvious, it is important to use large scale datasets when designing and testing visual representations, as significant challenges exist with respect to scale in terms of the context dimensions of time, space, responsibility, physical components, process entities and work environment. It is believed that the lessons learned are readily extendable to other construction management functions, including the need to examine the use of coordinated data views as opposed to maximizing the compactness of an image in terms of providing both overview and detailed information in a single image, despite the desirability of doing so for a construction audience that is action driven.  In the near term, our focus will be on exploring visual analytics models for quality and risk management to demonstrate broad applicability of the approach and supporting principles. As part of this work, including previous work on change order management, comparisons of the utility of compact visual representations vs. coordinated data views will be made. Attention will also be directed on the design of visual representations for assisting in formulating and determining the validity of hypotheses for explaining construction performance (e.g. productivity, delays) – i.e. visual causal model reasoning. The most promising of the foregoing visual representations and accompanying interaction features will be implemented using state-of-the-art visualization tools and field-tested on actual projects. Feedback from such tests is essential in order to ensure the usefulness of the representations and their responsiveness to the practicalities and constraints of the industry. Our ultimate goal is to contribute to the design of a visual analytics environment that is attuned to the needs and attributes of construction managers. We believe that this environment should include a palette of pre-coded images and related interaction features.  94  Chapter 3 Design of a Construction Management Data Visualization Environment: a Top-Down Approach4 3.1 Introduction The primary focus of this chapter is on the use of a top-down approach coupled with the extension of thought processes, principles and guidelines previously described (Russell et al. 2009b) for the design and development of a data visualization environment for construction management. For most projects, the large volume of data generated while executing a diverse set of CM functions poses significant challenges to their constructors, particularly in regard to generating insights and deducing cause-effect relations in a timely manner in support of decision making. Data visualization can play a pivotal role in addressing these challenges and improving project performance in terms of “cost and profit, time, scope, quality, safety and regulatory compliance” (Russell and Udaipurwala 2004). It deals with the effective portrayal of construction data to generate insights about the data and to unveil the undiscovered useful information embedded in it (Keim 1996).  Visualization of construction data offers several benefits. These include:  identifying and communicating interdependent relationships across various data items thus enhancing the ability of the construction team to interpret data and improve decision making (Liston et al. 2000);  amplifying cognition of quantitative data;  improving and verifying the completeness and accuracy of data;  reducing the time spent in comprehending and explaining information;  providing managers with information rich overviews about the status of various project components;  avoiding misconceptions due to inadequacies in data sets;  explaining the divergence and disparity between the planned and as-built stories (Pilgrim et al. 2000; Shaaban et al. 2001; Songer et al. 2004); and,  4  A version of Chapter 3 has been published. Chiu, Chao-Ying., and Russell, Alan D. (2011). "Design of a Construction Management Data Visualization Environment: A top–down Approach." Automation in Construction, 20 (4), 399-417. 95   assessing the quality of a construction schedule (Russell and Udaipurwala 2000a). The term visualization is defined by Card et al. (Card et al. 1999) as “the use of computer supported, interactive, visual representations of data to amplify cognition”, and as “the act or process of interpreting in visual terms or of putting into visible form” (Nielson and Erdogan 2007). Card et al. (Card et al. 1999) has identified the need to support three types of interaction for information visualization in order to be able to modify: (a) data transformation; (b) data mapping between the data and its visual representation; and, (c) view transformation (navigation) for navigating through the visual representation.  In using the term data visualization environment, which we consider herein as being synonymous with a visual analytics environment, we mean a computerized information system which enables users to create their own scenes composed of one or more pre-coded visual representations and an interface that assists users to interact (filter, sort, zoom, highlight & coordinate, etc.) with data and a palette of pre-coded images designed to facilitate analytic reasoning. Distinguishing features of a holistic data visualization environment from that of data visualization capabilities of current state-of-the-art CM systems revolves in large part around two issues. First is support for dynamic versus static images, specifically the degree to which the user can interact in real time with an image or collection of images across the temporal, spatial, organizational, product and process dimensions of a project. And, second is the breadth of functions treated and associated analytical reasoning tasks supported, within the confines of individual tasks for a function, across multiple tasks within a function and across multiple functions.  Addressed in Figure 3.1 are several aspects of the foregoing issues for a subset of CM functions and associated tasks. Depicted is an approximate subjective assessment of the current state-of-the-art in terms of functionality versus what could be achieved in a holistic data visualization environment (the triangular / trapezoidal shapes reflect intensity (height) and breadth (length) of support offered). Significant opportunities exist for enhancing the insights that can be generated from the speedy exploration of large information spaces in 96  Quality management o Inspect work o Track deficiencies o Rectify punch list o .  Document control o Manage contracts o Track correspondence o Track RFIs o . Risk management o Risk identification o . Cost control o .  tasks within a function and between functions; coordination amongst images, extensive interaction capabilities  Scope management o Drawing control o Manage COs o .  System with holistic visualization environment – strong task visualization for all relevant functions with scene capabilities between  Time management o Plan & schedule o Model resources o Analyze productivity o Monitor site conditions o Report progress/problems o Update schedule o Analyze variances o Explain performance o Revise schedule o Prepare claims o Trend analysis o .  CM system with holistic visualization environment Task Across Across level tasks functions  scene generation capabilities amongst functions; limited interaction capabilities  Current state-of-art of CM systems Task Across Across level tasks functions  Current CM systems – strong task visualization for a few functions with some scene capabilities for tasks within a function; low  Function/Tasks  .  Project Database Product, process, participant, cost, quality, environment, risk, change, as-built data views Task level – capability within a task (horizontal); across tasks – vertical within a function; across functions – vertical across functions  Figure 3.1 Differentiating between current state-of-art CM systems and potential of systems with a formal visualization environment applicable to a wide range of functions 97  terms of improving and broadening analytic reasoning at the individual task level, and amongst tasks and functions in terms of scene generation. Also, data visualization has the potential to help overcome some of the integration challenges created by the need to discretize the job of construction management into somewhat independent functions in order to make it manageable. In summary, data visualization can assist with at least three significant management needs: analytical reasoning, communication, and learning (Pich et al. 2002; Puddicombe 2006). Ideally, data in support of all functions would reside in a single system facilitating its ready access, but the reality is that multiple systems may be involved. An underlying assumption of our work is that however the data is stored (single system or multiple systems), all of it can be accessed for use in a visualization environment.  3.2 Approach and structure of chapter In Chapter 2, three questions posed and pursued, with emphasis on the first two, dealt with the principles and guidelines for design of a visual analytics environment, the ability of users to perceive the usefulness of individual visual representations, and how to design and implement an analytics environment which is responsive to the realities of the construction industry. With respect to the principles treated, topics addressed included understanding the purposes of analytical reasoning, organizing data representations and data transformations, designing visual representations and interaction features, and the evaluation of visual representation designs. In this chapter, using primarily a top-down design approach which is described later, we extend our past treatment of principles and guidelines by setting out some concepts of analytics as they apply to CM in general. We then elaborate on these concepts in the context of time management, with the focus being on planning, monitoring, diagnosing and controlling time. This allows us to address more fully the third question posed previously, namely “How should a visual analytics environment be designed and implemented so that it is responsive to the realities of the construction industry and satisfies the criterion or test of practicality?” (Russell et al. 2009b).  98  The visualization environment design task is aided by a top-down approach because it helps one:  identify the functions to be treated as well as the linkages amongst functions and related tasks in terms of the use of shared data either directly or through transformations;  identify the analytical reasoning processes about project performance and conditions to be served (i.e. the suite of questions for which answers / insights are sought);  establish requirements for consistency in the formulation of images and interaction features developed (e.g. filtering, sorting, highlighting, coordination, and navigation capabilities);  describe the degree of flexibility that users should have especially in terms of being able to create scenes to assist in establishing likely casual relationships;  determine how the environment should be evaluated to assess the degree of conformance with established requirements; and,  identify opportunities for data visualization to extend existing management capabilities in terms of reasoning, communication and learning by critically assessing strengths and weaknesses of current management practices (a particular benefit of a top-down approach).  A bottom-up approach on the other hand assists one to identify the properties required of individual images that could be helpful in addressing the analytic reasoning associated with individual tasks within a function and at the function level itself. It is at this detailed level where one applies many of the findings of researchers on image design and image specific interaction features (encodings, use of colour, etc.) (Meyer et al. 2009).  In  carrying out this approach, opportunities for improving insight-generation capabilities can be identified through the design of novel images that juxtapose information in a manner that aids reasoning (e.g. clustering of change orders in time versus the trajectory of forecast project completion date – see (Russell et al. 2009b)). In reality, one iterates between a topdown and bottom-up approach as the design of a visualization environment is evolutionary in nature – development of one set of capabilities generally leads to ideas for additional capabilities – from the detailed level up to the overall environment level and vice versa. 99  The remainder of the chapter is structured as follows. A brief overview of recent construction data visualization work is first provided. Then, as part of the top-down approach, we introduce concepts and useful terminology related to a structured way of thinking about analytical reasoning and visual analytics, and their relationship with construction management functions. The focus of the latter then shifts to how a construction data visualization environment can support project participants’ analytical reasoning needs for the management of time, specifically planning/predicting and monitoring/diagnosing/controlling construction conditions/time performance. A case study of aspects of an actual project examined using the construction data visualization environment developed to date is then presented. Purposes served include demonstrating the breadth of support that can be offered for reasoning by such an environment and thus its usefulness for conducting CM analytic tasks, and providing a test case for demonstrating the kind of evaluation process one should engage in to assess how well an environment conforms to the requirements set out for it. Time management functions treated for this case study include assessing quality of a baseline schedule, assessing actual vs. planned construction conditions/time performance, and assessing reasons for deviations. An evaluation of the current environment is then made to assess conformance / nonconformance with the requirements established for it and to identify worthwhile extensions to it. This evaluation involves both a top-down and bottom-up approach. The chapter concludes with a discussion of lessons learned from work performed to date, and their application to create a more comprehensive visualization environment that supports key tasks within a CM function and multiple functions, as depicted in Figure 3.1.  3.3 Data visualization in construction management Significant opportunities exist for the integration of advanced interactive tools and techniques along with visual analytic tools in support of a diverse range of CM functions. To date, however, only a modest level of effort has been expended by the construction academic community on this topic. Despite the fact that the field of visualization has been instrumental in representing how physical artefacts are to be built from constructability reasoning and construction method workability perspectives (e.g. (McKinney and Fischer 100  1998; Sriprasert and Dawood 2003; Staub and Fischer 1998)), the literature reveals very little about the visualization of heterogeneous multi-source, multi-dimensional, and time varying data in the context of construction management.  With regards to visualizing structured-abstract construction data, Russell and Udaipurwala (Russell and Udaipurwala 2000b; Russell and Udaipurwala 2002) demonstrated the value of using linear planning charts which, when combined with ancillary images pertaining to the distribution of resource usage in time and space, permitted additional insights on the quality of a schedule to be gleaned. Extensions to this work to include 4D CAD further enhanced the ability to generate insights on quality of schedule and strategies to improve schedule performance (Russell et al. 2009a). Songer et al. (Songer et al. 2004) developed and evaluated four visual representations including scatterplot, linked histogram, hierarchal tree, and treemap to represent structured cost control data. Song et al. (Song et al. 2005) proposed a 3D model-based project management control system where the visual platform (i.e. the 3D building model) itself serves as a construction information delivery platform. Vrotsou et al. (Vrotsou et al. 2008) applied Time Geographical methods to visualize work sampling data to allow analysts to understand better the distribution of activities and the interdependencies amongst them. Zhang et al. (Zhang et al. 2009) used an integrated building information system and digital images captured on site to semi-automate the calculation of progress measurements (e.g. cost and schedule variance) for items of a work breakdown structure. Lee and Rojas (Lee and Rojas 2009) recommended principles for designing effective visual representation for actual construction performance data as did the author in Chapter 2.  Based on the literature review of past research on construction management data visualization, it is observed that “a limitation of work to date on abstract construction data visualization as opposed to physical product visualization is that it is mainly exploratory in nature, with limited breadth in terms of the type of data and information entities examined, management functions examined, and guiding principles for designing relevant visual images” (Russell et al. 2009b). In particular, relatively little research seems to have been 101  conducted that addresses a detailed and systematic process for designing either specific images or a comprehensive CM data visualization environment reflective of clearly identified analytical reasoning, decision making, communication, and learning needs. Such a process is required in order to develop a detailed specification / set of requirements for the design of a construction data visualization environment comprehensive enough to support the full spectrum of CM functions.  3.4 Concepts of analytics and relation to development of a construction management data visualization environment As part of our top-down approach, we seek a structured way of thinking that will help in formulating visual images of data and supporting interaction features to assist with the analytical reasoning tasks associated with various CM functions, singly or in combination, and related performance measures. Introduced in this section is important terminology related to analytical reasoning and visual analytics, and its adaptation to the context of construction management. We make extensive use of the term ‘model’ in reference to explicit and implicit (tacit) knowledge models and predictive versus explanatory (diagnostic) models. The term explicit model is used to describe a quantitative relationship between input and output variables, whether based on fundamental principles (e.g. Critical Path Method (CPM) network model) or empirically derived (e.g. regression equation, factor based mathematical relationship, neural net). The model structure may or may not be transparent to the user. Implicit models are viewed as residing in the minds of experienced construction management personnel, and in general cannot be readily documented. A predictive model refers to the application of either an explicit or implicit model to forecast or predict likely outcomes in the future based on some assumed fact pattern or set of input variables. In contrast, an explanatory model involves explaining actual outcome values, and basically involves working an explicit or implicit model in reverse and as measured against some originally assumed set of conditions or objectives.  102  3.4.1 Analytics for construction management Here we present several analytic concepts as they relate to construction management in general and which assist in determining CM data visualization environment requirements. Construction project management involves an integrated process of managing construction conditions to achieve required levels of performance, dimensions of which include time, cost, scope, quality, safety and risk, and values of which depend on how the construction conditions imposed and/or encountered are managed. The terminology of “construction condition” is an umbrella concept covering construction strategies imposed, construction requirements dictated in contracts, construction constraints encountered, and so forth. Dependency relationships exist amongst construction conditions and performance (e.g. a dependency relationship between productivity and activity duration). Therefore, the terminology of “construction dependency relations” is used for referring to such relationships.  Construction conditions, performance, and dependency relations can be understood by their characteristics, which can be viewed from three perspectives. Firstly, they have a temporal status – one of planned, in progress, or actual. Secondly, conditions, performance and dependency relations can be characterized at different levels of detail ranging from overall characteristic (i.e. overall qualitative pattern), local characteristic (i.e. local qualitative pattern), and individual characteristic (i.e. single value). For example, the characterization of time performance can be described as stable overall (i.e. the entire project remains on schedule), the rate of production is accelerating in the middle of the project, or simply the duration of an individual activity. Also, the level of detail can be defined by data range (e.g. entire data set vs. a data subset) and/or data granularity (e.g. all footings vs. spread footings only). Lastly, conditions, performance and dependency relations can be examined from various perspectives such as time, space, product, process, and participant. In general, the primary purpose of the analytical reasoning (i.e. analytics) involved in construction management processes is to gain an understanding of the characteristics of construction conditions, performance, and dependency relations. This “analytics for construction management” is referred to as CM analytics herein. 103  Most of the CM processes associated with different project phases involve the iterative application  of  CM  analytics  for  the  purposes  of  planning/predicting,  monitoring/diagnosing/controlling/, and re-planning/ re-predicting construction conditions, performance, and dependency relations. For example, construction time management during the planning phase involves assuming/imposing (i.e. planning) certain construction conditions, applying explicit or implicit dependency relations or models to obtain a forecast (prediction) of time performance and then seeing whether the forecast time performance satisfies the contractual requirements. Purposes served, tasks, and workflows of CM analytics applicable to construction management functions are depicted as CM analytics flow charts in Figure 3.2 and elaborated upon as follows: 1. CM analytics for predicting/planning purposes, Figure 3.2(a): Steps involved include the specification of inputs for the explicit CM prediction model of interest (e.g. (Babu and Suresh 1996; Chao and Skibniewski 1998; Chong et al. 2005; Motawa et al. 2007; Staub-French et al. 2003)), running the relevant model, and obtaining model outputs. The models can be mathematical-function based or artificial intelligence based, and deterministic or probabilistic. Having obtained model outputs, one then examines both inputs (i.e. construction conditions/performance assumed) and outputs (i.e. construction performance predicted) to gain insights (i.e. the purposes served) into:  characteristics of the inputs and outputs (e.g. crewing levels vs. milestone dates, undesirable construction conditions vs. performance impacts, etc.) so that project participants can identify additional construction dependency relationships possibly unique to the project at hand or limitations on the models used. Other purposes served include inspecting quality of data entries, and assessing validity of the models used; and  how the change of inputs affect outputs, with the goal being to identify the best plan possible. 2. CM analytics for monitoring/diagnosing/controlling purposes: The primary purposes served here deal with examining performance to date, explaining (diagnosing) reasons for it, and then determining the most relevant actions to take. Use is made of, and thus support is required for, both formal (explicit), diagnostic (explanatory) models as well as 104  (a)  (b)  (c) Figure 3.2 CM analytics flow charts  105  the implicit ones which reside in the minds of project participants. Specifically, diagnosis can be made:  with the use of explicit CM explanatory models (see Figure 3.2(b)) by preparing inputs for an explicit CM explanatory model (e.g. Battikha 2008; Moselhi et al. 1991; Russell and Fayek 1994; Soibelman and Kim 2002), running the model, and obtaining outputs. The models can be mathematical-function based or artificial intelligence based. By examining the inputs (i.e. deviation of actual vs. planned/baseline construction conditions/performance) and outputs (i.e. actual or planned/baseline construction conditions/performance that explain the deviations) of the explanatory models, insights into their characteristics can be gained, reasonableness of model results assessed, and changes made to planned/baseline construction conditions/performance as appropriate; and,  without the use of explicit CM explanatory models (see Figure 3.2(c)). Steps involved include examining actual vs. planned/baseline construction conditions/performance to gain insights into their characteristics so that likely reasons for any deviations between the actual and the planned/baseline can be inferred, using human-based reasoning processes. Once these reasons are identified, changes to planned/baseline construction conditions/performance may be pursued as part of the corrective/preventive actions.  The solid flow lines connecting the text boxes in the CM analytics flow charts (Figure 3.2) represent the CM analytics workflows that utilize explicit, machine-based CM models (referred to later as machine-based CM analytics) while the dashed flow lines connecting text boxes and mind maps represent CM analytics workflows that utilize human tacit CM knowledge (referred to later as human-based CM analytics). Different project managers have different thinking styles, experiences, and knowledge (Tullett 1996) and it is difficult to predict the steps a person takes to explore, acquire, organize, and use information to assist analytical reasoning (Stolte et al. 2002; Tullett 1996). Nevertheless, the CM analytics processes depicted in Figure 3.2 account for most CM analytical reasoning scenarios that project participants use in carrying out construction management functions.  106  3.4.2 Visual CM analytics supported by a data visualization environment For project participants to conduct the CM analytics tasks just described, an essential complementary task is to search out and analyze data relevant to construction conditions, performance, and dependency relations. The current state-of-the-art of CM data analysis is dominantly computational in nature and machine-based. However, computational data analysis requires pre-defined algorithms that capture explicit CM knowledge either in transparent or non-transparent form. While useful, these algorithms on the one hand can be too simplistic because they leave out complex and comprehensive real world phenomena, or on the other hand, require data analysis experts to operate them (Diekmann 1992), leading to reservations on the part of industry to use them. Computational data analysis for construction management constitutes the “machine-based CM analytics” mentioned in the previous sub-section. As an alternative to the foregoing, a relatively recent data search/analysis paradigm of “the use of visual representations of data and interactions to accelerate rapid insights into complex data” and coined “visual analytics” (Thomas and Cook 2005) is advocated herein. It transcends the use of computerized statistical graphics as an essential data analysis tool which began in the 60s~70s (Friendly 2008; Schmid 1983) and the development of visualization for information search emerging in the 80s~90s (e.g. (Card et al. 1991; Shneiderman 1994)). The paradigm of visual analytics has the potential of addressing some of the shortcomings of computational data analysis. Equally if not more importantly, it is also a particular fit for CM use because a construction project involves all types of data ranging from structured abstract data (our primary focus herein), geometric data, and unstructured data such as pictures and textual documents. With the advanced capabilities of graphics automation, interactivity, and computation provided by computer technology, it is possible to develop a computerized construction data visualization environment that allows project participants to go quickly through complex data presented in easilyunderstood/natural visual forms. This facility allows them to gain important insights by applying their experience and knowledge to interpreting what they see in the images. Therefore, a terminology of “visual CM analytics” can be defined as “conducting CM 107  analytics by the visual analytics approach”. This kind of visual data analysis is mainly applied to support the “human-based CM analytics” mentioned in the previous sub-section and focused on hereafter.  A data visualization environment should allow users to conduct visual CM analytics in support of both human-based and machine-based CM analytics. Such an environment is meant to be an interactive one built as an integral part of a computerized construction information system.  It should allow users to access and navigate galleries of visual  representations on demand. These representations depict the complex data stored in the information system and illustrate salient characteristics of construction conditions, construction performance, and construction dependency relations hidden in this data. The shaded text boxes of the CM analytics flow charts seen in Figure 3.2 illustrate construction data that should be turned into visual representations or images that are pre-coded in the construction data visualization environment for project participants to select and view. This gallery of images has to be capable of presenting the characteristics of a spectrum of construction conditions and performance reflective of different time statuses, at different levels of detail, and as observed from the perspectives of different project views (e.g. product, process, participant, as-built, change, quality, etc.). The dashed arrowed lines going from shaded boxes to mind maps in the same figures represent the possible CM analytics workflows that project participants may take by interacting with this visualization environment. This interaction takes the form of iteratively accessing and navigationally viewing the pre-coded images individually or in the form of image scenes, depending on participant cognitive styles and purposes of the CM analytics. Inclusion of a comprehensive set of interaction features is core to the successful development of a responsive visualization environment.  To support the types of analytics described in the foregoing, three essential requirements must be addressed in a CM data visualization environment, as follows: 1. An extendable gallery of ready to use (pre-coded) visual representations of CM data should be supported.  These ready to use representations encapsulate a range of 108  construction conditions/performance mapped against the primary CM data dimensions of time, space, product, process, and participant in support of analytic reasoning. Predefinition of these representations in as flexible and easy to use manner as possible is seen as being essential for them to be employed for everyday use by construction practitioners. 2. When representing specific construction conditions/performance couplets, the ability to represent different status states must be supported (e.g. planned versus actual). This includes the ability to show the differences (deviations) in status values of performance measures as a function of the differences of status values of condition parameters. 3. A rich set of interaction features should be present to allow users to interact with the environment in support of the following essential tasks: (a) choose the visual representation to be generated; (b) adjust granularity and value range of data dimensions for the visual representation of interest; and, (c) view several visual representations which portray different aspects of construction conditions/performance either sequentially or simultaneously (i.e. multiple views (Wang Baldonado et al. 2000). Being able to display multiple views is particularly useful because not only does it reduce the burden on human visual memory (Plumlee and Ware 2002), it also supports (i) scanning overviews and then viewing details, (ii) comparing differences (Roberts 2007), (iii) viewing a range of attributes (Wang Baldonado et al. 2000) related to the user’s chosen CM strategy, and (iv) comparing conditions with performance measures to observe likely cause-effect relationships.  3.4.3 Analytics for time performance management Described in this section are the analytics reasoning processes associated with planning and then monitoring, diagnosing and controlling time performance. In the case study section of the chapter, we demonstrate an implementation of these processes in the form of a research visualization environment.  109  3.4.3.1 Visual CM analytics for planning/predicting time An important and representative CM analytics task for planning/predicting analytic reasoning purposes involved in construction time management is appraising the quality of a planned schedule (i.e. inputs and outputs of the scheduling model). This task encompasses many aspects of planned construction conditions/performance (e.g. crewing and sequencing conditions, early/late finish dates, etc.). A complementary reasoning task that involves the same data and which could benefit from the application of visual analytics is judging the quality of data entry, which includes identifying erroneous data values (e.g. crewing data, activity duration data, etc.).  Reasoning about a schedule and associated data in terms of both formulation and evaluation from the contractor’s perspective involves a number of considerations which are also applicable to the reasoning associated with monitoring/diagnosing/controlling. They include:  Ordering of construction processes: The effective ordering or sequencing of construction processes, requires consideration of at least three project dimensions, these being time, space, and project participant. The time dimension speaks to the logic or order in which work should be performed, and hence its placement in time. Intersecting with this ordering is consideration of the sequence in which trades (project participant) should perform their work to the extent that it is discretionary, so as to minimize deficiencies plus access and possibly congestion issues. For large scale linear projects in particular, consideration also needs to be given to the order in which work flows through locations – i.e. the location sequence for multi-location activities. For example, it is desired to execute such activities in order of physical location adjacency so as to reduce mobilization efforts to a minimum. Time ordering considerations other than technological ones and preferred trade sequencing deal with balancing production rates by adjusting resource levels to reflect different location and trade work scopes and achieving work continuity to minimize interruptions between work at the locations of a multi-location activity.  Distribution of construction processes: With respect to the distribution of 110  construction processes, consideration should be given to the project dimensions of time, space, project participant and physical component. The issue of granularity with which these dimensions is expressed also comes into play, as discussed below. Schedule properties of interest include: (a) how much work is/can be packed into a specific time window and over how many locations; (b) how much work is being executed during this time window at the same work location with emphasis on simultaneous or overlapping activities and what is the distribution of this work in terms of sub-locations, if any; (c) what is the footprint of the work being performed for a specific time window (i.e. number of locations in progress simultaneously; (d) how many work faces are there and their distribution over the project site; (e) what spans of control for project participants in terms of active locations are implied; and (f) how much time is required to complete an individual physical component or collection of similar components at a work location. All of these properties reflect on the workability and quality of a schedule.  Granularity: Granularity refers to the level of aggregation or decomposition that is useful for communicating information to project participants and for extracting meaning from the schedule. Dimensions involved include time, space, project participant, physical component (product), and activity (process). Ideally, it would be desirable to be able to work at different levels of detail for each dimension. In terms of time, from hours to days to weeks to months to years; for space, from collections of locations to individual locations to sub-locations of a location; for participants, from individual trades to collections of participant types; for physical components, from individual constituents of a component to a complete component to collections of a specific component, and for activities from individual activities to collections of activities for shared properties of interest – i.e. belong to the same physical component or same project participant.  Compliance: Compliance refers to meeting requirements specified by contractual language (e.g. milestone dates, including project completion and client specified time windows for specific aspects of the work), applicable codes, regulations and agreements (e.g. permits required, allowable working hours and days, etc.), and constraints imposed 111  by prevailing natural and man-made conditions (e.g. black out times for certain types of work, resource availability limitations, restrictions on work face access, etc.).  3.4.3.2 Visual CM analytics for monitoring/diagnosing/controlling time During the execution phase, construction time management is mainly an iterative threestep process of monitoring, diagnosing, and controlling construction conditions/time performance. As an overview of the primary roles that CM analytics can play, for the monitoring function, it can assist with identifying deviations between actual and planned/baseline characteristics of construction conditions/time performance.  For the  diagnostic function, it can help in identifying actual or planned/baseline characteristics of construction conditions/time performance that explain the deviations. And for control, it can help to identify changes to planned/baseline construction conditions/time performance that are reflective of findings from the diagnostic function and that may aid in mitigating problems encountered to date. An additional function served by CM analytics is to assist in determining the completeness and accuracy of conditions data by providing big picture views of it in order to determine data gaps and anomalous values.  Incomplete and  inaccurate data can make difficult the task of diagnosing the true causes(s) of deviations encountered and can result in conflicting interpretations of reasons for performance to date.  In essence, in order to design a visualization environment for the monitoring, diagnosing and controlling functions in the context of time management, one seeks answers to the following question groups. Question group (1) – Fact representation of time performance (i)  What are the key performance indices for construction time performance?  (ii)  What are their values and patterns of behaviour and how can they provide useful insights into time performance versus various project dimensions and assist in positing cause-effect hypotheses?  Addressing these questions allows one to examine time performance as a function of one or more project dimensions (i.e. space, time, participant, product, process) and determine 112  regions of inadequate performance – i.e. one seeks to determine the facts. This might then be followed by reflecting on possible reasons for it, and hence conditions that should be explored. For example, if activities (process dimension) only experienced delays in the later stages of a project (the time dimension), for one project participant (e.g. subtrade), then construction personnel could focus their attention on conditions pertaining to this stage of construction in order to identify potential causes of delay. Question group (2) – Fact representation of conditions and other performance measures (i)  What construction conditions or other construction performance measures (i.e. measures not directly time related such as scope, safety) may have contributed to the time performance detected in addressing question group 1?  (ii)  What are their values and patterns of behaviour and how can they provide useful insights into time performance versus various project dimensions and assist in positing cause-effect hypotheses?  The primary value in seeking answers to these questions is to determine if the factors contributing to unsatisfactory time performance are more broadly based than ones for which management personnel believe there is a direct causal relationship with time performance. This is reflective of the intricate web of interactions amongst the condition variables and performance dimensions that accompany construction projects. For example, it has been demonstrated that construction data collected for one management function has the potential to be used in CM analytics for more than one management function. This was illustrated by Lu and Anson (Lu and Anson 2004) who used quality control data to analyze the actual productivity of placing concrete. Thus, the facility must be present to allow users to explore a wide store of data, not just data directly related to one performance measure. Question group (3) – Evidence in support of participant causal models (i)  What is the specific evidence in terms of comparisons between planned and actual conditions and between conditions and performance that support project participant causal models to explain the time performance of a particular collection of activities related to one or more project dimensions? 113  Here one seeks to demonstrate compelling causal links between conditions and performance determined by addressing question groups 1 and 2 as a function of known and generally accepted quantitative models, participant tacit knowledge, or by direct learning from the unique project context at hand. In so doing, one seeks to pinpoint conditions/performance evidence that can objectively explain the time performance of interest. This can be a challenging task, given that the analytical reasoning involved may not be limited to simply identifying one layer of cause-effect relationship between construction conditions and construction performance.  3.4.3.3 Visualization requirements deduced from time performance management analytic needs Based on the foregoing discussion of the time management function, the requirements for a generalized CM data visualization environment identified earlier in the chapter should be augmented by the following ones: 1. One focus should be on visualizing individual activity time performance, for which the time units of date and duration are used as performance metrics. The ability to represent visually contractual requirements with regard to planned and actual time performance of activities should also be supported in order to assess compliance with contractual requirements, both for the planning and execution phases of a project. 2. The ability to associate construction conditions that can be represented visually with individual activities should be supported. Thus, for activities that experience unsatisfactory time performance, conditions directly relevant to those activities can be readily displayed. The association can be explicitly specified by users or inferred because of their proximity in time and space, and possibly with participant and product. 3. The ability to display sequences of activities and associated conditions should be supported. Such sequences can relate to one or more paths in a network model, the sequence of work of an individual trade, or the sequence of work at a specific work location.  114  3.5 Case study of CM analytics using a data visualization environment A case study that treats construction time management during the planning, execution, and post execution phases is used for a two fold purpose: (a) to demonstrate how a construction data visualization environment can assist with important CM analytic tasks; and (b) to elaborate on the process for developing a CM data visualization environment, with emphasis on evaluation to assess compliance with requirements developed as part of a topdown  and  bottom-up  design  approach.  Analytic  purposes  served  include:  planning/predicting (e.g. assessing planned construction condition/time performance such as quality of schedule) and monitoring/diagnosing (e.g. assess actual vs. planned/baseline construction conditions/time performance). Such case studies also help to identify the benefits that could be derived from additional features or requirements of a visualization environment. The current implementation of the visualization environment forms part of the REPCON research system (Russell and Udaipurwala 2004). It employs custom designed schedule graphics routines and pre-coded as-built graphics and associated features using CHARTFX 6.2 Client Server (Software FX Inc.). During the original development of the system when creation of a visualization environment was not the primary focus, a bottom-up approach was used for visual representations, with the focus being on the design of individual images. This led to recognition of the potential for a comprehensive data visualization environment for integrating the use of these images and hence pursuit of the current top-down approach. As a result, there has been a significant enrichment of the palette of images and user interface features supported.  3.5.1 Case study overview The case study data comes from a 3 km segment of the original Advanced Light Rapid Transit Project (ALRT) in Vancouver, British Columbia built some 24 years ago. It reflects the actual as-planned and as-built schedule data and the problems encountered, as seen through the eyes of the contractor. The scope of the work consisted of building 103 foundations and piers in support of a pre-cast beam elevated guideway, with installation of the beams being performed by others. Use is made here of three project views: (i) the product view which contains data related to what is to be built and the site context; (ii) the 115  (a)  (b)  (c)  Figure 3.3 Product view – (a) project locations; (b) location attributes; and (c) photo of components  process view which contains data pertaining to how, when, where and by whom a project is being built; and, (iii) the as-built view which captures data that describes what happened, why and actions taken. The product view (Physical Component Breakdown Structure (PCBS)) consists of a simplified list of project components and a listing of all work locations in planned location sequence, as partially shown in Figure 3.3(a). Location attributes related to problems encountered are depicted in Figure 3.3(b), and a photo of actual columns is shown in Figure 3.3(c). The project plan and schedule (process view) was recreated using the original assumptions and constraints related to work location sequence, work continuity constraints, number of crews, contract milestone constraints, and decomposition into phases because of the then capacity constraints of the scheduling system used. (A much more elegant modeling of the project could now be done using current technology which would enhance the clarity of the schedule images.) As can be observed in Figures 3.4(b) and (c), respectively, the bold red lines in the linear planning (LP) chart and filled bars in the bar chart schedule representations show the critical activities. Because of the way the schedule was originally defined with start milestones to 116  link the phases and no finished milestones with late date constraints, no critical path is shown from beginning to end of the project. Scoping of activities includes: (i) survey & layout; (ii) excavate & mud slab; (iii) form & reinforce footings; (iv) pour footings; (v) form columns; (vi) pour columns; (vii) cure & strip columns; (viii) backfill & grade; and, (ix) cleanup. Other activities not present at all locations include piling at locations of weak soil conditions and construction of column crossheads (form, reinforce, pour, cure & strip, and post tension). The as-built view contains data pertaining to the daily status of site conditions as well as activity status from the available records maintained by the contractor. The activity status includes “start”, “ongoing”, “finish”, “same day activity start and finish”, “idle” (work started but later interrupted), and “postponed” (start of activity delayed), respectively. Daily site environmental data for the project’s time frame was retrieved from the Environment Canada web site as we did not have access to weather dat